-
-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failing to report unbound UDP traffic #81
Comments
Oh dear - it should definitely support udp! Could you tell us a bit more about your use case, platform and perhaps an easy way to reproduce this? |
I downloaded the binary Ran on:
I'm using FreeSWITCH running lots of traffic. Not sure an easy way to test... google says netcat? |
Hmm, with netcat I see UDP traffic.
And then:
And I see traffic in bandwhich and the process is correctly identified as netcat. I wonder what's different in these use cases? (Will keep investigating if nothing comes up, in any case). |
One idea: @zhangxp1998 - does this happen to you when you use your patched |
@imsnif Yes this still happens |
Ok found why: We indeed captured UDP traffic, but |
Further investigation: this regex bandwhich/src/os/lsof_utils.rs Line 19 in 5fdf236
Doesn't even pick up UDP lines Since we only use |
AFAIK a UDP datagram does have a source and a destination port though, which is what we need. |
The datagram does have source/destination port. But output from |
The output of my lsof for the netcat test is this:
Is it different on a mac? |
On Mac it looks like
EDIT:
(rem_address being 0) |
Hmm. For outgoing packets, this can make sense. |
Hmmm we should be able to determine which process these packets belong to. For example we could use triple EDIT: |
I'm not sure I understand. Right now, we do this: pub struct Socket {
pub ip: Ipv4Addr,
pub port: u16,
}
// ...
pub struct Connection {
pub remote_socket: Socket,
pub protocol: Protocol,
pub local_port: u16,
} Could you help me understand what you're suggesting? |
Sure. Right now we have a HashMap like this: bandwhich/src/display/ui_state.rs Line 55 in 5fdf236
Whenever we received a packet, we look up this Currently, the key to this hash map is
Problem of this approach: the
as key in Why? |
The kernel only shows a connected destination if the program called connect(). If it didn't, it can instead specify a destination for each packet, using sendto(). Here's a C program that talks UDP with 1.1.1.1 and 8.8.8.8, calling connect() for only one of the sockets. It looks like this in lsof on Linux (I don't know about macOS and other Unix-likes): a.out 18303 alcaro 3u IPv4 558532 0t0 UDP *:54852 (It's also possible to talk to both servers on the same socket, but it acts weirdly if you didn't call connect(). I suspect the kernel discards incoming packets from wrong source if you connect().) I can't offer any solutions, but perhaps this can help you understand the problem better. |
@zhangxp1998 - Right - you said "remote port" instead of "local port" before and that's what threw me off. :) Thanks for clearing this up. We indeed match the process by its local port, which afaik is a safe approach since processes should not bind to the same local port with the same protocol. So, if I understand both you and @Alcaro - the piece we're missing here is the local port. We have it in the packet, but sometimes we don't have it in lsof or /proc. Right? |
I wouldn't recommend that - servers tend to have many connections with iface=eth0 local=80 proto=tcp. Use the remote port/address as well if available; if not, skip it and hope for the best. It'll still give weird results if a UDP server doesn't connect(), but it's right for all TCP, all connect()ors, and all clients. Not perfect, but pretty close. Even for clients, I also suspect you can get same local port on different connections if you have multiple network interfaces (wifi+ethernet), dual-stack (aka ipv4+ipv6), and of course port 12345/tcp vs 12345/udp, but I didn't test that. (fork() can also lead to having the same fd open in two processes, attributing Chrome's traffic to another Chrome. But it'll get the process name and target address right, which should be good enough for users.) Or, even better, tell pcap to report PID for each packet. That'll not only fix disobedient UDP servers, but also raw sockets, like ping and traceroute. If pcap can't do that, BPF probably can. |
Just to add my 2c, what about using the file descriptor to uniquely identify a "connection" (including UDP)? If I'm not mistaken that should be unique. |
In that case, all these connections will have local port 80, but possibly with different remote ip/ports. Using <local port, interface, protocol> to identify process is correct, because there can't be two processes listening on TCP port 80(if you have two HTTP servers on the same machine, they must be listening on different ports/interfaces). There can be multiple connections going into port 80. We are not trying to identify which connection a packet belongs to, we are simply trying to identify which process sent the packet.
Yes you can. That's why I suggested |
An interesting idea, but fds do, to my knowledge, not have system-unique identifiers accessible to userspace. (They have kernelspace addresses, but kernel doesn't like revealing that. Wouldn't make much sense for a 32bit x86 process on a 64bit kernel, anyways.) Process ID would work, but judging by this thread, pcap can't return that.
I somehow thought we needed to match the packet to a specific line in lsof, but you're right, all relevant matches will be owned by the same program, and that's all we need. Remote address is unnecessary, I got confused, sorry about that. There can be two processes listening on 80/tcp with fork() or SO_REUSEADDR, but worst case is identifying apache2's traffic as another apache2, no real harm. Grouping every apache2 together is probably the desired behavior, anyways.
May want to split protocol to transport layer protocol (tcp/udp) and network layer (ipv4/ipv6). It wouldn't surprise me if some systems refuse to assign the same local port between ipv4/v6, but it would surprise me if all systems do. I believe this is what you meant, but it's easy to misread. |
hmmm network layer has no concept of ports... On my machine, binding to the same port twice using different network layer protocol fails. It would surprise me if some system allow such behavior. Different OSI layers should act independently, when I listen on port TCP 80, I want all TCP traffic to port 80 to go to me, regardless of the network layer protocol used by the sender. If we are concerned we could try to use IPv4/v6 when matching, |
I can reproduce your results if I try to bind to any overlapping subset, such as v4 However, non-overlapping combinations work fine, for example v4 and v6 localhost (
Conclusion: Correct matching must include local address. Next question: Is correctness necessary? Probably not for servers, but I believe the randomized client-side addresses can overlap between v4 and v6. I don't know for sure if they do, but I'd rather include at least a v4/v6 flag, just to be safe. (With a fallback to IPv6 lookup if v4 fails, so UDP servers listening to |
@Alcaro for the file descriptor topic, we have it in the output of EDIT: forget about it, seems to be per process |
Fixed by PR #82 |
Merged in release 0.8.0. Seems to work fine! |
I'm not seeing much traffic (basically just ssh and sshd) despite knowing there's a ton of traffic.
iftop
shows me over 20x streams open, but they are all UDP -- freeswitch VoIP streams.They are listed in
lsof -i
Is this a bug or by design? I saw no mention in the docs or issues of tcp vs udp.
Thanks!
The text was updated successfully, but these errors were encountered: