Skip to content
This repository has been archived by the owner on Oct 30, 2019. It is now read-only.

Low-level networking #15

Closed
dlrobertson opened this issue Apr 26, 2018 · 17 comments
Closed

Low-level networking #15

dlrobertson opened this issue Apr 26, 2018 · 17 comments

Comments

@dlrobertson
Copy link
Contributor

dlrobertson commented Apr 26, 2018

Is/will this working group be focused on the lower-level layers of the network stack (e.g. projects like libpnet and smoltcp)? If so are there particular areas that are of interest? There may be some overlap with the embedded-wg here.

@aturon
Copy link
Contributor

aturon commented Apr 26, 2018

I think the amount of focus depends entirely on who gets involved! For Rust 2018 the main goal is async/await and layers above it, but with enough interest we could pursue other layers in parallel.

@dlrobertson
Copy link
Contributor Author

Makes sense. I'm definitely interested in helping however I can with this group, but as a whole I tend to be more interested in the lower-level layers of the stack.

@stephanbuys
Copy link

Low-Level Networking should ideally be a sub-group of this WG

@deg4uss3r
Copy link

Agreed, I'm very interested in this!

@Nemo157
Copy link
Member

Nemo157 commented Jul 27, 2018

I'm interested in this from the embedded side, hoping to be able to build up an asynchronous network stack all the way from the hardware to the application service.

@stephanbuys
Copy link

In this sub-group protocols will mainly refer to IP,TCP,ARP,BGP,etc - I'm sure those common crates and components would be really useful in embedded space too

@bubaflub
Copy link

At Cloudflare we've begun using Rust for a major project at L3/L4. I'm interested in helping contribute in whatever way I can.

@stephanbuys
Copy link

@bubaflub anyone that has the muscle to solve this issue comprehensively would unlock so much potential: tokio-rs/mio#839

@polachok
Copy link

What do we need libcap for? I'm working on mio-afpacket which adds support for raw sockets on linux. The same can be done for other platforms. I also plan to add tokio shim on top of it.

@mrmonday
Copy link

mrmonday commented Jul 28, 2018

Hi, main author of libpnet here. I'm not sure that I'll have time to follow the thread/WG, but I'll do a quick braindump/retrospective of my experience and thoughts with libpnet, Rust, and low-level networking, and try to answer any questions.

libpnet

libpnet is a product of its time - it was written pre-Rust 1.0, by someone who didn't know Rust, or anything about low-level networking. It's a mixed bag of features, with the general goal of providing safe and fast abstractions for user space low-level networking. Nowadays the core bits are pretty good, and there's still a lot of nice ideas in there, but it is a long way from what I would want from a stable/modern Rust library.

Datalink/Layer 2

This is what I would consider most mature in libpnet. It provides a cross-platform abstraction for synchronous datalink programming on Windows, Linux, assorted BSDs, and macOS. It has no problem saturating dual-10G links (I did not have the hardware to test beyond that).

The Good

  • Relatively easy to strike a balance between "this code works the same on all platforms" and "I want to use this feature which only exists on platform X.
  • Supports batching of packets to minimise overhead

The Bad

  • Some required interfaces on some platforms are inherently unsafe, and I'm not convinced can be made safe. I've settled for a this is probably fine approach, but maybe others aren't ok with out of bounds "but it's not, I promise" array access (read the comments next to sdl_data).
  • At least as far as libpnet is concerned, there's a whole bunch of unsafe code. Mostly written pre-nomicon and without any assertions of "this is actually safe". For the most part this should be fixable.
  • While the abstraction works OK for the general case, it is probably insufficient for, eg. proper DPDK support. Even with netmap, which libpnet supports, you do not get access to the full suite of functionality.
    • Also missing are sensible things like AsRawFd.
  • Only supports synchronous usage, except on Linux, where it kinda-sorta supports async but with a sync interface.
  • Some platforms really want you to use libpcap. I have managed to hit kernel panics because I did something in a slightly different way to libpcap before.

The Ugly

  • Doing anything on Windows at L2 needs a driver to be installed. libpnet uses WinPcap for this, rather than writing its own. WinPcap doesn't support using it directly, only via libpcap.
    • Incidentally, libpnet has a libpcap backend
  • Testing is hard
    • To run tests which actually use a real backend, you need the right permissions
    • The easy way is "run tests as root" - this is obviously not something you want to do
    • There are nicer ways on a per-platform basis, but they don't really integrate with cargo (eg. modifying capabilities of the test binary before it runs)
    • It's possible all these problems become trivial if you containerise the tests - you can create test interfaces, and sandbox the containers so it doesn't matter that you're running as root.

Streaming iterators

As an aside, in my opinion, to do good synchronous low level networking, you need streaming iterators. The abstraction "this set of bytes is valid until the end of the block" is incredibly powerful, and enables safe zero-copy networking.

Network/Transport Layers 3/4

Don't even bother.

The Good

  • This section intentionally left blank

The Bad and Ugly

  • Operating system support varies wildly
    • Some platforms helpfully re-order some bytes of packets for you
    • Some platforms allow you to send/receive TCP/UDP, others don't.
    • Some platforms let you use the loopback interface. Some let you do it if you jump through some hoops.
  • It is slow
    • OSes optimise for application layer, the lower levels usually take an unusual route
    • On some plaforms, it slows down application layer networking
    • On most platforms, you cannot saturate gigabit, let alone 10G and above at this layer
  • IPv6 support varies wildly, from not at all to "kind of a bit"
  • Honestly, your time would be better spent writing your own IP stack on top of the datalink layer

Packet abstractions

These are great, but I would not do what libpnet does again.

The Good

  • When I wrote the packet abstraction for libpnet, I fixed multiple bugs from my manual packet manipulation, and reduced the code size by a factor of N, where N is a number I don't remember, but could look up.
  • Not dealing with endianness is great
  • Not manually shifting bits is great

The Bad/Ugly

  • In libpnet, I wrote support using compiler plugins/macros. This was a terrible idea.
    • Compile times suffer due to syntex
    • Don't have access to type information, which makes for very hacky code
    • Far too much magic
  • Packet composition is critical to a successful abstraction. libpnet's packet support is mediocre at best for this.
  • You can get really far with simple abstractions. The second you try to implement more interesting packets you start hitting a lot of walls.
  • It's incredibly verbose. Compare this scapy code with the equivalent in libpnet. It is possible to get a near-scapy level of terseness in Rust, and any abstraction should endeavour to get close to that

What I would do next time

I think a more sensible approach for a packet abstraction is to not write Rust. I would propose a rust-like DSL, which compiles to Rust, and gives a scapy-level experience. There should be minimal magic. libpnet's packet code generates lots of traits and implementations, which people find confusing. I have a lot more thoughts on this, but I will save them for another time.

Async

This is a thing everyone wants.

When I started libpnet, Rust's story was "we use green threads". Then it was "we do zero overhead abstractions". For libpnet, I settled on synchronous, blocking networking. This is great for when you are handling millions of packets per second - you don't want or need the overhead of a full async runtime. It's not so great for anything else.

I haven't spent much time on how to implement this - no doubt it's different for each platform. It looks like @polachok is working on Linux support for this, he might be a better person to ask. BSDs/macOS follow an everything is a file approach, so I imagine the code is similar to async file handling.

libpnet would need a big overhaul to support async well.

Documentation

As a general rule, documentation for anything low-level networking is sparse and conflicting at best. There are a few good resources around, they tend to be hard to find though. I would hope that any Rust work in this area would fix that ;)

Let me know if I can be of any more use/lend any further insight.

@stephanbuys
Copy link

@polachok Windows doesn't support raw sockets, so at least for it we would need a solution that integrates with winpcap or provide our own alternative (definite long-term goal).

@mrmonday
Copy link

@stephanbuys Since you brought it up, I'll expand on some technical details for L2 networking.

Native

Linux

Linux exposes L2 networking through the standard Berkeley socket APIs, with AF_PACKET. There's nothing particularly special here compared to normal application layer networking.

See also: https://github.com/libpnet/libpnet/blob/b74c3da988573a35ce1f9839317d3aa10b4b0a43/pnet_datalink/src/linux.rs

BSDs/macOS

These platforms take the unix-y "everything is a file" approach. You open either /dev/bpf or /dev/bpfN, (where N is an incrementing natural number) depending on the platform, then read/write to it like a file.

See also: https://github.com/libpnet/libpnet/blob/b74c3da988573a35ce1f9839317d3aa10b4b0a43/pnet_datalink/src/bpf.rs

Non-native

Windows

Windows does not support L2 networking out of the box. It requires a driver to do it. Usually this will be WinPcap or Npcap, but you can write your own. These function very similarly to BPF for sending/receiving, but the set up is rather different.

See also: https://github.com/libpnet/libpnet/blob/b74c3da988573a35ce1f9839317d3aa10b4b0a43/pnet_datalink/src/winpcap.rs

libpcap

libpcap is what you would usually reach for in the C world. It already supports every platform which I mention here, and would mean we only need to target a single API. The main downside is that you cannot (last time I checked) reach the same level of performance as is otherwise possible (with eg. netmap/DPDK).

See also: https://github.com/libpnet/libpnet/blob/b74c3da988573a35ce1f9839317d3aa10b4b0a43/pnet_datalink/src/pcap.rs

netmap/DPDK/pf_ring

There are a bunch of userspace networking libraries which cut out the middle man and work directly with the NIC (ish, library dependent). These typically enable you to process packets at line speed on serious hardware (40G/100G). Their APIs vary quite a bit - but obviously don't conform to anything that looks like the file/socket APIs. I believe they all offer versions of libpcap, but take a performance hit when doing so. The platform support also varies.

See also:

Long story short - this needs to be implemented very differently on every platform, unless pcap is targetted.

@stephanbuys
Copy link

@mrmonday this is awesome - thank you, this kind of gold is scattered all over the place - I'll make sure this is captured somewhere.

I believe one of the best things this effort could accomplish would be to support the above, appropriately in mio - possibly a mio "shim" for libpcap? (@polachok ?)

@dlrobertson
Copy link
Contributor Author

@mrmonday great summary!

Low-Level Networking should ideally be a sub-group of this WG

@stephanbuys agreed, that sounds great!

I'm interested in this from the embedded side, hoping to be able to build up an asynchronous network stack all the way from the hardware to the application service.

@Nemo157 You may be interested in smoltcp-rs/smoltcp#208.

@ereichert
Copy link

I'm very interested in the low level networking topics.

@polachok
Copy link

@stephanbuys
I'm not quite sure if pcap is the best way to go for low-level networking as it hides many details about platform APIs which can be crucial if you want to implement some sort of high-performance processing application (see packet(7) options in linux).

A better approach in my opinion would be to implement platform APIs in rust via C FFI when possible (AF_PACKET on linux, bpf on bsd/osx, pcap on windows) and create an umbrella crate which would choose appropriate way for the platform via #cfg. This approach would allow both advanced users (who know what kind of options do they want to set) and newcomers (who want an easy to use cross-platform interface) to benefit.

@Ralith
Copy link

Ralith commented Jul 30, 2018

As the author of a QUIC implementation I'm interested. Though QUIC does not require raw sockets in any sense as it's implemented atop UDP, an implementation has much in common with a TCP stack.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests