Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Networking API design #370

Closed
badeend opened this issue Dec 28, 2020 · 14 comments
Closed

Networking API design #370

badeend opened this issue Dec 28, 2020 · 14 comments

Comments

@badeend
Copy link
Contributor

badeend commented Dec 28, 2020

I've had some thoughts, and here's my braindump. Feedback is welcome.

1. Which use cases should the WASI networking APIs account for?

The official WebAssembly website mentions lots of use cases: https://webassembly.org/docs/use-cases/. I've kept the cases that seem relevant and added some of my own:

  • Standalone desktop & server Wasm apps.
  • IoT devices running microcontrollers without a full blown OS.
  • Web browsers.
  • container-based serverless providers like AWS, GCP and Azure.
  • JavaScript/Wasm-based serverless providers like Fastly's Compute@Edge, Cloudflare Workers and Akamai EdgeWorkers. These look to be more locked down than their container-based counter parts.
  • HTTP proxies.
  • plugin/extension-style embeddings into existing applications.
  • Audio / video streaming.
  • Gaming
  • Peer-to-peer applications (games, collaborative editing, decentralized and centralized).
  • Image recognition.
  • Platform simulation / emulation (ARC, DOSBox, QEMU, MAME, …).
  • Language interpreters and virtual machines.
  • POSIX user-space environment, allowing porting of existing POSIX applications.
  • Remote desktop.
  • VPN.
  • Local web server.
  • Fat client for enterprise applications (e.g. databases).
  • Server-side compute of untrusted code.
  • Symmetric computations across multiple nodes

2. Which protocols should be focused on first?

For reference, I've compiled a non-exhaustive list of commonly used protocols:

UDP, TCP, DNS, HTTP1/2/3, QUIC, TLS, SSH, SFTP, FTP(S), SMTP, POP3, IMAP, WebRTC, WebSockets, gRPC, UNIX sockets.

Notes on HTTP

Outside the browser

Many runtimes (Java, .NET, ...) have built their own HTTP stack on top of the OS provided socket API's. Even if WASI was to provide an HTTP interface, it seems unlikely existing codebases would switch to it anytime soon.

Inside the browser

Browsers already have an HTTP API: fetch(). All WASI can provide is a wrapper for that function.

Also, all WASI's current proposals are, as far as I can see, blocking synchronous functions. And browsers won't swallow that. Until there is an general answer for asynchronicity in Wasm/WASI, I propose to leave this use case out of scope for the networking proposal.

My thoughts

TCP and UDP are the lingua franca of the web with pretty much every other protocol in existence built on top of that. It seems logical to start with these, although I'd be happy to be proven wrong.

Pros:

  • Wasm engines only need to expose a mininal surface area.
  • Gives users the freedom to use any protocol they want.

Cons:

  • Browsers will never support it.
  • Every module must bring its own implementation for the higher-level protocols it wishes to use. This increases file sizes, but file size is not be a top priority since browser usage is ruled out.
  • Very low level. Makes Wasm embedders unable to hand out capabilities based on application protocol-specific traits.

3. What should and shouldn't be allowed?

Prior art regarding blocking unwanted network activity:

  • The existing WASI filesystem API. Only file descriptors explicitly passed down from the host can be acted upon.
  • Web browsers restrict a script's network access using same/cross origin policies. An "origin" is defined as the combination of protocol+hostname+port. Also, the script only has access to a limited subset of the transmitted HTTP request and response.
  • WebExtensions use a manifest.json to declare url patterns which they're allowed to communicate with. Example: "*://developer.mozilla.org/*".
  • Layer 3/4 firewalls generally allow or block traffic based on direction (inbound, outbound), protocol (TCP or UDP), source (address, port), destination (address, port). This can in theory be applied on any box between the two endpoints of the connection.
  • Layer 7 firewalls allow or block traffic based on the data inside the packets, usually HTTP. This allows filtering on things like hostname, method, path, headers, body. This kind of firewall need access to the cleartext data, so it must be placed before the data is encrypted or after the data is decrypted.

Random questions

  • How should capabilities be passed down?
  • Based on what conditions will embedders decide to allow or deny networking requests?
  • Should modules be allowed to instantiate their own network connections or is this a privilege of the host?
  • Should networking capability handles be discoverable by modules? Like preopened directories can currently be discovered by iterating through the file descriptor ids until an error is reached.
  • Sockets connect to IP addresses only. Example pseudo-code: connect(fd, "[2a00:1450:400e:80d::200a]", 443). However, if a Wasm embedder whishes to block network access based on the host, they'll probably want to do so based on the domain name instead of meaningless IP addresses.

4. What should be the general API design?

Follow the designs of existing API's or come up with something new?

Is it a goal to have one unifying API abstracting away multiple protocols? This is what "TAPS" mentioned in 315 seems to be doing. If there is a time to steer away from the conventional API's, this would be it.

Many existing applications are built on the presumption that the OS doesn't anything higher-level than Berkeley-sockets and therefore already bundle their own implementations for the other protocols one way or another. Whatever the WASI networking API's end up looking like, is it a goal to provide a compatibility layer similar to wasi-libc automagically remapping open() to openat() ?

What should be the interface boundary? "Networking" is a very broad topic. Assuming embedders will only implement interfaces that are relevant to them, where should the dividing line between interfaces be placed? Some fictional examples for illustration:

  • wasi_sockets vs wasi_tcp & wasi_udp
  • wasi_http vs wasi_http_server & wasi_http_client

Transparent TLS

I've seen this idea surface multiple times in this repository: unify unencrypted and encrypted connections into a single API.

Some considerations:

  • Not all communications are encrypted throughout the entire duration of the connection. Examples: SMTP/IMAP/POP3 using STARTTLS, back-end servers forwarding encrypted data but with the PROXY protocol header prefixed.
  • Applications need access to the current state of encryption when deciding whether or not to allow authentication commands.
  • Quic has a tight TLS integration.

My thoughts

The WASI overview document mentions:

The first version of WASI is relatively simple, small, and POSIX-like in order to make it easy for implementers to prototype it and port existing code to it, making it a good way to start building momentum and allow us to start getting feedback based on experience.

... hinting at a POSIX-sockets API. This might be an obvious answer since they're well understood and an industry standard.

Random questions:

  • As mentioned before, there is no general agreed upon answer to non-blocking functions yet. For listening sockets, does this mean that we're forced to a thread-per-connection solution?
  • How to decide which address(es) to bind on? Just assume 127.0.0.1 and/or 0.0.0.0 are available? Or expose an API to list the available network interfaces?
  • Allow multicast UDP?
  • There is an open pull request to add Berkeley sockets. It features a modified socket(...) function that includes a "capability" file descriptor. This is not standard and breaks existing software. Should a compatibility workaround be deviced?

5. Gather feedback

It would be wise to reach out to existing Wasm embedders and get their vision on what the networking API should and shouldn't be allowed to do. For example: Fastly is both a founding member of the bytecode alliance and an embedder. They'll probably have something to say about what networking functionalities they want to allow inside their workers, if any.

@sunfishcode
Copy link
Member

Interesting thoughts! Here are some thoughts to go with them :-).

One of the big ideas in WASI is that we don't have to think in terms of "which abstraction level should WASI as a whole target?", because can target multiple levels. We could have bare sockets APIs for programs doing their own POP3, transparent-TLS sockets APIs for programs doing custom protocols, and high-level HTTP fetch/proxy APIs for programs that just want to do simple HTTP operations, all in WASI at the same time.

APIs in wasm are virtualizable -- there's no syntactic difference between a system call and a library call, so if we design things well, it should be possible to implement the high-level APIs as pure wasm modules that call the low-level APIs. This means that implementations that want to minimize their surface area can do that. Other implementations will want to implement the high-level APIs natively, because they'll be able to do high-level optimizations that depend on having the high-level information about how the protocols are being used. So we can let implementations choose.

all WASI's current proposals are, as far as I can see, blocking synchronous functions.

Right; there are efforts underway to design a Wasm-level way to do async, so at the WASI level we're mostly just keeping an eye on those efforts right now to avoid duplicating work.

TCP and UDP are the lingua franca of the web with pretty much every other protocol in existence built on top of that. It seems logical to start with these, although I'd be happy to be proven wrong.

Popular operating systems don't expose bare TCP APIs, so while in theory that might be the lowest-level abstraction layer, in practice most implementations today wouldn't be able to support it. Bekerley-style sockets are the lowest-level API that is widely supported by existing OS's, so that seems more practical than bare TCP, at least to start with.

And as I mentioned at the top, we can work on high-level APIs independently of the low-level APIs.

is it a goal to provide a compatibility layer similar to wasi-libc automagically remapping open() to openat() ?

Yes. This is one of the expectations of the Berkeley sockets prototype -- it's ok to have an additional capability argument, because wasi-libc will be the thing which provides the standard Berkeley sockets API, handling the capability argument implicitly.

How should capabilities be passed down?

The rough idea is that they'll be passed into main from the outside, similarly to how the preopened directories are opened, and then passed into the APIs as needed. This is an area where the current infrastructure is very primitive at this time, but which we plan to develop into a major more general framework, especially once the underlying Wasm platform provides better mechanisms for doing so.

Based on what conditions will embedders decide to allow or deny networking requests?

In the sockets prototype , the idea is to have an address pool. The details of what's in the pool and how it decides which connections are permitted are areas where I expect the API will evolve.

Should modules be allowed to instantiate their own network connections or is this a privilege of the host?

They'll need a handle to do so. But given a handle, the details of what can be done with that handle will be dependent on the individual API design.

Should networking capability handles be discoverable by modules? Like preopened directories can currently be discovered by iterating through the file descriptor ids until an error is reached.

Yes. As above, the current infrastructure for this is primitive right now, but this is something we want to enable.

As mentioned before, there is no general agreed upon answer to non-blocking functions yet. For listening sockets, does this mean that we're forced to a thread-per-connection solution?

WASI does have a poll_oneoff function for waiting for I/O on multiple handles. It won't scale to very-many handles, but it's likely better than thread-per-connection for many use cases. If you need very-many-handles, that's where we'll need something like the async support.

@badeend
Copy link
Contributor Author

badeend commented Jan 7, 2021

The rough idea is that they'll be passed into main from the outside, similarly to how the preopened directories are opened, and then passed into the APIs as needed. This is an area where the current infrastructure is very primitive at this time, but which we plan to develop into a major more general framework, especially once the underlying Wasm platform provides better mechanisms for doing so.

How do you see the relationship between "capabilities" and "files"? In the object-capability world, the reference is the capability. And when talking about file descriptors Wikipedia has the following to say:

This file descriptor is a capability. Its existence in the process's file descriptor table is sufficient to know that the process does indeed have legitimate access to the object.

Suggestion:

Rename the file descriptor table to a more general term like "resource table" and let files and other resources share the address space. To maintain compatibility, preopened file descriptors could keep starting at index 0 and count up. Other non-POSIX resources could start at MAX_INT_VALUE counting down.

Regardless of how capabilities get transported into a Wasm module, wasi-libc needs a way to take a collection of capabilities and somehow redirect every C-function call to the correct handle. Some options:

  • Expose only one "ambient" handle. wasi-libc uses this for every wasi call.
  • Expose multiple handles and add a WASI API that can discover which WASI functions can be called using a particular handle.
  • Expose multiple handles at predefined locations:
    • handle 1: filesystem
    • handle 2: networking
    • handle 3: clock
    • ...

Out of these three, the first one has my preference.


Let me try to answer some of my own questions:

Sockets connect to IP addresses only. Example pseudo-code: connect(fd, "[2a00:1450:400e:80d::200a]", 443). However, if a Wasm embedder whishes to block network access based on the host, they'll probably want to do so based on the domain name instead of meaningless IP addresses.

One way to implement this: let the Wasm module "proof" the relationship between a domain name and an ip address:

  1. The embedder has a rule that allows TCP connections to the domain name "example.com" on port 22.
  2. The Wasm module uses a t.b.d. WASI dns lookup function to perform a lookup of "example.com", which resolves to 123.1.2.3
  3. The embedder remembers the address in a dictionary, mapping resolved addresses back to their domain names.
  4. The Wasm module calls connect(fd, "123.1.2.3", 22)
  5. The embedder looks up the ip address in its dictionary and finds that the address is associated with "example.com". There is a rule allowing connections to "example.com" on port 22, so the connect call continues. Otherwise it fails.

This workflow is based on the presumption that applications usually perform a DNS lookup before connecting.

How to decide which address(es) to bind on? Just assume 127.0.0.1 and/or 0.0.0.0 are available? Or expose an API to list the available network interfaces?

I guess just like any other piece of data:

  • hardcode it, or:
  • make it configurable (command line, config file, ...).

The embedder will block unwanted behaviour either way.

Allow multicast UDP?

Seems like a niche, but: sure. Maybe not in the first iteration.

Transparent TLS

Whether it is transparent or not, TLS seems like a large effort on its own. Not something to just "tag on" in a network proposal.

Having said that, it doesn't mean TLS can't integrate into a socket API. For example:

var sock1 = socket_create(cap);
socket_connect(sock1, ...);
tls_upgrade_client_socket(sock1, ...);
socket_send(sock1, "My private data");

@ShadowJonathan
Copy link

Small comment, but I personally find the "allow based on DNS domains, remember IP address behind domains, verify on that" method a bit unsound, and i have a feeling it can expose an attack surface to exploit (e.g. to "shift" or "trick" the DNS checker that the address resolved is actually an address that's allowed to be connected, or something along the lines)

Can't, then, the connect() function be based on accepting a union of u32 (for ipv4), 2xu64 (for ipv6), or string (or equiv)? With a string, it'll be a domain that is passed through that "allowed" list before a DNS lookup is made, this is safer, and has less risk, and makes more sense for a design, imo.

Also, I think TAPS might be onto something with abstracting the actual connecting process from the capability/requirements aspect, i personally still dont know how it'll then be possible for a server to "properly receive" those connections (if TAPS can make connection method decisions on a myriad of variables), but the prospect is interesting, certainly if that means (per that example) that SMTP/POP3 clients can now become browser-native, which was a real eyebrow-raiser for me, as that aspect could rapidly change how the browser can play a role in experimenting and working with old APIs (maybe then finally a new mail protocol will be designed and implemented through these rapidly-versioned web-based client libraries)

@unmellow
Copy link

unmellow commented Sep 2, 2021

What ever is decided upon for the primary API I do think it would be good to have an API for managing a network device directly

e.g A standardized API for network card drivers

In addition Programs like Lokinet implement TCP/UDP on top of their protocols in this case for supporting existing protocols on their mixed net by default

Being able to write these once to support any wasm
Engine that supports whatever we name these specialized API'S would be a game changer for allowing new high level API's to evolve quickly

(Also wanted to mention distributed API'S like ipfs BitTorrent and hyperdrive.
who knows how they might evolve in the world of wasm with API's like this)

@lukehinds
Copy link

lukehinds commented Dec 22, 2021

How do we follow progress of this design, is there an API working group or another medium used to host discussion?

So far I have noted bytecodealliance/wasmtime#70 which has fizzled out with no progress since almost a year ago and this issue appears to be going the same way.

Perhaps something like https://github.com/WebAssembly/wasi-crypto should exist?

@sunfishcode
Copy link
Member

A number of different people have started work on sockets at various points, and I don't know all the reasons why this work hasn't progressed. But I can explain some of the things in the broader context here.

One of the things that's happened in the WASI Subgroup is that we've learned a lot about the limitations of the witx format and associated tooling that WASI is currently built on, from a lot of feedback from different people building things. And, while witx attempted to anticipate where the interface types and module linking proposals were heading, those proposals have now evolved significantly, to the point where the current witx tooling is now out of date. In light of both of these, the WASI Subgroup meetings have been focusing on the next generation of API tools, called wit, which is coordinated with interface types and module linking, has more features, is more usable, and has a plan for async support. In particular, having a unified async story across WASI will allow us to design APIs that work with each other. Building on that foundation will get us to a much better place in the long run.

This is an evolving scene, and there's not a lot of documentation yet, and the tooling is still maturing. To follow the space, I encourage people to attend the Subgroup meetings if they can, and ask questions in the issue trackers here. Beyond that, lots of people are expecting to be writing lots more documentation soon. And, I'm expecting some major new proposals built on the new tools soon. And yes, proposals will have dedicated proposal repositories that people can follow and participate in.

@ShadowJonathan
Copy link

Could you maybe point me to the "async" story? This sounds interesting from a number of perspectives for me.

Also, to summarize; work isnt being done on WASI networking because core collaboration/definition tools are being reworked?

@sunfishcode
Copy link
Member

The async functions and streams presentation which was recently presented in the WASI Subgroup is a starting point. The Component examples presentation to the WASI Subgroup gives some examples of what this might look like in practice.

Also, to summarize; work isnt being done on WASI networking because core collaboration/definition tools are being reworked?

Work is being done; in the presentations I linked here, there are examples which involve networking. And as I mentioned above, there are some proposals in flight.

@badeend
Copy link
Contributor Author

badeend commented Dec 23, 2021

@ShadowJonathan

Small comment, but I personally find the "allow based on DNS domains, remember IP address behind domains, verify on that" method a bit unsound, and i have a feeling it can expose an attack surface to exploit

I don't know if it is sound. Definitely something that needs extra investigation. I've created a small POC in C# to demonstrate the idea.

Can't, then, the connect() function be based on accepting a union of u32 (for ipv4), 2xu64 (for ipv6), or string (or equiv)? With a string, it'll be a domain that is passed through that "allowed" list before a DNS lookup is made, this is safer, and has less risk, and makes more sense for a design, imo.

Definitely possible, but that would require all existing software written for BSD sockets to be modified.

I think it boils down to: BSD sockets don't have a notion of domain names. So any solution that applies domainname-based restrictions on sockets is going to be an imperfect solution in some way or another. For example: my POC would not work for applications that ship their own DNS client.

@sunfishcode
To give these discussions a better home, I would be glad to create Sockets proposal repo.
Should I just fork this repo? The current proposal process documentation is a bit sparse on this point.

@badeend
Copy link
Contributor Author

badeend commented Dec 28, 2021

I've created an initial draft at: https://github.com/badeend/WASI-Networking

wenyongh pushed a commit to bytecodealliance/wasm-micro-runtime that referenced this issue Feb 23, 2022
Refer to [ Networking API design](WebAssembly/WASI#370) and [feat(socket): berkeley socket API v2](WebAssembly/WASI#459)

Support the socket API of synchronous mode, including socket/bind/listen/accept/send/recv/close/shutdown,
the asynchronous mode isn't supported yet.
Support adding `--addr-pool=<pool1,pool2,..>` argument for command line to identify the valid ip address range.
And add the sample.
wenyongh added a commit to bytecodealliance/wasm-micro-runtime that referenced this issue Mar 10, 2022
Refer to [Networking API design](WebAssembly/WASI#370)
and [feat(socket): berkeley socket API v2](WebAssembly/WASI#459):

- Support the socket API of synchronous mode, including `socket/bind/listen/accept/send/recv/close/shutdown`,
    the asynchronous mode isn't supported yet.
- Support adding `--addr-pool=<pool1,pool2,..>` argument for command line to identify the valid ip address range
- Add socket-api sample and update the document
xujuntwt95329 pushed a commit to xujuntwt95329/wasm-micro-runtime that referenced this issue Mar 13, 2022
…#1036)

Refer to [Networking API design](WebAssembly/WASI#370)
and [feat(socket): berkeley socket API v2](WebAssembly/WASI#459):

- Support the socket API of synchronous mode, including `socket/bind/listen/accept/send/recv/close/shutdown`,
    the asynchronous mode isn't supported yet.
- Support adding `--addr-pool=<pool1,pool2,..>` argument for command line to identify the valid ip address range
- Add socket-api sample and update the document
@badeend badeend closed this as completed Jul 9, 2022
@ShadowJonathan
Copy link

@badeend how was this completed? Only bytecodealliance/wasm-micro-runtime#1036 was merged, which is not an API RFC or completed proposal to do networking?

@SamuraiCrow
Copy link

Maybe he doesn't recognize the difference between WASI and WAMR.

@StevenACoffman
Copy link

StevenACoffman commented Jul 10, 2022

I think this issue is closed as there is an initial design proposal here:
https://github.com/WebAssembly/wasi-sockets

If you are interested in tracking progress on networking for WASI, that design proposal is currently listed there as Phase 1, with the following note:

Phase 4 Advancement Criteria
At least two independent production implementations.
Implementations available for at least Windows, Linux & MacOS.
A testsuite that passes on the platforms and implementations mentioned above.

@SamuraiCrow
Copy link

Oh.

victoryang00 pushed a commit to victoryang00/wamr-aot-gc-checkpoint-restore that referenced this issue May 27, 2024
…#1036)

Refer to [Networking API design](WebAssembly/WASI#370)
and [feat(socket): berkeley socket API v2](WebAssembly/WASI#459):

- Support the socket API of synchronous mode, including `socket/bind/listen/accept/send/recv/close/shutdown`,
    the asynchronous mode isn't supported yet.
- Support adding `--addr-pool=<pool1,pool2,..>` argument for command line to identify the valid ip address range
- Add socket-api sample and update the document
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants