More

wh33zle · 2024-07-04T06:52:39 1720075959

If Rust ever gets a native generator syntax, this might be become achievable because one would be able to say: `yield transmit` to "write" data whilst staying within the context of your async operation. In other words, every `socket.write` would turn into a `yield transmit`.

To read data, the generator would suspend (.await) and wait to be resumed with incoming data. I am not sure if there is nightly syntax for this but it would have to look something like:

  // Made up `gen` syntax: gen(yield_type, resume_type)
  gen(Transmit, &[u8]) fn stun_binding(server: SocketAddr) -> SocketAddr {
   let req = make_stun_request();

   yield Transmit {
      server,
      payload: req
   };

   let res = .await; // Made up "suspend and resume with argument"-syntax.
   
   let addr = parse_stun_response(res);

   addr
 }

Arnavion · 2024-07-04T18:19:00 1720117140

Rust has had native generator syntax for a few years FYI. It's what async-await is built on. It's just gated behind a nightly feature.

https://doc.rust-lang.org/stable/std/ops/trait.Coroutine.htm... and the syntax you're looking for for resuming with a value is `let res = yield ...`

Alternatively there is a proc macro crate that transforms generator blocks into async blocks so that they work on stable, which is of course a round-about way of doing it, but it certainly works.

wh33zle · 2024-07-04T06:07:21 1720073241

They actually play together fairly well higher up the stack. Non-blocking IO (i.e async) makes it easy to concurrently wait for socket IO and time. You can do it with blocking IO too by setting a read-timeout on the socket but using async primitives makes it a bit easier.

But I've also been mulling over the idea how they could be combined! One thing I've arrived at is the issue that async functions compile into opaque types. That makes it hard / impossible to use the compiler's facility of code-generating the state machine because you can't interact with it once it has been created. This also breaks the borrow-checker in some way.

For example, if I have an async operation with multiple steps (i.e. `await` points) but only one section of those needs a mutable reference to some shared data structure. As soon as I express this using an `async` function, the mutable reference is captured in the generated `Future` type which spans across all steps. As a result, Rust doesn't allow me to run more than one of those concurrently.

Normally, the advice for these situations is "only capture the mutable reference for as short as possible" but in the case of async, I can't do that. And splitting the async function into multiple also gets messy and kind of defeats the point of wanting to express everything in a single function again.

wh33zle · 2024-07-04T05:36:09 1720071369

Yes, traffic is routed to the gateway through a WireGuard tunnel. Broadly speaking, what happens is:

- Client and gateway perform ICE to agree on a socket pair (this is where hole-punching happens or if that fails, a relay is used)

- The socket pair determined by ICE is used to set up a WireGuard tunnel (i.e. a noise handshake using ephemeral keys).

- IP traffic is read from the TUN device and sent via the WireGuard tunnel to the gateway.

- Gateway decrypts it and emits it as a packet from its TUN device, thereby forwarding it to the actual destination.

It is worth noting that a WireGuard tunnel in this case is "just" the Noise Protocol [0] layered on top of UDP. This ensures the traffic is end-to-end encrypted.

[0]: https://noiseprotocol.org

wh33zle · 2024-07-04T05:14:58 1720070098

Haha thank you!

Yes there are indeed similarities to rust-libp2p! Over there, things are more interleaved though because the actual streams and connections are still within `Future`-like constructs and not strictly split like in the sans-IO case here.

wh33zle · 2024-07-04T05:07:43 1720069663

I tried to address this at the end of the post: If what you are implementing is mostly _sequential_ IO operations, then this model becomes a bit painful.

That isn't always the case though. In more packet-oriented usecases (QUIC, WebRTC & IP), doing the actual IO bit is easy: send & receive individual packets / datagrams.

There isn't really much the compiler can generate for you because you don't end up with many `.await` points. At the same time, the state management across all these futures becomes spaghetti code because many of these aspects should run concurrently and thus need to be in their own future / task.

wh33zle · 2024-07-04T04:24:23 1720067063

In Firezone's case, things are built on top of UDP so technically there aren't any (kernel-managed) connections and only a single file descriptor is allocated for the UDP socket.

The main benefit is being able to use `&mut` everywhere: At the time when we read an IP packet from the TUN device, we don't yet know, which gateway (exit node), it needs to go to. We first have to look at the user's policies and then encrypt and send it via a WireGuard tunnel.

Similarly, we need to concurrently receive on all of these tunnels. The tunnels are just a user-space concept though. All we do is receive on the UDP socket and index into the corresponding data structure based on the sending socket.

If all of these "connections" would use their own task and UDP socket, we'd would have to use channels (and thus copying) to dispatch them. Additionally, the policy state would have to be in an `Arc<Mutex>` because it is shared among all connections.

wh33zle · 2024-07-04T04:15:57 1720066557

I did some quick research and found that there is an "async job" API in OpenSSL. That one appears to do IO though, it even says that creating a job is a very expensive operation and thus jobs should be reused.

Is the similarity you are seeing that the work itself that gets scheduled via a job is agnostic over how it is executed?

From this example [0] it looks more like that async API is very similar to Rust's futures:

- Within a job you can access a "wait context"

- You can suspend on some condition

- You can trigger a wake-up to continue executing

[0]: https://www.openssl.org/docs/man1.1.1/man3/ASYNC_is_capable....

ethegwo · 2024-07-04T04:38:14 1720067894

Yes, you're right. It's not entirely similar, it's not IO-less. But in async Rust (or any other stackless coroutine runtimes), IO should be bound to the scheduler. This allows IO events callback scheduler and wake the task it binds to. Exposing and manually pushing state is a good way to decouple IO from the scheduler.

wh33zle · 2024-07-04T04:45:24 1720068324

Yes! Decoupling is the goal of this! Using non-blocking IO is still useful in this case because it means we can wait on two conditions at once (i.e. socket IO and time), see [0].

It is possible to do the same blocking IO but it feels a little less natural: You have to set the read-timeout on the socket to the time when you need to wake-up the state machine.

[0]: https://github.com/firezone/sans-io-blog-example/blob/99df77...

wh33zle · on April 29, 2024

If you don't mind me asking, how much did they pay you to build it?

jbverschoor · on April 29, 2024

My salary was about 30k. Best deal ever lol

GrumpyNl · on April 29, 2024

With software still running and being used, the company had a bargain.

glitchcrab · on April 29, 2024

The person you're replying to is not the GP

jbverschoor · on April 29, 2024

Per year for those who are confused..

wh33zle · on Nov 11, 2023

I really enjoyed: Software Architecture, The Hard Parts.

https://www.oreilly.com/library/view/software-architecture-t...

One part it discusses is in fact communication patterns along the dimensions of:

- sync/async

- atomic/eventual consistent

- orechestrated/choreographed

wh33zle · on June 20, 2023

At the bottom is the link to the more technical specification: https://github.com/libp2p/specs/blob/master/kad-dht/README.m...