The point of a single address space OS is that you can pass pointers/references ...

zozbot234 · on March 31, 2022

Address space is entirely orthogonal to memory protection. You can have multiple protected tasks in a single address space, or multiple address spaces sharing blocks of physical memory among themselves with different virtual addresses, or any combination.

mike_hearn · on March 31, 2022

Yes, you could configure your memory maps so they never overlap and then call it a single address space, but if passing pointers between realms doesn't work then why bother? You didn't get any real benefit. The point of using a unified GC is that you can actually do this: just call a method in another protection ___domain, pass a pointer to a huge object graph, and you're done. There's no need for concepts like SHM segments or IPC marshalling. Even if you segmented your address space and then used classical process-like context switching, you'd still need all those things.

zozbot234 · on March 31, 2022

> but if passing pointers between realms doesn't work then why bother?

Because then it can work? It's a matter of what virtual addresses each "realm" has access to, either reading, writing or both.

mike_hearn · on March 31, 2022

I don't think I quite follow what you have in mind.

If there are two realms or protection domains or whatever we want to call them, but there is memory protection in place to prevent reading/writing of others when one is active, you can pass a pointer from one to the other and the other knows it's not belonging to itself. But the moment that receiver tries to read it, it'll segfault. Or what are you imagining happens here?

It seems to me like to solve that you have to copy data, not pointers. Now you have marshalling.

There's a second problem with trying to solve this with WASM - C and the associated ABIs don't handle evolution all that well. But part of what you need in a single address space OS is the ability for components to evolve independently, or semi-independently. In particular you need the ability to add fields to structures/objects without needing to recompile the world. Ideally, you'd even be able to modify structures in memory without even restarting the software. Higher level VMs than WASM can do this because they provide higher level semantics for memory layouts and linkage. You can evolve a JAR in much more flexible ways than you can evolve a binary C/rust module, or at least it's a lot less painful, which is why Win32 is full of reserved fields, and most attempts at long-term stable C APIs are full of OO-style APIs like GObject or COM in which structs are always opaque and everything has to be modified via slow inter-DLL calls to setters.

legalcorrection · on March 31, 2022

I think the piece you're missing is the continuing role of the page tables or similar functionality in such systems. You can have a single address space, i.e. a particular address can only ever refer to the same memory, while still determining that only certain processes are allowed to access that address. In such a system, the page tables would always have the same mapping to physical addresses no matter what process you're in, but the read/write/execute bits on the page table would still change as you context switch.

mike_hearn · on March 31, 2022

That's exactly what I understood from the proposal too, but I don't see why that is useful, nor why it'd be worth implementing with WASM.

Perhaps it's worth stepping back. The reason SASOSs are always written in managed languages like C# or Java [dialects] is that they're trying to solve several problems simultaneously:

1. IPC is slow due to all the copying and context switching.

2. Beyond slow it's also awkward to share data structures across processes. You need SHM segments, special negotiations, custom allocators that let you control precisely where data goes and which are thread safe across processes etc. Even if you do it, you need a lot of protocols to get memory management right like IUnknown. So in practice it's rarely done outside of simple and special cases like shared pixel buffers.

3. Hardware processes conflate several different things together that we'd like to unpack, such as fault isolation, privacy, permissions etc.

4. Hard to evolve data structures when code is compiled using C ABIs.

and so on.

Simply creating non-overlapping address spaces doesn't help with any of these things. Even if all you do on a context switch is twiddle permission bits, it doesn't matter: you still need to do a TLB flush and that's the bulk of the cost of the context switch, and at any rate, you can't just call a function in another address space. Even if you know its address, and can allocate some memory to hold the arguments and pointers, you can't jump there because the target is mapped with zero permissions. And even if you go via a trampoline, so what, the stack holding your arguments also isn't readable. If you fix that with more hacks, now any structures the arguments point to aren't readable and so on recursively.

So you end up having to introduce RPC to copy the data structures across. Well, now what if you want a genuinely shared bit of state? You need some notion of handles, proxies, stubs, and that in turn means you need a way to coordinate lifetime management so different quasi-processes don't try to access memory another process freed. That's COM IUnknown::AddRef. Then you need ways to handle loosely coupled components that can be upgraded independently. That's COM IUnknown::QueryInterface and friends. And so on and so forth.

In a SASOS all that goes away because the compiler and GC are tightly integrated, and they don't let you manufacture arbitrary pointers. You don't have to do refcounting or marshalling as a consequence, you can evolve the ABIs of components without breaking things, you can create capabilities easily and cheaply, and so on.

As discussed above, speculation is a pain because it lets you break the rule of not crafting arbitrary pointers, but there are caveats to that caveat. I'm not actually convinced Spectre kills SASOS though you do need to do things differently.

legalcorrection · on March 31, 2022

Why can't it be that whatever procedure you use to give/get a pointer into another process also makes the necessary modifications to the page table? As you point out, this would become very tedious to the programmer if you just tried to bolt it on to current languages as a library, but I can imagine, e.g., a version of Java or C# that makes this all mostly seamless.

As for what the benefit is, I think you can at the very least get rid of needing to copy data back and forth.

Not that I'm an advocate for single address space OS's. I'd have to think about this more. You might be right. I'm playing devil's advocate to think it through, not to defend a position, if that makes sense.

mike_hearn · on March 31, 2022

Sure.

Well, pages are coarse grained so you'd end up giving the other quasi-process access to more stuff than it should have. And you'd have to flush the TLB so you pay the context switch cost anyway, at which point why bother? The reason operating systems make overlapping mappings is (classically) to enable better memory sharing due to not needing relocations or GOT/PLTs. That's all irrelevant these days due to ASLR (which doesn't even work that well anymore) but that was the idea.

You can do some tricks with using special allocators for the stuff you want to reveal that places the data in SHM segments, then blit the stack across to the other process and it can work. I know this because I built such a system as my undergrad thesis project :)

https://web.archive.org/web/20160331135050/plan99.net/~mike/...

It's very ugly though. And again, you need to coordinate memory management. For FastRPC it didn't matter because it was used mostly for holding stuff temporarily whilst you called into a library, so the 'outer' process owned the memory and the 'inner' process just used it temporarily. I never did anything with complex lifetimes.

One way of thinking about it is to study the issues that cause people to write large JVM apps instead of large C apps. It's not any different at the systems level. They want GC that works across all the libraries they use, they want to be able to upgrade the backend compiler without frontend-recompiling everything, they want to be able to upgrade a library or change a memory layout without recompiling the world, and they don't want all the goop that is triggered by allowing libraries to use arbitrary compilers and memory management subsystems. Forcing a unified compiler and GC simplifies the inter-module protocols so drastically, the wins are enormous.

legalcorrection · on March 31, 2022

>Well, pages are coarse grained so you'd end up giving the other quasi-process access to more stuff than it should have.

Good point. Embarrassing oversight on my part. My whole mental model of how this would work has come crashing down. Now obvious to me that to have it work the way I envisioned, you would need a managed-code only environment that supervises all memory accesses.

>I know this because I built such a system as my undergrad thesis project

Very cool!

imtringued · on April 1, 2022

Isn't it still possible to cause corruption within Java? If your operating system is written in Java it will have some unsafe interfaces where programmer, hardware or concurrency bugs will then let you bypass security features.

Yes this is a problem with all operating systems and it is much less likely with Java compared to C but it feels to me like you are entirely reliant on the JVM producing "verified" code. It's kinda like how the Rust gang has to do formal verification to prove that their borrow checker is actually water tight instead of just acting as if the programmer made sure it is water tight like in C land.

mike_hearn · on April 5, 2022

Yes, it gets you a long way but then on top you need things like IOMMUs to stop a driver mis-programming a hardware device and evading the memory safety that way. Fortunately all modern platforms have IOMMUs.

The point of managed code is that it compresses the space where (memory) safety problems can creep in to a small core in the runtime, which gets really well tested and reviewed. Even if there are concurrency bugs in your user-level code they can't corrupt memory in the manner C code can do - just yield objects in an invalid state that will hopefully be detected and cause exceptions very quickly. In turn you can then recover and try again or let the user know. CoModExceptions are a good example of that in action.