Linus Torvalds said something about this, in relation to microkernels. But the gist is in how interactions between many pieces makes the whole thing complex. Here's the quote from his book "Just for Fun".
"The theory behind the microkernel is that operating systems are complicated. So you try to get some of the complexity out by modularizing it a lot. The tenet of the microkernel approach is that the kernel, which is the core of the core of the core, should do as little as possible. Its main function is to communicate. All the different things that the computer offers are services that are available through the microkernel communications channels. In the microkernel approach, you’re supposed to split up the problem space so much that none of it is complex. I thought this was stupid. Yes, it makes every single piece simple. But the interactions make it far more complex than it would be if many of the services were included in the kernel itself, as they are in Linux. Think of your brain. Every single piece is simple, but the interactions between the pieces make for a highly complex system. It’s the whole-is-bigger-than-the-parts problem. If you take a problem and split it in half and say that the halves are half as complicated, you’re ignoring the fact that you have to add in the complication of communication between the two halves. The theory behind the microkernel was that you split the kernel into fifty independent parts, and each of the parts is a fiftieth of the complexity. But then everybody ignores the fact that the communication among the parts is actually more complicated than the original system was—never mind the fact that the parts are still not trivial. That’s the biggest argument against microkernels. The simplicity you try to reach is a false simplicity."
> "...The simplicity you try to reach is a false simplicity."
Also applies to some microservice architectures I've seen. People completely disregard the complexity (and overhead!) of the interactions between microservices.
I feel like I've become a crusader against microservices for the same reason.
They're so easy to set up and they immediately solve problems. But they also create many more, which aren't immediately obvious. And very few people are willing to say, "I was totally wrong to move to them, and let's spend some more precious time rolling them back."
As somebody who built some stuff with microservices before, I think the key is, that you don't view any architectural pattern as a silver bullet. All things come with their own trade offs.
It can make totally sense to break out certain parts into their own microservices if you thought long and hard about the interface and which data is going to be passed around – but if you break things out into their own microservices just for the heck of it, you will end up in a very messy mess quickly.
Using microservices to solve problems which don't demand them is similar to using OOP patterns in places where they are known to bring pain: you are holding a hammer and you think the world is made of nails.
That beeing said I am sure this has nothing to do with the real practicality of the underlying patterns, it just shows how easy people can lie to themselves.
This is a poor take away. Even in a monolith, you still want to separate your concerns, yes? Your code should be almost as abstracted in a monolith as they are in micro services. In microservices the code is just deployed across multiple instances.
RPCs and local method calls both need to be fault tolerant and race condition free. As you break up datastores, transactions become more complex but certainly you had a specific reason to do that so the complexity isn't a choice.
Sure the communication layer is added complexity but that too is should be abstracted into boilerplate such that you shouldn't have to think about it. Overall the added complexity requires more work but it shouldn't really make your business logic problems more complicated.
Micro services are a different kettle of fish though
Firstly, the benefits they offer have often little to do with the architecture itself, but with the bigger picture (separating teams, CIs, allowing different stacks, managing costs, scaling, etc).
Secondly, unlike microkernels, not all microservices have to talk to every single other microservice. If you have a service to send emails, say, there'll be a few services that interact with it, but the majority won't. The same for an image resizing service.
We (~150 eng) build microservices in a polyglot environment (mostly Python, JS, and Go), all in a monorepo! We also build + deploy in containers with Jenkins, etc.
The structure looks something like this:
|-- third_party (third party libs)
`-- python
|-- libs (internally-written, common dependencies)
`-- application_name
|-- client (react app, connects to flask)
|-- server (flask app, connects to services)
`-- services (microservices)
We use a dependency-management + build tool (https://www.pantsbuild.org/index.html, we started before Bazel was public) to manage dependencies. Without pants, our repo would be a mess.
Let me know if you have any questions, I'm happy to answer them! I'm super happy about our setup and eager to share it :)
Hahaha, because pants is a bad tool? Or because it sounds funny? I’m sympathetic to both :)
In defense of pants, I meant that our repo would be a mess without a versioned dependency graph + reproducible builds. Of course other tools give you that too, and definitely do it better than pants does.
I guess I should have said “without some build tool”, our repo would be a mess.
Hierarchical layout demands choices be made, but there are advantages in grouping source packages by language:
1) can naturally reflect object packaging models of target language (eg python, jvm).
2) this can encourage reuse of packages across projects.
I agree with the GP that it seems weird, and with you that it has its benefits. Of course, other approaches have their own benefits.
Personally, I feel like the top-level directory ordering for a monorepo is somewhat arbitrary, in that you can argue for anything, but it probably doesn’t matter; especially if you have a decent build tool.
It boils down to gut instinct again, since there is no clear line of how much complexity to accrue before splitting something up. I think he likes to keep Linux in one repo because that makes it easier for him to watch the project and manage it. Currently, if linux gets separated into 10 components, Linus will still have to keep an eye on all 10, so from his point of view, complexity is still the same, or worse. But if he could actually let go and let someone else completely own another component, this would not be the case. Bottom line is that he is smart enough to be able to do manage a repo that large, which only proves the author's point.
If I'm not mistaken, another argument for microkernel was the isolation of each modules. For example, if I'm using some driver X and it crashes, rest of my system will continue to work fine. That's not the case with monolithic kernel. I think this safety is pretty cool to have.
Although, Linux still probably does the best job of being stable compared to rest of OS I use (Windows, macOS). I can't recall the last time I got into kernel panic or crashed (despite worse drivers in some cases).
I've been positively suprised by Windows in the presence of some issues with bad graphics drivers. On Windows, the screen flickers, the "guilty" app dies and a popup appears in the corner "sorry, the graphics subsystem had to be restarted". Whereas GPU driver issues on Linux typically leave you at the text console at best. (a driver going totally haywire can of course bring down both entirely)
This is mostly because of a long history of crummy graphics drivers on Windows led to countless bluescreens. In the past, if your Windows machine bluescreened, it was a safe bet that the graphics driver was the cause.
Microsoft had enough telemetry telling them this that they spent a large effort restructuring the graphics driver subsystem so that it could crash and burn and be restarted without affecting the rest of the system.
Although Linux already has the isolation, it doesn't have the clean recovery. Since the year of the Linux desktop hasn't arrived yet, Linux is yet to make this journey.
The advantage of microkernels is that they can be extended with “untrusted” code like hardware drivers or file systems. This runs in user space and thus any bugs in such code will not crash the kernel process.
So I agree with you that Linus is presenting a straw man and your comment shouldn’t have been downvoted.
> The advantage of microkernels is that they can be extended with “untrusted” code like hardware drivers or file systems. This runs in user space and thus any bugs in such code will not crash the kernel process.
Did this advantage play out in practice? If your filesystem module goes down then every module that talks to the file system module needs to gracefully handle the failure or it will still effectively crash the system.
Or the module core dumps and the system keeps chugging on, but everything is locked up because they're waiting for the return from the crashed module. Did MINIX have a way to gracefully restart crashed modules?
> Did this advantage play out in practice? If your filesystem module goes down then every module that talks to the file system module needs to gracefully handle the failure or it will still effectively crash the system.
If the file system process crashes then in theory the OS would simply relaunch it.
But your core services should be stable, it’s more about extensions, for example you may want to have virtual file systems (ftp, sshfs, etc.), which until FUSE wasn’t possible in the non-microkernel world.
As for how it played out in practice: I think microkernels lost early on because of performance and things like FUSE were created to allow the most obvious extension mechanisms for the otherwise non-extendable monolithic kernels.
That's the theory yes, but I was asking about real life. Did those early microkernel systems actually deliver?
Also, for anything stateful, like a filesystem, simply relaunching it may not be sufficient. You need to make sure it hasn't lost any data in the crash and possibly rewind some state changes in related modules.
> That's the theory yes, but I was asking about real life. Did those early microkernel systems actually deliver?
According to Wikipedia “[MINIX] can also withstand driver crashes. In many cases it can automatically restart drivers without affecting running processes. In this way, MINIX is self-healing and can be used in applications demanding high reliability”.
While this kernel was originally written to teach kernel design, all Intel chipsets post-2015 are running MINIX 3 internally as the software component of the Intel Management Engine.
Another widely deployed microkernel is L4, I assume this has similar capabilities, as it also puts most things in user space and is used for mission critical stuff.
> Also, for anything stateful, like a filesystem, simply relaunching it may not be sufficient.
True, but simply rebooting when the kernel process crashes due to buggy driver code won’t be sufficient either :)
FYI when Apple introduced extended attributes their AFP (network file system) did have a bug that made the kernel (and thus entire machine) crash for certain edge cases involving extended attributes.
In that case, had their AFP file system been a user space process, I may still have lost data, but it would have saved me from dozens of reboots.
My nvidia driver regularly hangs my system every ~90 minutes or so, so I can certainly empathize with the goals & vouch that they still have a role today.
Please note that Linus wrote an operating system that in practice showed greater reliability than competing commercial microkernels. I do not believe that the principles that he came to believe in that process should be dismissed as straw man arguments.
> […] showed greater reliability than competing commercial microkernels
What is your basis for this claim?
I am only aware of QNX as a commercial microkernel (and real-time OS) and that is widely used in cars, medical devices, etc. with a strong reputation for reliability.
But for many tasks, Linux is good enough and free, which is hard to beat. But that does not mean that Linus is automatically correct in his statements.
According to the public advertising at the time, Windows NT, GNU Hurd, and Mach were all designed as microkernels. Mach of course is the basis for OS X.
At the same time that Windows NT was being claimed as a microkernel, Linux was outperforming and had a reputation as being more reliable. Ditto with Mach. And GNU Hurd famously was hard to get running at all.
QNX is highly reliable, but is also a specialized use case.
Tell that to my (lack of) graphics drivers. You can say its political but as it stands its no where near apples to apples in terms of what Windows supports vs what Linux supports.
Which video card do you have that lacks drivers for Linux? Or do you need fully open source drivers that fully support 3D acceleration and computation?
And yet somehow Linux manages to run on a greater variety of hardware than Windows does.
I am of course including supercomputers, embedded hardware, and hand-held phones. Admittedly Windows has greater support for is running consumer hardware for desktops. But that has to do with how small the Linux marketshare is. And is hardly an indictment of Linus' work.
In practice it was less useful than people assumed, because:
1. Things like drivers and filesystems are usually written by a small handful of vendors, who already have rigorous engineering cultures (hardware is a lot less forgiving than say web design), and a large base of demanding users who will rapidly complain and/or sue you if you get it wrong. When was the last time you personally had a crash due to a driver or filesystem issue? It used to happen semi-frequently in the Win95 days, but there was a strong incentive for hardware manufacturers to Fix Their Shit, and so all the ones who didn't went out of business.
2. You pay a hefty performance price for that stability - since the kernel is mediating all interactions between applications and drivers, every I/O call needs to go from app to kernel to driver and back again. There's been a lot of research put into optimizing this, but the simplest solution is just don't do this, and put things like drivers & filesystems in the kernel itself.
3. The existence of Linux as a decentralized open-source operating system took away one of the main organizational reasons for microkernels. When the kernel is proprietary, then all the negotiations, documentation, & interactions needed to get code into an external codebase become quite a hassle, with everyone trying to protect their trade secrets. When it's open-source, you go look for a driver that does something similar, write yours, and then mail a patch or submit a pull request.
Its only a false simplicity if you still need to track the interaction between everything and everything else.
When you break up a problem the goal is to find clear bottlenecks of complexity such that you can abstract a thing to its inputs and outputs and ignore the complexity within. You reduce the amount of knowledge required from any given perspective, thus reducing peak cognitive load.
Sure the system is as or possibly slightly more complex, but there is a distinct advantage to reducing the peak complexity of any given sub-problem.
This is analogous to the debate about when to break long functions into shorter ones. The simplicity argument usually doesn't consider the increased complexity of all the interactions between the new, shorter functions. If you omit that, you get a misleading picture, or as Linus puts it, a false simplicity.
"The theory behind the microkernel is that operating systems are complicated. So you try to get some of the complexity out by modularizing it a lot. The tenet of the microkernel approach is that the kernel, which is the core of the core of the core, should do as little as possible. Its main function is to communicate. All the different things that the computer offers are services that are available through the microkernel communications channels. In the microkernel approach, you’re supposed to split up the problem space so much that none of it is complex. I thought this was stupid. Yes, it makes every single piece simple. But the interactions make it far more complex than it would be if many of the services were included in the kernel itself, as they are in Linux. Think of your brain. Every single piece is simple, but the interactions between the pieces make for a highly complex system. It’s the whole-is-bigger-than-the-parts problem. If you take a problem and split it in half and say that the halves are half as complicated, you’re ignoring the fact that you have to add in the complication of communication between the two halves. The theory behind the microkernel was that you split the kernel into fifty independent parts, and each of the parts is a fiftieth of the complexity. But then everybody ignores the fact that the communication among the parts is actually more complicated than the original system was—never mind the fact that the parts are still not trivial. That’s the biggest argument against microkernels. The simplicity you try to reach is a false simplicity."