The wild world of non-C operating systems

skissane · on March 31, 2022

Also z/OS (formerly known as MVS) -large chunks of it are written in an IBM-internal-use-only PL/I dialect called PL/X.

The core of the AS/400 operating system, OS/400, the Licensed Internal Code (LIC), used to be written in another IBM-only PL/I dialect, PL/MP. In, the 1990s, while porting the AS/400 from CISC to RISC, they rewrote all the PL/MP code into C++. But IBM has another private dialect of PL/I called PL/MI, which was used to write the higher level part of OS/400, the XPF - basically PL/MI compiled to the virtual machine bytecode but PL/MP compiled to the machine code of the physical hardware. From what I understand, parts of the XPF are still written in PL/MP even in the latest versions of OS/400 (now called IBM i OS), although newer components tend to be developed in C/C++ instead. Other parts of the XPF were written in Modula 2, and I believe there is still some Modula 2 code surviving in the most recent versions as well.

IBM has had even more secret proprietary PL/I variants. The long defunct IBM 8100 line’s DPPX operating system was a written in PL/DS. Another dialect, PL.8, used to be used to write IBM’s compilers (although I believe it has now been replaced by C++) and also some of the firmware for the mainframes. IBM even wrote a GCC front-end for PL.8, although it appears they never released it.

shakna · on March 31, 2022

> IBM even wrote a GCC front-end for PL.8, although it appears they never released it.

Oh, how I had hoped I had scrubbed PL.8 from my memory.

"80% of PL/I." Only missing everything that made it bearable to use PL/I. No interrupts. You get ITERATE but no LEAVE statement. No STATIC, EXPORT or PACKAGE. You do get DEFINE... But not the one in other IBM PL/? langs. Incompatible.

The two weeks I spent debugging an interbank transferring program IBM wrote for them decades ago was one of the least pleasant experiences I've had when actually having access to the source code (albeit printed on five reams of paper, rather than digital). Give me pre-standard COBOL or FORTRAN over PL.8, any day.

Though, to be fair, the version of PL.8 I was granted access to was the original compiler, only. Not the updated one used for z/Arch firmware, today. Which I think solves a lot of those things. And I do mean the original compiler. The very, very first version.

I said no interrupts? Most of IBM's PL/? langs will signal when they run out of memory, so you can do something intelligent to kick the never-allowed-to-go-down mainframe into gracefully continuing to crunch endlessly (like clearing up part of a queue, and re-requesting it from a previous point). That signal is missing. So instead, you have developers build their own systems of heuristics to guess if a memory allocation had failed. Heuristics that had begun to fail, after an old network card burned out and was replaced with a similar one... But a faster, more modern, one. Faster hardware is the bane of old, hard-coded software.

    DCL IntVar INTEGER BASED (Cursed)

skissane · on March 31, 2022

> The two weeks I spent debugging an interbank transferring program IBM wrote for them decades ago was one of the least pleasant experiences I've had when actually having access to the source code (albeit printed on five reams of paper, rather than digital). Give me pre-standard COBOL or FORTRAN over PL.8, any day.

Interesting. I didn't realise PL.8 was used to write application software. I knew about it being used to write compilers, firmware, etc. If customer applications were written in it, did this mean they actually shipped PL.8 development tools to customers? I thought it was always IBM-internal-only.

shakna · on March 31, 2022

> If customer applications were written in it, did this mean they actually shipped PL.8 development tools to customers?

Access to the compiler was by special request, as was the source code for the project. It wasn't something generally available, but for this specific project, and there was someone whose job it was to make certain I didn't copy anything or so on. I have no reason to believe the interbank software was normal for IBM in any way, shape or form.

I do have a suspicion as to why PL.8 was used instead of another language, and it does make a little bit of sense, but I'm afraid I don't believe that I'm currently allowed to reveal that. I've given what details I currently can.

pjmlp · on April 1, 2022

Interesting story, I only know PL.8 from the public sources and always though it started and died with the RISC project, which later gave birth to Aix on PowerPC.

Wasn't aware that it still had a life of its own after that.

deepspace · on March 31, 2022

Speaking of PL/I, Intel made a few operating systems in the late 70s / early 80s written in their dialect of PL/I called PL/M. There was ISIS, which ran on their microprocessor development stations, and iRMX, the real time O/S running on the target microprocessors.

Having used both, ISIS was quite a bit more advanced than 'competitors' like DOS and CP/M at the time.

skissane · on March 31, 2022

> Speaking of PL/I, Intel made a few operating systems in the late 70s / early 80s written in their dialect of PL/I called PL/M

PL/M was created by Gary Kildall, creator of CP/M, and CP/M 1.x was written in PL/M. In CP/M-80 2.x, they rewrote the kernel (BDOS) and command processor (CCP) in 8080 assembly to improve performance, but utility programs were still written in PL/M. Some later CP/M ports, such as CP/M-68K and CP/M-8000, were written in C instead.

Wikipedia says that IBM used PL/M too – to write some of the firmware for the CISC AS/400s (the Service Processor).

kragen · on March 31, 2022

I think those might be two different PL/Ms that were both born in the 01970s.

skissane · on April 1, 2022

They are the same PL/M. Gary Kildall was working at Intel as a contractor when he invented the first version of PL/M. Intel adopted it for their own internal software development (ISIS, RMX, etc), Kildall used it for writing CP/M and other Digital Research products.

The compiler was forked into two versions – an Intel version and a Digital Research version – and they were mostly compatible, but each had some extensions the other lacked. As a result, code written for one couldn't always be compiled by the other.

kragen · on April 1, 2022

Sorry, I was thinking of IBM's PL/M, which clearly wasn't the one Intel was using. Thanks for the correction.

skissane · on April 1, 2022

Did IBM have a "PL/M"? IBM has had PL/MP and PL/MI but not (to my knowledge) a PL/M.

Wikipedia claims (without a source) that the CISC AS/400 Service Processor firmware was written in Gary Kildall's PL/M. The claim seems very plausible – the Service Processor is not the main CPU, it is a microprocessor used to control system startup/etc. I'm not sure what microprocessor was used in the CISC AS/400's Service Processor, but IBM could well have chosen an Intel CPU (such as 80x86) for it; and if they did, Intel's PL/M compiler would have had a logical attraction – at the time (the AS/400 came out in 1988 so this would have been some time in the 1980s) Intel was still actively supporting it as an embedded systems development language; it had similar syntax to the systems programming PL/I dialects which IBM's developers were using for other parts of the AS/400 (PL/MI and PL/MP); using Intel's PL/M compiler would have avoided the need for IBM to implement an 80x86 backend for their PL/whatever compilers.

kragen · on April 1, 2022

I might be misremembering! My source is a friend of mine whose day job was maintaining DB2; about 15 years ago, she told me — if I'm not misremembering! — that DB2 was written in "PL/M", which was a variant of PL/I. But it wouldn't be terribly surprising if it was actually PL/MI or PL/MP or (as you suggest in https://news.ycombinator.com/item?id=30877856) PL/X and I just misheard. Or misremembered.

She was also born around the time DB2 was first released, but she did work for IBM.

the_only_law · on March 31, 2022

I actually don’t mind PL/I. Always wondered one they needed a ton of derivative proprietary languages. I think I’d almost prefer language bloat.

zozbot234 · on March 31, 2022

PL/I is a kitchen sink language already.

Modern languages tend to minimize the amount of features that they provide in the language itself, and do as much as possible in their standard libraries.

(This also goes together with simplified syntax, because user-defined complex syntax is hard, slow and potentially ambiguous, whereas limiting verbose, "readable" syntax to just a handful of language-provided features is just silly.)

nobody9999 · on March 31, 2022

>PL/I is a kitchen sink language already.

   Speaking as someone who has delved into the 
   intricacies of PL/I, I am sure that only Real Men 
   could have written such a machine-hogging,
   cycle-grabbing, all-encompassing monster. 

   Allocate an array and free the middle third? Sure! 
   Why not? 

   Multiply a character string times a bit string and 
   assign the result to a float decimal? Go ahead!

   Free a controlled variable procedure parameter and 
   reallocate it before  passing it back? 

   Overlay three different types of variable on the 
   same memory ___location? Anything you say! 

   Write a recursive macro? Well, no, but Real Men use 
   rescan. 

   How could a language so obviously designed and 
   written by Real Men not be intended for Real Man 
   use?

Heh. Heh. Heh.

[Source: http://www.anvari.org/fortune/Miscellaneous_Collections/3620... ]

kps · on March 31, 2022

    The design criteria are as follows:

    1. Anything goes. If a particular combination of symbols
       has a reasonably sensible meaning, that meaning will be
       made official.

https://dl.acm.org/doi/10.1145/363707.363708

Koshkin · on March 31, 2022

Looks like a perfect language to write an OS in!

gumby · on April 1, 2022

Multics was written in PL/1

the_only_law · on April 1, 2022

And is still being updated!

gumby · on April 1, 2022

I should have written is, not was.

wglb · on March 31, 2022

Here is a trivia question for you. What language was the first optimizing compiler written in?

pinewurst · on March 31, 2022

It was the initial IBM FORTRAN written in 704 assembler. Very much an optimizing compiler.

wglb · on March 31, 2022

Not quite.

The idea of "basic blocks", key to optimizing compiler, was invented by the team that wrote the Fortran I compiler in Fortran I. See https://ucla-biostat-257-2020spring.github.io/readings/fortr...

rst · on March 31, 2022

The first Fortran I compiler is the one pinewurst was referring to, and it was written for the 704 -- in 704 assembly. Source code is here -- this is Fortran II (basically Fortran I plus provision for separately compiled subprograms), but obviously still assembly: http://www.softwarepreservation.org/projects/FORTRAN/index.h...

(The article in the course reading you cited also correctly states that the compiler was written in assembly, for the 704.)

wglb · on April 1, 2022

That is why I posted the question about "optomizing" compiler.

rst · on April 2, 2022

Not sure what you mean here... Fortran I had a damn good optimizer, and the papers on it introduced a whole lot of terminology (e.g., as above, "basic block") which is still in use. And it was written for the 704, in assembly.

Tao331 · on March 31, 2022

IIRC this is the punchline of a story about Seymour Cray programming a BIOS in octal via front switches and blinkenlights. Or somewhere in that ballpark. It's a person, so the answer is DNA.

wglb · on March 31, 2022

Nope. This happened before Seymour was at Control Data.

javajosh · on March 31, 2022

Surely a more accurate answer would be "Cray's bookshelf".

kragen · on March 31, 2022

Last I heard DB2 was still written in PL/M too.

skissane · on April 1, 2022

DB2 is really four different products marketed under one brand – three of which have some PL/I heritage, the fourth doesn't:

(1) DB2 for VM/VSE – the first to be released, as SQL/DS in 1981, its code was descended from IBM Research's System R research project, although most of that code was rewritten in the process of productisation

(2) DB2 for z/OS – the first to be called "DB2" (the others were later rebranded to that), released for MVS in 1983. I believe this started life as a fork of the SQL/DS code base, but then evolved in a different direction

(3) DB2 for IBM i – System/38 had an integrated database, but it was non-relational. In the process of evolving System/38 into the AS/400, IBM took the SQL layer out of SQL/DS (which was written in PL/S) and ported it to OS/400 (on which it would have been PL/MI code not PL/S, although those languages are very close to each other), but the lower layers of the stack (storage, etc) were kept from the System/38 non-relational database. In 1994, this was rebranded to "DB2/400", and more recently "DB2 for IBM i"

(4) DB2 for Linux/Unix/Windows (LUW) – this began life as OS/2 Extended Edition Database Manager. It was written in C/C++ from the very start, it has never contained any PL/I dialect code, although some ideas/concepts/etc were borrowed from IBM's mainframe/midrange RDBMS products. The code base was ported from its OS/2 birthplace to Unix (including AIX, Solaris and HP/UX), Windows NT, and Linux.

From what I understand (2) is still predominantly PL/X, although I would not be surprised if it now had some components in other languages such as C++. (1) is almost certainly mostly PL/X as well. (3) probably still contains PL/I-ish code too (albeit technically PL/MI rather than PL/X.)

(Disclaimer: I have no firsthand knowledge of this – I was born around the time SQL/DS and DB2 for MVS was first released, and I've never worked for IBM. It is just what I've pieced together from publicly available sources.)

ChuckMcM · on March 31, 2022

And Smalltalk. But this was a great article which highlighted some of the creativity that got me interested in computers in the first place. Building an operating system isn't "hard" but it is a lot of work the more you do. That is time consuming (but in a fun way, like reading books is time consuming)

Off and on I've experimented with a small event driven real time OS for my robots that started as a Forth kernel, switched over to a C kernel, and then to a Rust kernel. The experimental part has been looking at "infinite asynchronous threading" and it was hugely influenced by Rodney Brook's subsumption architecture. My thesis being that organisms aren't driven by a single scheduler and a bunch of processes, why should an OS be. (there are arguments pro and con on that one :-)).

Anyway, not everyone is an 'OS' type. At Sun we used to joke you were either an 'OS' type, 'Language' type, or a 'GUI' type. Those were the core areas of investigation in the software organization. I tend to be an 'OS' type with a bit of 'language' thrown in :-)

jazzyjackson · on April 1, 2022

> organisms aren't driven by a single scheduler and a bunch of processes, why should an OS be

Sounds like neuromorphic computing, or clockless circuits in general. Really interesting stuff, I'm working on a programming language based on macro-expansion and dependency resolution with a scheduling model of "do everything as soon as you can, and not a moment sooner!"

Have you happened upon Andy Clark's book, Being There -- Putting Brain, Body, and World Together Again ? It was very influential to me in how to think of computers as organisms that can process parallel stimulus.

ChuckMcM · on April 1, 2022

This sounds a lot like what I was considering. Initially I had a uC/OS kernel and added the equivalent of sleep()/wakeup() and the AmigaOS message ports. QNX took messaging to the extreme (in my opinion :-) but it makes a good way to delineate taskable data. Back at NetApp I was trying to accelerate file processing and realized that files (and mutations thereof) were just a state machine which (and this isn't new or anything, log file systems are an extension of this observation) but the interesting thing was how you could project a sequence of state transitions and reliably determine if it was possible to jump to the end state or not. (and yes this is basically optimizing non-directed graph traversal using robot state which can be reflected in external things like actuator position) The games people are doing these sorts of things to animate elements of their game environment which I find pretty cool too. Once my Tilt5 glasses arrive I expect I can play with this in 3Space using Unity and the SDK which will make it less of a thought exercise and more of a visualization exercise.

Guessing that lookalive-software is your github, have you put up your thoughts anywhere? Would love to read through them.

jazzyjackson · on April 1, 2022

thanks for the interest, the implementation has been in stealth mode but an old first draft description of the language is at https://lookalive.software/

Could you say what it means to 'project a sequence of state transitions' ? I am not hip to the CS lingo - you could say I am scanning the AST for tasks whose prerequisites include no further tasks, and replacing the branch of a completed task with its result.

Essentially all I'm doing is asking the coder to write an AST directly, except expressions can be mixed in with literals. I was just doing a kind of handlebars/mustache templating engine with JSON before realizing I had all the pieces for a general purpose language. My hello world(s) looks like:

  {"body": [
    {"#!cat planets.txt": [
      {"h1": "#!echo hello :0"}
    }]
  ]}

The interpreter scans the JSON for the #! token and checks if all the parameters are literals, so first it hits #!cat and sees the argument 'planets.txt', OK, nothing further to compute, pass it off to a subshell and get the result, say, `mercury\nvenus\nearth\mars`. Newline separated results get interpreted as an array, and arrays are mapped over the next branch of the tree to get:

  {"body": [
    {"#= mercury": [
      {"h1": "#!echo hello :0"}
    ]},
    {"#= venus": [
      {"h1": "#!echo hello :0"}
    ]},
    {"#= earth": [
      {"h1": "#!echo hello :0"}
    ]},
    {"#= mars": [
      {"h1": "#!echo hello :0"}
    ]},
  ]}

At this point, there are four branches of the AST that have no shared dependencies, so my interpreter can solve them in parallel: before `#!echo` is ready, the colon marks a named (in this case numbered) argument to lookup so "#!echo hello :0" gets replaced with "#!echo hello mercury", and now all four echo branches can be forked in parallel. (If you're asking why I have to use a subshell and the echo program just to concatenate two strings, it's because the language has no builtins except #=, #&, #!, and #? ((for label, lookup, solve, and switch)). You can define functions but I wanted to start off with builtins provided by the OS, communications overhead be damned.) The '#=' then collapses once it's children are all literal so the final state of the program is:

  {"body": [
    {"h1":"hello mercury"},
    {"h1":"hello venus"},
    {"h1":"hello earth"},
    {"h1":"hello mars"}
  ]}

Which can then be translated to HTML by `github.com/lookalive-software/elementary` -- besides the parallelism and declarative/functional style, my selling point is being able to step through the process, so the editor is able to rewind the state, adjust something, and resume execution. Every state is a new valid program (valid JSON too), so if you want to pause the program, upload it to a bigger computer, and resume you can do that to.

Not sure if those were quite the thoughts you were looking for, but I've been meaning to start writing docs for the current implementation anyway, so thanks for the push. If you want to know more details about the queueing / scheduler, I need to write the docs for that too ;)

MaxLeiter · on March 31, 2022

I’ve been on a Smalltalk binge the last few days and have really been enjoying this two hour long interview / demo with Dan Ingalls on Smalltalk and the Alto: https://youtu.be/uknEhXyZgsg

lproven · on April 1, 2022

Thanks very much! Especially in the context that reading on as this thread rapidly tends towards "terrifying incomprehensibility", I take that as high praise.

kragen · on March 31, 2022

What's "infinite asynchronous threading"?

pyinstallwoes · on March 31, 2022

DSP is a language and operating system (forth-like) made in the Soviet Union. I believe it was originally crafted upon ternary computation hardware but eventually transitioned to whatever was being used throughout the Union.

http://www.euroforth.org/ef00/lyakina00.pdf

Apparently they came to the syntax independently of Forth. However my research shows that some material about Forth may have showed up in a journal around the same time DSP was made. Either way the resemblance is uncanny. There is an interesting quote about the interface of the design requirements that manifested as the language: “DSP wasn’t invented, it was discovered” - I probably butchered it. But I’m pretty sure Chuck may have said it too or at least agrees.

The point is, you reach this interface following a method of reaching for something akin to “irreducible computational language that makes sense between humans and machines.”

trashburger · on March 31, 2022

Hmm, they didn't list SerenityOS[0] under the C++-based operating systems for some reason... maybe it's still too under the radar?

[0]: https://serenityos.org/

eatonphil · on March 31, 2022

Also, it's funny the editors of The Register didn't catch the omission of Serenity since The Register is definitely aware of Serenity.

Edit: More than just the editors missing it -- it's the same author!

https://www.theregister.com/2022/03/31/serenityos/

lproven · on April 1, 2022

Yup, I was and am aware of it. :-)

However, AIUI, the Serenity kernel isn't C++ but C. I could be wrong.

As such, as this was specifically a piece about non-C OSes, I felt it didn't really fit.

Secondly, although now I've had a bit of exposure to it writing that 2nd article, nonetheless, SerenityOS -- impressive as it is -- remains a bit of a prototype sort of project, rather than anything actually deployed in production anywhere. The project pages note that there are issues getting it to run anywhere outside of QEMU.

Whereas, say, Oberon is pretty obscure, but various versions of Oberon run on bare metal, or as VMs under multiple other OSes, or both, and ETH very much ran Oberon in production on bare metal for non-technical staff for years. There are 3-4 different dialects of Oberon, it runs natively on other OSes, it has OSes designed solely to be VMs, and so on.

To draw a comparison: it straddles the spaces of Plan 9 (mainly intended as a native bare-metal OS) to Taos (primarily bare-metal but deeply cross-platform, no native binaries as such) to Inferno (deeply cross-platform, no native binaries as such, but mainly intended to run under other OSes).

Whereas in the space of, say, Rust OSes, there are so few that even niche ones that are in prototype stage are worthy of mention... IMHO.

eatonphil · on April 1, 2022

Looks like C++ to me [0]!

And my point is that when you mention OS-es like Mezzano (3k stars on Github, a dozen contributors [1]) and Redox (13k stars, 80 contributors [2]), but don't mention Serenity (18k stars, over 100 contributors [3] (Github limits this view to the top 100)) it seems funny.

In any case, it's your article and you can do what you'd like. I don't like people telling me I should change my articles.

[0] https://github.com/SerenityOS/serenity/tree/master/Kernel/Ar...

[1] https://github.com/froggey/Mezzano/graphs/contributors

[2] https://github.com/redox-os/redox/graphs/contributors

[3] https://github.com/SerenityOS/serenity/graphs/contributors

eatonphil · on March 31, 2022

I don't think it's any more under the radar than Redox or Tock listed in the article.

dang · on March 31, 2022

Discussed yesterday:

SerenityOS Browser now passes the Acid3 test - https://news.ycombinator.com/item?id=30853392 - March 2022 (141 comments)

Lots more at https://news.ycombinator.com/item?id=30858768

LordDragonfang · on March 31, 2022

Nor do they mention Google's upcoming Fuchsia OS, whose Zircon Kernel is written almost entirely in C++. While not quite "beloved" yet, it generates substantial buzz every time it hit the HN front page.

lproven · on April 1, 2022

I've been looking at it.

The difficulties for me right now are:

* it's not FOSS;

* it's not well-described in the literature (yet?);

* and ISTM that the project's goals seem to keep shifting.

pjmlp · on April 1, 2022

Looks pretty much FOSS to me,

https://fuchsia.dev/fuchsia-src/contribute/governance/policy...

lproven · on April 1, 2022

Hmmm. My mistake, then.

I was and am keeping an eye on it anyway. As and when it gets to a more useful state, I will try it and probably write about it.

greenyoda · on March 31, 2022

The article notably omits Multics, which was written in PL/I: https://en.wikipedia.org/wiki/Multics

Multics, first released in 1969, was a major influence on Unix. From the Wikipedia article:

> Multics was the first operating system to provide a hierarchical file system, and file names could be of almost arbitrary length and syntax. A given file or directory could have multiple names (typically a long and short form), and symbolic links between directories were also supported. ... It was also the first to have a command processor implemented as ordinary user code – an idea later used in the Unix shell. It was also one of the first written in a high-level language (Multics PL/I), after the Burroughs MCP system written in ALGOL.

qubex · on March 31, 2022

I’m delighted they mentioned TAOS and (following through) the amazing Transputer architecture of the 1980s. Being a teen ‘interested’ in computers in the early 1990s I was fascinated by massively parallel architectures (Connection Machine 1/2 and Connection Machine 5, IBM’s SP/2 “scalable parallel” RS/6000 based machines, and the Transputer concept) and I’m still figuring out whether GPUs are true embodiments of that concept or not.

yvdriess · on March 31, 2022

The Tilera [1] processors were a recent embodiment of the Transputer. (comedy option: the TIS-100 [2])

Modern GPU architectures are very much their own thing. They have more in common with Vector machines in how the ALUs are organized, but the macro control flow is more like Dataflow architectures.

[1] https://en.wikipedia.org/wiki/TILEPro64 [2] https://en.wikipedia.org/wiki/TIS-100

mst · on March 31, 2022

TIS-100 is my current favourite puzzle game and the plethora of implementations on github plus the various community resources (including a rather good subrediit) make it even more fun. If it sounds interesting, I recommend https://alandesmet.github.io/TIS-100-Hackers-Guide/ as a starting point.

lproven · on April 1, 2022

Glad to hear it. :-)

Tilera was very cool. Maybe too much too young, though?

It was MIPS not Transputer based, AFAIK, but maybe I am misunderstanding you.

Both the Transputer and Alpha are CPU arches I think could still be usefully resurrected.

Re the Transputer, this earlier piece of mine may interest:

https://www.theregister.com/2021/12/06/heliosng/

dfox · on March 31, 2022

Massively parallel was just an marketing buzzword that was used for a lot of different things. One thing to look for is wether all the compute elements run from same clock or not.

CM1/2 is SIMD machine that is mostly nothing more than FPGA turned inside out and in fact just an accelerator that cannot do anything useful standalone (control flow is mostly done in software on the frontend machine). In this regard it is somewhat similar to today's GPUs.

SP/2 is more or less equivalent of the "Beowulf clusters" of the turn of certury and Slashdot fame, that is bunch of computers with full OS instances connected together with some reasonably fast network. But done few years before on somewhat specialized IBM hardware (And well, Myrinet of the early large Linux based clusters was kind of funky specialized hardware, with these 18pair Cat5 cables...).

CM5 is weird as it is cluster of more or less off the shelf SPARC nodes with several independent communications networks with one of the networks being capable of doing bunch of "collective communications" (reduce, broadcast...) operations in hardware. As with CM1/2 it does not run any kind of OS and is mostly managed by some set of frontend computers than do the job setup and network configuration for each new job.

And then transputer is not an concept, there was real hardware and solar system is full of things that are directly descended from that hardware. The issue with transputers as an HPC platform (even for the early 90's) is that the nodes are too small, slow and have too slow interconnect to be meaningfully usable as general purpose HPC platform because you end up using significant fraction of the potential performance for housekeeping tasks (notice that above mentioned CM1/2/5 has some out of band method for the frontend to manage the nodes and even directly access their local memory, on transputers this had to be done through the main communication network and supported by software running on the nodes themselves, which makes even booting the thing an interesting excercise in parallel programming).

__d · on April 1, 2022

I'd have to dig out the documentation, but I think Tilera addressed some of the Transputer's link limitations by having five (and six in the TilePro64) separate communications meshes between the cores.

There's one mesh each for: memory access, packet transfers, user data network, cache misses, and IPC. The Pro added a cache coherency mesh too.

Sadly, Tilera was bought by EZChip, who were bought by Mellanox, who were bought by NVIDIA, and the Tilera processors don't seem to be available or being updated.

abnercoimbre · on March 31, 2022

We asked Walter Bright at Handmade Seattle [0] what he thinks of a future with non-C ABIs. He makes the case we must all accept: C is deeply entrenched and all-encompassing.

That's not to discourage creative efforts -- it's more like "be aware this is 10x bigger than Mt. Everest"

[0] https://vimeo.com/652821195#t=10m52s

bombcar · on April 1, 2022

Likely C will be supplanted by something that isn’t even trying to supplant it.

abnercoimbre · on April 1, 2022

It's funny that's usually how it goes.

imtringued · on April 1, 2022

RPC via Flatbuffers or Cap'n proto? Those aren't aiming at replacing C ABIs but they could.

bitwize · on March 31, 2022

MCP (as in the Burroughs OS) stood for Master Control Program. The villain in Tron was named for the OS. The technical consultant for Tron was none other than Alan Kay — himself a huge Burroughs fan.

skadamat · on March 31, 2022

Yeah! Also I think Alan Kay's wife (Bonnie MacBird) was the writer for the movie: https://en.wikipedia.org/wiki/Bonnie_MacBird

bitwize · on March 31, 2022

Yep. They met and fell in love while working on Tron. Doubtless Bonnie thought that MCP Alan was talking about made a cool villain name. Also, Tron's programmer Alan Bradley is named after Kay.

The major conflict in Tron is also themed after Kay's thought: Whom should a computer serve, its users or the bureaucracy that administrates it? Kay believed in user-centric computing, and was profoundly influential in making that happen, but it's clear that this struggle still goes on today.

lproven · on April 1, 2022

> MCP (as in the Burroughs OS) stood for Master Control Program. The villain in Tron was named for the OS.

I specifically said this in the article:

> B5000's MCP or Master Control Program. (Yes, the same name as the big baddie in Tron.)

dboreham · on March 31, 2022

MCP written in some version of Algol, iirc.

bitwize · on March 31, 2022

ESPOL, a systems dialect of ALGOL.

Uniquely for a machine at the time, you didn't program a Burroughs in assembly. The tools just weren't available. Everything from the kernel to user applications was written in some HLL.

pjmlp · on April 1, 2022

Initially, it was later replaced by NEWP, which ClearPath MCP uses to this day.

weare138 · on March 31, 2022

This article misses a ton. In addition to all the ones mentioned already what about all the early operating systems written in assembly like Apple DOS or Commodore DOS and the Pascal family of OSs like Mac OS Classic and Toro Kernel.

em3rgent0rdr · on March 31, 2022

and MS-DOS [1] and BareMetal [2].

[1] https://web.archive.org/web/20170506152047/http://www.paters...

[2] https://github.com/ReturnInfinity/BareMetal

lproven · on April 1, 2022

I specifically said, in the article:

> This is not intended to be a comprehensive list.

And that is because:

> There are too many such efforts to count

And that I was:

> intentionally excluding early OSes that were partly or wholly written in assembly language.

jleyank · on March 31, 2022

If they're discussing historical OS's, which I think they were, I thought VMS was written in BLISS. That one was sorta important, and I recall "Ignorance is BLISS, BLISS is ignorance" T-shirts.

lproven · on March 31, 2022

Hi. Author here.

It's not all historical stuff by any means; in fact, I tried to skew it towards stuff from the recent or modern era and things that are still in maintenance, still being worked on, or actively sold.

And, yes, I did consider VMS, but the article was getting too long already and while BLISS does qualify, it's also relatively obscure. Maybe I made the wrong call there: it's in current use, on sale, and about to become generally available on x86-64. Ah, well.

jleyank · on March 31, 2022

Google sea that openvms is still used. Talk about inertia…

lproven · on March 31, 2022

"Google sea"?

VMS is alive, well, and about to ship a new version.

https://vmssoftware.com/about/openvmsv9-1/

It is exceptionally solid, it has the best clustering ever invented for any OS in history, and above all: if it ain't broke, don't fix it.

wrycoder · on April 1, 2022

Great job, thanks. I’ve been around awhile, but I’d not heard of about a third of those systems.

lproven · on April 1, 2022

Oh cool! Thank you!

mike_hearn · on March 31, 2022

The article didn't list JNode, but it's also a pure Java OS.

I noticed in another thread that a few people seem to think you can't implement an entire operating system in a GCd language like Java or C#, but that isn't true. You can do it like this:

1. Use an ahead of time compiler to compile the base runtime/kernel image to static machine code. In managed language operating systems like JNode or Singularity there isn't a strong distinction between kernel and language runtime, so this is the same thing. This base needs to contain at least a GC and probably a JITC, as well as some basic support services. This can itself be written in the managed language.

2. Write a very short bootstrap loader in assembly like every OS has, which sets things up enough to jump into the entry point of that runtime.

3. Writing a compiler in a managed language is easy enough but what about a GC? To do this you teach the compiler that some methods are 'magic' and shouldn't be treated as normal method calls. Instead they become either compiler intrinsics that are translated to pure assembly e.g. to read/write raw memory locations, or they are compiled in special ways for example to remove GC safe points.

The current best example of this in action is the GC in SubstrateVM, which is the "VM" compiled into any program AOT compiled with the GraalVM native image tool:

https://github.com/oracle/graal/tree/master/substratevm/src/...

If you flick through it you'll see various annotations and types you wouldn't normally see in Java, like `Pointer` and `@Uninterruptible`. These are recognized by the compiler and affects how the machine code is generated. The language is the same, so all existing tools continue to work - it's not a dialect or subset of Java, it's the full thing, just with slightly modified rules for how the final generated code behaves.

SubstrateVM has one more trick up its sleeve to break the circularity: some objects can be initialized and persisted to an "image heap" at build time. In other words, the GC code can use Java classes to structure itself, despite being unable to allocate.

And that's all it needs.

There have been efforts to do things like this in the past for full operating systems. They have nice properties: for example you can sandbox drivers, IPC overheads goes away because it's all in a single address space, capabilities actually work and are pervasive, and it's quite easy to keep ABIs stable. There are usually a few sticking points that prevent them taking off:

1. Historically, GCs have either been good at latency or throughput but not both simultaneously. Some of them also had trouble with large heaps. That's a problem because virtually all reasonable computing requires a mix of programs with wildly different sensitivity, most obviously, developers want latency to be prioritized for editing in their IDE but throughput to be prioritized for their build system. If you have one GCd heap for the entire computer then you have two options:

1a. Pick one algorithm to manage it.

1b. Do what Singularity did and come up with a quasi-process notion in which each unit has its own GCd heap. Singularity had a nice concept called an 'exchange heap' which allowed objects to be passed between these quasi-processes very fast and cheaply, whilst ensuring that object graph could only be pointed to by one unit at once. This made IPC tremendously cheap, allowed one unit to be paused for GC whilst other units ran, and let them use IPC all over the place. However it did reduce the benefits of using managed languages somewhat as it reintroduced complex data ownership rules.

NB: This is changing now with tech like HotSpot ZGC and Shenandoah (which are written in C++ though). They drive latencies through the floor, it's basically pauseless, and the new not yet released fully generational variants have very good throughput too. Also, G1 GC has monstrous throughput even with low pause times, they just aren't quite as low as ZGC/Shenandoah.

2. Overheads of managed languages are higher. The successor at MS Research to Singularity was codenamed Midori and not much was ever published about it publicly, but from what was written (by Joe Duffy) it seemed apparent that they went down a rabbithole of trying to make Midori have the same raw efficiency and overhead as C++ based Windows. They got a long way but ended up not having any interesting enough new features to justify the investment and the project was eventually canned.

3. All the same problems new operating systems always have: no apps, drivers etc.

4. Spectre attacks make single address space operating systems more complex. The new support for memory protection keys in Intel CPUs could re-awaken this area of OS research however because MPKs let you block speculation attacks within a single address space.

pjmlp · on March 31, 2022

Regarding Midori, besides the blog posts, Joe Duffy did two talks about the subject,

"RustConf 2017 - Closing Keynote: Safe Systems Software and the Future of Computing by Joe Duffy"

https://www.youtube.com/watch?v=CuD7SCqHB7k

"Safe Systems Programming in C# and .NET"

https://www.infoq.com/presentations/csharp-systems-programmi...

In one of them, he mentions that even with Midori proving its value to the Windows team, they were quite dimissive of it.

It appears to have also been yet another victim of the usual DevDiv vs WinDev politics.

titzer · on March 31, 2022

> 4. Spectre attacks make single address space operating systems more complex.

I will say, unequivocally, that Spectre actually makes process isolation in single address space operating systems impossible on modern hardware. There is too much speculation and too many leaks, and it's not just branches. We wrote a whole paper about it a few years back.

https://arxiv.org/abs/1902.05178

mike_hearn · on March 31, 2022

I believe that paper predates the introduction of speculation-blocking MPKs. Could you build a single-address space OS out of those, without hitting problems with Spectre attacks? It's an open research question but my gut says yes. MPKs are limited so you may need an equivalent of swapping with fallback to page table based isolation, but it's worth noting that in a SASOS the notion of a process is unpacked, so you can then add on top newly defined hardware enforced privacy domains that don't cleanly map to any existing notion of a process.

For example all code from a particular vendor (origin) could share a single MPK whilst running, even if the failure ___domain for things like fault isolation is finer grained.

titzer · on March 31, 2022

> I believe that paper predates the introduction of speculation-blocking MPKs

That isn't enough, because you can induce misspeculation through paths that do (or would) have access to appropriate MPKs and do almost anything you want, including disclosing information through sidechannels you do have access to. Loading an MPK is akin to changing address space protections; it has to be a hard barrier that cannot be speculated through. You cannot even have code mapped that would have access to those MPKs, as you can induce misspeculation into this code.

> It's an open research question but my gut says yes.

My gut says no. There is just too much dark stuff going on in hardware. Sidechannels are outside of models. You can't verify anything until you have not the model, but the whole design of the chip.

Also, variant 4 is not addressed much in the literature. I couldn't write what I wanted to write because of NDAs, but I have personally written PoCs that basically say hardware has to turn off memory speculation or you end up with a universal read gadget again. There is no software solution for variant 4.

mike_hearn · on March 31, 2022

I'm told, but haven't verified, that in older Intel CPUs loading an MPK wasn't a speculation barrier, but in newer CPUs it is. In other words changing your current MPK is like changing address spaces but much faster because there's no TLB flush or actual context switch.

I think there are also other caveats to consider. A lot of Spectre research (like your paper) is implicitly focused on the web renderer/V8 use case but here we're discussing theoretical operating system designs. Let's say you sandbox an image decoder library in-process, using type security and without using MPKs. Is this useless? No, because even if the image decoder is:

a. Maliciously doing speculation attacks.

b. Somehow this doesn't get noticed during development.

... the sandbox means it won't have any ability to make architectural-level changes. It can spy on you but its mouth is sealed; it doesn't have the ability to do any IO due to the architectural sandbox. To abuse Spectre in this context would require something really crazy, like trying to speculatively walk the heap, find the data you're looking for, encrypt it, steganographically encode the result into the images it decodes, and then hope that somehow those images make it back to the attackers even though the destination is probably just the screen. This isn't even NSA-level stuff, it's more like Hollywood at that point.

Compare to the situation today: the image decoder (or whatever) is a buggy C library running in a process with network access because it's a web origin. Game over.

I worry that the consequence of Spectre research has been that people conclude "in-process sandboxing is useless, cross-process is too hard, oh well, too bad so sad". Whereas in reality in-process sandboxing even without MPKs or equivalent would still be a massive win for computer security in many contexts where Spectre is hardly your biggest problem.

titzer · on March 31, 2022

> I worry that the consequence of Spectre research has been that people conclude "in-process sandboxing is useless, cross-process is too hard, oh well, too bad so sad". Whereas in reality in-process sandboxing even without MPKs or equivalent would still be a massive win for computer security in many contexts where Spectre is hardly your biggest problem.

Well, I agree that in-process sandboxing is still quite useful; it at least closes the barn door. But the rest of the conclusion is not what we made in Chrome; we had to go whole-hog multi-process for site isolation. That and just moving as much as possible out of the renderer process so that there aren't many secrets left to steal.

It's really an issue for situations where a process (or platform) is required to run untrusted code from lots of different sources. There isn't a software solution that is robust to side channels yet. They can still spy on each other. Clearly, two import cases that Google cares about are Cloud and the web.

mprovost · on March 31, 2022

> I noticed in another thread that a few people seem to think you can't implement an entire operating system in a GCd language like Java or C#, but that isn't true.

Both Smalltalk and LISP were used to write operating systems decades ago.

rev_d · on March 31, 2022

There's a USENIX article from a year or two ago about the "Biscuit" research OS, which was written in golang.

There were clear compromises & performance degradations related to using a GC'ed language, but it definitely was interesting enough that you hope for more.

Yhippa · on March 31, 2022

Mike, this is very well-written. Thank you. I grok maybe 40% of what you are saying which is a testament to what you typed out.

Is there a situation where if your sticking points could. E addressed or tolerated where using a GC'd language could shine vs a non one?

mike_hearn · on March 31, 2022

Sorry it's only 40% :) If you like I can elaborate.

I think there are many cases where GC'd languages can benefit tasks traditionally thought of as 'systems' tasks (I don't personally recognize much distinction between systems programming and non-systems programming). The Graal project is probably the most successful in this area. Like Midori it came out of long term corporate R&D, but unlike Midori it successfully shipped impactful software. They also published papers which the Midori guys never did.

Graal is an implementation of a JIT and AOT compiler for many languages, both source code and byte code based. It can run not only JVM bytecode but e.g. Python, Ruby, LLVM bitcode, WASM, and a whole bunch more, which makes a JVM powered by it easily the most polyglot compiler and runtime systems in history. Languages aren't translated to JVM bytecode, which would introduce impedance mismatches, but rather input to the compiler via a rather interesting technique called partial evaluation of interpreters.

Graal came out of asking the question, what if we wrote a JVM in Java? How can we do that, and if we can, are there benefits? Well, just rewriting something in Java doesn't give any benefits to the end users, just the developers, so to justify this you really have to find ways to really boost productivity a lot and then use that to yield interesting end-user features as well. Midori seems to have failed at this, Graal succeeded.

IMO the primary benefits Graal gets out of being written in a high level statically typed managed language are:

1. All the usual benefits like GC, great libraries, rock solid refactoring IDEs.

2. Annotations and the associated annotation processing/reflection infrastructure. The part that makes Graal polyglot is called Truffle and Truffle relies on this feature extensively.

3. Medium-level program input representation. One thing the Java world got really right is the clean separation between frontend and backend compilers via a stable bytecode format. Bytecode is not too high level, so you don't have to think about the complexities of evolving syntax when reading it, but also not too low level. Graal is of course itself expressed using bytecode and the compiler exploits this all over the place, for instance, it can parse bits of itself into compiler graphs at runtime and then use them as templates. They call this snippets and they use it as a way to lower high level constructs into lower level ones. It's really neat and a boost to compiler development productivity.

4. Related to that, Truffle relies heavily on that sort of meta-circular reflection capability. Truffle is the way Graal can compile languages that aren't JVM bytecode based. You write an AST interpreter for the language in any language that can produce bytecode (but in reality it's always Java because you need fairly precise control over the output; Kotlin might also work). The interpreter uses the Truffle API to express the AST, in particular it uses lots of annotations. The resulting interpreter is a normal Java program that can run on anything that can interpret bytecode, but the Graal compiler has plugins that recognize when it's compiling a Truffle program and handle it in special ways.

This ability to construct large, complex API surfaces that trigger special compiler behavior is one of the huge productivity wins that allowed the Graal team to add so many languages with such high runtime performance, so fast and so cheaply. It's like compiler intrinsics but taken to the next level, and then the next level again. And the end result is real benefits delivered to end users, for instance, Shopify uses TruffleRuby to get better performance for components of their web app.

yvdriess · on March 31, 2022

Truffle/Graal sounded almost like magic when we first looked at it. Its partial evaluation was one of the best efforts we've seen for speeding up R, which is notoriously averse to being compiled. (https://www.graalvm.org/r/)

lproven · on April 1, 2022

I think that there might be space for a separate article on "OSes implemented in GCed languages", but I do not yet know enough to write it.

zozbot234 · on March 31, 2022

> you can sandbox drivers, IPC overheads goes away because it's all in a single address space, capabilities actually work and are pervasive, and it's quite easy to keep ABIs stable.

You can do these things with WASM, which has no GC and simply manages memory segments on a per-module basis.

mike_hearn · on March 31, 2022

The point of a single address space OS is that you can pass pointers/references between modules without requiring any form of marshalling, nor introducing memory safety problems.

WASM cannot do this because it's meant to be a target for C and C-like languages. Either you have to make one WASM address space contain all your software, in which case it can easily corrupt itself internally (the sandbox is useless because it contains everything), or, you have to re-introduce a process-like concept that gives you isolated memory spaces, at which point it's not a single address space OS anymore.

zozbot234 · on March 31, 2022

Address space is entirely orthogonal to memory protection. You can have multiple protected tasks in a single address space, or multiple address spaces sharing blocks of physical memory among themselves with different virtual addresses, or any combination.

mike_hearn · on March 31, 2022

Yes, you could configure your memory maps so they never overlap and then call it a single address space, but if passing pointers between realms doesn't work then why bother? You didn't get any real benefit. The point of using a unified GC is that you can actually do this: just call a method in another protection ___domain, pass a pointer to a huge object graph, and you're done. There's no need for concepts like SHM segments or IPC marshalling. Even if you segmented your address space and then used classical process-like context switching, you'd still need all those things.

zozbot234 · on March 31, 2022

> but if passing pointers between realms doesn't work then why bother?

Because then it can work? It's a matter of what virtual addresses each "realm" has access to, either reading, writing or both.

mike_hearn · on March 31, 2022

I don't think I quite follow what you have in mind.

If there are two realms or protection domains or whatever we want to call them, but there is memory protection in place to prevent reading/writing of others when one is active, you can pass a pointer from one to the other and the other knows it's not belonging to itself. But the moment that receiver tries to read it, it'll segfault. Or what are you imagining happens here?

It seems to me like to solve that you have to copy data, not pointers. Now you have marshalling.

There's a second problem with trying to solve this with WASM - C and the associated ABIs don't handle evolution all that well. But part of what you need in a single address space OS is the ability for components to evolve independently, or semi-independently. In particular you need the ability to add fields to structures/objects without needing to recompile the world. Ideally, you'd even be able to modify structures in memory without even restarting the software. Higher level VMs than WASM can do this because they provide higher level semantics for memory layouts and linkage. You can evolve a JAR in much more flexible ways than you can evolve a binary C/rust module, or at least it's a lot less painful, which is why Win32 is full of reserved fields, and most attempts at long-term stable C APIs are full of OO-style APIs like GObject or COM in which structs are always opaque and everything has to be modified via slow inter-DLL calls to setters.

legalcorrection · on March 31, 2022

I think the piece you're missing is the continuing role of the page tables or similar functionality in such systems. You can have a single address space, i.e. a particular address can only ever refer to the same memory, while still determining that only certain processes are allowed to access that address. In such a system, the page tables would always have the same mapping to physical addresses no matter what process you're in, but the read/write/execute bits on the page table would still change as you context switch.

mike_hearn · on March 31, 2022

That's exactly what I understood from the proposal too, but I don't see why that is useful, nor why it'd be worth implementing with WASM.

Perhaps it's worth stepping back. The reason SASOSs are always written in managed languages like C# or Java [dialects] is that they're trying to solve several problems simultaneously:

1. IPC is slow due to all the copying and context switching.

2. Beyond slow it's also awkward to share data structures across processes. You need SHM segments, special negotiations, custom allocators that let you control precisely where data goes and which are thread safe across processes etc. Even if you do it, you need a lot of protocols to get memory management right like IUnknown. So in practice it's rarely done outside of simple and special cases like shared pixel buffers.

3. Hardware processes conflate several different things together that we'd like to unpack, such as fault isolation, privacy, permissions etc.

4. Hard to evolve data structures when code is compiled using C ABIs.

and so on.

Simply creating non-overlapping address spaces doesn't help with any of these things. Even if all you do on a context switch is twiddle permission bits, it doesn't matter: you still need to do a TLB flush and that's the bulk of the cost of the context switch, and at any rate, you can't just call a function in another address space. Even if you know its address, and can allocate some memory to hold the arguments and pointers, you can't jump there because the target is mapped with zero permissions. And even if you go via a trampoline, so what, the stack holding your arguments also isn't readable. If you fix that with more hacks, now any structures the arguments point to aren't readable and so on recursively.

So you end up having to introduce RPC to copy the data structures across. Well, now what if you want a genuinely shared bit of state? You need some notion of handles, proxies, stubs, and that in turn means you need a way to coordinate lifetime management so different quasi-processes don't try to access memory another process freed. That's COM IUnknown::AddRef. Then you need ways to handle loosely coupled components that can be upgraded independently. That's COM IUnknown::QueryInterface and friends. And so on and so forth.

In a SASOS all that goes away because the compiler and GC are tightly integrated, and they don't let you manufacture arbitrary pointers. You don't have to do refcounting or marshalling as a consequence, you can evolve the ABIs of components without breaking things, you can create capabilities easily and cheaply, and so on.

As discussed above, speculation is a pain because it lets you break the rule of not crafting arbitrary pointers, but there are caveats to that caveat. I'm not actually convinced Spectre kills SASOS though you do need to do things differently.

legalcorrection · on March 31, 2022

Why can't it be that whatever procedure you use to give/get a pointer into another process also makes the necessary modifications to the page table? As you point out, this would become very tedious to the programmer if you just tried to bolt it on to current languages as a library, but I can imagine, e.g., a version of Java or C# that makes this all mostly seamless.

As for what the benefit is, I think you can at the very least get rid of needing to copy data back and forth.

Not that I'm an advocate for single address space OS's. I'd have to think about this more. You might be right. I'm playing devil's advocate to think it through, not to defend a position, if that makes sense.

mike_hearn · on March 31, 2022

Sure.

Well, pages are coarse grained so you'd end up giving the other quasi-process access to more stuff than it should have. And you'd have to flush the TLB so you pay the context switch cost anyway, at which point why bother? The reason operating systems make overlapping mappings is (classically) to enable better memory sharing due to not needing relocations or GOT/PLTs. That's all irrelevant these days due to ASLR (which doesn't even work that well anymore) but that was the idea.

You can do some tricks with using special allocators for the stuff you want to reveal that places the data in SHM segments, then blit the stack across to the other process and it can work. I know this because I built such a system as my undergrad thesis project :)

https://web.archive.org/web/20160331135050/plan99.net/~mike/...

It's very ugly though. And again, you need to coordinate memory management. For FastRPC it didn't matter because it was used mostly for holding stuff temporarily whilst you called into a library, so the 'outer' process owned the memory and the 'inner' process just used it temporarily. I never did anything with complex lifetimes.

One way of thinking about it is to study the issues that cause people to write large JVM apps instead of large C apps. It's not any different at the systems level. They want GC that works across all the libraries they use, they want to be able to upgrade the backend compiler without frontend-recompiling everything, they want to be able to upgrade a library or change a memory layout without recompiling the world, and they don't want all the goop that is triggered by allowing libraries to use arbitrary compilers and memory management subsystems. Forcing a unified compiler and GC simplifies the inter-module protocols so drastically, the wins are enormous.

legalcorrection · on March 31, 2022

>Well, pages are coarse grained so you'd end up giving the other quasi-process access to more stuff than it should have.

Good point. Embarrassing oversight on my part. My whole mental model of how this would work has come crashing down. Now obvious to me that to have it work the way I envisioned, you would need a managed-code only environment that supervises all memory accesses.

>I know this because I built such a system as my undergrad thesis project

Very cool!

imtringued · on April 1, 2022

Isn't it still possible to cause corruption within Java? If your operating system is written in Java it will have some unsafe interfaces where programmer, hardware or concurrency bugs will then let you bypass security features.

Yes this is a problem with all operating systems and it is much less likely with Java compared to C but it feels to me like you are entirely reliant on the JVM producing "verified" code. It's kinda like how the Rust gang has to do formal verification to prove that their borrow checker is actually water tight instead of just acting as if the programmer made sure it is water tight like in C land.

mike_hearn · on April 5, 2022

Yes, it gets you a long way but then on top you need things like IOMMUs to stop a driver mis-programming a hardware device and evading the memory safety that way. Fortunately all modern platforms have IOMMUs.

The point of managed code is that it compresses the space where (memory) safety problems can creep in to a small core in the runtime, which gets really well tested and reviewed. Even if there are concurrency bugs in your user-level code they can't corrupt memory in the manner C code can do - just yield objects in an invalid state that will hopefully be detected and cause exceptions very quickly. In turn you can then recover and try again or let the user know. CoModExceptions are a good example of that in action.

sigzero · on March 31, 2022

jOS was another Java OS.

http://jos.sourceforge.net/

maxloh · on March 31, 2022

Which was the another thread you read?

rbanffy · on March 31, 2022

Fun to think Clearpath is the oldest operating system in production, since 1962, and that’s still fully supported.

Comes close to another one that also predates C, IBM’s z/OS, which has parts written in their own assembly dialect, as well as PL/1 and others.

Next week we’ll probably see a new version of it with support for the new z16 (based on the Telus chip they’ve shown at the last Hot Chips).

p_l · on March 31, 2022

Porting a mention from the other submission of this article:

It misses the big name of Multics, written in PL/I, and direct inspiration (both positive and negative) for Unix

lproven · on April 1, 2022

Yes it does. I had reasons. I'm not saying that they're great reasons, but...

* because it's not obscure

* because I don't know much about it

* because once you open the can of worms labelled "PL/?" then it sort of explodes in your face, with fun aspects touched on in other comments, such as there being at least 2 different languages called PL/M, but PL/I and PL/1 are the same, and...

It's complicated, and I don't know enough about it... yet.

p_l · on April 1, 2022

Fortunately Multics is one of the easier ones to learn about from that list :)

Thanks to <http://multicians.org> (which gathers information about Multics including source code) and <http://ban.ai> (public access Multics system)

mal10c · on March 31, 2022

This took me down memory lane but in a weird way. One of the first languages I really took to was vb6. I was absolutely convinced I could write an OS with that language... I tried and tried - really not knowing what I was doing and finally realized its limitations. Such a good lesson on using the right tool for the job.

butlerm · on March 31, 2022

> C fans tend to regard BCPL as an unimportant transitional step, but it was used in two celebrated OSes. One was the TRIPOS operating system, later used as the original basis of AmigaOS.

The latter is a bit of an exaggeration. Tripos related code written in BCPL was used for "AmigaDOS" - meaning the filesystem driver, the command line shell, a few bits of process control, and a variety of generic command line utilities. It was not used for the kernel (Exec), or graphics support, or the gui library (Intuition), or the graphical shell (Workbench), or lower level device drivers, which were all written in C or 68K assembly.

The Tripos derived parts were nice though, generally nicer than the MSDOS equivalent and somewhat friendlier than the Unix command line, if the filesystem was a little slow, comparatively speaking.

lproven · on April 1, 2022

I know. Almost every mention in the article is to some degree an oversimplification, though. It was either that or write a book.

Re Tripos: see this thread of comment...

https://twitter.com/Resuna/status/1509569315892346890

You may also enjoy this earlier piece:

https://www.theregister.com/2021/12/06/heliosng/

OnlyMortal · on March 31, 2022

I remember the original MacOS been a mix of 68K and Pascal.

I tended to use MPW C and 68K in those days and recall the stack sizes been different between C and Pascal.

lproven · on April 1, 2022

Yes it was, but large parts were assembler, and I therefore excluded it for the reasons given in the last paragraph of the article.

astrange · on April 1, 2022

I think it should count since it didn't provide the C stdlib and so wasn't "really" in C even when it was. There was no printf or malloc, instead lots of MoreMasters().

lproven · on April 1, 2022

Hey, look, FWIW, I welcome more info in this area and if you want to go write your own article or blog post covering the systems that I omitted and you feel I should have included, please do! I'd love to read something like that, and if there's a future installment in this (very loose) series, which I hope there will be, then I will mention and link to your work.

As I said, though, in this version, I intentionally omitted OSes partly or largely written in assembler.

I also didn't cover a load of OSes written in either very obscure languages, properietary languages, dead languages, and also languages and OSes I just don't know anything much about.

As I've said elsewhere in these comments: if I tried to cover everything in equally, it wouldn't be an article, it would be a book.

And while I'd quite like to write that book, it's be a very niche one that would sell about a dozen copies.

Decabytes · on March 31, 2022

It would be fun to try and make a scheme based Kernel/OS. My reasoning is that older revised reports document a much smaller Scheme language than r6rs and r7rs so would be easier to implement in asm. Then once you have a working Scheme you can build on top of it

wglb · on March 31, 2022

And start bootstrapping with sector lisp https://justine.lol/sectorlisp/

ncmncm · on March 31, 2022

The article falsely claims, "Rust is eclipsing C++". "Rust is desperately chasing after C++" would be accurate.

It also fails to mention Apollo Aegis, coded in Apollo's extended Pascal, and both at introduction and retirement was more advanced than Unix. Some key features still are not seen in Linux or most BSDs.

And, it fails to mention SerenityOS, in C++.

ModernMech · on March 31, 2022

It depends on what they mean by popularity. If they mean % of developers using Rust vs C++ they are wrong. But I teach students both C++ and Rust, and by far Rust is preferred. It’s never even a contest.

C++ is long in the tooth, and if you want to talk desperation, C++ is really struggling with its identity now that Rust is on the scene. Before there were some pretty good arguments to use it, but in light of Rust, the best argument for C++ is just that it has a larger ecosystem (which says more about how old C++ is than how good it is as a language). That argument becomes less convincing with each passing day.

IMO in 2022 there’s not a good reason to start a new project in C++ unless that’s all you know, or you have some constraint tied to C++.

kllrnohj · on March 31, 2022

> C++ is long in the tooth

C++ is mature, there's a difference. It's definitely not "long in the tooth" as it still has plenty of modern features & niceties and is getting more all the time.

> C++ is really struggling with its identity now that Rust is on the scene

C++'s identity struggles have nothing to do with Rust whatsoever. It's struggling to balance ABI & historical compatibility with moving the language forward, that's about it. And that's just something many mature languages face at some point (see Python3 for example).

Beyond that struggle of supporting legacy code, there's no identity crisis with C++ at all.

> the best argument for C++ is just that it has a larger ecosystem (which says more about how old C++ is than how good it is as a language). That argument becomes less convincing with each passing day.

It's still an incredibly convincing argument. Rust has decent C interop, but it's not especially seamless to do safely. But Rust's interop with C++ is rather complicated. So if you have any C++ dependencies, you're almost certainly far better off to just go with C++. Even if you just have C dependencies and someone else hasn't yet made a crate for it, you might still be better of with C++.

Particularly since many of the Rust crates that bind to popular C & C++ libraries are not owned or maintained by the library in question. So you're entering into a bit of a sketchy maintenance world. And there's already been churn & abandonware here, for example the various projects (including one from Mozilla/Servo!) to bind to libpng that were then later all abandoned, and you're supposed to migrate to rewrites of PNG support in Rust like LodePNG. This is not ignorable churn & risk.

It can be fun, and if it's a hobby project absolutely go for it. But if you're on the clock to ship & maintain something, it's a risk you must consider quite seriously.

andrewaylett · on March 31, 2022

An "unsafe" set of calls from Rust to C are still no less unsafe than calling C from other C or from C++. The biggest difference is in programmer expectation, which is why it's so unfortunate when a "safe" wrapper has bugs. Rust can't fix bugs in the underlying C library either, of course. That's one place where Mozilla's recent foray into FFI via compiling to WASM then compiling the WASM to native code is very interesting to me.

kllrnohj · on March 31, 2022

But making that unsafe Rust block safe to call from safe Rust isn't the most trivial of things to do. Conforming to all the safety requirements is still a thing to deal with, otherwise you contaminate all your safe Rust code.

It's not an unreasonable burden for what you get in return, but it is still a burden.

stouset · on March 31, 2022

That burden exists if you're calling into it from C, it's just implicit. At least Rust gives you the tools to write safe wrappers. Once you've done that consumers of your library don't have to tiptoe around the possibility of unsafe behavior.

adwn · on April 1, 2022

> That burden exists if you're calling into it from C, it's just implicit.

That point is absolutely crucial. It's not just implicit, it's also often undocumented. Recently, I tried to call LLVM's ORC2 JIT functions from two threads concurrently – an interface which was designed to be thread-safe [1]. And yet, actually doing that resulted in non-deterministic crashes and failed assertions. Guess it wasn't thread-safe after all... None of the types or function prototypes gave any indication that they weren't thread-safe, not to mention the documentation. And that's an extremely popular open-source project! The reality for most other C++ code looks even worse.

[1] https://llvm.org/docs/ORCv2.html#features

ModernMech · on April 1, 2022

This is my experience as well. I feel like a lot of people who are experts at C++ and have been using it for literal decades now really fail to appreciate how hard it is to learn C++ in 2022. They have all of this stuff in their heads already and it's easy, so the vast amount of undocumented knowledge is invisible to them.

notriddle · on March 31, 2022

> C++ is mature, there's a difference. It's definitely not "long in the tooth" as it still has plenty of modern features & niceties and is getting more all the time.

C is mature. C++ drinks and sleeps around, but that hardly makes it mature.

zahllos · on March 31, 2022

I mostly agree. C++ is mature, has standards and a whole array of production ready libraries. If I wanted to build a production cross platform GUI app today and not use electron (vomit) I would likely use Qt for it. Rust has nothing that is quite ready yet to offer there.

However, my overall view is that Rust is the right direction. If I had to improve C++, it would look a lot like Rust: move bounds checking and ownership to the compiler not the template compiler, remove classes, introduce traits and make C++ have concepts that don't require me to type every punctuation symbol on my keyboard to use, const by default.

Getting the compiler to do some of this checking by default is exactly what I want. If we can have an Ada/Spark style subset via Kani, also done at compile time to check correctness, even better.

> It's still an incredibly convincing argument. Rust has decent C interop, but it's not especially seamless to do safely.

I would make this statement slightly stronger. If you interop with C code and there are bugs in C code, all bets are off. You don't get any magical protection because your main routine is written in Rust. Cross-language safety, e.g. C++/Javascript, for example, is a very difficult problem. Here's a discussion of C++ and Rust https://dl.acm.org/doi/fullHtml/10.1145/3485832.3485903 If you want a tl;dr of that paper, it is that memory corruptions in C or C++ code sharing the address space of safe Rust code can make safe Rust code unsafe too. There have been plenty of attempts to study for example the interactions between browser C++ and the various runtimes a browser hosts.

The compact you are making with the Rust compiler when you type 'unsafe' is that you are saying you have verified the correctness of that code: that it is in fact already safe and the compiler should not (probably because it cannot) check it. The same is true of any FFI code you use. However, this is not how many people actually _use_ Rust. Here for example is a 2 year old unfixed segfault in a Rust crate caused by interfacing with non-Rust code: https://rustsec.org/advisories/RUSTSEC-2020-0159. This a) demonstrates the issue and b) demonstrates that random crates pulled off crates.io can contain safety issues for 2+ years with no fix.

hulitu · on March 31, 2022

> I mostly agree. C++ is mature, has standards and a whole array of production ready libraries. If I wanted to build a production cross platform GUI app today and not use electron (vomit) I would likely use Qt for it.

C++ is mature ? Every couple of years a new C++ standard emerges which breakes old programs. Qt ? Which version ? 3 ? 4 ? 5 ? 6 ?

ncmncm · on March 31, 2022

Each is more mature than the previous one. But, they do not break old code. Some old code, though, started out broken, and C++ is not called upon to fix it.

boznz · on March 31, 2022

> C++ is long in the tooth

Welcome to the world of any language that the cool guys dont use anymore :-)

dan-robertson · on March 31, 2022

If you have a very deep understanding of C++ then you are potentially yourself a good reason to start a project in C++ and this statement is true of any language. Knowing the language very well can mean more time focusing on the actual goals and less time messing around with incidental language things. C++ is not sufficiently bad that mastery of it doesn’t trump other concerns for personal projects. Even in companies you are often likely to find more C++ expertise than Rust expertise.

ilovecaching · on March 31, 2022

I don't think you understand just how much of your system and the libraries out there are C/C++. Popularity isn't a measure of entrenchment. C/C++ will likely exist forever because the industry isn't going to rewrite every single piece of the system into Rust. And it isn't like you can just go to one community and ask them to switch. These are huge federations that talk to each other using C/C++ bindings and have hashed out protocols over years and years of board meetings. I think Rust has a place. We can build new things in Rust or replace some things with Rust. But acting like everyone should learn Rust and C/C++ has no value is dangerously ignorant.

zozbot234 · on March 31, 2022

C library bindings can be used directly in Rust, and there are facilities to improve interfacing with C++ as well.

pjmlp · on March 31, 2022

Try to use those facilities with CUDA or Unreal.

bayindirh · on March 31, 2022

> IMO in 2022 there’s not a good reason to start a new project in C++ unless that’s all you know, or you have some constraint tied to C++.

I personally don't agree. I can start with performance and continue with a better compiler ecosystem, and add suitability for low resource applications. I'm just not scratching the surface here.

Rust is not bad, but it's not a replacement of C++. They aspire to be different things.

umanwizard · on March 31, 2022

When is C++ higher-performance than Rust?

infamouscow · on March 31, 2022

Rust's lack of fast thread-local storage makes it a non-starter for me.

It's really disappointing, especially when the language has incorporated so many other excellent performance improvements like Google's SwissTable and pattern-defeating quicksort.

https://github.com/rust-lang/rust/issues/29594

https://matklad.github.io/2020/10/03/fast-thread-locals-in-r...

roblabla · on March 31, 2022

FWIW, Rust does have fast thread locals, but only in nightly. https://github.com/rust-lang/rust/issues/29594

infamouscow · on April 1, 2022

How many companies are running Rust nightly in production?

roblabla · on April 2, 2022

According to the 2020 Rust Survey[0], a decent amount. The majority is on stable rust, but a bit less than a third of rust users are on nightly.

The company I work at uses a pinned nightly rust in order to access some soon-to-be-stabilized features that simplify our life a bit. We update our pinned nightly in lockstep with stable rust releases. In practice, we've almost never had problems with using nightly rust - those are still very well tested, and problems are caught early and fast. The Rust test suite is quite thorough!

However, we do try to avoid using features that don't have a clear trajectory towards stabilization, so for us to consider thread_locals, we'd need to have some very solid proof that it would fix some critical performance problems.

I suspect every org will have their own policy, and while the majority will use stable rust, it's not like nightly rust is completely unthinkable.

[0]: https://blog.rust-lang.org/2020/12/16/rust-survey-2020.html

infamouscow · on April 2, 2022

This survey says about half of the responders are using Rust in production (or at work in some capacity). It doesn't say anything about what percentage of those surveyed AND use Rust at work are ALSO using stable vs nightly.

Jweb_Guru · on April 2, 2022

Yup this is the #1 annoying performance issue in Rust. It would probably require a few language extensions to make work safely, but nothing insurmountable I don't think.

_vbnz · on March 31, 2022

There's no placement new IIRC so you always build on the stack and copy it to the heap.

https://users.rust-lang.org/t/how-to-create-large-objects-di...

tialaramex · on March 31, 2022

You can do "placement new" (Rust has no new operator, but in this context) unsafely with MaybeUninit -- https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.h...

Make a MaybeUninit<Doodad> on the heap, initialise it, and then unsafely assume_init() remembering to write your safety justification ("I initialised this, so it's fine") and get a Doodad, on the heap.

The reason it isn't mentioned in that 2019 u.r-l.o post is that MaybeUninit wasn't stabilized until mid-2019.

andrewaylett · on March 31, 2022

I think that's a question of an "insufficiently smart compiler" rather than necessarily something the programmer should be concerned about?

steveklabnik · on March 31, 2022

No, it is a language semantics issue. You can get the compiler to optimize it away today, if you're careful. But that's a bad user experience. You should be able to specifically request this semantic, and be sure it works.

Koshkin · on March 31, 2022

The programmer should certainly be concerned about the compiler being insufficiently smart.

umanwizard · on March 31, 2022

Fair point.

jandrewrogers · on March 31, 2022

I would frame the question more in terms of economy than performance. In theory, I could write a database engine in Rust that is as performant as C++, but it would not make sense to.

Making Rust perform the same as C++ in database engines requires marking large portions of the code base as "unsafe", while requiring substantially more lines of code due to being less expressive. Note that this doesn't imply that the code is actually unsafe, just that the Rust compiler cannot reason about the safety of correct and idiomatic code in database engines.

And this is why state-of-the-art database engines are still written in C++. There is not much point in using Rust if doing so requires disabling the safety features and writing a lot more code to achieve the same result.

pkolaczk · on March 31, 2022

Definitely not my experience with file io and database-related software in rust. Actually two of my publically available programs - fclones and latte - seem to be the most efficient in their respective classes, beating a wide variety of C and C++ programs, yet they contain almost no unsafe code (< 0.1%).

The state of the art database engines are written in C++ (and Java) because at the time they were developed Rust didn't exist.

jandrewrogers · on April 1, 2022

All high-performance database engines do full kernel bypass -- Linux has standard APIs for this. Databases designed this way are much higher performance, which is people do it. It may not be obvious what this has to do with Rust.

The downside of kernel bypass is that the kernel is no longer transparently hiding the relationship between storage, memory, and related low-level hardware mechanics. A consequence of this is that the mutability and lifetime of most objects cannot be evaluated at compile-time, which Rust relies on for its safety model. The hardware can hold mutable references to memory that are not represented in software and which don't understand your object model. How do you manage lifetimes when references to objects are not visible in the code? When an object does not have a fixed address? And so on. This is not trivial even in C++ but the language does allow you to write transparent abstractions that hide this behind something that behaves like a reference without complaining.

There are proven models that guarantee safety under these constraints, widely used in database engines. Instead of ownership semantics, they are based on dynamic scheduling semantics. The execution scheduler can dynamically detect unsafe (or other) situations and rewrite the execution schedule to guarantee safety and forward progress without violating other invariants. This is why, as a simple example, some databases never seem to produce deadlocks -- deadlock situations still occur but they are dynamically detected before they occur and are resolved transparently by editing the lock and execution graph.

Some major classes of database architecture optimization don't work in garbage collected languages. For this reason, you never see state-of-the-art database engines written in e.g. Java.

Rusky · on April 2, 2022

I've seen a similar pattern implemented in Rust already, presenting a safe interface around relatively little unsafe code.

For example, the Bevy game engine's entity system works like a small in-memory database, and also replaces ownership (games also have lots of mutability and lifetimes that aren't clear at compile time) with careful dynamic scheduling. Users of this system just request a particular kind of access (mutable or immutable) to some set of entities, and the engine schedules them to avoid conflicts.

I'm not that familiar with databases so maybe I'm just totally missing what you're talking about, but from what it sounds like this is not something that would require large portions of a Rust program to be unsafe.

Overall I think the idea that Rust relies on knowing specific mutability and lifetimes at compile time is a bit of a misunderstanding of how the borrow checker works. It's rather a sort of glue that describes relationships between APIs, and when you are coming up with a custom approach to ensure safety it is still a useful language you can speak on the boundaries.

jandrewrogers · on April 5, 2022

A tacit assumption of Rust's model is that code is running on top of a transparent virtual memory system e.g. what Linux provides by default. Game engines, and in-memory databases, typically run on top of the OS virtual memory implementation. A database engine that bypasses or disables the OS virtual memory system, which many do, breaks this assumption. This is common architecture in any performance-sensitive application that is I/O intensive, but databases are the most extreme example.

Almost every object in a database engine, whether transient or persistent, lives outside the virtual memory model provided by the OS. Because it is in user space, this is part of the database code and no longer transparent to the compiler. To a compiler, these objects have no fixed address, have an ambiguous number of mutable references at any point in time, and may have no address at all (e.g. if moved to storage). This is safe, and you can write abstractions in C++ that make this transparent to the developer, but the compiler can still see it and Rust does not like what it sees.

Performance is not the only reason to do this. The OS implementation is tightly coupled to the silicon implementation of virtual memory. Very large storage volumes can exceed the limits of what the hardware can support, but user space software implementations have few practical limits on scale.

pkolaczk · on April 7, 2022

> A tacit assumption of Rust's model is that code is running on top of a transparent virtual memory system e.g. what Linux provides by default.

Where did you get that from? There is nothing more in Rust than there is in C++ that relies on virtual memory subsystem. Rust is perfectly able to compile for platforms with no OS at all.

> This is safe, and you can write abstractions in C++ that make this transparent to the developer, but the compiler can still see it and Rust does not like what it sees.

Rust can do the same abstractions as C++ can. This is pretty much a standard for low-level crates, which come with some unsafe code, wrapped in higher-level, easy and safe to use abstractions.

jandrewrogers · on April 9, 2022

You seem to be missing a critical point.

As a consequence of implementing some of these kernel functions in user space, it is not possible to determine object lifetimes or track mutable references at compile-time. Any object created in memory that bypasses the kernel mechanisms is unsafe as far as Rust is concerned. In a database, the vast majority of your objects are constructed on this type of memory.

You can implement this in Rust, but essentially the entire database kernel will be "unsafe". At which point, why bother with Rust? C++ is significantly more expressive at writing this type of code, and it is not trivial even in C++.

umanwizard · on March 31, 2022

I work at Materialize, which is writing a database in Rust. We use some `unsafe` code, but it's a very small proportion of the overall codebase and quite well abstracted away from the rest of the non-unsafe code.

To be fair, we're trying to do a bit of a different thing (in-memory incrementally maintained views on streaming data) than most existing databases, so you could argue that it is not an apples-to-apples comparison.

But there are plenty of other databases written in non-C++ languages -- there is even TiDB which is written in Rust too.

bayindirh · on March 31, 2022

High performance and scientific computing, or more generally when you transfer a lot of data structures via pointers to a lot of threads. In that scenarios, a half second performance difference in main loop can translate to hours in long runs.

The code I'm developing is running with ~1.7M iterations/core/second. Every one of these iterations contain another set of 1000 iterations or so (the number is variable, so I don't remember the exact mean, but the total is a lot). Also, this number is on an 8 year old system.

More benchmarks are here: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

umanwizard · on March 31, 2022

You haven’t actually explained when Rust is slower. You just described a situation where you care about performance.

What specific operations happen slower in Rust than in C++?

ModernMech · on March 31, 2022

The link you provided shows C++ and Rust both performing best in various benchmarks. Seems like a wash to me, and to the extent one beats the other it's marginal. Which is to be expected because as long as the language is compiled and the compiler is good enough, you should be able to get as much performance out of it as your CPU can handle.

bayindirh · on March 31, 2022

The numbers might look small, and indeed for most cases it can be considered a wash, but in some cases, these small differences return as hours.

For example, I'm using a lot of vector, matrix and array accesses in my code. A lot of these are copied around for computation. Moving that amount of data will trigger a lot of checks in Rust, and will create a slowdown inevitably.

umanwizard · on March 31, 2022

What makes you think Rust forces more “checks” when moving data than C++ does? Can you give an example?

stouset · on March 31, 2022

To wit, the Rust compiler is "sufficiently smart" in an extremely high number of scenarios such that it can achieve optimal performance by eliding things like bounds checks when it's able to determine that they would never fail.

This is, in my experience, the overwhelming majority of idiomatic Rust code. For cases where it can't, and performance is absolutely critical, there are easy escape hatches that let you explicitly elide the checks yourself.

ncmncm · on April 1, 2022

"Extremely high" means "noticeably many". They are identically the same cases where doing something unsafe in C++ is (likewise) impossible. As such, they drop out of the comparison.

ModernMech · on March 31, 2022

Do you have a link to your code that you've benchmarked in C++ and Rust? I'm also doing a lot of vector/matrix/array access in my code, so I'm curious as to what you'd be doing that would cause slow downs to the point that C++ can beat Rust by hours. That would be very enlightening to see.

bayindirh · on March 31, 2022

I would love to, but the framework I'm developing is closed source (it's my Ph.D. and postdoc work at the end of the day), but we have a paper for some part of it. Let me see if I can share it.

chlorion · on March 31, 2022

Moves don't require any runtime checks in Rust to my knowledge. Moves do involve some compile time analysis to make sure you aren't accessing a value that has been moved, maybe that is what you are thinking of?

In Rust, moves are always a simple memcpy from one place to another, that's it! With non `Copy` types this destroys the original value which is why the compiler makes sure you don't access it after it's moved.

There is also `Clone` which is generally a deep copy of everything, but this also doesn't involve any runtime checks as long as you aren't using reference counted pointers or something.

There are bounds checks when indexing into vectors and whatnot though, but non checked methods are available.

ncmncm · on April 1, 2022

In other words, moves in Rust are very limited and limiting.

chlorion · on April 1, 2022

They are more limited than C++ moves for sure, whether that is a good or a bad thing I am not so sure of!

I am curious if there are any examples of things you wanted to do but couldn't with Rusts move semantics.

After using both C++ and Rust I personally feel that the way Rust handles moves makes a lot more sense and ends up being much more ergonomic. In C++ moves feel messy and complex due to them being added on rather than having the language built around them from the start. It's also not ideal that C++ moves must leave values hanging around in an unspecified state rather than destroying them immediately, even if there are a few cases where that is useful, most of the time it's not.

Rust also doesn't have any of the crazy `&&T` rvalue reference stuff or require explicitly defining move constructors since everything is movable and all moves are bitwise copies.

steveklabnik · on April 1, 2022

The classic thing that's tough to do is self-referential structures; since you have no hook into moves, you can't fix-up the references.

It also has quite a few pros, and many people think it's better the way it is. Since moves are memcopies, they can never fail, meaning there's no opportunities for panics, which means it's much easier to optimize. They aren't arbitrarily expensive; you always know what the cost is. As you mentioned, there's no need for a "moved-from" state, which can save space and/or program logic.

ncmncm · on April 2, 2022

It is correct to say that allowing move constructors to fail was a grave design mistake.

jcelerier · on March 31, 2022

Rust's Vec indexing is bound-checked by default. Just something like this is abysmal for performance (but pretty good for consultants who get called to fix stuff afterwards, so, like, keep them coming!):

- Rust: https://rust.godbolt.org/z/GK8WY599o

- CPP: https://gcc.godbolt.org/z/qvxzKfv8q

titzer · on March 31, 2022

> Rust's Vec indexing is bound-checked by default. Just something like this is abysmal for performance

Unless you can provide numbers to back this claim up, I'll continue to rely on my measurements that bounds checking (in Virgil) costs anywhere from 0.5% to 3% of performance. It's sometimes more or less in other languages, but it is by far not the reason any programming language has bad performance. I have no reason to suspect that it costs more in Rust, other than misleading microbenchmarks.

jcelerier · on March 31, 2022

Here's a quick and dirty example: https://gcc.godbolt.org/z/3WK7eYM4z

I observe (-Ofast -march=native -std=c++20 ; CPU is intel 6900k with performance governor on Arch... blasting at 4GHz) :

- clang 13: 8.93 ms per iteration

- gcc 11: 9.46 ms per iteration

so roughly around 9ms.

Replacing

    return mat[r * matrix_size + c];

by

    return mat.at(r * matrix_size + c);

I observe

- clang 13: 20.95 ms per iteration

- gcc 11: 18.11 ms per iteration

so literally more than twice as slow. I also tried with libc++ instead of libstdc++ for clang and the results did not meaningfully change (around 9ms without bound-checking and 21 ms with).

titzer · on March 31, 2022

From the godbolt link, it looked like most of the vector operations were not getting inlined. You'll need -O2 or higher to have representative results.

I could counter by just writing a Virgil program and turning off bounds check with a compiler flag. We could stare at the machine code together; it's literally an additional load, compare, and branch. The Virgil compiler loves eliminating bounds checks when it can. I know your original comment was in the context of the STL, but it really muddies the waters to see huge piles of code get inlined or not depending on compiler flags. Machine code is what matters.

Regardless, this is still microbenchmarking. Maybe matrix multiply is an important kernel, but we need to benchmark whole programs. If I turn off bounds checks in a big program like the Virgil compiler, I cannot measure more than about 1.5% performance difference.

jcelerier · on March 31, 2022

> From the godbolt link,

I just used godbolt to quickly share the code. On my computer I tried with -Ofast -march=native (broadwell)

> I could counter by just writing a Virgil program and turning off bounds check with a compiler flag. We could stare at the machine code together; it's literally an additional load, compare, and branch.

This sounds like Virgil is not vectorizing, which makes the comparison much less useful.

I see much more than a couple instructions changed there: https://gcc.godbolt.org/z/8jPMb734x

titzer · on March 31, 2022

> Virgil is not vectorizing

That's just even more confounding variables. We are, after all, not even doing experiments with Rust vectors, which is what your original comment was about. You wrote examples in C++ and we went down a wormhole already, but I let it slip since at least it was empirical. But I think we shouldn't get distracted with even more side alleys now. Bounds checks are literally just a couple machine instructions, and often eliminated by the compiler anyway, which enables vectorization.

jcelerier · on March 31, 2022

> We are, after all, not even doing experiments with Rust vectors,

they are very close to C++ ones, down to the stack unwinding to report the panic in case of bound error if I'm not mistaken.

> That's just even more confounding variables.

No they are not. We are trying to see how much bound checks costs. If you compare between suboptimal programs the comparison is meaningless (or rather not interesting to anyone) - the starting point has to be the absolute best performance that it is possible to get, and add the worst-case bound-checking (I'm happy for you if you never have to worry about the worst case though!)

> Bounds checks are literally just a couple machine instructions, and often eliminated by the compiler anyway

please provide a source for this ? sure, if you use spans as other commenters mentioned, that moves the checking at the span creation time but that only works for the simplest cases where you are going to access linearily - and I would say that it's a library feature rather than a compiler one.

   for(double v : vec) { } // or any sub-span you can take from it

in C++ also does not need bound-checks by design but this kind of construct also utterly does not matter for so HPC workloads.

I can look into my pdfs folders and bring out dozens of papers where the core algorithms all use funky non-linear indexing schemes where you cannot just iterate a range (algorithms based on accessing i, i-1, i+1, or accessing i and N-i, or accessing even / odd values, etc etc) - how would you implement an FFT for instance ? This is the code that matters !

titzer · on March 31, 2022

> accessing i, i-1, i+1, or accessing i and N-i,

A compiler can do loop versioning for that. And they do. Hotspot C2 (and Graal, too) does a ton of loop optimizations, partitioning the index space into in-bounds and potentially out-of-bounds ranges, unrolling loops, peeling the first iteration off a loop, generating code before a loop to dispatch to a customized version if everything will be in bounds.

When a compiler is doing that sophisticated of loop transforms, you are not measuring the cost of bounds checks anymore, you are measuring a whole host of other things. And sometimes if a loop is just a little different, the results can be disastrously bad or miraculously good. Which is why microbenchmarking is so fraught with peril. A microbenchmark might be written assuming a simple model of the computation and then a sophisticated compiler with analyses never dreamed of comes along and completely reorganizes the code. And it cuts both ways; a C++ compiler might do some unrolling, fusion, tiling, vectorization, software pipelining or interchange on your loops and suddenly you aren't measuring bounds check cost anymore, but loop optimization heuristics that have been perturbed by their presence. You end up studying second-order effects and not realizing it. And it keeps running away from you the more you try to study it.

>> That's just even more confounding variables.

> No they are not. We are trying to see how much bound checks costs. If you compare between suboptimal programs the comparison is meaningless (or rather not interesting to anyone) - the starting point has to be the absolute best performance that it is possible to get, and add the worst-case bound-checking (I'm happy for you if you never have to worry about the worst

We are always comparing suboptimal programs. No compiler is producing optimal code, otherwise they would dead-code eliminate everything not explicitly intertwined into program output, statically evaluate half of our microbenchmarks, and replace them with table lookups.

You're going down a linear algebra rabbit hole trying to come up with a result that paints bounds checks in the worst possible light. If this is the real problem you have, maybe your linear algebra kernels would be better off just using pointers, or you could even try Fortran or handwritten assembly instead, if it is so important. Unsafe by default is bad IMHO. For real programs bounds checks really don't matter that much. Where this thread is going is all about loop optimizers, and they don't really get a chance to go nuts on most code, so I think we're way off in the weeds.

tialaramex · on March 31, 2022

Note that Rust has unchecked index, you just have to explicitly ask for that if you want it. You can even - if you just insist on writing cursed code - which it seems jcelerier is, write your own type in which the index is always unchecked and it is Undefined Behaviour when you inevitably make a mistake. Just implement std::ops::Index and if you also want to mutate these probably invalid indices, std::ops::IndexMut and in your implementation say you're unsafe and just don't bother doing bounds checks.

You can shoot yourself in the foot with Rust, it's just that you need to explicitly point a loaded gun at your foot and pull the trigger, whereas C++ feels like any excuse to shoot you in the foot is just too tempting to ignore even if you were trying pretty hard not to have that happen.

pjmlp · on April 1, 2022

C++ devs that actually care about security the same way as titzer, do turn on the security checks that are disabled by default, thus you can have a same experience.

Example, on Visual Studio define _ITERATOR_DEBUG_LEVEL to 1, enable /analize as part of the build.

While not Rust like, it is already much better than not caring.

steveklabnik · on March 31, 2022

I think you’re confusing two different senses of “vector”, there is the contiguous series of items “vector” data structure that both C++ and Rust have in their standard libraries that’s used in the code example, and there’s “vectorizing” which is an optimization to use things like SIMD to operate on multiple things at the same time.

titzer · on March 31, 2022

I understood what was meant by these terms.

steveklabnik · on March 31, 2022

Okay, then I got lost reading your comment, my apologies.

titzer · on March 31, 2022

No worries. It's too late for me to edit it now, though, but I meant vectorizing="compiler introduces SIMD".

nybble41 · on March 31, 2022

Part of the problem is that the multiply() function takes variable-size matrices as inputs and isn't inlined because it's declared as an external function and so could, in theory, be called from another compilation unit. If it's declared as "static" instead then the compiler generates identical code (modulo register selection) at -Ofast for both versions—eliminating the bounds checking at compile-time:

https://gcc.godbolt.org/z/qqWr1xjT8

jcelerier · on March 31, 2022

> Part of the problem is that the multiply() function takes variable-size matrices as inputs

so does real-life code ? I don't know about you but I pretty much never know the size of my data at compile time.

> If it's declared as "static"

again, real-life code calls math operations defined in other libraries, sometimes even proprietary.

titzer · on March 31, 2022

In Java or other languages with bounds-checked arrays, you would typically just loop from 0 to the end of the array. In Java the JIT will analyze the loop, see that it is bounded by the length of the array, conclude all accesses are in-bounds and eliminate bounds checks. Virgil does this simple type of bounds check elimination as well in its static compiler, but its analysis is not quite as sophisticated as Java JITs.

nybble41 · on March 31, 2022

> I don't know about you but I pretty much never know the size of my data at compile time.

Whole-program optimization (LTO) can deal with that. Also, Rust inlines across modules much more aggressively than C++ does, even without LTO, so optimization will be more effective and its bounds-checking won't have as much (if any) runtime overhead. Especially if you write your Rust code idiomatically, as others have already suggested. Your earlier Rust example was practically designed to inhibit any attempt at optimization. Simply iterating over a slice of the vector, rather than the indices, results in a much tighter inner loop as it only needs to check the bounds once.

That being said, in this case I think it would be better to have fixed-size matrices (using const generic/template arguments) so that the bounds are encoded in the types and known to the compiler locally without relying on whole-program optimization.

stouset · on March 31, 2022

> Especially if you write your Rust code idiomatically, as others have already suggested. Your earlier Rust example was practically designed to inhibit any attempt at optimization.

This is pretty much word-for-word what I've seen in other similar disagreements. Rust Skeptic transcribes their solution from its original programming language into Rust line by line, producing something nobody using Rust would actually write. They find that Rust performs poorly. Rust Proponent writes it the way you'd expect someone who actually knows Rust to write it, and performance meets or exceeds the original. Sometimes they even catch a few bugs.

Yes, if you don't know Rust, and you think it's just a weird re-skin of C++ that can be gsub'd from one to another, you're going to have a bad time. If you're an expert in C++ and have never written Rust, you probably aren't going to beat your finely-tuned C++ program with a ten minute Rust translation. But someone with a year of experience in Rust is probably going to be able to rewrite it in half an hour and come within spitting distance of the thing you've optimized by hand over the course of a few years.

I've written Rust for half a decade and I'm not sure I've ever actually explicitly indexed into a Vec or slice in production code. If I needed to, and it was in a hot loop, and it wasn't a fixed-size array whose size is known at compile time... there's always `get_unchecked()` which is functionally identical to indexing into a C++ vec without `at`.

jcelerier · on April 1, 2022

> Yes, if you don't know Rust, and you think it's just a weird re-skin of C++ that can be gsub'd from one to another, you're going to have a bad time.

most C++ code is not "idiomatic C++" either, it's code which looks like this:

https://github.com/cdemel/OpenCV/blob/master/modules/calib3d...

or this

https://github.com/madronalabs/madronalib/blob/master/source...

or this

https://github.com/MTG/essentia/blob/master/src/3rdparty/spl...

etc ... which is in general code from science papers written in pseudocode which is ported at 2AM by tired astrophysics or DSP masters students to have something to show to their advisor. You can have whatever abstraction you want but they won't be used because the point is not to write rust or C++ or anything but to get some pseudo-MATLABish thing to run ASAP, and that won't have anything that looks like range / span, only arrays being indexed raw.

colejohnson66 · on March 31, 2022

If you were using std::vector::at (the proper way), you'd have bounds checking as well. One should only direct index if they really know that they are inside the bounds. And in Rust, there's std::vec::Vec::get_unchecked: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.get...

jcelerier · on March 31, 2022

> If you were using std::vector::at (the proper way),

it is absolutely not the proper way and a complete anti-pattern

chlorion · on March 31, 2022

Yeah I agree using vector::at is a common and fairly bad anti-pattern.

So long as you make sure your program is correct you never need to worry about indices being out of bounds. Requiring bounds checking is a sign you need to eliminate the errors in your software.

This makes me wonder, why are people writing software with errors in it in the first place? Even master programmers seem to be too lazy and careless to remove the errors from their software. What gives?

benibela · on April 1, 2022

Using C++ is an anti-pattern

That is why I write all my code in Pascal, where you can enable bound checking for []

Enable it for a debug build, and you an be sure there are no overflows of any kind, and when it runs the program has pretty much no errors at all. Disable it for the release build, and it runs as fast as if it was written in C

imtringued · on April 1, 2022

I think I'm falling for Poe's law.

Here is a honest answer. The problem is frankly unsolvable thanks to the halting problem. It's impossible to determine in general whether a program is going to reach a certain state except by testing all potential inputs and even that is only going to give you an approximate answer. If it were possible we would have written a program that solves the problem through static analysis.

umanwizard · on March 31, 2022

That depends on the project. Anyway, it’s irrelevant, because you can do it either way in either language.

gavinray · on March 31, 2022

This is the Ossia Score lead dev FYI, he's no stranger to either language

umanwizard · on March 31, 2022

I don't know what Ossia Score is. Regardless, I didn't mean to imply anything negative about the competence of the person I was replying to.

pjmlp · on April 1, 2022

Definitly not a complete anti-pattern in distributed computing software that cares about security and for whatever reason needs to be written in C++.

umanwizard · on March 31, 2022

Both languages let you access vector elements with or without bounds checking. Stroustrup even recommends you use .at by default in c++.