> Borrow checking is incompatible with some useful patterns and optimizations (described later on), and its infectious constraints can have trouble coexisting with non-borrow-checked code.
Not that this isn't true, but the rest of the article introduces a system with a superset of those limitations, gradually decreasing over time but never becoming a subset. In fact the pattern described in the article is a common pattern in Rust and I make use of it all the time; the library for making use of it is `slotmap`.
Later on, it adds generational references and constraint references to relax the restrictions. These are both more flexible than SlotMap because they don't require a new parameter to be passed in from the callers (and callers' callers etc), which can cause problems when an indirect caller's signature can't change (trait method override, public API, drop, etc.)
> Borrow checking is incompatible with some useful patterns.
The main problem is back references, as in doubly linked lists. In Rust, you can do that sort of thing using Rc and the weak/strong reference mechanism. Forward references own, and are strong. Back references are weak.
I've been toying with the idea of some generic types which allow strong forward references which you can't copy or clone, and weak back references which you can't make strong outside a contained scope. This can be implemented with the existing Rc system, and potentially could be proven, with a static analyzer, to not require the reference counts. It's worth a try to see if one can effectively program under those restrictions. If it's not too much of a pain to use, this might be an effective way out of Rust's back-reference problem.
> In fact the pattern described in the article is a common pattern in Rust and I make use of it all the time; the library for making use of it is `slotmap`.
Slotmap uses unsafe everywhere, it's a memory usage pattern not supported by the borrow checker. It's basically hand-implementing use-after-free and double-free checks, which is what the borrow checker is supposed to do. Is that really a common pattern in Rust?
> Slotmap uses unsafe everywhere, it's a memory usage pattern not supported by the borrow checker. Is disabling the borrow checker really a common pattern in Rust?
Wrapping "unsafe" code in a safe interface is a common pattern in Rust, yes. There is absolutely nothing wrong with using "unsafe" so long as you are diligent about checking invariants, and keep it contained as much as possible. Obviously the standard library uses some "unsafe" as well, for instance.
"unsafe" just means "safe but the compiler cannot verify it".
Unsafe does not disable the borrow checker, though. All of the restrictions of safe Rust still apply. All "unsafe" does is unlock the ability to use raw pointers and a few other constructs.
> Obviously the standard library uses some "unsafe" as well, for instance.
Most beautifully, MaybeUninit<T>::assume_init() -> T
This unsafe Rust method says "I promise that I actually did initialize this MaybeUninit<T>, so give me the T".
In terms of the resulting program the machine is not going to do any work whatsoever, a MaybeUninit<T> and a T are the same size, they're in the same place, your CPU doesn't care that this is a T not a MaybeUninit<T> now.
But from a type safety point of view, there's all the difference in the world.
Even though it won't result in emitting any actual CPU instructions, MaybeUninit::assume_init has to be unsafe. Most of the rest of that API surface is not. Because that API call, the one which emitted no CPU instructions, is where you took responsibility for type correctness. If you were wrong, if you haven't initialized T properly, everything may be about to go spectacularly wrong and there's no-one else to blame but you.
Exactly. People miss this all the time when they write off Rust for "needing unsafe to do real programming" or whatever uninformed criticism they're parroting (they've clearly never actually done this "real programming" in Rust). The whole point is to reduce the opportunity for unforced errors by marginalizing the cognitive load required for the programmer to ensure the program is correct. And a program with a few unsafe blocks to `assume_init` some memory that e.g. a driver initialized for you is still infinitely better in that regard than a program that's littered with `void*` everywhere.
> than a program that's littered with `void*` everywhere
Strawman argument. A properly written C++ program isn't littered with `void*` everywhere in the same way that a properly written Rust program isn't littered with `unsafe` everywhere. You build safe abstractions around the ugly low-level pointer handling, you just don't have a keyword for a clear delineation.
> People miss this all the time when they write off Rust for "needing unsafe to do real programming" or whatever uninformed criticism they're parroting
Hard-core Rust proponents also seem to miss this all the time. Because "you basically write the same unsafe code that you would write in C++ but you now have a keyword to mark it" just doesn't imply the same urgency for adopting the language than "you only need unsafe to implement a few primitives in the standard library" does, which always seems to be tacitly implied until called out, and then the critics are "misinformed."
Firstly the delineation clarity is much more valuable than you seem to appreciate. A day one beginner in Rust can see that this stuff is roped off - so they know if they should call a grown-up - and everything which isn't roped off is safe for them. This also benefits an experienced developer when you're not at your best. Lets not write unsafe Rust today, we can do that when the air conditioning works, the coffee machine is fixed and there aren't contractors using power tools in the office.
I also think you very seriously underestimate how much equivalently unsafe C++ you write, and overestimate how much actual unsafe Rust is needed. Philosophically WG21 (the C++ committee) didn't like safe abstractions, so it doesn't provide them. To the point where the C++ slice type std::span is exactly like the safety proposal where it was originally suggested, except with all the safety explicitly ripped out. "We like this safety feature, except for the safety, get rid of that". I am not even kidding.
Most Rust programmers don't need to write any unsafe Rust. They can rely on Rust's promises, about aliasing, races, memory safety, performance characteristics, and they have no responsibility for delivering those promises, it's all done for them so long as they write safe Rust.
The other crucial element is culture. Culturally Rust wants safe abstractions, that applies to the standard library of course, but it also applies to third party code, you can expect other Rust programmers to think your library is crap if it has a method which is actually not safe to call without certain pre-conditions but isn't labelled "unsafe" -- because that's exactly what "unsafe" is for so you're not fulfilling your social contract.
> You build safe abstractions around the ugly low-level pointer handling, you just don't have a keyword for a clear delineation.
The main difference is they are not really safe. It is trivial to accidentally invoke UB with incorrect use of "safe" abstractions in C++ like built-in containers or smart pointers. Keep a reference to a vector element, add a new item to the vector and it will sometimes blow up ;)
I disagree that it is "trivial," at least in the example you stated. This take-reference-then-mutate is exactly the kind of usage that the borrow checker prevents. You have to avoid it systematically in both languages.
The built-in containers are also not the best examples of "safe" abstractions. You can build safer abstractions, and you can employ safer usage patterns of built-in vectors, at non-zero but marginal costs.
The honest view on C++ is that there is no such thing as "safe" in absolute terms, but you have a lot of tools to mitigate the unsafe nature of the core language.
The honest view on Rust is that the idea of categorically excluding memory safety errors didn't quite pan out, but we're nonetheless left with an improvement over C++.
It’s subtle, but you don't avoid “take reference then mutate” in Rust, you are told exactly how to do it without aliasing the memory.
I’m not going to say Rust is perfect, that’s obviously not the case. But I really think your argument, like others are saying, underplays the actual value of Rust.
I’ve written entire projects in both C++ and Rust. I’ve never wasted days debugging memory corruption in Rust. Just sayin’.
If unsafe means “safe but the compiler cannot verify” then I guess just consider .cpp to mean “safe but the compiler cannot verify” and we have suddenly made C++ memory safe
There's a related idea in Haskell, usually considered a memory safe language. You can write a program in Haskell that directly mutates memory, or does IO operations, freely, anywhere in the code. This violates functional purity and the compiler cannot offer its usual promises; your program may very well segfault from a bug in such code. But sometimes you just have to, perhaps to implement an algorithm efficiently.
Still, it is discouraged; both culturally in the language community, and discouraged through the subtle prodding of the language itself (such as everything being typed "IO", or the slightly ominous "unsafe" in the "unsafePerformIO".) Very often, the amount of code that must truly live in IO can be reduced to a few dozen lines, if that. That code is crucial to get right -- it's where the actual sequence of computation and external effects are handled. Such isolation allows the rest of the code to not have to worry about those matters.
Sure, and if a typical Rust program that I write has no unsafe in it directly, and 5% of its dependencies' code have unsafe in them, that's also the same as writing a program in the "not c++" language directly, and using "not c++" dependencies for all but 5% of the dependency code.
Unsafe Rust is safer than C++, and even if it wasn't, 5% unsafe in Rust programs (in well-marked locations) is vastly superior to 100% unsafe in C++ programs.
unsafe rust is less safe than C++ because of the provenance and aliasing semantics that unsafe rust must adhere to to avoid UB, which are generally tricker than those of C++
The provenance rules in the C++ standard are basically just a shrug emoji†, so it's unclear whether those are worse, I can see an argument for the idea that obeying Aria's strict provenance experiment rules in Rust is easier - not because it's easy (although for many common cases it is) but because at least these are coherent rules.
The core value proposition of rust is that it’s memory safe by default, and it’s possible to limit the set of code that needs to be manually checked for UB. This isn’t the case for C++, as any code anywhere can invoke undefined behavior.
True, as long static analysers aren't part of the build, at which time specific constructs can be made to break the CI/CD build, forcing everyone to play by the rules if they want the PR to go through.
It isn't perfect, but does improve a lot the security baseline.
> so long as you are diligent about checking invariants
part. Could you go through and check all the parts of a huge C++ codebase to make sure invariants are held as opposed to a few hundred lines of unsafe Rust code?
Right, I guess the question is what will that proportion be when Rust is used for things like operating systems and web browsers. 30% would be untenable but a few hundred/thousand lines of unsafe code is fairly easy to put under a microscope.
For some current day research into this, there is the paper "How Do Programmers Use Unsafe Rust?"[1] which I'll drop a quote from here:
> The majority of crates (76.4%) contain no unsafe features at all. Even in most crates that do contain unsafe blocks or functions, only a small fraction of the code is unsafe: for 92.3% of all crates, the unsafe statement ratio is at most 10%, i.e., up to 10% of the codebase consists of unsafe blocks and unsafe functions
That paper is definitely worth reading and goes into why programmers use unsafe. e.g 5% of the crates at that time were using it to perform FFI.
In writing "RUDRA: Finding Memory Safety Bugs in Rust
at the Ecosystem Scale" [2], I recreated this data and year-by-year the % of crates using unsafe is going down. And for what it's worth, crates are probably a bad data-set for this. crates tend to be libraries which are exactly where we would expect to find unsafe code encapsulated to be used safely. There's also plenty of experimental and hobby crates. A large dataset of actual binaries would be way more interesting to look at.
Or Rust in Android, in this deep dice gaining two places of unsafe code which found a bug in the existing implementation due to the vetting triggered by being the only two places.
As we follow the standard rust rule that "safe code should not be able to use unsafe code to do unsafe things", those unsafe bits of code have been very carefully checked, to the best of our abilities, to ensure they don't create memory safety issues. It is a lot easier to triple-check 170 lines of code than 30,000 lines.
Are you using wgpu for the rendering stuff? Heard that WebGPU had to sacrifice some performance in order to make the API safer for the web (like more bounds checking and sanity checks). These kinds of issues are actually plaguing projects like Tensorflow.js (for example see https://github.com/gpuweb/gpuweb/issues/1202).
Other libraries like Vulkan and DirectX 12 are fundamentally unsafe in the API level, so direct usage of it would lead to heaps of unsafe Rust code. Rust people have tried wrapping it in a safe way (like gfx-rs and vulkano) but nowadays most seem to have transitioned to wgpu (since WebGPU API is safe by design so it fits more for the Rust ecosystem).
Rust does sacrifice some performance in general in order to achieve its safety claims, but people are happy with it so far, since the majority of applications using Rust (like CLI apps and web servers) don't have to squeeze out performance that much (for webdev there are too many things that can cause performance issues other than not writing it in Rust). But for 3D graphics people can be more sensitive about these problems. Though maybe if you're not developing a triple-A game with the latest cutting-edge graphics (with new techniques like "hardware ray tracing" and "bindless descriptors", which are both impossible in wgpu), writing in Rust can be a good-enough tradeoff for your needs.
WGPU is just finishing up a major reorganization of locking and internal memory management, going from a global lock to fine-grained Arc reference counts.[1] Change log, just posted a few minutes ago: "Arcanization of wgpu core resources: Removed 'Token' and 'LifeTime' related management, removed 'RefCount' and 'MultiRefCount' in favour of using only 'Arc' internal reference count, removing mut from resources and added instead internal members locks on demand or atomics operations, resources now implement Drop and destroy stuff when last 'Arc' resources is released, resources hold an 'Arc' in order to be able to implement Drop, resources have an utility to retrieve the id of the resource itself, removed all guards and just retrive the 'Arc' needed on-demand to unlock registry of resources asap removing locking from hot paths."
From a performance standpoint, I'm much more concerned about being able to get all the CPUs working on the problem than slight improvements in per-CPU performance. My metaverse viewer has slow frames because loading content into the GPU from outside the rendering thread blocks the rendering thread. All that "ARCcanization" should fix that.
A counterpoint that makes this argument a bit weaker: Rust’s “unsafe” marker doesn’t pollute only its scope and actually pollutes the whole module; You need to make sure that the invariants in unsafe code are met even in safe code. (An explanation of this in the Rustonomicon: https://doc.rust-lang.org/nomicon/working-with-unsafe.html)
So there’s quite a lot more code to actually check then what some of the Rust proponents are saying. One can say that C++ is still worse in this regard (theoretically you need to check 100% of your code to be safe in C++). But for some minority of developers who frequently needs to delve into unsafe code, the advantages of Rust might seem a bit more disappointing (“the compiler doesn’t really do that much for the more important stuff…”)
> whole point of rust is that memory safety issues are never worth the cost
I don’t think that it would be the point of rust — otherwise why not write Java, or a litany of GCd languages instead?
Rust is a low-level/systems programming language where you have more control over the program’s execution (e.g. no fat runtime), which is a necessity in some rare, niche, but important use cases.
It's not what unsafe means. Unsafe means this might cause UB for some invocations (accessing raw pointers, calling into another language, etc.). Safe means it will not cause UB for any invocations (it may panic or abort).
It's essentially a "user-space" memory allocator with it's own use-after-free and double-free checks, apparently because the language implementation isn't adequate. If anything it just reinforces the articles point that "borrow checking is incompatible with some useful patterns and optimizations."
There is absolutely no need for unsafe in slotmap. I chose to use unsafe (wrapped in a safe API) to reduce memory usage using intrusive linked freelists.
If done using safe Rust this would involve `enum`s that would take up extra space.
A downside for sure, but one that, at least in this specific example, has limited downsides. If you can button it up into a safe abstraction that you can share with others, then I don't really see what the huge problem is. The fact that you might need to write `unsafe` inside of a well optimized data structure isn't a weakness of Rust, it's the entire point: you use it to encapsulate an unsafe core within a safe interface. The standard library is full of these things.
Now if you're trying to do something that you can't button up into a safe abstraction for others to use, then that's a different story.
There are two things here. The `unsafe` in an `unsafe { ... }` block is referring to the contents of the block. From the outside it is indeed safe to use as if it were safe code. No special requirements necessary. So, yes, from a certain point of view `safe` would have been a better name (albeit confusing in a different way).
An `unsafe fn` however does need to be used correctly (and should document those requirements). However, these can only be called within `unsafe` blocks, so see above.
Not entirely correct, Rust’s “unsafe” marker doesn’t pollute only its scope, it actually pollutes the whole module; You need to make sure that the invariants in unsafe code are met even in safe code. (An explanation of this in the Rustonomicon: https://doc.rust-lang.org/nomicon/working-with-unsafe.html)
> Slotmap uses unsafe everywhere, it's a memory usage pattern not supported by the borrow checker.
Author of slotmap here. This is patently false.
Yes, the slotmap crate uses a lot of unsafe to squeeze out maximum performance. But it is not 'a memory usage pattern not supported by the borrow checker'. You can absolutely write a crate with an API identical to slotmap without using unsafe.
> But it is not 'a memory usage pattern not supported by the borrow checker'. You can absolutely write a crate with an API identical to slotmap without using unsafe.
I think that might actually be worse though, performance aside. You're performing memory / object lifetime management but the Rust borrow checker still would have no idea what's going on because now you've tricked it by using indices or an opaque handle instead of references. The program may compile just fine but could have use-after-free bugs.
At least with unsafe there's an explicit acknowledgement that the borrow checker is turned off.
Yes, using slotmap you can get "use after free"-style bugs that you would not encounter if you strictly stayed with the borrow checker. So if the borrow checker fits your purpose, by all means, go ahead.
But the borrow checker can't represent circular/self-referential structures you see very often in graphs. Nor is it convenient in some cases as it has a strict separation between references that can mutate, and those that can't, which doesn't fit all problems either because the mutable references are by necessity unique.
Note that a "use after free" in slotmap results in a None value, or a panic (exception for the C++ people), depending on which API you use. In other words, it is detected and you can handle it. It does not trigger undefined behavior, you don't get ABA-style spurious references, there are no security issues. It is not the same as the issues pointers have at all.
I have implemented my own slotmap crate for a lisp interpreter that uses no unsafe code and provides exactly the same features as the "standard" slotmap crate.
There is nothing inherent to the slotmap that requires unsafe code! It's only used for optimizations purposes.
Mine works in a similar way to the "standard" slotmap. It's a vec of slots, slot is an enum that can be occupied or vacant, the occupied variant is a two tuple containing the value and generation, vacant holds just a generation. Inserting into the slotmap simply switches the variant of the slot from vacant to occupied, and popping does the reverse. If there is no currently vacant slots, we just use the underlying push method on the vec of slots which will handle resizing for us! I also store a stack of indexes to vacant slots to make insertion fast.
When you insert into the slotmap, it provides a opaque key, but the data inside is an index and a generation. When you attempt to retrieve a value with a key, the slotmap checks if the slot is occupied and if the generation matches, and if so returns the value, otherwise returns none.
There is also a indirect slotmap, that adds an extra layer of indirection, so rather than the key being an index directly into the underlying vec of slots, its an index into a vec of indexes, this allows moving the slots around without invaliding currently living keys.
The indirect slotmap has the advantage of faster iteration, since it doesn't have to skip over empty "holes" of vacant slots in the vec of slots. The tradeoff is that insertion is slightly slower!
Anyways, no unsafe is required to implement a performant slotmap data structure! I have not uploaded my slotmap to crates.io because I didn't think anyone would find it useful, but maybe I should reconsider this!
The question is far too broad, and contextual. You're never going to get an answer to that question.
Sometimes, the rules add more optimization potential. (like how restrict technically exists in C but is on every (okay almost every) reference in Rust) Sometimes, the rules let you be more confident that a trickier and faster design will be maintainable over time, so even if it is possible without these rules, you may not be able to do that in practice. (Stylo)
Sometimes, they may result in slower things. Maybe while you could use Rust's type system to help you with a design, it's too tough for you, or simply not worth the effort, so you make a copy instead of using a reference. Maybe the compiler isn't fantastic at compiling away an abstraction, and you end up with slower code than you otherwise would.
And that's before you get into complexities like "I see Rc<RefCell<T>> all the time in Rust code" "that doesn't make sense, I never see that pattern in code".
I'd say it mostly applies to manual optimization, when we're restructuring our program.
If the situation calls for a B-tree, the borrow checker loves that. If the situation calls for some sort of intrusive or self-referential data structure (like in https://lwn.net/Articles/907876/), then you might have to retreat to a different data structure which could incur more bounds checking, hasher costs, or expansion costs.
It's probably not worth worrying about most the time, unless you're in a very performance-sensitive situation.
There can be no answer. Research is ongoing, smart people are actively trying to make optimizer better, so even if I gave a 100% correct answer now (which would be pages long), a new commit 1 minute latter will change the rules. Sometimes someone discovers what we thought was safe isn't safe in some obscure case and so we are forced to no longer apply some optimization. sometimes optimization is a compromise and we decide that the using a couple extra CPU cycles is worth it because of some other gain (a CPU cycle is often impossible to measure in the real world as things like caches tend to dominate benchmarks, so you can make this comprise many times when suddenly the total adds up to something you can measure.).
The short answer for those who don't want details: it is unlikely you can measure a difference in real world code assuming good clean code with the right algorithm.
Without directly answering your question, it's worth noting that there are also additional optimizations made available by Rust that are not easily accessible in C/C++ (mostly around stronger guarantees the Rust compiler is able to make about aliasing).
However, what you can say is that the borrow-checker works like a straight-jacket for the programmer, making them less capable to focus on other things like performance issues, high-level data leaks (e.g. a map that is filled with values without removing them eventually), or high-level safety issues.
You can also say that the borrow checker works like a helpful editor, double checking your work, so that you can focus on the important details of performance issues, safety issues, and such, without needing to waste brain power on the low-level details.
The point is that the compiler helps you “read” it. This takes mental effort off of you.
I agree that not everyone thinks this is true, but this is my experience. I do not relate to the compiler as a straight jacket. I relate to it as a helpful assistant.
I think it’s generally accepted that writing code is nearly universally easier than reading code, in any language. That aside, getting a mechanical check on memory safety for the price of some extra language verbosity is obviously worth it IMO.
By the same token, it is common to see criticisms of the complexity of templates in C++, but templates are the cornerstone of “Modern C++” and many libraries could not exist without them.
GC has little to do with it. The borrow checker as a developer tool has much more to do with preventing concurrency bugs and unexpected mutation than it does with memory management.
"As a developer tool" is doing some work in that sentence though. As a language implementation characteristic, the checker can help inform (or, more accurately, ensures that code is written in a way that informs) memory management decisions.
What performance? That’s not a single thing. Do you pay in throughput or latency?
It certainly has a price but it is waaay too overblown in many discussions. What it mostly does entail is a slightly larger p99 latency. Where it actually matters is entirely another question.
Not that this isn't true, but the rest of the article introduces a system with a superset of those limitations, gradually decreasing over time but never becoming a subset. In fact the pattern described in the article is a common pattern in Rust and I make use of it all the time; the library for making use of it is `slotmap`.