Hacker News new | past | comments | ask | show | jobs | submit login

When is C++ higher-performance than Rust?



Rust's lack of fast thread-local storage makes it a non-starter for me.

It's really disappointing, especially when the language has incorporated so many other excellent performance improvements like Google's SwissTable and pattern-defeating quicksort.

https://github.com/rust-lang/rust/issues/29594

https://matklad.github.io/2020/10/03/fast-thread-locals-in-r...


FWIW, Rust does have fast thread locals, but only in nightly. https://github.com/rust-lang/rust/issues/29594


How many companies are running Rust nightly in production?


According to the 2020 Rust Survey[0], a decent amount. The majority is on stable rust, but a bit less than a third of rust users are on nightly.

The company I work at uses a pinned nightly rust in order to access some soon-to-be-stabilized features that simplify our life a bit. We update our pinned nightly in lockstep with stable rust releases. In practice, we've almost never had problems with using nightly rust - those are still very well tested, and problems are caught early and fast. The Rust test suite is quite thorough!

However, we do try to avoid using features that don't have a clear trajectory towards stabilization, so for us to consider thread_locals, we'd need to have some very solid proof that it would fix some critical performance problems.

I suspect every org will have their own policy, and while the majority will use stable rust, it's not like nightly rust is completely unthinkable.

[0]: https://blog.rust-lang.org/2020/12/16/rust-survey-2020.html


This survey says about half of the responders are using Rust in production (or at work in some capacity). It doesn't say anything about what percentage of those surveyed AND use Rust at work are ALSO using stable vs nightly.


Yup this is the #1 annoying performance issue in Rust. It would probably require a few language extensions to make work safely, but nothing insurmountable I don't think.


There's no placement new IIRC so you always build on the stack and copy it to the heap.

https://users.rust-lang.org/t/how-to-create-large-objects-di...


You can do "placement new" (Rust has no new operator, but in this context) unsafely with MaybeUninit -- https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.h...

Make a MaybeUninit<Doodad> on the heap, initialise it, and then unsafely assume_init() remembering to write your safety justification ("I initialised this, so it's fine") and get a Doodad, on the heap.

The reason it isn't mentioned in that 2019 u.r-l.o post is that MaybeUninit wasn't stabilized until mid-2019.


I think that's a question of an "insufficiently smart compiler" rather than necessarily something the programmer should be concerned about?


No, it is a language semantics issue. You can get the compiler to optimize it away today, if you're careful. But that's a bad user experience. You should be able to specifically request this semantic, and be sure it works.


The programmer should certainly be concerned about the compiler being insufficiently smart.


Fair point.


I would frame the question more in terms of economy than performance. In theory, I could write a database engine in Rust that is as performant as C++, but it would not make sense to.

Making Rust perform the same as C++ in database engines requires marking large portions of the code base as "unsafe", while requiring substantially more lines of code due to being less expressive. Note that this doesn't imply that the code is actually unsafe, just that the Rust compiler cannot reason about the safety of correct and idiomatic code in database engines.

And this is why state-of-the-art database engines are still written in C++. There is not much point in using Rust if doing so requires disabling the safety features and writing a lot more code to achieve the same result.


Definitely not my experience with file io and database-related software in rust. Actually two of my publically available programs - fclones and latte - seem to be the most efficient in their respective classes, beating a wide variety of C and C++ programs, yet they contain almost no unsafe code (< 0.1%).

The state of the art database engines are written in C++ (and Java) because at the time they were developed Rust didn't exist.


All high-performance database engines do full kernel bypass -- Linux has standard APIs for this. Databases designed this way are much higher performance, which is people do it. It may not be obvious what this has to do with Rust.

The downside of kernel bypass is that the kernel is no longer transparently hiding the relationship between storage, memory, and related low-level hardware mechanics. A consequence of this is that the mutability and lifetime of most objects cannot be evaluated at compile-time, which Rust relies on for its safety model. The hardware can hold mutable references to memory that are not represented in software and which don't understand your object model. How do you manage lifetimes when references to objects are not visible in the code? When an object does not have a fixed address? And so on. This is not trivial even in C++ but the language does allow you to write transparent abstractions that hide this behind something that behaves like a reference without complaining.

There are proven models that guarantee safety under these constraints, widely used in database engines. Instead of ownership semantics, they are based on dynamic scheduling semantics. The execution scheduler can dynamically detect unsafe (or other) situations and rewrite the execution schedule to guarantee safety and forward progress without violating other invariants. This is why, as a simple example, some databases never seem to produce deadlocks -- deadlock situations still occur but they are dynamically detected before they occur and are resolved transparently by editing the lock and execution graph.

Some major classes of database architecture optimization don't work in garbage collected languages. For this reason, you never see state-of-the-art database engines written in e.g. Java.


I've seen a similar pattern implemented in Rust already, presenting a safe interface around relatively little unsafe code.

For example, the Bevy game engine's entity system works like a small in-memory database, and also replaces ownership (games also have lots of mutability and lifetimes that aren't clear at compile time) with careful dynamic scheduling. Users of this system just request a particular kind of access (mutable or immutable) to some set of entities, and the engine schedules them to avoid conflicts.

I'm not that familiar with databases so maybe I'm just totally missing what you're talking about, but from what it sounds like this is not something that would require large portions of a Rust program to be unsafe.

Overall I think the idea that Rust relies on knowing specific mutability and lifetimes at compile time is a bit of a misunderstanding of how the borrow checker works. It's rather a sort of glue that describes relationships between APIs, and when you are coming up with a custom approach to ensure safety it is still a useful language you can speak on the boundaries.


A tacit assumption of Rust's model is that code is running on top of a transparent virtual memory system e.g. what Linux provides by default. Game engines, and in-memory databases, typically run on top of the OS virtual memory implementation. A database engine that bypasses or disables the OS virtual memory system, which many do, breaks this assumption. This is common architecture in any performance-sensitive application that is I/O intensive, but databases are the most extreme example.

Almost every object in a database engine, whether transient or persistent, lives outside the virtual memory model provided by the OS. Because it is in user space, this is part of the database code and no longer transparent to the compiler. To a compiler, these objects have no fixed address, have an ambiguous number of mutable references at any point in time, and may have no address at all (e.g. if moved to storage). This is safe, and you can write abstractions in C++ that make this transparent to the developer, but the compiler can still see it and Rust does not like what it sees.

Performance is not the only reason to do this. The OS implementation is tightly coupled to the silicon implementation of virtual memory. Very large storage volumes can exceed the limits of what the hardware can support, but user space software implementations have few practical limits on scale.


> A tacit assumption of Rust's model is that code is running on top of a transparent virtual memory system e.g. what Linux provides by default.

Where did you get that from? There is nothing more in Rust than there is in C++ that relies on virtual memory subsystem. Rust is perfectly able to compile for platforms with no OS at all.

> This is safe, and you can write abstractions in C++ that make this transparent to the developer, but the compiler can still see it and Rust does not like what it sees.

Rust can do the same abstractions as C++ can. This is pretty much a standard for low-level crates, which come with some unsafe code, wrapped in higher-level, easy and safe to use abstractions.


You seem to be missing a critical point.

As a consequence of implementing some of these kernel functions in user space, it is not possible to determine object lifetimes or track mutable references at compile-time. Any object created in memory that bypasses the kernel mechanisms is unsafe as far as Rust is concerned. In a database, the vast majority of your objects are constructed on this type of memory.

You can implement this in Rust, but essentially the entire database kernel will be "unsafe". At which point, why bother with Rust? C++ is significantly more expressive at writing this type of code, and it is not trivial even in C++.


I work at Materialize, which is writing a database in Rust. We use some `unsafe` code, but it's a very small proportion of the overall codebase and quite well abstracted away from the rest of the non-unsafe code.

To be fair, we're trying to do a bit of a different thing (in-memory incrementally maintained views on streaming data) than most existing databases, so you could argue that it is not an apples-to-apples comparison.

But there are plenty of other databases written in non-C++ languages -- there is even TiDB which is written in Rust too.


High performance and scientific computing, or more generally when you transfer a lot of data structures via pointers to a lot of threads. In that scenarios, a half second performance difference in main loop can translate to hours in long runs.

The code I'm developing is running with ~1.7M iterations/core/second. Every one of these iterations contain another set of 1000 iterations or so (the number is variable, so I don't remember the exact mean, but the total is a lot). Also, this number is on an 8 year old system.

More benchmarks are here: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...


You haven’t actually explained when Rust is slower. You just described a situation where you care about performance.

What specific operations happen slower in Rust than in C++?


The link you provided shows C++ and Rust both performing best in various benchmarks. Seems like a wash to me, and to the extent one beats the other it's marginal. Which is to be expected because as long as the language is compiled and the compiler is good enough, you should be able to get as much performance out of it as your CPU can handle.


The numbers might look small, and indeed for most cases it can be considered a wash, but in some cases, these small differences return as hours.

For example, I'm using a lot of vector, matrix and array accesses in my code. A lot of these are copied around for computation. Moving that amount of data will trigger a lot of checks in Rust, and will create a slowdown inevitably.


What makes you think Rust forces more “checks” when moving data than C++ does? Can you give an example?


To wit, the Rust compiler is "sufficiently smart" in an extremely high number of scenarios such that it can achieve optimal performance by eliding things like bounds checks when it's able to determine that they would never fail.

This is, in my experience, the overwhelming majority of idiomatic Rust code. For cases where it can't, and performance is absolutely critical, there are easy escape hatches that let you explicitly elide the checks yourself.


"Extremely high" means "noticeably many". They are identically the same cases where doing something unsafe in C++ is (likewise) impossible. As such, they drop out of the comparison.


Do you have a link to your code that you've benchmarked in C++ and Rust? I'm also doing a lot of vector/matrix/array access in my code, so I'm curious as to what you'd be doing that would cause slow downs to the point that C++ can beat Rust by hours. That would be very enlightening to see.


I would love to, but the framework I'm developing is closed source (it's my Ph.D. and postdoc work at the end of the day), but we have a paper for some part of it. Let me see if I can share it.


Moves don't require any runtime checks in Rust to my knowledge. Moves do involve some compile time analysis to make sure you aren't accessing a value that has been moved, maybe that is what you are thinking of?

In Rust, moves are always a simple memcpy from one place to another, that's it! With non `Copy` types this destroys the original value which is why the compiler makes sure you don't access it after it's moved.

There is also `Clone` which is generally a deep copy of everything, but this also doesn't involve any runtime checks as long as you aren't using reference counted pointers or something.

There are bounds checks when indexing into vectors and whatnot though, but non checked methods are available.


In other words, moves in Rust are very limited and limiting.


They are more limited than C++ moves for sure, whether that is a good or a bad thing I am not so sure of!

I am curious if there are any examples of things you wanted to do but couldn't with Rusts move semantics.

After using both C++ and Rust I personally feel that the way Rust handles moves makes a lot more sense and ends up being much more ergonomic. In C++ moves feel messy and complex due to them being added on rather than having the language built around them from the start. It's also not ideal that C++ moves must leave values hanging around in an unspecified state rather than destroying them immediately, even if there are a few cases where that is useful, most of the time it's not.

Rust also doesn't have any of the crazy `&&T` rvalue reference stuff or require explicitly defining move constructors since everything is movable and all moves are bitwise copies.


The classic thing that's tough to do is self-referential structures; since you have no hook into moves, you can't fix-up the references.

It also has quite a few pros, and many people think it's better the way it is. Since moves are memcopies, they can never fail, meaning there's no opportunities for panics, which means it's much easier to optimize. They aren't arbitrarily expensive; you always know what the cost is. As you mentioned, there's no need for a "moved-from" state, which can save space and/or program logic.


It is correct to say that allowing move constructors to fail was a grave design mistake.


Rust's Vec indexing is bound-checked by default. Just something like this is abysmal for performance (but pretty good for consultants who get called to fix stuff afterwards, so, like, keep them coming!):

- Rust: https://rust.godbolt.org/z/GK8WY599o

- CPP: https://gcc.godbolt.org/z/qvxzKfv8q


> Rust's Vec indexing is bound-checked by default. Just something like this is abysmal for performance

Unless you can provide numbers to back this claim up, I'll continue to rely on my measurements that bounds checking (in Virgil) costs anywhere from 0.5% to 3% of performance. It's sometimes more or less in other languages, but it is by far not the reason any programming language has bad performance. I have no reason to suspect that it costs more in Rust, other than misleading microbenchmarks.


Here's a quick and dirty example: https://gcc.godbolt.org/z/3WK7eYM4z

I observe (-Ofast -march=native -std=c++20 ; CPU is intel 6900k with performance governor on Arch... blasting at 4GHz) :

- clang 13: 8.93 ms per iteration

- gcc 11: 9.46 ms per iteration

so roughly around 9ms.

Replacing

    return mat[r * matrix_size + c]; 
by

    return mat.at(r * matrix_size + c);
I observe

- clang 13: 20.95 ms per iteration

- gcc 11: 18.11 ms per iteration

so literally more than twice as slow. I also tried with libc++ instead of libstdc++ for clang and the results did not meaningfully change (around 9ms without bound-checking and 21 ms with).


From the godbolt link, it looked like most of the vector operations were not getting inlined. You'll need -O2 or higher to have representative results.

I could counter by just writing a Virgil program and turning off bounds check with a compiler flag. We could stare at the machine code together; it's literally an additional load, compare, and branch. The Virgil compiler loves eliminating bounds checks when it can. I know your original comment was in the context of the STL, but it really muddies the waters to see huge piles of code get inlined or not depending on compiler flags. Machine code is what matters.

Regardless, this is still microbenchmarking. Maybe matrix multiply is an important kernel, but we need to benchmark whole programs. If I turn off bounds checks in a big program like the Virgil compiler, I cannot measure more than about 1.5% performance difference.


> From the godbolt link,

I just used godbolt to quickly share the code. On my computer I tried with -Ofast -march=native (broadwell)

> I could counter by just writing a Virgil program and turning off bounds check with a compiler flag. We could stare at the machine code together; it's literally an additional load, compare, and branch.

This sounds like Virgil is not vectorizing, which makes the comparison much less useful.

I see much more than a couple instructions changed there: https://gcc.godbolt.org/z/8jPMb734x


> Virgil is not vectorizing

That's just even more confounding variables. We are, after all, not even doing experiments with Rust vectors, which is what your original comment was about. You wrote examples in C++ and we went down a wormhole already, but I let it slip since at least it was empirical. But I think we shouldn't get distracted with even more side alleys now. Bounds checks are literally just a couple machine instructions, and often eliminated by the compiler anyway, which enables vectorization.


> We are, after all, not even doing experiments with Rust vectors,

they are very close to C++ ones, down to the stack unwinding to report the panic in case of bound error if I'm not mistaken.

> That's just even more confounding variables.

No they are not. We are trying to see how much bound checks costs. If you compare between suboptimal programs the comparison is meaningless (or rather not interesting to anyone) - the starting point has to be the absolute best performance that it is possible to get, and add the worst-case bound-checking (I'm happy for you if you never have to worry about the worst case though!)

> Bounds checks are literally just a couple machine instructions, and often eliminated by the compiler anyway

please provide a source for this ? sure, if you use spans as other commenters mentioned, that moves the checking at the span creation time but that only works for the simplest cases where you are going to access linearily - and I would say that it's a library feature rather than a compiler one.

   for(double v : vec) { } // or any sub-span you can take from it
in C++ also does not need bound-checks by design but this kind of construct also utterly does not matter for so HPC workloads.

I can look into my pdfs folders and bring out dozens of papers where the core algorithms all use funky non-linear indexing schemes where you cannot just iterate a range (algorithms based on accessing i, i-1, i+1, or accessing i and N-i, or accessing even / odd values, etc etc) - how would you implement an FFT for instance ? This is the code that matters !


> accessing i, i-1, i+1, or accessing i and N-i,

A compiler can do loop versioning for that. And they do. Hotspot C2 (and Graal, too) does a ton of loop optimizations, partitioning the index space into in-bounds and potentially out-of-bounds ranges, unrolling loops, peeling the first iteration off a loop, generating code before a loop to dispatch to a customized version if everything will be in bounds.

When a compiler is doing that sophisticated of loop transforms, you are not measuring the cost of bounds checks anymore, you are measuring a whole host of other things. And sometimes if a loop is just a little different, the results can be disastrously bad or miraculously good. Which is why microbenchmarking is so fraught with peril. A microbenchmark might be written assuming a simple model of the computation and then a sophisticated compiler with analyses never dreamed of comes along and completely reorganizes the code. And it cuts both ways; a C++ compiler might do some unrolling, fusion, tiling, vectorization, software pipelining or interchange on your loops and suddenly you aren't measuring bounds check cost anymore, but loop optimization heuristics that have been perturbed by their presence. You end up studying second-order effects and not realizing it. And it keeps running away from you the more you try to study it.

>> That's just even more confounding variables.

> No they are not. We are trying to see how much bound checks costs. If you compare between suboptimal programs the comparison is meaningless (or rather not interesting to anyone) - the starting point has to be the absolute best performance that it is possible to get, and add the worst-case bound-checking (I'm happy for you if you never have to worry about the worst

We are always comparing suboptimal programs. No compiler is producing optimal code, otherwise they would dead-code eliminate everything not explicitly intertwined into program output, statically evaluate half of our microbenchmarks, and replace them with table lookups.

You're going down a linear algebra rabbit hole trying to come up with a result that paints bounds checks in the worst possible light. If this is the real problem you have, maybe your linear algebra kernels would be better off just using pointers, or you could even try Fortran or handwritten assembly instead, if it is so important. Unsafe by default is bad IMHO. For real programs bounds checks really don't matter that much. Where this thread is going is all about loop optimizers, and they don't really get a chance to go nuts on most code, so I think we're way off in the weeds.


Note that Rust has unchecked index, you just have to explicitly ask for that if you want it. You can even - if you just insist on writing cursed code - which it seems jcelerier is, write your own type in which the index is always unchecked and it is Undefined Behaviour when you inevitably make a mistake. Just implement std::ops::Index and if you also want to mutate these probably invalid indices, std::ops::IndexMut and in your implementation say you're unsafe and just don't bother doing bounds checks.

You can shoot yourself in the foot with Rust, it's just that you need to explicitly point a loaded gun at your foot and pull the trigger, whereas C++ feels like any excuse to shoot you in the foot is just too tempting to ignore even if you were trying pretty hard not to have that happen.


C++ devs that actually care about security the same way as titzer, do turn on the security checks that are disabled by default, thus you can have a same experience.

Example, on Visual Studio define _ITERATOR_DEBUG_LEVEL to 1, enable /analize as part of the build.

While not Rust like, it is already much better than not caring.


I think you’re confusing two different senses of “vector”, there is the contiguous series of items “vector” data structure that both C++ and Rust have in their standard libraries that’s used in the code example, and there’s “vectorizing” which is an optimization to use things like SIMD to operate on multiple things at the same time.


I understood what was meant by these terms.


Okay, then I got lost reading your comment, my apologies.


No worries. It's too late for me to edit it now, though, but I meant vectorizing="compiler introduces SIMD".


Part of the problem is that the multiply() function takes variable-size matrices as inputs and isn't inlined because it's declared as an external function and so could, in theory, be called from another compilation unit. If it's declared as "static" instead then the compiler generates identical code (modulo register selection) at -Ofast for both versions—eliminating the bounds checking at compile-time:

https://gcc.godbolt.org/z/qqWr1xjT8


> Part of the problem is that the multiply() function takes variable-size matrices as inputs

so does real-life code ? I don't know about you but I pretty much never know the size of my data at compile time.

> If it's declared as "static"

again, real-life code calls math operations defined in other libraries, sometimes even proprietary.


In Java or other languages with bounds-checked arrays, you would typically just loop from 0 to the end of the array. In Java the JIT will analyze the loop, see that it is bounded by the length of the array, conclude all accesses are in-bounds and eliminate bounds checks. Virgil does this simple type of bounds check elimination as well in its static compiler, but its analysis is not quite as sophisticated as Java JITs.


> I don't know about you but I pretty much never know the size of my data at compile time.

Whole-program optimization (LTO) can deal with that. Also, Rust inlines across modules much more aggressively than C++ does, even without LTO, so optimization will be more effective and its bounds-checking won't have as much (if any) runtime overhead. Especially if you write your Rust code idiomatically, as others have already suggested. Your earlier Rust example was practically designed to inhibit any attempt at optimization. Simply iterating over a slice of the vector, rather than the indices, results in a much tighter inner loop as it only needs to check the bounds once.

That being said, in this case I think it would be better to have fixed-size matrices (using const generic/template arguments) so that the bounds are encoded in the types and known to the compiler locally without relying on whole-program optimization.


> Especially if you write your Rust code idiomatically, as others have already suggested. Your earlier Rust example was practically designed to inhibit any attempt at optimization.

This is pretty much word-for-word what I've seen in other similar disagreements. Rust Skeptic transcribes their solution from its original programming language into Rust line by line, producing something nobody using Rust would actually write. They find that Rust performs poorly. Rust Proponent writes it the way you'd expect someone who actually knows Rust to write it, and performance meets or exceeds the original. Sometimes they even catch a few bugs.

Yes, if you don't know Rust, and you think it's just a weird re-skin of C++ that can be gsub'd from one to another, you're going to have a bad time. If you're an expert in C++ and have never written Rust, you probably aren't going to beat your finely-tuned C++ program with a ten minute Rust translation. But someone with a year of experience in Rust is probably going to be able to rewrite it in half an hour and come within spitting distance of the thing you've optimized by hand over the course of a few years.

I've written Rust for half a decade and I'm not sure I've ever actually explicitly indexed into a Vec or slice in production code. If I needed to, and it was in a hot loop, and it wasn't a fixed-size array whose size is known at compile time... there's always `get_unchecked()` which is functionally identical to indexing into a C++ vec without `at`.


> Yes, if you don't know Rust, and you think it's just a weird re-skin of C++ that can be gsub'd from one to another, you're going to have a bad time.

most C++ code is not "idiomatic C++" either, it's code which looks like this:

https://github.com/cdemel/OpenCV/blob/master/modules/calib3d...

or this

https://github.com/madronalabs/madronalib/blob/master/source...

or this

https://github.com/MTG/essentia/blob/master/src/3rdparty/spl...

etc ... which is in general code from science papers written in pseudocode which is ported at 2AM by tired astrophysics or DSP masters students to have something to show to their advisor. You can have whatever abstraction you want but they won't be used because the point is not to write rust or C++ or anything but to get some pseudo-MATLABish thing to run ASAP, and that won't have anything that looks like range / span, only arrays being indexed raw.


If you were using std::vector::at (the proper way), you'd have bounds checking as well. One should only direct index if they really know that they are inside the bounds. And in Rust, there's std::vec::Vec::get_unchecked: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.get...


> If you were using std::vector::at (the proper way),

it is absolutely not the proper way and a complete anti-pattern


Yeah I agree using vector::at is a common and fairly bad anti-pattern.

So long as you make sure your program is correct you never need to worry about indices being out of bounds. Requiring bounds checking is a sign you need to eliminate the errors in your software.

This makes me wonder, why are people writing software with errors in it in the first place? Even master programmers seem to be too lazy and careless to remove the errors from their software. What gives?


Using C++ is an anti-pattern

That is why I write all my code in Pascal, where you can enable bound checking for []

Enable it for a debug build, and you an be sure there are no overflows of any kind, and when it runs the program has pretty much no errors at all. Disable it for the release build, and it runs as fast as if it was written in C


I think I'm falling for Poe's law.

Here is a honest answer. The problem is frankly unsolvable thanks to the halting problem. It's impossible to determine in general whether a program is going to reach a certain state except by testing all potential inputs and even that is only going to give you an approximate answer. If it were possible we would have written a program that solves the problem through static analysis.


That depends on the project. Anyway, it’s irrelevant, because you can do it either way in either language.


This is the Ossia Score lead dev FYI, he's no stranger to either language


I don't know what Ossia Score is. Regardless, I didn't mean to imply anything negative about the competence of the person I was replying to.


Definitly not a complete anti-pattern in distributed computing software that cares about security and for whatever reason needs to be written in C++.


Both languages let you access vector elements with or without bounds checking. Stroustrup even recommends you use .at by default in c++.


Idiomatic Rust here would use something like this, not that in this case it makes them performance identical, but it is more representative of real Rust code, and is slightly smaller https://rust.godbolt.org/z/brh6heEKE

EDIT: lol don’t write code five minuets after waking, the reply to this has far better code

(EDIT: incidentally, I was also curious about a non-extern function, but the compiler is seeing through all of my attempts to call a native Rust function with an opaque body here...)

That being said this code will also be a no-op instead of... something worse, if the vector is empty, so, they're not directly comparable, but they weren't really in the first place, as you mention, so...


Why not

  for i in &v[1..num]
instead of all the iter take skip stuff?


Because I wrote this comment shortly after waking up and made a mistake. That code is far far better. Thank you.


>but pretty good for consultants who get called to fix stuff afterwards, so, like, keep them coming!):

I'm confused. Doesn't that apply to the C cowboy way of doing things? You introduce security vulnerability after vulnerability and then lots of people have to hire expert security consultants all the time. Your snark just makes no sense to me. Fixing an ArrayOutOfBoundsException in Java is something even a novice programmer with less than a year experience can do. No expensive consultant needed.

The Java vulnerabilities aren't even in the same class as C. They are usually quite dumb shit like class loading remote class files using obsolete features that nobody even remembers (log4shell). It's like setting up ssh with a default password (raspberry pi). It happens but it's rare because of the sheer amount of "incompetence" required that it requires lightning to strike twice.


I have never come across security consultants. Performance OTOH ...


https://www.mandiant.com

Sometimes they're called auditors. Or "formal audits" because people think if they call something "formal" it's somehow different from not being "formal".


I mean, I know that this is a business that exists, what I mean is that I get called regularly for fixing performance things, but have never heard of any request for security - or met security consultants during my jobs. However I know of a few cases where people would sell members of their extended family in a heartbeat for a % more oomph.


Just because indexing is bounds-checked by default doesn't mean that the bounds-checking is necessarily in your final code:

https://rust.godbolt.org/z/er9Phcr3c

Yes, you still have the bounds check when constructing the slice, but that's fine.

And since it's probably slightly more common to iterate over the length of the vector anyway:

https://rust.godbolt.org/z/e5sdTb7vK


https://rust.godbolt.org/z/zM58dcPse

And look, putting a limit that the compiler and guarantee once outside the critical loop eliminates it. You're already using unsafe. If you really want to shoot yourself in the foot with this nonsense you can use a get unchecked or an unchecked assumption

https://rust.godbolt.org/z/naeaYad5T

On top of that, noone would write rust code like that. This version has one bounds check at the beginning and is a lot more idiomatic

https://rust.godbolt.org/z/1s9zsMx36


The point is to have some code that exhibits what happens when a bound check is needed. You can also do

    for(double v : std::span(vec).subspan(1)) { } 
in C++ and that will be safe and not need a bound check except an user-provided one at the creation of the span (-1 for C++ here :-)) - and also not matter at all regarding the bound checks problem which will occur as soon as you don't just iterate linearly which is extremely common in the kind of problems that end up showing at the top of the profiler and the whole point of this conversation.


You can just use unsafe if you care in this case though.

It's a saner default option tbh.


C is even better. One instruction less than C++ ;)

https://gcc.godbolt.org/z/oc19Gc4eY




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: