Rust for C++ programmers – part 4: unique pointers

rdtsc · on April 29, 2014

Rust looks very exciting and promising. I see the hardest things for it to be not necessarily syntax and concurrency (which are very well done), but performance and getting to compete with C++11 (C++14), which actually seems to become fresh and interesting again.

Performance is tough. I feel most often C++ is not chosen for its inherent cleanliness, elegance and beauty, but because there are no viable competitors at the given performance point. C++ compilers have been honed and tweaked for more than a decade now, so it will be a hard battle ahead.

pcwalton · on April 29, 2014

I feel pretty good about it. For one, we have significantly better aliasing information than C++, and we use most of the guts, including all the optimizations and code generation, of a C++ compiler (clang/LLVM). There are also some optimizations around move semantics that are open to us but not for C++ as standardized.

That said, there are definitely performance bugs that have yet to be fixed. But I would definitely say most Rust code, if well-written, can be competitive with C++ today.

berkut · on April 29, 2014

How does one do intrinsics (SSE / AVX) in Rust - I guess they're just the exposed functions from emmintrin.h and the compiler calls them directly?

How do you do ASM in Rust?

Can you align memory in rust?

masklinn · on April 29, 2014

> How do you do ASM in Rust?

There's an asm! macro: http://static.rust-lang.org/doc/0.10/guide-unsafe.html#inlin...

> Can you align memory in rust?

I think that's an open problem for now, e.g. https://github.com/mozilla/rust/issues/4578

berkut · on April 29, 2014

Aligning memory is pretty important for high-performance apps using SSE/AVX (archs after SB don't need alignment any more, but still benefit from it) - and generally anyway - cacheline alignement, etc, and while it's often possible to force structs / classes to be aligned by padding them, this doesn't always work.

Does Rust allow using memory allocated by specific external allocators? i.e. libnuma or something?

masklinn · on April 29, 2014

> Aligning memory is pretty important for high-performance apps using SSE/AVX

Er… yes? Did I say it was unimportant anywhere? (also please note that I'm not a Rust developer)

> Does Rust allow using memory allocated by specific external allocators? i.e. libnuma or something?

I'm not quite sure what you're asking

* if you're asking if it's possible to use arbitrary memory returned by a third party? Then yes.

* if you're asking whether user-defined allocators or boxes are supported and you can ask the runtime to put its stuff in memory you get from e.g. a bundled jemalloc, then that's planned but not in I believe: https://github.com/mozilla/rust/issues/12038

berkut · on April 29, 2014

> Er… yes? Did I say it was unimportant anywhere?

No, but that link seemed to indicate it wasn't being planned for the first release and pcwalton seems to think it's possible to get Rust programs to run as fast as C++ programs in most cases.

dbaupp · on April 29, 2014

It can easily be added backward-compatibly, so could appear in, e.g., a 1.1 release. Or... someone who really wants it scratches their itch and gets it implemented. :)

(The speed of a language isn't determined by the speed of it's "first release".)

twic · on April 29, 2014

I would imagine that "most cases" don't require this kind of alignment, because most cases don't involve vector instructions.

There is certainly a set of performance-critical problems where making optimal use of vector instructions is essential. It does seem that Rust is unlikely to be competitive for those today. But i think it's a minority of all problems.

pcwalton · on April 29, 2014

> How does one do intrinsics (SSE / AVX) in Rust - I guess they're just the exposed functions from emmintrin.h and the compiler calls them directly?

It uses the LLVM support.

> How do you do ASM in Rust?

With the asm! macro.

> Can you align memory in rust?

Yes. Write an allocator that does this and use it. The language doesn't have an allocator "built-in" and has full support for custom allocators.

berkut · on April 29, 2014

> It uses LLVM support.

What do you mean by this? 1. Any scaler code might get vectorised by the LLVM if you're luckly.

or 2. You can write the instrinsics yourself in inline code, and the compiler will (almost) obey them verbatim?

In my experience, LLVM's vectorisation is behind GCC and quite a way behind ICC...

seertaak · on April 29, 2014

> There are also some optimizations around move semantics

That piqued my curiosity -- would you mind going into a little more detail, or point me to somewhere where these are discussed?

pcwalton · on April 29, 2014

https://github.com/mozilla/rust/issues/5016 is the main optimization (and we have more discussed but not written up). Since we have a very strict notion of liveness, we don't have to zero out moved data since the compiler can just not run the destructor.

NAFV_P · on April 29, 2014

> Performance is tough. I feel most often C++ is not chosen for its inherent cleanliness, elegance and beauty, but because there are no viable competitors at the given performance point.

I don't think there are many popular high level languages that target systems programming in the first place.

pjmlp · on April 29, 2014

The problem was the trend in the mid-90's to move away from native AOT compilers to VM JITs, thus leaving C and C++ as the main to go languages most mainstream developers know, as other options faded out of sight.

Hopefully the "going back to native" trend will move us back on track, similar to how it happened in the early 80's VM attempts.

twic · on April 29, 2014

I think the real problem is the move from manual memory management to garbage collection. Garbage collection has immense advantages, and is indisputably the right choice for most application programming, but it only goes fast if you feed it lots of memory.

See figure 3 of this wonderful but terrifying paper:

http://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf

(Caveat: that was published in 2005, and garbage collection has improved since then. However, i doubt it has improved enough to invalidate its conclusions.)

The best collector there needed ~3x as much memory as a (admittedly superhumanly perfect) manually memory management to get to a competitive speed.

There are plenty of problems where that memory is available. But there are plenty where it is not. I can justify giving loads of memory to the application that makes my company money. I can't justify giving it to some tiny little log shipper or message forwarder or other random system utility that i want to run on every machine in my infrastructure.

pjmlp · on April 29, 2014

What does it have to do with VMs, though?

There are lots of languages with native code generation compilers that have GC[1] support.

Since the Xerox PARC days there have been system programming languages with automatic memory management (Cedar, Interlisp, Modula-3, Oberon, ....).

They were just ignored by the mainstream OS vendors, that were busy creating UNIX System V and VMS clones.

[1] RC is usually GC chapter 1 in CS books.

twic · on April 29, 2014

What does it have to do with VMs, though?

It doesn't. I think the point about VMs and JITs is a red herring.

AnimalMuppet · on April 29, 2014

There are few viable competitors that: - give access to as low a level as you need, in a way that you can tell what the costs of various actions are going to be, - lets you manage different pools of memory differently, - lets you use higher level abstractions (clear up to lambdas), - gives you confidence that it will still be around in a decade or two (I have worked on code bases that lasted two decades), and that coders will still be available in that language.

It's not just the performance point. It's how well it's known. (Including how well the flaws are known. What's wrong with C++? Plenty, but there are people who know how to work around it. What's wrong with Rust? Presumably a fair amount as well, but we don't know what yet.) It's how mature our understanding is of how to build million line code bases that can be maintained for decades.

And, sure, people are going to scream about the memory bugs that are going to make that million-line code base a maintenance nightmare for decades. But what's going to make a Rust program a maintenance nightmare for decades? (Don't bother trying to tell me it won't be one.) You don't know. I suspect, however, that there is at least some danger that you won't be able to find many Rust programmers two decades from now.

tl;dr: C++, for all its flaws, is pretty well understood. We know that we can build at-least-somewhat-working large programs and maintain them for decades using it. We know of few other languages where that is true.

loup-vaillant · on April 29, 2014

> I feel most often C++ is not chosen for its inherent cleanliness, elegance and beauty, but because there are no viable competitors at the given performance point.

My feeling is that most often, (i) people don't need nearly as much performance as they think, and (ii) they greatly overestimate the performance gap between C++ and garbage collected languages (most notably those who are compiled to native code, such as Lisp, ML, and Haskell).

malkia · on April 29, 2014

Game at 60fps is hard-to achieve if something is taking more than few milliseconds to finish (since there are other things to finish too), even if it's occasionally - it'll produce a noticeable frame-drop.

An audio mixer would be the same - it's given some fraction of the time to mix (resample, apply effects, etc.) in some short amount of time (say 5ms) - and called at regular periods.

I don't know much about web-servers, but wouldn't that be the case there - you want quick response (html).

I'm not saying going against scripting/dynamic/garbage-collected languages - they are very useful - but often there is need for the garbage collection mechanism to be exposed, so that it could be done piecewise, or fully collected at timees when it does not matter (say in a game, when transitioning from one level to another, or some UI is popped on the screen (pause), etc.).

E.g. - it's important for such things to be exposed, rather than hidden (and most of the time they are). For example reference lua, and luajit are good examples there, and other runtime implementations allow that (.NET I think there is a way to pause the garbage collectro).

But performance is needed in all mentioned applications above - physics, collision, occlusion, rendering, etc.

loup-vaillant · on April 30, 2014

> Game at 60fps

…is one of the most demanding application ever. And a tiny niche to boot: while prominent, games represent a tiny fraction of all programming effort. Web browsers and operating systems are an even more extreme example of this availability bias.

For interactive stuff that otherwise doesn't move (regular web browsing, GUI stuff…), 100ms response time is perfect, except maybe for text entry. GC pauses can be made negligible in that context. (Though as you said, we need access to that stuff.)

noct · on April 29, 2014

As a developer who's worked in games, embedded, and real-time situational awareness software (ex. air traffic control), my feeling is exactly the opposite.

People often greatly underestimate the amount of performance sensitive software out there, often because they're used to working on software where the hardware isn't a fixed constraint, or are indirectly relying on optimized C/C++ software provided by the system itself.

Moreover, simple benchmarks which are often used for comparing languages are not a valid indicator of the performance available in a lower-level language.

For example, switching loop-processed data from a standard AOS (array of structures) to a SOA (structure of arrays) format can improve performance by orders of magnitude by reducing cache misses (on an i7, main memory latency is ~25x that of L1 cache).

With that latency difference in mind, imagine for a moment the impact of iterating an array of objects stored disparately in memory versus a tightly packed array.

Or more related to memory management, a common allocation optimization used in games is to provide a block of per-frame memory; allocations are a simple addition, and deallocation is an assignment. All the easy cleanup of GC, but with a constant (and insignificant) cost.

Certainly most of the code does not need to be C++, nor heavily optimized; even the 16/32ms per frame game industry tends towards garbage collected scripting languages for a significant portion of the code base. That does not however obviate the necessity for a systems level language to provide the capacity for such optimization.

loup-vaillant · on April 30, 2014

> As a developer who's worked in games, embedded, and real-time situational awareness software (ex. air traffic control), my feeling is exactly the opposite.

I call availability bias: people who work on high performance niches feel everyone is underestimating the problem. People who work on low-performance niches wonder what's the big deal.

What I have personally observed is more like a fear of poor performance, coming from people who don't understand the problem like you do. So the team choose C++, which eventually leads to an unmaintainable, slow Big Ball of Mud. 'Cause as your AOS vs SOA example demonstrates, C++ doesn't magically make your code fast. Oops.

> That does not however obviate the necessity for a systems level language to provide the capacity for such optimization.

Agreed. Just two caveats: first, C++ is really a last resort, to be used when nothing else will do, not even C+Lua or similar combination. Even for high performance code, this language is way overused. Second, while we all use the high performance infrastructure you speak of, few of us get to write it.

pohl · on April 29, 2014

they greatly overestimate the performance gap between C++ and garbage collected languages

I think you're right about that, but I also think that people greatly underestimate the determinism gap between them, too. (Not necessarily the same set of people, of course.)

loup-vaillant · on April 29, 2014

I'm not sure. If you start using smart pointers in C++, you can get the same problem: instead of a GC pause, you get a cascading delete pause. To eliminate it, you have to devise a smarter memory scheme, at which point you probably don't underestimate the determinism cap.

Also don't forget that garbage collectors can often be tuned. Granted, many of them suck, but a generational, incremental collector whose parameters can be tweaked? Much less room for pauses.

pohl · on April 29, 2014

A cascading delete pause may be undesirable, but is it nondeterministic?

jerf · on April 29, 2014

In the same way that GC is, yes. Technically both are probably fully deterministic, just not readily determinable by casual examination of the code at compile time.

pohl · on April 29, 2014

I guess I didn't mean nondeterministic in the philosophical indeterminism sense, but in the sense that different runs might produce different behavior.

http://en.wikipedia.org/wiki/Nondeterministic_algorithm

loup-vaillant · on April 30, 2014

Well, you can get there if you lag enough to drop a few frames in a competitive FPS game: while the game is still deterministic, the game/players system will diverge: a few dropped frames can get you fragged.

Also, some naive simulations will adapt the length of their steps with the time it takes to compute them. Any performance variation gives you full blown unpredictability.

AnimalMuppet · on April 29, 2014

Even a single delete/free is nondeterministic. How long it takes depends on the state of the heap (true of new/malloc as well).

rdtsc · on April 29, 2014

Very good point. Performance folds in latency (determinism) and throughput. There is usually a trade-off between the two. GC might handle throughput reasonably, latency is a bit tougher. You are sort of left tweaking knobs on a black box hoping to get good results in the end.

berkut · on April 29, 2014

When you're IO or network / database bound, then, you won't notice.

When you're just CPU bound without having to cope with lots of memory transfer / throughput, then you won't notice.

Without a doubt, there are many apps that don't need this pure speed, but for specialist apps and even general apps on the desktop (compare firefox to chrome and safari for example), you can notice the difference quite distinctly.

In my experience writing high performance applications for the VFX industry, where both memory performance AND memory compactness (so you can fit as much in memory as possible) are important, when profiling applications, I've found that maybe 70% of the time it's the memory allocation / deallocation that's the actual bottleneck, not actual calculations. Being able to separate stuff into separate allocators based on different use cases adds complexity, but gives huge benefits when you're memory throughput constrained.

Pxtl · on April 29, 2014

Nobody will ever beat C/C++ for raw soft-real-time speed. Even a language like Rust that may have the linguistic features to be blazingly fast will never close that massive gap in historical optimization. Rust will never have an Intel-compiler.

That said, Rust will get close to C/C++ in ways that GC-based languages never will. GC languages mean memory bloat and soft-realtime problems related to GC cleaning. It will likely hit the sweet spot for a lot of embedded or gaming uses. Obviously you can do those tasks in other languages (great stuff is done on phones with GC-based languages, obviously), but you have to work around some challenges that would not exist in Rust.

kam · on April 29, 2014

> Rust will never have an Intel-compiler.

Yesterday, Intel announced a project combining a clang frontend with icc's backend. [1] The backend takes LLVM IR, so hopefully Rust's LLVM IR output could be run through it.

[1] https://news.ycombinator.com/item?id=7663462

loup-vaillant · on April 29, 2014

> Even a language like Rust that may have the linguistic features to be blazingly fast will never close that massive gap in historical optimization. Rust will never have an Intel-compiler.

That's most probably false: the current self-hosting implementation have an LLVM backend. So it has back end optimisations covered. As for the front end, Rust's cleaner semantics will make many optimizations easier.

Plus, it's not like the historical gap couldn't be closed by reading the gazillion papers on C++ optimization out there. Copying a known technique is much faster than developing it in the first place.

Maybe Rust will end up slower than C++ anyway, but it won't be the compiler's fault.

malkia · on April 29, 2014

Well, it's already been beaten on most common hardware, though not fairly and not on the same "cpu" chip, but on the gpu: OpenCL, Cuda, DX/GL Shaders. Can't beat that (yet)!

there exist some fairly minimal "C++" like languages for gpu, but nothing to fully deal with strings, exceptions, STL, etc.

And then there is FPGA, but probably not a fair mention since these do not exist (in general) with most common hardware (pc computers, phones, pads, etc.) - while GPU's are found almost anywhere, and on most of the devices now, one can exploit extra computing power exactly by using the GPU.

captainmuon · on April 29, 2014

Rust gives me the same feeling I had when learning C. I started with BASIC, then learned Java and a bit of Pascal, Perl, etc., but C was my first language with pointers. Now it seems silly, but understanding pointers was a huge step to me. There's a before and an after. Getting used to owned/transferable pointers seems to involve a similar step, although it's probably easier this time since I understand what they do.

Btw., if you like this kind of intelligent pointers, Vala something similar [1], and the syntax seems a bit easier at first glance. You have owned pointers (and can transfer ownership), shared pointers (with reference counting) and unmanaged pointers. It uses reference counting for all the UI stuff (Vala is highly integrated with Glib and Gtk), since that is usually not performance critical and you'd rather be correct there. If you feel the need to do without reference counting, you can manage memory manually, too.

[1]: https://wiki.gnome.org/Projects/Vala/ReferenceHandling

pcwalton · on April 29, 2014

> You have owned pointers (and can transfer ownership), shared pointers (with reference counting) and unmanaged pointers. It uses reference counting for all the UI stuff (Vala is highly integrated with Glib and Gtk), since that is usually not performance critical and you'd rather be correct there. If you feel the need to do without reference counting, you can manage memory manually, too.

That doesn't sound safe—what if unowned pointers outlive the owned pointer, and you get use-after-free?

I don't believe it's possible to avoid pervasive reference counting or GC in a practical sense, and still be safe, without something like lifetimes built into the type system.

NAFV_P · on April 29, 2014

> Rust gives me the same feeling I had when learning C. I started with BASIC, then learned Java and a bit of Pascal, Perl, etc., but C was my first language with pointers. Now it seems silly, but understanding pointers was a huge step to me.

You learned programming the right way round, unlike myself. One of the current problems with the web is that any old noob can learn any old language, so rather than following good advice and learning Python, I hacked C. One of the most interesting things about C is the more proficient you become, the more you use pointer to pointer to pointer of type array of pointer to function that returns a pointer to pointer of type array of pointer to function that returns pointer to void.

Proper high level languages just go straight over my head, they're way too complicated.

bambam12897 · on April 29, 2014

I've been working C++ professionally for a couple of years and honestly I'm a huge fan - So I was excited to read about an alternative. After reading your 5 posts, I get the impression that RUST is mostly mildly useful syntactic sugar on top of C++.

Here is my feedback:

1 - If memory management is a serious problem for the software you work on, I've never found the boost library lacking. This seems like the main selling point for RUST. Given the scope of the project: you guys must be doing something that is so different that it couldn't be rolled into a library - so I'm looking forward to your future posts to see if there is something here that I really am missing out on.

2 - I'm not a fan of the implicitness and I personally don't use 'auto' b/c it makes scanning code harder. I guess this is more of a personal preference.

3 - A lot of things are renamed. auto->let, new->box, switch->box You get the feeling that effort was put in to make the language explicitly look different from C++

4 - the Rust switch statement don't fall through... This one was truly mind blowing. The one useful feature of switch statement got ripped it out! If you don't really need the fall through, I'd just avoid using them completely...

5 - I've never really seen an equivalent to boost (in combination to the STL) in other languages (maybe I didn't look hard enough). Could you maybe make a post about the RUST standard library? Libraries are always the deal breaker

To that point, my last comment is maybe a little more wishy washy. The main reason I'm consistently happy with using C++ (and why I put up with the header files) is that everything is available. If you need to do X, and X has at some point been put into library by someone: you can be sure that that library will be available in C++. Since Rust seems so close to C++, does this mean that linking to C++ code is trivial? If I can seamlessly start programming parts of our codebase in RUST, that could potentially make a huge impact.

masklinn · on April 29, 2014

> Given the scope of the project: you guys must be doing something that is so different that it couldn't be rolled into a library

Memory safety and static checking.

> A lot of things are renamed. auto->let, new->box, switch->box You get the feeling that effort was put in to make the language explicitly look different from C++

* let comes from ML and similar languages (e.g. Haskell), which are a huge inspiration of Rust (e.g. the type system). Furthermore, let is not auto, it does pattern matching, it's not just a limited form of type inference (or it'd be after the colon, where the type goes).

* match is much the same, it's a pattern-matching construct not just a jump table

* box is for placement box, IIRC it's similar to a universal placement new. For instance it can be used thus:

    // looks like it returns a basic stack-allocated value
    fn foo() -> int {
        3
    }

    let stack_allocated = foo();
    let heap_allocated = box foo();

  the second one has no more overhead than if the function had returned `box 3` and a ~int (~unique_ptr<int>) in the first place. This also works for structs, the caller can decide where he wants the return value allocated.

loup-vaillant · on April 29, 2014

If I had to guess, I'd say "box" is a reference to "boxed values" mentioned in many papers about functional language compilation. Typically, the stack of these implementation would either be "unboxed values", or pointers to "boxes" on the heap.

Clearly, some of those folks have worked on ML or Haskell implementations.

masklinn · on April 29, 2014

> If I had to guess, I'd say "box" is a reference to "boxed values"

Yes, that's the start of it but it was expanded to multiple box classes (and maybe eventually user-defined ones) rather than just boxed and unboxed values: a developer can currently use box(HEAP) and box(GC) (a bare `box` is an alias for `box(HEAP)`)

bad_user · on April 29, 2014

> If memory management is a serious problem for the software you work on, I've never found the boost library lacking.

As a developer that isn't working with C++, I'm finding memory management in C++ to be a nightmare and no amount of libraries can solve it.

Say you receive a pointer from somewhere. Is the referenced value allocated on the stack or on the heap? If allocated on the heap, do you need to free it yourself, or is it managed by whatever factory passed it to you? If you need to deallocate that value, is it safe doing so? Maybe another thread is using it right now. If you received it, but it should get deallocated by the factory that gave it to you, then how will the factory know that your copy is no longer in use? Maybe it's automatic, maybe you need to call some method, the only way to know is to read the docs or source-code very carefully for every third-party piece of code you interact with.

All of this is context that you have to keep in your head for everything you do. No matter how good you are, not matter how sane your practices are, it's easy to make accidental mistakes. I just reissued my SSL certificate, thanks to C++.

Yeah, for your own code you can use RAII, smart pointers, whatever is in boost these days and have consistent rules and policies for how allocation/deallocation happens. Yay! Still a nightmare.

Even if manageable, there's a general rule of thumb that if a C++ project doesn't have multiple conflicting ways of dealing with memory management and multiple String classes, then it's not mature enough.

nly · on April 29, 2014

> Say you receive a pointer from somewhere.

Here's your problem. In general I don't want to be receiving a single pointer from anyone. Lately, I've found it helpful to think of pointers in C++ as special iterators rather than a referential relic from C. In such a mindset passing pointers around without an accompanying end iterator, or iteration count, just makes no sense. Anywhere that implied iteration count is always a constant, I'm probably not structuring my code correctly.

So my recommendation is to use references (foo&) for passing down (well, up) the stack, never to heap allocated objects. Because you can't use delete on a reference there's no longer an ambiguity. Use smart pointers to manage the heap. Write RAII wrappers (it's not a lot of code) to manage external resources. RAII wrappers are especially useful for encapsulating smart pointers so big things can be passed around with value semantics, which gives you even stronger ability to reason. Implementing optimisations like copy-on-write becomes fairly trivial.

> I just reissued my SSL certificate, thanks to C++.

If you're referring to Heartbleed then OpenSSL is written in C, not C++. Generally only a language that inserts array bounds checks for every access would have shielded you from this bug... C++s <vector> does this if you use the at() function of <vector>, but op[] doesn't by default for performance reasons.

scott_s · on April 29, 2014

The problem that Rust solves is that your advice, while good, is still advice. I absolutely agree that naked pointers are a code smell, and stack allocated objects should be the norm, with passing around (const) references to them. And RAII wrappers are great.

But all of that are patterns of use, enforced mostly by convention. In Rust, that's enforced by the language itself, and violating it will be a compiler error. The following kind of shenanigans won't be allowed outside of unsafe regions:

  int main()
  {
    int on_stack;
    int& ref = on_stack;
    int* ptr = static_cast<int*>(&ref);
    delete ptr;
    return 0;
  }

Yes, it's obviously bad code, but C++ happily let me write it, and it compiled with no warnings under -Wall -Wpedantic.

nly · on April 29, 2014

This is because delete is an operator that can be overridden, and whether it has been overridden isn't known until link time.

    void operator delete(void*) {  }

    int main()
    {
      int on_stack;
      int& ref = on_stack;
      int* ptr = &ref;
      delete ptr;
      return 0;
    }

and now it's safe :P... and yes, never freeing any memory is arguably a perfectly valid memory management strategy. Ok, this example is nuts... but it's a feature of C++, in the C tradition, that it lets you do crazy things. Can I plug custom per-type memory allocators in to Rust?

pcwalton · on April 29, 2014

> In such a mindset passing pointers around without an accompanying end iterator, or iteration count, just makes no sense. Anywhere that implied iteration count is always a constant, I'm probably not structuring my code correctly.

Passing around two iterators is still not safe, due to iterator invalidation.

Memory safety in the C++ model is a hard problem.

> So my recommendation is to use references (foo&) for passing down (well, up) the stack, never to heap allocated objects.

I don't think this is practical. Consider `operator[]` on a vector, which returns a reference to a heap allocated object. If that were to copy out, you'd have a lot of overhead, and if it were to move, then a lot of very common patterns would be annoying to write.

roel_v · on April 29, 2014

As a developer that isn't working with C++, I'm finding memory management in C++ to be a nightmare

?

berkut · on April 29, 2014

I've never had an issue with memory management in C / C++ (and I've had experience of writing C#/Java in enterprise applications, so I know what not having to do it is like).

There is an overhead to having to think and plan for it, but in my experience of using it for things ranging from realtime systems through to high performance, multi-threaded image processing and rendering applications, at least if you're in control of most of the code and it's pretty good code, it's not really an issue.

steveklabnik · on April 29, 2014

Of course, if it's good code, then you're good. The issue is getting good code in the first place. ;) That's where static analysis can help: compilers are tireless. Humans are faulty.

Memory management issues cause tons of security issues in C++ applications. In the recent Pwn2Own, all of the security vulnerabilities in Firefox would have not been possible had Firefox been implemented in Rust.

loup-vaillant · on April 29, 2014

> if a C++ project doesn't have multiple conflicting ways of dealing with memory management and multiple String classes, then it's not mature enough.

Hmm… How can I tell maturity from rot?

AnimalMuppet · on April 29, 2014

bad_user's definition of maturity sounds an awful lot like rot to me...

bad_user · on April 30, 2014

That was me being sarcastic. Taking it seriously implies that you felt it to some degree, no? :-P

AnimalMuppet · on April 30, 2014

Sigh. And here I thought I had my sarcasm detector properly calibrated...

jcmoyer · on April 29, 2014

>1 - If memory management is a serious problem for the software you work on, I've never found the boost library lacking. This seems like the main selling point for RUST. Given the scope of the project: you guys must be doing something that is so different that it couldn't be rolled into a library - so I'm looking forward to your future posts to see if there is something here that I really am missing out on.

I would say that Rust's defining feature is the borrow checker[1], which eliminates an entire category of pointer related errors at compile time.

>3 - A lot of things are renamed. auto->let, new->box, switch->box You get the feeling that effort was put in to make the language explicitly look different from C++

Rust is strongly influenced by the ML family of languages. I believe that's where some of the keywords came from.

>4 - the Rust switch statement don't fall through... This one was truly mind blowing. The one useful feature of switch statement got ripped it out! If you don't really need the fall through, I'd just avoid using them completely...

Rust doesn't have a classical switch statement. Instead, it has a 'match' keyword that performs pattern matching on the input so you can easily destructure a complicated blob of data into more manageable pieces. This is also a concept borrowed from the ML family. If you really wanted to, you could write a macro that emulates the C style switch statement.

>5 - I've never really seen an equivalent to boost (in combination to the STL) in other languages (maybe I didn't look hard enough). Could you maybe make a post about the RUST standard library? Libraries are always the deal breaker

There's a great overview[2] of what the standard distribution contains on the Rust website.

[1] http://static.rust-lang.org/doc/master/rustc/middle/borrowck...

[2] http://static.rust-lang.org/doc/master/index.html#libraries

mercurial · on April 29, 2014

I'll add that Rust's match statement, as far as I can see, is more powerful than what Haskell offers: it combines pattern matching and guards, and lets you match different pattern with the same block (the patterns should bind the same variables and they should have the same type, obviously).

bambam12897 · on April 29, 2014

Looks like I have a lot to read up on. Thank you for pointing me in the right direction.

comex · on April 29, 2014

> 1 - If memory management is a serious problem for the software you work on, I've never found the boost library lacking.

A key feature of Rust is being memory safe by default without sacrificing too much performance. In C++ it's not really possible to be completely safe, and pretty much all real-world C++ I've seen doesn't even come close to that ideal, using raw pointers frequently; the result is that crashes can be relatively mysterious and, perhaps more importantly, that in large C++ codebases with lots of attack surface, like browsers, security vulnerabilities are extremely frequent. So yes, memory management is a serious problem.

In particular, there is no way to replicate Rust-style borrowed pointers in C++, which make safe low-level programming easier.

> 4 - the Rust switch statement don't fall through... This one was truly mind blowing. The one useful feature of switch statement got ripped it out! If you don't really need the fall through, I'd just avoid using them completely...

Most C++ code uses switch frequently, usually without taking advantage of fallthrough. Rust match is good for this, but it can also be used for more complex things (destructuring); using this frequently makes for code that looks quite different from C++.

> The main reason I'm consistently happy with using C++ (and why I put up with the header files) is that everything is available.

The bad news: as Rust is still a very unstable language, there isn't much of a library ecosystem.

The good news: by release, Rust will have a standard package manager, which should make importing libraries easier than in C and C++ where every package has its own build system. I think Go has been pretty successful with this approach.

> Since Rust seems so close to C++, does this mean that linking to C++ code is trivial?

Nope... I would personally like this feature, although it would be difficult to make reliable because Rust has different semantics in various areas - OOP is very different, templates are (intentionally) not as powerful, etc.

However, you can import C headers using rust-bindgen [1] and fairly easily link to C code.

https://github.com/crabtw/rust-bindgen

bambam12897 · on April 29, 2014

Thanks for taking the time to address my questions. Memory management hasn't been a major stumbling block in my line of work, but I'll read up and give it a whirl if I have the appropriate project.

ahy1 · on April 29, 2014

> Most C++ code uses switch frequently, usually without taking advantage of fallthrough.

I am not so sure about this. Thinking back on my uses of switch in C and C++, I am not able to remember using switch without taking advantage of fallthrough. Maybe I am just not a typical C++ programmer...

dbaupp · on April 29, 2014

Rust still offers an equivalent to things like

  switch (x) {
  case 0: case 1: foo(); break;
  case 2: case 3: bar(); break;
  default: baz();
  }

in the form of

  match x { 
      0 | 1 => foo(),
      2 | 3 => bar(),
      _ => baz()
  }

In any case, Rust's match is so much more powerful than switch, offering (nested) pattern matching like Haskell's `case`.

DerekL · on April 29, 2014

There's two different kinds of fallthrough in C++. The most common is using the same code for multiple values. Rust already supports this by allowing multiple values and ranges of values for each pattern.

The much more rare use of fallthrough is executing code for one case, and then continuing on to execute the code for the next case. This seems to be much more rare. In fact, in my large Android application, I turned on warnings for this type of fallthrough, and out of hundreds of switch statements, only eight used it, and all but one was a mistake.

pcwalton · on April 29, 2014

> 1 - If memory management is a serious problem for the software you work on, I've never found the boost library lacking. This seems like the main selling point for RUST.

When use-after-free becomes a security concern that commonly leads to remote code execution, as it is for us in the browser space, it becomes very apparent just how inadequate modern C++ is at the task. Everything works fine, the tests pass, people use it in production, and yet all the time someone discovers some way to make the stars align to produce a use-after-free bug. This has happened over and over again, despite all the smart pointers and modern C++ techniques.

The fact is that modern C++ just isn't memory safe, and after digging deep into the problem to try to solve it with Rust I'm convinced now that it can never be. The language just has too many core features, such as references, iterators, and the "this" pointer, that cannot be made memory safe without sacrificing backwards compatibility.

coldtea · on April 29, 2014

>1 - If memory management is a serious problem for the software you work on, I've never found the boost library lacking. This seems like the main selling point for RUST.

No, the main selling point is memory management with guarantees for safety but full performance. You don't get that just with boost, it's mostly a GC.

>2 - I'm not a fan of the implicitness and I personally don't use 'auto' b/c it makes scanning code harder. I guess this is more of a personal preference.

Type inference removes redundant information.

>4 - the Rust switch statement don't fall through... This one was truly mind blowing. The one useful feature of switch statement got ripped it out!

Huh? The fall-through has been known to be a bug hazard and bad style for ages. And for what, to save a few lines with a not that clever trick? The only useful abuse of the Switch was Duff's device, and that's stretching it already.

pix64 · on April 29, 2014

The 'let' keyword introduces a variable. It's not 'auto'.

You can still declare variables types explicitly: 'let x: int = 5'

'match' is also much more powerful than 'switch'. You can match on several patterns, ranges of numbers, as well as use guards.

http://rustbyexample.com/examples/match/README.html

ben0x539 · on April 29, 2014

Features-wise, C++ probably subsumes a lot of languages, so from a certain perspective, everything else looks like a weird syntax for a subset of C++.

And it's true that Rust is edging closer and closer to C++ in some ways. It certainly didn't start there, old ("old", 2010 or so) Rust looks pretty weird with lots of esoteric features (effects! structural types!) and built-in magic. Over time Rust probably ~focused on its core values~ etc, which resemble that of C++, and also grew more powerful as a language so that most things can now live in the library rather than in the compiler, which is I guess something that C++ also prides itself with.

But more seriously, the whole motivation for Rust isn't what features it gives, but what it takes away. All the traits and algebraic datatypes and type inference and other shiny Rust features, if someone walked up to the Rust people with an implementation of that as a source filter for C++ or whatever, it still wouldn't sell. The mission of Rust (as I see it, from the distance) isn't to make a more convenient or powerful C++, it's to make a language convenient and powerful enough that people won't miss C++'s laissez-faire attitude towards memory safety.

To that effect, Rust isn't really aiming to compete with C++ on a feature-for-feature basis, but it has to include features that enable a style of programming that is competitive with C++ performance and convenience without relying on memory-unsafe features.

(I guess the `unsafe { }` sub-language comes off as an admission that that won't work, but it's arguably just a way to introduce new safe primitives that requires particularly careful manual checking, just like adding features to the language that the compiler assumes are safe, and anyway it's mostly equivalent to linking to arbitrary C code without which the language would pretty much be a non-starter anyway.)

rat87 · on April 30, 2014

Warning these are just my impressions, I haven't fully tried out rust.

> 1 - If memory management is a serious problem for the software you work on, I've never found the boost library lacking. This seems like the main selling point for RUST. Given the scope of the project: you guys must be doing something that is so different that it couldn't be rolled into a library - so I'm looking forward to your future posts to see if there is something here that I really am missing out on.

Generally every non-trivial program has bugs(possibly excluding tex which was written by one of the greatest computer scientists of all time and after years of being open with bragging rights for anyone who finds a bug, but I'm not even sure about it). Rust aims to reduce bugs including memory bugs as much as possible with static typing. It's probably the language that most concentrates on preventing bugs with static typing/analysis this side of haskell. Even if c++ could have some similar feature if some library used carefully this is not the same as people could still abuse unsafe features not allowed in rust outside of unsafe blacks.

> 2 - I'm not a fan of the implicitness and I personally don't use 'auto' b/c it makes scanning code harder. I guess this is more of a personal preference.

It is a matter of style but as someone who puts var in front of everything but basic types in c# I think its a good default. An ide can help a lot by showing type on hover.

> 3 - A lot of things are renamed. auto->let, new->box, switch->box You get the feeling that effort was put in to make the language explicitly look different from C++

Although rust is somewhat influenced by C++ aiming to take over its problem ___domain it is probably more influenced by functional languages like Ocaml. let comes from ocaml, match instead of switch also comes from ocaml(note that match is more powerful then switch). Also note that let isn't exactly the same as auto, let allows declaring local variables, rust default to implicit typing for local variables but you can specify them and it will still have let. ex.

    let monster_size: int = 50;

> 4 - the Rust switch statement don't fall through... This one was truly mind blowing. The one useful feature of switch statement got ripped it out! If you don't really need the fall through, I'd just avoid using them completely...

There aren't that many useful switch statements that fallthrough for reasons other then matching multiple values to one execution path and rust allows you to match one of a number of values or in a containing range. ex.

    match my_number {
      0     => println!("zero"),
      1 | 2 => println!("one or two"),
      3..10 => println!("three to ten"),
      _     => println!("something else")
    }

There are some other uses for switch fall through such as duffs device(cool but hard to understand and not necessarily a speed up these days) and things like http://programmers.stackexchange.com/a/116232 where some cases include other cases(not that common, can be simulated with if/else's). Fallthorugh is a confusing "feature" which can lead to bugs if you forget to put break which goes against the rust philosophy and it would be even more confusing when using the return value of match.(note I'm think this next example is right)

    println!("my number is {}!", match my_number {
      0     => println!("zero"),
      1 | 2 => println!("one or two"),
      3..10 => println!("three to ten"),
      _     => println!("something else")
    }

In rust matches and if/else statemnts are expressions. This gives you a nice looking ternary expression. And makes some functions shorter.

I think that if there are multiple ';' seperated expressions in an if/else or => result wrapped with {} it will return the last expression if it isn't followed by a ; otherwise it returns unit(a type meaning something similar to void). But I could be wrong.

> 5 - I've never really seen an equivalent to boost (in combination to the STL) in other languages (maybe I didn't look hard enough). Could you maybe make a post about the RUST standard library? Libraries are always the deal breaker

The equivalent of Boost in what way? Boost is a collection of many(>80) different libraries some of which (ab)use the language in very interesting ways. Some of these eventually move into the standard and stdlib. I don't think any other languages have this sort of feeder collection. That said many languages have officially or unofficially blessed libraries which sometimes make it to the stdlib. Rust is fairly young and a moving target so there aren't many 3rd party libraries yet but hopefully that will change once it stabilizes and the package manager is mature.

Dewie · on April 29, 2014

> 3 - A lot of things are renamed. auto->let, new->box, switch->box You get the feeling that effort was put in to make the language explicitly look different from C++

Rust is inspired by more languages than just C++, believe it or not. :)

stewbrew · on April 29, 2014

I recently wondered whether it's possible to compile rust into a dll/so or whether there is a way to call rust from other languages (e.g. C, R, or ruby). All I found is that this isn't (easily?) possible because of rust's runtime. Is this true? If so, will it be possible to call rust from, e.g., C code? I'd like to have an alternative to c/c++ for writing native extensions for interpreted languages.

dbaupp · on April 29, 2014

Yes, it is possible to do it by just avoiding the runtime, e.g. the following works fine:

    #![crate_id="example_c"]
    #![crate_type="dylib"]
    
    #[no_mangle]
    pub extern fn my_rust_function(x: i32, y: i32) -> i32 {
        x + y
    }

Then compiling that gives a libexample_c....so file, which can be linked against the following C:

    #include<stdio.h>
    extern int my_rust_function(int, int);

    int main() {
        printf("%d", my_rust_function(1, 2));
        return 0;
    }

printing 3. (There's no interaction with a runtime here at all.)

In fact, it's even possible to "manually" start a runtime[1] inside another process (and if it's started on its own thread then I think it will work flawlessly, if not, then it might break in some corner cases).

[1]: http://static.rust-lang.org/doc/master/guide-runtime.html

masklinn · on April 29, 2014

> If so, will it be possible to call rust from, e.g., C code? I'd like to have an alternative to c/c++ for writing native extensions for interpreted languages.

Yes. Technically it's already possible[0][1] but AFAIK it requires throwing out all (?) of the stdlib. rust-core[2] is an exploration of a runtime-less subset of the stdlib, which would ultimately be usable under a no_rt flag or something like that (at this point you have to disable the stdlib entirely and use rust-core instead)

[0] https://github.com/mozilla/rust/issues/3608

[1] https://github.com/charliesome/rustboot

[2] https://github.com/thestinger/rust-core

rcxdude · on April 29, 2014

It's on the roadmap: there's plans to split the standard library into parts which require the runtime and parts which don't (as well as into parts which don't require malloc). As it stands you can write them, but you need to avoid parts of the standard library.

steveklabnik · on April 29, 2014

The third production deployment of Rust is a Ruby extension written in Rust. https://air.mozilla.org/sprocketnes-practical-systems-progra... (You'll have to scroll forward)

And technically, it's a Ruby extension written in C. But the C is very small, and is mostly written in Rust. This is because Ruby's cexts use so many C macros, it was easier to make a shim then try to port them all to Rust.

Dewie · on April 29, 2014

Alternatively you could write runtime-less Rust. I don't know if the offerings for this at this point is at the prototype stage or better, but it seems like something that they want to keep.

pkulak · on April 29, 2014

Man. Now I really want to read the next post...

coldtea · on April 29, 2014

From the blog:

>I have history with Firefox layout and graphics, and programming language theory and type systems (mostly of the OO, Featherweight flavour, thus the title of the blog).

Hmm, what are "type systems" of "featherweight flavour"? Anything real or some inside joke? Or perhaps an elaborate way to say "not that complicated"?

(Google mostly returns references to the same blog)

nrc · on April 29, 2014

Featherweight Java (http://www.fos.kuis.kyoto-u.ac.jp/~igarashi/papers/fj.html) was a seminal paper which showed type soundness for Java. The formal syntax was a subset of Java subset and preserved the interesting features in the semantics (as opposed to encoding a language in an extension of the lambda calculus, which is an alternative style of formalisation). So, "featherweight flavour" refers to formal type systems work which follows this style of formalisation.

masklinn · on April 29, 2014

I'd expect something along the lines of:

> Several recent studies have introduced lightweight versions of Java: reduced languages in which complex features like threads and reflection are dropped to enable rigorous arguments about key properties such as type safety. We carry this process a step further, omitting almost all features of the full language (including interfaces and even assignment) to obtain a small calculus, Featherweight Java, for which rigorous proofs are not only possible but easy.

> Featherweight Java bears a similar relation to Java as the lambda-calculus does to languages such as ML and Haskell. It offers a similar computational "feel," providing classes, methods, fields, inheritance, and dynamic typecasts with a semantics closely following Java's. A proof of type safety for Featherweight Java thus illustrates many of the interesting features of a safety proof for the full language, while remaining pleasingly compact. The minimal syntax, typing rules, and operational semantics of Featherweight Java make it a handy tool for studying the consequences of extensions and variations.

You can see a few of his pubs on his website: http://www.ncameron.org/papers/index.html

frozenport · on April 29, 2014

>>The memory will not leak however, eventually it must go out of scope and then it will be free. Yes, if the function that called it lives for ever the point is perhaps moot, in an extreme case if it was called by 'main'.

rcthompson · on April 29, 2014

If the pointer is returned all the way up to the main function, then it must be required for the lifetime of the program (or else the program is poorly designed). What point are you trying to make?

frozenport · on April 29, 2014

It could also be a memory leak. For example, I have a for loop that keeps requesting a new pointer...

masklinn · on April 29, 2014

> For example, I have a for loop that keeps requesting a new pointer...

Unless you put that pointer in a structure which lives forever, it will be freed as soon as a loop finishes:

    struct Foo { a: int, b: int }
    impl Drop for Foo {
        fn drop(&mut self) {
            println!("Drop {:?}", *self);
        }
    }

    fn foo(i: int) -> Foo {
        println!("Creating Foo with {}", i);
        Foo { a: i, b: 5}
    }

    fn do_thing(foo: &Foo) {
        println!("\tdo thing {:?}", *foo);
    }
    fn do_other_thing(foo: &Foo) {
        println!("\tdo other thing {:?}", *foo);
    }

    fn main() {
        for i in range(0, 5) {
            let a = box foo(i);
            do_thing(a);
            do_other_thing(a);
        }
    }

will print

    Creating Foo with 0
        do thing Foo{a: 0, b: 5}
        do other thing Foo{a: 0, b: 5}
    Drop Foo{a: 0, b: 5}
    Creating Foo with 1
        do thing Foo{a: 1, b: 5}
        do other thing Foo{a: 1, b: 5}
    Drop Foo{a: 1, b: 5}
    Creating Foo with 2
        do thing Foo{a: 2, b: 5}
        do other thing Foo{a: 2, b: 5}
    Drop Foo{a: 2, b: 5}
    Creating Foo with 3
        do thing Foo{a: 3, b: 5}
        do other thing Foo{a: 3, b: 5}
    Drop Foo{a: 3, b: 5}
    Creating Foo with 4
        do thing Foo{a: 4, b: 5}
        do other thing Foo{a: 4, b: 5}
    Drop Foo{a: 4, b: 5}

(drop is ~equivalent to a C++ destructor)

cmrx64 · on April 29, 2014

If you're creating a new ~T in a loop, it will be freed after each iteration of the loop (ie, when the block ends). Witness for yourself: https://gist.github.com/cmr/f80c0a58e5b90021bb35

AlisdairO · on April 29, 2014

I just tried the following:

        let mut blah = ~MyStruct{x: 3, y: 4};
        for i in range(0,100000000) {
            blah = ~MyStruct{x: i + blah.x, y: i + blah.y};
        }
        println!("{} {}", blah.x, blah.y);

The memory usage didn't increase over time - the owned pointer frees when it gets reassigned.

azth · on April 29, 2014

See this example for more details: https://news.ycombinator.com/item?id=7665617

rcxdude · on April 29, 2014

Only if you move the pointer into a vector or similar. just having loop { let x = ~"something"; } will not create a memory leak, as the pointer will be dropped and freed at the end of the loop.

kirab · on April 29, 2014

What the heck is this?

  y: ~~~~~Foo

akavel · on April 29, 2014

I assumed it reads more or less like:

  Foo *****y;

Or, probably, more like:

  std::unique_ptr<std::unique_ptr<std::unique_ptr<std::unique_ptr<std::unique_ptr<Foo>>>>> y;

claudius · on April 29, 2014

Hey, at least you can write ‘>>>>>’ instead of ‘> > > > >’ now. :)

kirab · on April 29, 2014

Thanks, I hope this is never needed anywhere, ever.

pohl · on April 29, 2014

I suspect it's a deliberately-perverse example, meant only to illustrate how deeply method calls will automatically dereference.

kzrdude · on April 29, 2014

It also dereferences through ~~~&~@~~&~Foo.

alkonaut · on April 29, 2014

y is a pointer to a pointer to a pointer to a pointer to a pointer to an instance of Foo.

What the author wanted to show was that there is automatic dereference:

z = y.foo()

Will just reference through all five pointers and call the method on the object directly.

masklinn · on April 29, 2014

A pointer to a pointer to a pointer to a pointer to a pointer to a Foo.