I have also found people often unestimate realloc (but have never done the same level of investigation to find out just how clever it is!)
On several occasions I have wanted to use mmap, mremap, and friends more often to do fancy things like copy-on-write memory. However, I always find this whole area depressingly poorly documented, and hard to do (because if you mess up a copy-on-write call, it just turns into a copy it seems, with the same result but less performance).
While it's good realloc is clever, I find it increasingly embarassing how badly C (and even worse, C++ which doesn't even really have realloc (as most C++ types can't be bitwise moved) memory allocation maps to what operating systems efficiently support.
C++ really wants a realloc variant that extends an allocation if it can be extended without a copy, and leaves the allocation unchanged if it can't. The annoying thing is that there's no good reason why this can't exist beyond that the STL allocator interface happens not to have it.
There have been C++ templates written that use jemalloc-specific calls; for instance see Folly from facebook. I haven't taken a close look, but I know they do some jemalloc-specific code:
https://github.com/facebook/folly/tree/master/folly
The other allocated-related thing that C++ really wants (and could benefit C as well) is "sized deallocation". Most of the time you often know the exact size of the memory you allocated. If you could pass that to free() the allocator could save some work determining it. In the case of C++ the compiler can often do this on your behalf for many "delete" calls (at least in the cases where it knows the exact type). Google did an implementation of this idea and got good results. They submitted a proposal to the standards body but I don't know if there is any recent activity. I hope it does happen though:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n353...
Folly does take advantage of jemalloc to expand allocations in-place when possible, but afaik it doesn't do the more extreme optimization mentioned in the article where pages are moved to a different virtual address without actually paging into memory.
Since 2.1, jemalloc does support using mremap to do large realloc()'s, although it seems to be off by default. You need "./configure --enable-mremap" to get it.
That's good news about sized deallocation, I hadn't noticed that there is an updated "N3778" proposal which apparently was accepted. I still haven't seen the dlmalloc work to support that show up in the main svn branch.
It always calls (the equivalent of) malloc + memcpy + free for each grow, so it can be anywhere from the exact same speed (when realloc does the same thing internally) to absurdly slower (in the given case of a large array that has to be paged in). The first case is by far the most common case, but it is something that sometimes matters.
It seems that even vector::shrink_to_fit is implemented that way.
That is, when you call shrink_to_fit, it first copies all elements it has, swaps itself with the new copy, and deletes the original vector. (Probably because there's no portable way of returning only part of a contiguous memory region. Yikes.)
I think the reason is the existence of move/copy constructors, i.e. it's invalid to just move a C++ object in memory (as a realloc would do), in general.
(shrink_to_fit does need to (attempt to) get a smaller allocation, otherwise or wouldn't be shrinking the vector.)
Why use realloc when growing an array you have an interface to? Just add another allocated buffer to the previous allocated areas. When there are too many small areas, consolidate with realloc/free.
Much faster (yes yes, almost always).
(Disclaimer: Last time I used C++ I had hair. :-) )
Well, there is no grow() method on std::vector, so ... no?
But generally speaking, std::vector implementations are basically required to[1] grow the backing store exponentially so that adding elements to them absolutely does not call malloc on every growth of the vector.
You can make a vector do this kind of pessimistic allocation by calling reserve() for every element you add, which will cause the allocation of exactly the amount you reserved. This would be dumb, though. Reserve is there so you can allocate a precise large number and avoid even the logarithmic cost of allocation in adding elements to the vector.
It's really worth noting that this is a better worst case than the worst case for realloc(), which is entirely entitled to reallocate and copy every single time you call it. You're pretty likely to implement the exact same algorithm as vector if you DIY because of this exact issue when performance is important.
I do agree, though, that it would be nice if there was a failable realloc in C++ (and C for that matter) as described above, where it simply returns NULL if there's no more room in the allocated space. What to do in that event should really be up to the caller, not a black box algorithm sensitive to all sorts of variables.
[N] However, std::basic_string allows linear complexity on its push_back, so that may be what the poster meant. I'm not aware of any widely used implementation that actually does it in worse than amortized constant, though.
I was talking of the standard way of sllocating extra buffers, to keep new data. With a layer on top of this, using a series of pointers (8-16) to the underlaying memory areas.
(When you fill up the sub-buffers, you do a realloc to all of them. Future sub-buffers get the new size.)
This needs less copying and (potentially) free calls. On the other hand: An access would need some extra operation to find the right buffer and get an offset into it.
Edit: Ok, thanks. I explained if I was unclear, since you went off on a tangent. It was informative, so it is all good. (I'm not going back to C++ et al anyway. :-) )
std::vector guarantees that it stores its elements contiguously, as this is required for a lot of use-cases (such as passing the buffer to one of the millions of functions that take just a pointer and a size).
What you're talking about is some sort of rope data structure which is pretty simple to implement on top of std::vector and std::list. The std::vector guarantees contiguous memory locations so you can't do it for that particular container.
On several occasions I have wanted to use mmap, mremap, and friends more often to do fancy things like copy-on-write memory. However, I always find this whole area depressingly poorly documented, and hard to do (because if you mess up a copy-on-write call, it just turns into a copy it seems, with the same result but less performance).
While it's good realloc is clever, I find it increasingly embarassing how badly C (and even worse, C++ which doesn't even really have realloc (as most C++ types can't be bitwise moved) memory allocation maps to what operating systems efficiently support.