Handles are the better pointers

drblast · on June 17, 2018

A few years ago I wanted to make a shoot-em-up game using C# and XNA I could play with my kids on an Xbox. It worked fine except for slight pauses for GC every once in a while which ruined the experience.

The best solution I found was to get rid of objects and allocation entirely and instead store the state of the game in per-type tables like in this article.

Then I realized that I didn't need per-type tables at all, and if I used what amounted to a union of all types I could store all the game state in a single table, excluding graphical data and textures.

Next I realized that since most objects had properties like position and velocity, I could update those with a single method by iterating through the table.

That led to each "object" being a collection of various properties acted on by independent methods which were things like gravity and collision detection. I could specify that a particular game entity was subject to gravity or not, etc.

The resulting performance was fantastic and I found I could have many thousands of on-screen objects at the time with no hiccups. The design also made it really easy to save and replay game state; just save or serialize the single state table and input for each frame.

The final optimization would have been to write a language that would define game entities in terms of the game components they were subject to and automatically generate the single class that was union of all possible types and would be a "row" in the table. I didn't get around to that because it just wasn't necessary for the simple game I made.

I was inspired to do this from various articles about "component based game design" published at the time that were variations on this technique, and the common thread was that a hierarchical OO structure ended up adding a lot of unneeded complexity for games that hindered flexibility as requirements changed or different behaviors for in-game entities were added.

Edit: This is a good article on the approach. http://cowboyprogramming.com/2007/01/05/evolve-your-heirachy...

chacham15 · on June 17, 2018

The general approach described by the author is the one used by many game enginges. The main reason, however, the refactor that you've described here isnt used is that it bloats memory by a significant factor (which thereby reduces performance). Since you have to allocate for each array index the size of the largest item in the union, there is a lot of wasted space when you have lots of smaller items. If you have fewer than a couple billion points, a triangle will fit in 12 bytes, a game object can easily have 10 fields for a total of 30 bytes. Thats a massive expansion of your triangle array which will result in more cache misses.

drblast · on June 17, 2018

Of course, this isn't a general solution to every problem and I wouldn't use it if there was a huge distribution in entity sizes like you described.

The reason it worked in my case was that for a 2d game, the state of even the maximum sized entity was very small. For animated objects I had a per-entity integer that stored what frame to display. The texture and animation data was stored elsewhere outside the table, but that was also loaded before the game started so there was no allocation after that point.

For a 3d game with complex animations it would be foolish to try to store all the animation and texture data in the same table, but I'd probably do something very similar for the game state.

annywhey · on June 17, 2018

I've tried a lot of ways of structuring game data but I basically agree with this assessment. Simple game logic can usually be premised on a single "God Object" type of deal that contains every possible property, or if you're in a language that supports it, is just a form of dynamic type. In old 8-bit games bit packing and unions would be used to conserve memory, but the games generally weren't complex enough in architectural scope to require a generalized approach to custom data structures and containers - just a few for optimization's sake.

It took until the late 90's and some overenthusiastic misfires with class inheritance-based designs(they don't compose cleanly) for the modern component-driven styles to start emerging. The tradeoff being made in current engines is basically one of late vs early binding - it's easier to write tooling against a late binding system where you can attach more nodes arbitrarily(e.g. Godot's scene system), but your optimization potential is better if you can compute exactly the data structures and containers needed for each category of entity and asset(there's a talk from Handmadecon about how Destiny went too far down this path and worked themselves into a corner).

Edit: Also, the biggest sticking point for working a GC in games isn't really with the entity and asset management since pooling will suffice for guaranteeing the quantities of elements needed for any given scene; it's with algorithms that work with standard functions that expect a GC environment, like string manipulation.

krapp · on June 18, 2018

Shows what I know... I came around to the same conclusion as drblast with my own ecs project and assumed it would perform better than multiple arrays being a single array of the same type.

Of course, I also didn't consider any extra space wasted since it could potentially be used for recycling new components.

bo1024 · on June 18, 2018

Reminds of some assembly (!) source code I read once for, I think it was, Super Mario brothers as ported to the TI-89 calculator. Every object in the game had the same bit structure (position, velocity, weight, width, some others), with the final word being a pointer to an "update" function. The game loop was a very straightforward physics update on each object followed by a call to its update function.

IshKebab · on June 17, 2018

Sounds like you discovered the Entity-Component System.

drblast · on June 17, 2018

Pretty much. The fact that it solved a specific performance problem (avoiding GC pauses) in a tiny game I was making, along with the ease of reasoning about and debugging game state was really interesting to me. Most of the articles about it were about managing large code bases which didn't apply to my little toy game project.

I then took it to the furthest extreme I could (everything in a single table) just to see if it would work well, and it did. Almost a purely functional approach to game architecture.

Frame State 0 -> Collection of composable functions to modify state -> Frame State 1

I found that to be vastly simpler and more effective than the OO approach of segregating state and functions into individual opaque objects. At that point OO seemed like such a brain-dead way to make things more difficult for myself I wondered if the entity-component idea wouldn't be more universally applicable to systems other than games. It is.

This was one of those rare cases where taking something to a logical extreme had an architectural benefit (all state in one easy-to-reason about table) that matched up really well with the performance benefit (memory locality). For systems where the system state is too large or varied to fit in a human's head, the benefits wouldn't be as apparent. But game states are, by definition, something that should be fairly easy to reason about because that's what the player is seeing and interacting with.

sarchertech · on June 18, 2018

This is a bit pedantic, but this type of entity component architecture, isn't really the architecture that "Entity Component System" has come to describe. ECS usually refers storing the components separately and looping over each one (or small groups of them) separately.

For instance, you could a physics system that loops over all of the velocity and position components and increments the positions by the velocity and a collision system that loops over the position components and handles collision.

>memory locality

The system you're describing has less memory locality than the ECS system described above. That's the main benefit of the added complexity of storing components separately. Another big benefit is that you can store different components in different types of data structures depending on how you need to access them.

Don't get me wrong, I'm not saying the ECS I'm talking about is necessarily better. It's not normally needed, and it's almost definitely overused by games that don't need it.

Jasper_ · on June 18, 2018

Quick rant: it's "Entity Component Systems". The "System" doesn't describe the approach in general, it describes an object called a "System". Entities are IDs (equivalent to indexes into his array), Components are property bags (values in his array), Systems are things that act on components (his update loop could be considered a System).

The idea behind it is that you can have a "WorldPosition" Component which stores all world positions as a flat array, and a "Gravity" System which acts upon WorldPosition components and makes them move. The idea is that since your data is tightly packed, it's easier to update.

A lot of people get this confused, though, so no worries.

westurner · on June 18, 2018

> The final optimization would have been to write a language that would define game entities in terms of the game components they were subject to and automatically generate the single class that was union of all possible types and would be a "row" in the table

django-typed-models https://github.com/craigds/django-typed-models

> polymorphic django models using automatic type-field downcasting

> The actual type of each object is stored in the database, and when the object is retrieved it is automatically cast to the correct model class

...

> the common thread was that a hierarchical OO structure ended up adding a lot of unneeded complexity for games that hindered flexibility as requirements changed or different behaviors for in-game entities were added.

So, in order to draw a bounding box for an ensemble of hierarchically/tree/graph-linked objects (possibly modified in supersteps for reproducibility), is an array-based adjacency matrix still fastest?

Are sparse arrays any faster for this data architecture?

Improvotter · on June 18, 2018

> django-typed-models https://github.com/craigds/django-typed-models

Interesting, never came across it. But I've got a Django GitHub library (not yet open source, because it's in a project I've been working with on and off) that does the same for managing GitHub's accounts. An Organization and User inherit the same properties and I downcast them based on their polymorphic type field.

westurner · on June 18, 2018

ContentType.model_class(), models.Model.meta.abstract=True, django-reversion, django-guardian

IDK how to do partial indexes with the Django ORM? A simple filter(bool, rows) could probably significantly shrink the indexes for such a wide table.

Arrays are fast if the features/dimensions are known at compile time (if the TBox/schema is static). There's probably an intersection between object reference overhead and array copy costs.

Arrow (with e.g. parquet on disk) can help minimize data serialization/deserialization costs and maximize copy-free data interoperability (with columnar arrays that may have different performance characteristics for whole-scene transformation operations than regular arrays).

Many implementations of SQL ALTER TABLE don't have to create a full copy in order to add a column, but do require a permission that probably shouldn't be GRANTed to the application user and so online schema changes are scheduled downtime operations.

If you're not discovering new features at runtime and your access pattern is generally linear, arrays probably are the fastest data structure.

Hacker News also has a type attribute that you might say is used polymorphically: https://github.com/HackerNews/API/blob/master/README.md#item...

Types in RDF are additive: a thing may have zero or more rdf:type property instances. RDF quads can be stored in one SQL table like:

_id,g,s,p,o,xsd:datatype,xml:lang

... with a few compound indexes that are combinations of (s,p,o) so that triple pattern graph queries like (?s,?p,1) are fast. Partial indexes (SQLite, PostgreSQL,) would be faster than full-table indexes for RDF in SQL, too.

pm · on June 17, 2018

Under XNA, the garbage collector on PC was rudimentary compared to the generational garbage collector on Xbox 360. I had the same problems when working on a game in 2008.

pjmlp · on June 18, 2018

I think you got it backwards, the GC was rudimentary on the Xbox.

pm · on June 18, 2018

Ugh yeah, you're right - I had to look this up again, it's been that long. Makes sense as we always ran it on the machine as running it on PC would run fine.

jayd16 · on June 18, 2018

ECS style is nice from a design perspective but as far as GC goes it was the pooling objects into tables that really solved your issue. You can pool objects in many different styles. Preallocation is a good idea as well as stack allocating what you can.

MattHeard · on June 17, 2018

Centralised memory management doesn't inherently conflict with a decentralised OO design.

If you refactor your object construction into factory functions, you can make the factory responsible for implementing the memory management, and the objects themselves responsible for implementing the memory use. The objects can be ignorant about the handle array. The objects can delegate any memory-related functions to the factory methods.

maccard · on June 17, 2018

> If you refactor your object construction into factory functions

Given this article is about c++, no need for factory patterns or anything like that. You can overload the new operator on a global, and a per type basis and do it that way.

majewsky · on June 17, 2018

A potential problem with "operator new" is that it you can only have one allocator per type for the entire process. When you explicitly pass a factory, you can have separate factory instances.

AstralStorm · on June 18, 2018

Or if you template your types on an allocator, like STL does in some cases.

bo1024 · on June 18, 2018

Agreed that there's not an inherent conflict, but it seems the author is also arguing against an OO kind of design. In typical OO, all components of a given object are "owned" by that object which provides an interface through which they are accessed. This article seems to propose that the different components of an object are owned and accessed via different systems.

Koshkin · on June 17, 2018

But the factory pattern has nothing to do with memory management, it serves a different purpose; memory management is a separate concern.

EpicEng · on June 17, 2018

It's an abstraction used to create objects and manage their lifetime. There is no inherent opposition to managing memory in that pattern. Also, it's been done before and it works, so what is the issue specifically (i.e. not some religious nonsense about separation of concerns)?

the_mitsuhiko · on June 17, 2018

I use handles quite a bit in some rust code because you can't have self referential objects yet since rust does not have have immovable objects. While it felt like a huge hack in the beginning I really warmed up to the idea now.

majewsky · on June 17, 2018

Using handles instead of proper references is still a huge hack IMO, because you lose all the soundness checks that the compiler can do for references. It's a good thing that immovable objects are coming soon.

tomp · on June 17, 2018

Why can’t you just wrap handles into unique/referenced types?

habitue · on June 18, 2018

You can, but that only solves part of the problem: keeping pointer types from being used as the wrong type. You still have the equivalent of use-after-free to guard against which the lifetime system tracks for you for pointers, but not for handles.

the_mitsuhiko · on June 18, 2018

You can associate lifetimes with handles just fine.

Randgalt · on June 17, 2018

I'm old enough to remember programming for the original Macintosh which used handles and relocatable memory. It was a nightmare. Constantly locking, unlocking handles was a chore and error prone. I also remember early Windows APIs that had opaque handle-based memory allocations. Sometimes in life it's useful to check your history books and understand why some things were consigned to history.

perl4ever · on June 18, 2018

Constantly locking and unlocking handles was the problem. The whole point of having handles was so they could be moved around, and locking them defeated the purpose. People just created their own problems by dereferencing a handle and then wondering why their pointer became invalid and then locking stuff willy-nilly to prevent it.

taeric · on June 18, 2018

And sometimes, the reasons themselves are historical and the technique can get a resurgence.

Specifically, this can easily be used to create a narrative for many functional idioms. The cost of memory and compute has been a large part of the ride in many modern tricks.

Animats · on June 17, 2018

I've done this when I've had heavily interlinked structures. I had a 3D physics engine with elaborate data structures for all the polyhedra objects. Vertices link to edges, edges link to faces, faces link to edges and vertices, and the collision detection system follows those links to compute the closest points on two nearby objects efficiently.

All those data items were structures in arrays, part of the "Polyhedron" object. All interconnections between them were array indices. The Polyhedron object owned everything for that polyhedron. Delete that and all the parts went away.

duskwuff · on June 17, 2018

I wonder if there's anyone who'd argue for bringing back the other kind of handles:

https://developer.apple.com/library/archive/documentation/ma...

I hope not.

coldcode · on June 17, 2018

Given the fixed amount of memory available back then, handles were the only way to avoid fragmentation, although you had to be careful if you required locking them temporarily. I even wrote a tech note on this when I was at Apple back then. How far we have come since then.

pavlov · on June 17, 2018

This was my first thought too. For extra confusion, Mac handles were pointers to pointers managed by the OS while on Windows, a "handle" just meant a pointer to a struct.

Koshkin · on June 17, 2018

> Windows

Originally, a handle was an index in a table of allocated memory blocks; the idea was to be able to move these blocks around without affecting the program.

pavlov · on June 17, 2018

Ok, like handles on Mac OS then. Maybe they saw that this idea didn't work out so great on the Mac.

brandonbloom · on June 17, 2018

It worked out fine for 16 bit systems and the like, just as it did on the Mac. As machines became more powerful, the need for this technique disappeared.

johntb86 · on June 18, 2018

They didn't have much choice at the time: https://blogs.msdn.microsoft.com/oldnewthing/20041104-00/?p=...

lactoferment · on June 18, 2018

These books always reminded me of the Dungeons and Dragons manuals...

Taniwha · on June 17, 2018

yes this was my immediate thought after reading about handles with upper bits used for meta-data .... right before a shudder passed down my spine .... it brought back all those early bad mac memories

wruza · on June 18, 2018

If you care to remember, please, why was it bad? Did modern hardware/software solve it or prove it okay?

duskwuff · on June 18, 2018

Early versions of Mac OS ran on the 68000 and 68010, which had a 24-bit address bus. The upper 8 bits of the address were completely ignored, so some parts of the Memory Manager (documented in my link above!) would use those bits to store flags for handles, like "locked", "purgeable", and "resource". Seeing that Apple used these bits, some third-party developers followed suit in their own applications (or accessed those bits directly as a questionable optimization).

When Apple released computers which used 68030 processors, which had a 32-bit address bus, this trick stopped working. There was an awkward period where "32-bit addressing" was an option which could be turned on and off in the Memory control panel for compatibility with older software (and hardware!) which wasn't yet "32-bit clean".

Taniwha · on June 18, 2018

sure they used upper bits of addresses that the hardware of the day ignored, pretty predictably the very next hardware release (68020) supported 32-bit addresses potentially breaking just about every mac binary (initially they faked it out with a PAL in the address path, they eventually faked this out in the MMU resulting in a bit of TLB thrashing)

This was definitely a case of someone not planning to be successful and not looking far enough into the future

beaconstudios · on June 18, 2018

Are we seeing the emergence of a generic software best practice of centralised state management emergence across the industry? This table based storage pattern is now common to: - functional frontend apps via redux - classical apps and server backends via relational databases - game engines like the one mentioned here, and Unity's Entity Component System - probably more systems I don't know about

It seems to me like separation of state and behaviours is one of those fundamental rules that should apply to most large software applications.

asavinov · on June 18, 2018

> However, the underlying design philosophy doesn’t fit very well into a classical OOP world

These problems indeed belong to a non-OOP view of programming. In OOP, the behavior is concentrated in objects while here a significant part of the behavior is concentrated in references.

I have been working on one possible generic solution which is called concept-oriented programming [1] and where references are first-class elements of the program where a great deal of the program behavior is performed during object access using these references.

One of the main ideas is that each object class is accompanied with a reference class. Such a couple is referred to as a concept (hence the name of the model). If reference class is empty then we get normal OOP. If object class is empty then we can model normal values (which are passed by copy). Of course, we can build a hierarchy by inheriting the behavior of references and objects (which is different from the classical inheritance but generalizes it in case we have only object classes).

COP is still on early stage and cannot be used just because there are other quite serious unsolved problems. But I see that the problems described in this post are not exceptions - describing how objects are represented and how they are accessed is an important piece of custom functionality in any complex system and it is precisely COP tries to formalize (along with some other tricky problems).

[1] http://conceptoriented.org Links on concept-oriented programming (COP)

mpweiher · on June 17, 2018

So an object table, like in early Smalltalks?

http://www.mirandabanda.org/bluebook/bluebook_chapter30.html...

I've always wondered what the performance impact would be on 64 bit, assuming 32 bit indexes: on the one hand, half-size for most pointers, on the other hand extra indirection and the size of the table.

duneroadrunner · on June 18, 2018

> Most of the following blog post is written from a game developer’s perspective, but should also apply to other areas ...

Video games are a special case where allocations and deallocations are often well scheduled and predictable. For the more general case, using a data type like ivector[1] for the object arrays would provide the desired memory safety. It gives you the flexibility of safely accessing an object via index, iterator or (zero-overhead smart) pointer. Even though it's a vector (like std::vector) its iterators behave like list iterators. That is, after an insertion or removal operation, unlike indexes, they continue to point to the same item (not necessarily the same position) and they only become invalid when the item they point to is removed. In particular, they don't become invalid after a resize/reallocation operation.

So using ivector's iterators instead of index handles would allow you to safely perform insertion/removal operations if necessary.

Also, for performance reasons you may sometimes want to temporarily obtain a direct pointer to the item. ivector allows you to obtain a zero-overhead smart pointer to the item that is guaranteed not to become invalid. [2] It does this by disabling "resize" operations on the vector while it exists.

[1] shameless plug: https://github.com/duneroadrunner/SaferCPlusPlus#vectors

[2] https://github.com/duneroadrunner/SaferCPlusPlus#make_xscope...

AstralStorm · on June 18, 2018

The next step is to use the old structure called gap buffer, piece table or the more general zipper... And drop the pretense you are inventing something new. :)

duneroadrunner · on June 18, 2018

Pretense not intended, thanks for clarifying. :) Rather than algorithmic novelty, the point is more that ivector has an API and implementation that is memory safe (memory safety seemed to be an emphasis of the blog post).

As for those next steps, I'm under the vague impression that they aren't as popular these days because larger cpu caches have greatly reduced the cost of moving contiguous data in most real-world scenarios. So that more often than not the cache coherency benefits of simple contiguous data outweigh the avoidance of shifting some of that (contiguous) data.[1]

[1] simple microbenchmark of contiguous vs non-contiguous containers near the end of the article: https://www.codeproject.com/articles/1087021/stable-iterator...

AstralStorm · on June 19, 2018

Exactly why a gap buffer, piece table and zipper are a thing. These structures are optimized exactly for local modification, especially chunked or contiguous. These are semi-contiguous structures.

At some point with bigger data you will hit the brick wall where the memory move is too expensive. This is typically at a bunch of megabytes in size for localized block inserts. The idea of these only slightly more advanced structures is to amortize the cost or defer it.

For instance, a gap buffer is essentially 3 vectors/arrays. Piece table is a few more, merged as needed, usually few. A zipper can mix contiguous (potentially heap) access with tree structuring for rarely modified or accessed parts.

As for iterator stability. Either the structure knows about a active iterators (expensive) and fixes them up or you should probably use plain old pointers as such. Bet you didn't know you can and should use a pointer as an iterator to a contiguous memory... And if you really for some reason do need to hold stable iterators to anything, std::array is good enough. Remember you cannot safely resize due to potential for memory copies on reallocation. (yes there are tricks, no they're very not cheap nor threadsafe nor concurrent) The consistency model of ivector iterators is not quite explicit.

man-and-laptop · on June 18, 2018

From what I can tell: that while this memory-management strategy can stop a program from seg-faulting or from executing remote code, the style of programming is the same as with `malloc` and `free`. You still have to `alloc_texture()` and `free_texture()` in your code even if those functions return a handle instead of a raw pointer.

Some observations about the strategy based on the article:

  - It does *not* prevent memory leaks. It uses fixed size arrays, so once you allocate more memory
  than there is room in the array, then the program has to exit.
  - It does not prevent use after free. It just makes it easier to detect when this has happened.

This could make debugging easier in some situations. I do wonder though how it interacts with code analyzers.

Also this could work in any language. I wonder if it's common in game development for Android, because of Java and garbage collection spikes.

Oh, and another observation: I think this strategy is fully compatible with RAII and smart pointers. Smart pointers are implemented using malloc/free, and this strategy is isomorphic to malloc/free, so combining them should work fine.

naasking · on June 18, 2018

The program doesn't need to exit, it can simply allocate a larger space, and it knows exactly what slots are in use and can preserve those indices.

sacado2 · on June 18, 2018

I starting using rust recently and I noticed my programs tend to use the techniques described in this article. Using an "object" pool that is responsible for the creation / management / destruction of values and their referencing is a good way to reason about lifetimes when you need to manage DAGs or graphs (where reference ownerships and lifecycles are hard to reason about).

woolvalley · on June 18, 2018

This feels like your recreating something equivalent to database tables with a numeric id index.

pavlov · on June 17, 2018

Centralized resource management, public index handles... So basically OpenGL? (They're called "object names" rather than "index handles" in OpenGL, but the principle seems to be the same.)

jmts · on June 18, 2018

One downside to this approach as far as I interpret it,

    widget_handle widget = widget_manager_new_widget(manager);

is that it appears to require the user to either remember two values,

    widget_tweak(manager, widget);

or the developer to push a variable into (at least) file scope,

    widget_tweak(widget)
    {
        actual_widget = widget_manager_lookup_widget(manager, widget);
    }

That's not super nice.

Most of the benefits described can be had by using pools rather than a naive heap allocator.

Seem like a nice way to hide bugs, though.

hcs · on June 18, 2018

I think there is meant to be only one "manager" per object (thus, handle) type.

jmts · on June 18, 2018

I had envisaged my 'manager' variable above as type widget_manager. Having a global manager of all things, or independent manager makes no difference to my argument.

The manager has to live in file-scope or above since there is no room in the handle to save a reference to it. At that point, you're creating a singleton and reducing the flexibility of the module. You now have all the usual issues with globals and singletons.

Unless of course you're passing around structures containing a handle and pointer to its manager...

fsloth · on June 18, 2018

There is nothing wrong with globals and singletons. They are very good patterns. Some areas of code deal with global state, and there is no sense in pretending it is not so. On the other hand some other areas definetly need a closure of their own (like a workload dumped to a worker thread).

The usual issues with globals pop up when people whant to run shortcuts and never implement the data separation from global to local when they should, think they can be used to replace a database like system, or - the horrors - use some multi level stack based system to push and pop global state as they merrily progress along the call stack - i.e run computations while using the global state as some sort of auxiliary storage unit.

AstralStorm · on June 18, 2018

Generally it is better to have a context with a well defined lifetime rather than the other two options.

Additionally thread safety puts very stringent requirements on global state, further making singletons and language goals worthless with few exceptions.

scotty79 · on June 18, 2018

I think OO would be way faster if instead of keeping all of the object properties together in memory and dealing with one object at a time you'd have object collection as you base type (you rarely have one of anything in you program). It would have array for each object property and you would never access change or pass whole one object, just specific properties of specific object. This should be very cache friendly.

tome · on June 18, 2018

Then how would subtype polymorphism work?

scotty79 · on June 18, 2018

If you just want to inherit from the object collection you already have to have new object collection that behaves slightly differently then it should work exactly the same way as usual.

If you wanted to have different types of objects in one collection first you'd have to ask yourself if you really need to keep them together. But if so I suppose you'd just put base fields in one collection and new additional fields in another. Then handle to derived object would consist of two indexes, one in the base collection and the other in the collection of additional fields. Methods that access base field would need to use first index for them, and second index for additional fields.

Not sure if I'm being clear enough. Imagine you want to store collection of various subtyped objects in the database. This is pretty much the same problem. You keep base fields in one table, and specific fields in other tables (one for each subtype), and if you want to operate just on base fields you use base table, but if you want to operate on fields specific for a subtype you join base table with table that contains fields specific for that subtype.

rcthompson · on June 17, 2018

The method of filling spare bits with random values to guard against stale object access seems like it would probably work well in practice, but is there a way to make it 100% air-tight and guarantee that stale object access will always be prevented? Otherwise it seems like that could end up as the basis of a security vulnerability in a program that uses this pattern.

rocqua · on June 17, 2018

Use a lot of bits, increment upon re-use of the slot, and throw when the pattern reaches its maximum.

Alternatively, give each item a unique handle (again counting up) and implement some sort of lookup table to convert those handles into actual indexes.

tom_ · on June 17, 2018

If you don't mind the handle being rather large, make the sequence number 64 bits, and store it alongside the index. 2^64 nanoseconds in years = 585... computers just aren't fast enough to wrap it around in any reasonable time frame.

Depending on how many index bits you want, you might be good with only 64 bits in total. 2^56 ns in years = 2.3; 2^52 ns in days = 52; 2^48 ns in days = 3.3; and so on.

All of these figures are worst case scenarios, anyway. Your program probably simply can't actually be made to allocate objects at that rate.

sherjilozair · on June 17, 2018

It's only a security vulnerability if there is a bug already present in the code. This mechanism is not introducing the vulnerability. In fact, it's helping find out the bug during development. At release runtime, the spare bits should always match.

rcthompson · on June 18, 2018

I mean that if an object is deleted and its space in the array is reused for a new object, there's a small but non-zero chance that the spare bits will be identical, in which case the code would gain access to the new object when it is not supposed to. Or am I misunderstanding how the handle concept works?

sherjilozair · on June 18, 2018

Yeah, I think you are misunderstanding something here.

Yes, there is a non-zero chance that spare bits will be identical. But that's only relevant, if you have a bug in your code wherein you are trying to use a destroyed handle. If you are never using a destroyed handle, then the spare bits being identical is completely okay.

Another way to think of this is that the spare bits are optional. The interface is good even if there were no spare bits. The spare bits only add an additional probabilistic security check.

naasking · on June 18, 2018

The fact that it's probabilistic is what makes it unreliable as a real security check. That's how I understood the original poster's concern. It might also lend a false sense of security, like most other solutions that don't actually solve the full problem.

Too · on June 18, 2018

* >A good example for this ‘system ___domain knowledge’ is the destruction of rendering resource objects with modern 3D APIs: a resource object can not simply be destroyed immediately when the user code says so because the resource might still be referenced in a command list waiting to be consumed by the GPU. Instead the rendering system would only mark the resource object for destruction when the user code requests its destruction, but the actual destruction happens at a later time when the GPU no longer uses the resource.*

This is exactly the same behavior you get with shared_ptr, both the user code AND the rendering system would hold a reference to the object and it would only be deallocated once BOTH of them destory their shared_ptr. weak_ptr allows even more fine grained control.

You still need to manage your resource lifetime somehow, with either some equivalent of unique_ptr or shared_ptr. Otherwise you've just reinvented global variables. How else would you know when to free an object from your object pool? A "least-used" pattern could be useful in some scenarios but then you also need to be able to reconstruct the object again in case someone asks for it after a long time of being idle (like swapping to disk).

lostmsu · on June 17, 2018

I am actually trying to implement generic pointers in Rust, that could safely work with object pools and/or preallocated memory regions of arbitrary bitness.

My struggles are here: https://github.com/lostmsu/mem_bitness

man-and-laptop · on June 19, 2018

An implementation: https://github.com/jkabrg/type-segregated-pools

myWindoonn · on June 17, 2018

Oh hey, I recognize this memory pattern; it's also used in capability-aware software sometimes to implement capabilities, as a "capability list" or "c-list": https://en.wikipedia.org/wiki/C-list_(computer_security)

jnwatson · on June 18, 2018

The file descriptor table too.

M_Bakhtiari · on June 18, 2018

How does shared_ptr work? Reference counting?

No wonder that's slow and wasteful of both cache and branch prediction. What's wrong with using a proper generational moving garbage collector?

saagarjha · on June 18, 2018

> How does shared_ptr work? Reference counting?

Yep.

> What's wrong with using a proper generational moving garbage collector?

Because then you don't have deterministic performance which is necessary in certain contexts–or am I misunderstanding you here?

M_Bakhtiari · on June 18, 2018

Deterministic performance hasn't been a thing since the early 90s. I'll settle for just performance.

But sure, there are cases where you want the level of predictability of reference counting, in which case I'd rather take the performance penalty than spend time and pollute the codebase doing a lot of tricks to arrange memory as a compacting garbage collector would.