Hacker News new | past | comments | ask | show | jobs | submit login

[flagged]



We have 50 years of experience of code telling us that, no, programmers are not consistently capable of avoiding memory safety just by being good about it. Saying that it's just a failing of lesser programmers is the height of extreme arrogance, since I guarantee you that you've written memory safety vulnerabilities if you've written any significant amount of C code.

The problem is not that the rules are hard to follow--they're actually quite easy! The problem is that, to follow the rules, you need to set up certain invariants, and in any appreciably large codebase, remembering all of the invariants is challenging. As an example, here's a recent memory safety vulnerability I accidentally created:

  int map_value(map_t *map, void *key) {
    // This returns a pointer to the internals of map, so it's invalidated
    // any time map is changed. Therefore, don't change the map while this
    // value is live.
    int *value = add_to_map(map, key, default_value());
 
    // ... 300 lines of code later...
    // oops, need to recurse, this invalidated value...
    int inner = map_value(value, f(key));

    // ... 300 lines of code later...
    // Hi, this is now a use-after-free!
    *value = 5;
    return value;
  }
It's not that I'm too stupid to figure out how to avoid use-after-frees, it's that in the course of refactoring, I broke an invariant I forgot I needed. And the advantage of a language like Rust is that it bops me on the head when I do this.


The solution to that is to either:

1) If you're gonna return pointers, don't use an allocation scheme that invalidates pointers in the implementation of map_t.

2) If you want to reallocate and move memory, don't return pointers, return abstracted handles/indices.

Both of those are completely possible and not particularly unergonomic to do in C. If you need to enforce invariants, then enforce them, C does have enough abstraction level for that.


Handles for the win, I never understood why we don't use handles for all dynamic memory.


We have more than 50 years of experience in this, I've been coding for 50 years myself, and there was a huge industry when I started.

I still follow KISS, and if you're writing 600+ line functions, it is little surprise you forget things in the tightly written logic of code.


Is your function body really 600+ LOC?? If so, then I think I might have found your problem...


I one worked on a commercial product, a mix of C++ calling our own C libraries, that had 4000+ LOC in a single case statement. This was one of my first jobs out of school so I was shocked.


My experience reading lower level languages suggests me that this is not only possible but it seems that the higher the quality of the project (in scope, authors, products) the more of those you find.

When people focus on the _right_ implementation for _their_ problem they often find out that best practices like DRY or abstractions don't scale well at all.


As someone who has been designing, writing, and operating high-performance systems for decades, I can guarantee you that it does not boil down to "laziness".

Everyone starts with the best of intentions. malloc() and free() pairs. Then inevitable complexity comes in - the function gets split to multiple, then across modules, and maybe even across systems/services (for other shareable resources).

The mental overhead of ensuring the releases grows. It _is_ hard, and that's most definitely not a lie beyond any trivial implementation.

Surprisingly "just design systems that manage their memory correctly", as you said, is a very legitimate solution. It just so happens that those systems need good language support, to offload a good chunk of the complexity from the programmer's brain to the machine.


No, it is laziness, at the system architecture level. I've been doing this for decades too, in major corporations, writing the big name services that millions to billions of people use. The system architects are lazy, they do not want to do the accounting - that is all it is, just accounting of the resources one has and their current states, integrating that accounting tracking system into the environment - but few to none do, because it creates hard accountability, which they do not want. A soup of complexity is better for them, it grows their staff.

I've been playing this game long enough to see the fatal flaws built in, which grows complexity, staff, and the magnitude of the failures.


> I've been playing this game long enough to see the fatal flaws built in, which grows complexity, staff, and the magnitude of the failures.

Renee Descartes, a brilliant philosopher and scientist always used to say that if he thinks A and the rest of the world thinks B, the wrong one is probably him. And he was one of the most brilliant humans we've ever produced!

Mind you, don't misread that as "you should act like a sheep" but in the opposite sense, that even if you're 100% certain others are wrong, you should still have the mental acumen to understand that probability isn't stacked in your favour.

People aren't burning their professional (or even hobby) time for years or decades working on tooling and languages that help in memory management just to help other's skill issues.

They find it an issue that needs better solutions, and that's it in C.


Why do you use pointers and a high level language like C when you could just write assembly and load and unload all your instructions and data into registers directly. Why do you need functions? You could just use nothing but JMP instructions. There's a whole lot of stuff that C handles for you that is completely unnecessary if you really understood assembly and paid attention to what you're doing.


I'm from the era that when I was taught Assembly, half way through the class we'd written vi (the editor), and when finishing that one semester we had a working C compiler. When I write C, I drop into Assembly often, and tend to consider C a macro language over Assembly. It's not, but when you really understand, it is.


And Rust developers drop down and manage memory directly when they need to and even inline assembly, sometimes.


Yes we know, all the world's problems can be solved with a rewrite in Rust.

Andre Malraux was right: "The 21st century will be religious or it will not be".

He just got the definition of religion wrong.


I guess you are "the one". This means you won't fail this stuff and this discussion is not for you, it is for the rest of us who would.

https://rachelbythebay.com/w/2018/04/28/meta/


I agree with the grandparent mostly because the article doesn't have any real world applications.

Forgetting to free memory that is allocated and then used inside of a function is the rarest kind of memory management bug that I have run into in large code bases. It's frequently obvious if you read the function and are following good practices by making the code clean and easy to read / follow.

The ones that bite are typically a pointer embedded in some object in the middle of a complicated data structure that persists after a function returns. Reference counting may or may not be involved. It may be a cache of some sort to reduce CPU overhead from recomputing some expensive operation. It's rarely a chunk of memory that has just been allocated. To actually recover the lost memory in those cases is going to need something more complicated like garbage collection.

But garbage collection is really hard to retrofit into C when libraries are involved as who knows what kind of pointer manipulation madness exists inside other people's code.

What would be really interesting is if someone made a C compiler that replaced pointers with fat pointers that could be used to track references and ensure they are valid before dereferencing. Sure, it would be an ABI bump equivalent to implementing a new architecture along with plenty of fixups in legacy C code, but we've done that before. The security pendulum has swung over to the point that rebuilding the world would be considered worthwhile as compared to where we stood 10-15 years ago. It'd certainly be a lot of work to get that working compared to a simple hack per the Fine Article, but it would have real value.


Yep, okay, I mostly agree with you on this.


I'm advocating not to wing it, design up front and then follow that design. When the design is found lacking, redesign with the entire system in mind. Basically, all I'm saying is do not take short cuts, they are not short cuts. Your project may finish faster, but you are harming yourself as a developer.


I had initially written a very snarky comment here, but this one [1] actually expresses my view quite well in a respectful way, I would answer with this. I guess the discussion can continue there as well.

[1] https://news.ycombinator.com/item?id=43387334#43388305


You can have situations like

https://github.com/bsenftner/kvs/blob/master/kvs/kvs.cpp#L72...

where it's then not clear why there is no call to

`kv.m_binarySize = byte_size`

after calling

`kv.mp_binaryData = (uint8_t)malloc( sizeof(uint8_t) byte_size );`

because it seems like `kv.mp_binaryData` will now have a different size than it had before. That is, there will be a mismatch. Though it should not affect the `free` call.

I hope I'm missing something because I just dealt with the code for around 5 minutes.


That m_binarySize should have been removed, nothing referenced it beyond it's own code. I knew it did not affect the free() call and left it. That entire KVS lib is a fine example of KISS, it's so small I can hold it in my head, and issues like that m_binarySize field are just left because they end up being nops.


Yes, and the general trend of falling traffic fatalities is because people are driving better, right? Nobody's perfect, most people are far from perfect, and if it's possible to automate things that let you do better, we should do that


Beware of automation that negates understanding. At some point, changes or maintenance requirements will need to revisit the situation. If it is wrapped in some time consuming complexity, it will just be thrown out.


There's no "market propaganda", but 50+ years of evidence of memory-related bugs and vulnerabilities telling us that if we can do better in terms of tools and programming languages, we should try.


Performance art?


[flagged]


I downvoted because in my mind you are winging it. "Just give it back" works well for simple cases, I suppose.

We observe that engineering teams struggle to write correct code without tools helping them. This is just an unavoidable fact. Even with tools that are unsound we still see oodles of memory safety bugs. This is true for small projects run by individuals up to massive projects with hundreds or thousands of developers. There are few activities as humbling as taking a project and throwing the sanitizers at it.

And bugs aren't "well you called malloc at the top of the function and forgot to call free at the bottom." Real systems have lifetime management that is vastly more complex than this and it is just not the case that telling people to not suck mitigates bugs.


I'm advocating to design, and then follow the design, and when the design is found lacking redesign to include the new understanding. This writing of software career is all about understanding, and automating that understanding. Due to market pressures, many companies try to make due with developers that take shortcuts, these shortcut takers the majority of developers today, skewing the intellectual foundations of the entire industry. Taking shortcuts does not negate the fact that taking a shortcut is short sheeting one's understanding of what is actually occurring in that situation. These shortcuts are lazy non-understandings, and that harms the project, it's architecture, and increases the cognitive load on maintenance. It's creating problems for others and bailing, hoping you're not trapped maintaining the complex mess.


And I'm telling you that designing an application with a coherent memory management plan still leads to teams producing errors and bugs that are effectively prevented with sound tools. Soundness is not a shortcut.


You can consider whatever you want. That doesn't make it accurate. The reason I downvoted was for unnecessary inflammatory language. Your point would have been better without it (more likely to be heard by the people you claim to be talking to, at a minimum).

If you're actually trying to talk to people, if you're not just here to say "I'm smart and you're stupid" to gratify your ego, then why talk in a way that makes other people less likely to listen?


You are correct, and I seriously need to work on my language usage.


Well, see, I never give in to the temptation of using inflammatory language. Never... um, never today... um, so far... I think...

We've all been there. (Well, maybe dang hasn't. Most of the rest of us have, though.)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: