Hacker News new | past | comments | ask | show | jobs | submit login

There's a simple trick the real-time and high-assurance communities use to catch stuff like this: enforce a hard limit for time each task takes to complete. Some you might give leeway to prevent DDOS'ing your users. Many (most?) things have a sane, hard limit that can apply along with a log entry or notification to ops. Keeps one user or task from DDOSing whole system or forces it to fail fast + noticeably.

Note a lot of developers use this trick anymore but it's quite powerful. Should get more consideration. Not to mention the real solution of choosing tools or algorithms with predictable, reliable behavior. Those definitely exist for the task they're performing as other commenters noted.

EDIT to add modern example from high-assurance field that shows how awesome it can be (esp see Github link):

https://leepike.github.io/Copilot/




For the specific example, this is how I dealt with a situation in one service I wrote that takes an arbitrary user-controlled regex (for search).

The program was written in python, and as it turns out the GIL prevents a thread from being able to interrupt another thread that's doing a re.match() call - the entire regex engine is in C so the GIL is held for the entire call.

My solution was to use SIGALRM to set a timer before calling, and have the signal handler raise an exception. This causes the regex call to raise that exception, which I then catch and display an appropriate error.


You can only safely abort a "task" if the task has specifically be designed that way, or if the task is a process.


Which is of course part of his point. Anything that processes arbitrary user input should be designed in a way that is abortable in some way. As this particular case shows even the attempts to sanitize input can be vectors for a DOS against them.


It's why you use isolation techniques outside that code as the fall-back. Simplest forms I saw involved while loops counting downward and threads that got interrupted to assess things.


Everything processes user input. Whether a numeric user id or some text.


That's going a bit far. User input is usually handled by a small percentage of modules. Those should guard against malicious or faulty input where possible.

That's not even the issue here, though. The issue is tgat the input is processed via an algorithm thst is easy to hang and/or format that's hard to parse. On top of that, no isolation or monitor to catch the fault. Each of these can be done differently... are in many programs... to avoid or better mitigate such risks. Developers often dont.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: