Hacker News new | past | comments | ask | show | jobs | submit login

Calls like execvp() do little more than splitting PATH on ':', followed by repeatedly invoking execve() on ${dir}/${filename}. The fewer elements you have in PATH, the fewer execve() calls need to be performed in the worst case.



Is that ever going to be a hot path?


It's probably not exactly going to be hot, but even failing execve is inherently semi-expensive since it needs to be a syscall and incurs context switches.

It's just outweighed a couple orders of magnitude by all the overhead that comes with a successfully launching another executable unless you have, like, a thousand junk paths in your PATH.


Theoretically can be. Every command you invoke without a path will need to look up PATH.

In practice well behaving shells cache the contents of PATH to speed up operations.


Sounds like they need fixed for inefficient handling of simple operation.


The fix is for the user to use a smaller $PATH when possible. Any method of checking that the command exists and is executable before trying to execute it leads to TOCTOU race conditions.

https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use


I’m assuming you are proposing to stat each candidate before trying to execve it. I’m also assuming that a stat system call is roughly as expensive as an execve of a nonexistent or non-executable path.

For every failed candidate, you are doing one system call, so roughly the same cost each way.

Now if you just do an execve, you’re just paying that cost. If you stat first, you pay the cost of another system call that doesn’t change the flow of your program at all (a nice way of saying you’re wasting time).

Unless stat is dramatically faster than exec on a nonexistent or non-executable path, there’s never a case where this is better.


Context switches could straightforwardly be saved by doing the PATH splitting and lookup in-kernel, or just providing a list of executable paths to check.

It didn't work out this way historically (doing unnecessary string processing, requiring extra memory, could've been more expensive than the context switches), and the performance impact of failed execve isn't normally a high priority, and there are other reasons not to want stuff in the kernel (not that it stops frankly less critical stuff from getting in the kernel), but there's definitely low-hanging fruit here if it like, mattered.


Enlighten me how you would implement it instead.


It's not really an accurate description anyway. Most shells will only perform the PATH lookup one time per command, then store the found fully-qualified file path in an in-memory hash table for quicker lookup each subsequent invocation. This is why you need to blast the cache if you delete or move an executable. Plus, many common utilities are replaced by shell built-ins anyway and they never require directory traversal at all.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: