Calls like execvp() do little more than splitting PATH on ':', followed by repea...

criddell · on May 11, 2022

Is that ever going to be a hot path?

naniwaduni · on May 11, 2022

It's probably not exactly going to be hot, but even failing execve is inherently semi-expensive since it needs to be a syscall and incurs context switches.

It's just outweighed a couple orders of magnitude by all the overhead that comes with a successfully launching another executable unless you have, like, a thousand junk paths in your PATH.

hnlmorg · on May 11, 2022

Theoretically can be. Every command you invoke without a path will need to look up PATH.

In practice well behaving shells cache the contents of PATH to speed up operations.

mekster · on May 11, 2022

Sounds like they need fixed for inefficient handling of simple operation.

rascul · on May 11, 2022

The fix is for the user to use a smaller $PATH when possible. Any method of checking that the command exists and is executable before trying to execute it leads to TOCTOU race conditions.

https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use

koenigdavidmj · on May 11, 2022

I’m assuming you are proposing to stat each candidate before trying to execve it. I’m also assuming that a stat system call is roughly as expensive as an execve of a nonexistent or non-executable path.

For every failed candidate, you are doing one system call, so roughly the same cost each way.

Now if you just do an execve, you’re just paying that cost. If you stat first, you pay the cost of another system call that doesn’t change the flow of your program at all (a nice way of saying you’re wasting time).

Unless stat is dramatically faster than exec on a nonexistent or non-executable path, there’s never a case where this is better.

naniwaduni · on May 11, 2022

Context switches could straightforwardly be saved by doing the PATH splitting and lookup in-kernel, or just providing a list of executable paths to check.

It didn't work out this way historically (doing unnecessary string processing, requiring extra memory, could've been more expensive than the context switches), and the performance impact of failed execve isn't normally a high priority, and there are other reasons not to want stuff in the kernel (not that it stops frankly less critical stuff from getting in the kernel), but there's definitely low-hanging fruit here if it like, mattered.

EdSchouten · on May 11, 2022

Enlighten me how you would implement it instead.

nonameiguess · on May 11, 2022

It's not really an accurate description anyway. Most shells will only perform the PATH lookup one time per command, then store the found fully-qualified file path in an in-memory hash table for quicker lookup each subsequent invocation. This is why you need to blast the cache if you delete or move an executable. Plus, many common utilities are replaced by shell built-ins anyway and they never require directory traversal at all.