Hacker News new | past | comments | ask | show | jobs | submit login

I think it has shaken out the way it has, is because compile time optimizations to this extent require knowing runtime constraints/data at compile time. Which for non-trivial situations is impossible, as the code will be run with too many different types of input data, with too many different cache sizes, etc.

The CPU has better visibility into the actual runtime situation, so can do runtime optimization better.

In some ways, it’s like a bytecode/JVM type situation.






If we can write code to dispatch different code paths (like has been used for decades for SSE, later AVX support within one binary), then we can write code to parallelize large array execution based on heuristics. Not much different from busy spins falling back to sleep/other mechanisms when the fast path fails after ca. 100-1000 attempts to secure a lock.

For the trivial example of 2+2 like above, of course, this is a moot discussion. The commenter should've lead with a better example.


Sure, but it’s a rare situation (by code path) where it will beat the CPU’s auto optimization, eh?

And when that happens, almost always the developer knows it is that type of situation and will want to tune things themselves anyway.


What kind of CPU auto-optimization? Here specifically I envisioned a macro-level optimization, when an array is detected to have length on the order of thousands/tens of thousands. I guess some advanced sorting algorithms do extend their operation to multi-thread in such cases.

For CPU machine code it's the compilers doing the hard work of reordering code to allow ILP (instruction-level parallelism), eliminate false dependencies, inlining and vectorization; whatever else it takes to keep the pipeline filled and busy.

I'd love for the sentiment "the dev knows" to be true, but I think this is no longer the case. Maybe if you are in a low-level language AND have time to reason about it? Add to this the reserved smile when I see someone "benchmarking" their piece of code in a "for i to 100000" loop, without other considerations. Next, suppose a high-level language project: the most straightforward optimization to carry out for new code is to apply proper algorithms and fitting data structures. And I think this is too much to ask nowadays, because it takes time, effort, and knowledge of existence to remember to implement something.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: