Hacker News new | past | comments | ask | show | jobs | submit login

The thing that looked odd to me was:

>* IFE, IFN, IFG, IFB take 2 cycles, plus the cost of a and b, plus 1 if the test fails*

It is a long time since I worked in assembly, but I don't remember comparison functions having different timings depending on results when I were a lad.

(FYI, most of the assembler I played with was for 6502s (in the old beebs) with a little for the Z80 family and early x86)




Many CPUs with branch prediction carry a penalty of a least one cycle for mispredicted branches as the fetch stage(s) of the pipeline must be invalidated. From wikipedia:

The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. The longer the pipeline the higher the need for a good branch predictor.

http://en.wikipedia.org/wiki/Branch_predictor

EDIT: the inclusion of this is somewhat interesting as there's not much of a point in simulating a pipelined processor unless you care about hardware details. My best guess is they're adding this "feature" to make compilation to assembly MORE difficult and increase the advantages of hand-compiled assembly. Branch prediction is a tricky thing to do right in compilers.


Ah, obviously my experience is somewhat out of date! My "hacking cycles off loops in assembler" days were all before I got my hands on any kit advanced enough for pipelining and branch prediction to be a consideration.


AVR behaves similarly. Branch instructions take 1 cycle if the condition is false, and 2 if the condition is true.


low and mid-range PICs too


It seems to me the extra cycle is used to read the first word of the next instruction, since it needs to know how long the next instruction is to skip it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: