A lot of people have ways of accomplishing this, but my way is using compile-tim...

purplesyringa · 2025-04-29T21:38:29 1745962709

I think we're all talking past each other here.

Your suggestions introduce, in effect, a hypothetical `if` statement, only one branch of which is taken. I can change the condition arbitrarily, but ultimately it's still going to be either one or the other.

I want the `if` to take both branches at once. I want the compiler to assume that both branches trigger the exact same side effects and return the same results. I want it to try both approaches and determine the better one depending on the environment, e.g. the number of free registers, (lack of) inlining, facts statically known about the input -- all those things that you can't write a condition for on the source level.

Think about it this way. A standard compiler like LLVM contains passes which rewrite the program in order. If something has been rewritten, it will never be rolled back, except it another pass performs a separate rewrite that explicitly does that. In contrast, e-graphs-based compilers like Cranelift maintain an equivalence graph that represents all possible lowerings, and after the whole graph is built, an algorithm finds a single optimal lowering.

Existing solutions make me choose immediately without knowing all the context. The solution I'd like to see would delay the choice until lowering.

thrtythreeforty · 2025-04-29T22:01:37 1745964097

> e-graphs-based compilers like Cranelift maintain an equivalence graph that represents all possible lowerings, and after the whole graph is built, an algorithm finds a single optimal lowering

Do you have a good entrypoint reference for learning about how this works? This (and the associated mention in the article) is the first time I've heard of this approach.

purplesyringa · 2025-04-29T22:05:27 1745964327

@thrtythreeforty I think this RFC is a good start: https://github.com/bytecodealliance/rfcs/blob/main/accepted/.... Then read through these docs: https://docs.rs/egg/latest/egg/tutorials/. They document the behavior of a particular crate, but they also act as a very accessible high-level overview.

alion02 · 2025-04-30T09:04:49 1746003889

I recently ran down approximately the same rabbit hole when trying to figure out what to do about x86 treating addition and bitwise OR differently. There's https://llvm.org/docs/LangRef.html#id171, but it can't generally be synthesized in Rust. So I went on a short-lived quest:

- https://internals.rust-lang.org/t/expose-llvms-or-disjoint-i...

- https://github.com/rust-lang/libs-team/issues/373

- https://github.com/rust-lang/rust/pull/124601

Which ultimately culminated in an opinion that should sound familiar - https://github.com/rust-lang/rust/pull/124601#issuecomment-2....

Validark · 2025-04-29T23:46:33 1745970393

Oh, yes, I understand now. I've thought to myself before it would be nice if I could have implementation 1 go into variable x. And implementation 2 go into variable y. Then I do `assert(x == y)` and a compiler like Cranelift should know it only needs to pick one of them.

I'm glad to know that's the design of Cranelift, since that's how I would think it should be done, although I haven't written a middle or backend for a compiler yet.

panic · 2025-04-30T01:32:30 1745976750

Another cool thing you could do is fuzz test a version of the code that actually does take both branches (in separate runs with fresh memory etc.) and aborts if they give different results.

almostgotcaught · 2025-04-30T00:07:28 1745971648

> I want the `if` to take both branches at once.

this is called symbolic execution - as i have already told you, many compilers do this in the form sccp and scev. you don't need silly things like egraphs for this.

> If something has been rewritten, it will never be rolled back

this is patently false - not only do passes get rolled back to catch/correct errors but there is a fixed-point iteration system (at least in MLIR) that will apply passes as long as they are "profitable".

purplesyringa · 2025-04-30T08:59:22 1746003562

I don't see how symbolic execution is relevant to this. Yes, symbolic execution does "check both paths" for some definition of "check", but ultimately I as a programmer still need to write a condition, and that condition is on the source level, so it can't access information on e.g. register pressure, which is what I would like to comptime-branch on.

> not only do passes get rolled back to catch/correct errors but there is a fixed-point iteration system (at least in MLIR) that will apply passes as long as they are "profitable"

Huh? I'm not arguing against fixed-point iteration, that's perfectly fine because it's still a unidirectional process. What do you mean by "passes get rolled back to catch/correct errors" though? Certain rewrites can certainly be not performed in the first place if they pessimize the code, but that's not what I'm talking about.

If there's pass 1 that chooses between rewrite A and B, pass 2 that chooses between rewrite C or D, and pass 3 choosing between E or F, in a typical compiler, this choices would be made one by one mostly greedily. An e-graph style approach allows all of those eight combinations to be tried out without necessarily leading to a combinatorial explosion.