Compilers are usually at a disadvantage compared to human programmers, as they'r...

Franciscouzo · on Nov 29, 2016

There's this [1], there are also some superoptimizers that will save the optimizations they find for later use, such as [2]

[1] https://en.wikipedia.org/wiki/Superoptimization

[2] https://github.com/google/souper

chriswarbo · on Nov 29, 2016

Superoptimisation is very cool, although we'd want a bunch of infrastructure around it e.g. to identify bottlenecks, and make reuse easier.

Souper is a cool project, and I definitely agree with the use of such techniques as a form of static/dynamic analysis to inform the user, rather than as a compilation step.

Whilst the time taken by superoptimisation is a big problem, another big problem with this kind of approach is its unpredictability. Rather than using the output of such optimisers directly, it seems preferable to use those optimisers to find hints which we can annotate our code with; that way, the performance will be more predictable, and more robust to code changes.

adrianb · on Nov 29, 2016

Maybe like PGO? https://en.wikipedia.org/wiki/Profile-guided_optimization

chriswarbo · on Nov 29, 2016

Yes, PGO would form part of it.

Profiling information could be gathered during testing; we could kill two birds with one stone if we gathered profiling information during property checking, e.g. in the style of QuickCheck/(Lazy)SmallCheck/etc. Maybe with an option to add weights to the test cases, so we can assign a low weight to tests which throw crazy data at the system, like fuzzing, and higher weight to those with realistic data generators, golden tests, etc.

refset · on Nov 29, 2016

Your description of the assistant reminds me of a Clojure talk I watched recently[1] where the speaker outlines how a central pool of knowledge about invariant compilation properties could embody the scientific method.

[1] "Bare Metal Clojure with clojure.Spec" https://www.youtube.com/watch?v=yGko70hIEwk

chriswarbo · on Nov 29, 2016

Not watched the talk yet, but sounds very similar to my own thinking. For example, I think the "pipeline" approach of compilation (preprocess -> lex -> parse -> desugar -> inference -> check -> optimise -> code generation) is very restrictive, as it precludes many other activities (e.g. static analysis), forcing custom lexers, parsers, etc. to be created in parallel, which may-or-may-not work with existing code, etc.

I think a more data-based approach would be preferable, for example we might think of "source text", "preprocessed", "lexed", etc. as being tables in a relational database, and the phases of the pipeline as views/projections which populate one table from another. Optimisation would just be a one:many relation, with a single source text corresponding to multiple possible expressions. This data could be stored, collated, mined, augmented, etc. by various other processes, which allows more approaches to be taken than just compiling.

Of course, this is just one idea; and the relational part would only need to be an interface to the data, it could be computed lazily, and wouldn't necessarily be implemented with an actual RDBMS storing all of the intermediate bits.

chriswarbo · on Nov 29, 2016

Just watched the talk. The technique the author's discovered is called "supercompilation" (note: this is distinct from "superoptimisation", mentioned in sibling comments).

The idea is to use an evaluation strategy which works at compile time, reducing statically-known expressions whilst shuffling around unknown values symbolically to preserve the semantics. The result ends up folding constants, inlining, unrolling, etc. automatically, just like the examples in the talk.

Whilst supercompilation can be slow, the main criticism is that resulting code is quite large, due to the amount of inlining.

The main difference between supercompilers and the technique discussed in the talk is that supercompilation is provably correct: it doesn't rely on testing or placing exhaustiveness requirements on the user. I don't think it's necessary to sacrifice correctness; the speaker mentioned the use of guards in JITs, and how he avoids them for speed, but I think that since we're performing such extensive rewriting of the code anyway, many redundant guards can be supercompiled-away, and those remaining can be shunted to the start of the program, so that no branches appear in the hot path.

The allusions to the scientific method seem to be about properties/invariants, and potential counterexamples to them. I'm all for finding and sharing properties of programs, both conjectured and proven, although I don't like the idea of assuming the conjectures are true.

I've looked into this in Haskell, and found some great work, including:

- QuickSpec (v1 https://hackage.haskell.org/package/quickspec and v2 https://github.com/nick8325/quickspec ) which conjectures equations about functions by testing them on random inputs.

- HipSpec ( https://github.com/danr/hipspec ) which does the same as QuickSpec, but automatically proves the equations so they're guaranteed to hold.

- The GHC compiler's rewrite rule system ( https://wiki.haskell.org/GHC/Using_rules ) which allows programs to be optimised by replacing regular definitions with optimised versions in particular cases. This is how "fusion" and "deforestation" optimisations are implemented, e.g. https://donsbot.wordpress.com/2010/02/26/fusion-makes-functi...

I think there's definitely some low-hanging fruit there, for finding equivalent expressions, comparing their performance, and generating a list of rewrite rules which the programmer may want to consider adding to their code.

A slightly more ambitious project would use equations/properties about the program to generate a specialised supercompiler (maybe using something like a variant of knuth-bendix completion, ordered such that the normal forms are the higher-performing variants).