How to write correct code by construction using the Coq Proof Assistant

agumonkey · on Sept 3, 2023

Interesting article, they list applied use of the theory on mainstream tools, things I never see on my readings. https://plv.mpi-sws.org/rustbelt/popl18/ for instance. Very cool

nextos · on Sept 4, 2023

Also take a look at Dafny, Why3/WhyML, and Ada/SPARK. Very pragmatic tools that are getting some industry adoption.

Coq and Isabelle also see some use, but they are more costly to use in my experience.

mirekrusin · on Sept 3, 2023

Software Foundations in Coq by Michael Ryan Clarkson [0] is great introduction to this topic.

[0] https://www.youtube.com/playlist?list=PLre5AT9JnKShFK9l9HYzk...

stepchowfun · on Sept 3, 2023

If you're already familiar with a functional programming language like Haskell or OCaml, you have the background to work through my Coq tutorial here: https://github.com/stepchowfun/proofs/tree/main/proofs/Tutor...

My goal with this tutorial was to introduce the core aspects of the language (dependent types, tactics, etc.) from first principles. If you're fascinated by proof assistants like Coq or Lean and want to understand how to use them, this tutorial is written for you.

Any feedback is appreciated!

schoen · on Sept 4, 2023

I've done Logical Foundations and part of Programming Language Foundations, and some of the stuff in your tutorial is new to me (especially dependent types).

Your tutorial is very interesting, but in comparison to the Software Foundations series you have relatively few exercises and they're all at the end of the chapter. I found it helpful to do exercises as I went along, in many cases testing or exercising my knowledge of a new concept or Coq feature right away. So I would suggest writing more exercises and sprinkling them closer to where some of the material is introduced.

Also, some of your exercises in some chapters have a pretty steep ramp-up in difficulty, so again it might be nice to have exercises that gradually build in difficulty.

These are probably important considerations when readers are going to be working through this material entirely on their own and it might be their first exposure to Coq. (I think your tutorial says it was previously presented in-person, which has got to be a lot easier, since people can ask questions!)

Thanks for writing and sharing this material!

stepchowfun · on Sept 4, 2023

Thanks for your feedback!

schoen · on Sept 4, 2023

I'd like to add belatedly that I think it's cool that you introduce the Curry-Howard stuff so early on, basically from the very beginning.

jcarrano · on Sept 4, 2023

For a more practical approach (though limited to C) you can try the Frama-C suite. It works by annotating C code, instead of having to manually rewrite the code in Coq.

One thing that the post does not mention is that things get nasty real quick if you use actual machine integers (or number generally) instead of ideal ones in Coq- though that can also be due to my own ignorance as an amateur.

mdnahas · on Sept 4, 2023

If you’re interested, here is my Coq tutorial.

http://michaeldnahas.com/doc/nahas_tutorial.html

It has been mentioned on HN before (multiple times).

https://news.ycombinator.com/item?id=21625852

JonChesterfield · on Sept 3, 2023

I can't tell if this is pseudocode or what proof in coq usually looks like:

Theorem NestedTreeContains1:

  forall (t: tree),
    t = Node example 2 (Node Nil 3 Nil) 

  -> 

    contains 1 t = true.

Proof.

nail.

wreck t.

- wat.

- wreck H into Ht1, Hv and Ht2.

  sub Hv.
  evaluate.
  sub Ht1.
  just ExampleTreeContains1.

Qed.

Theorem, forall, qed look fairly likely. Are there built-in operations in coq called 'wat' and 'wreck'?

082349872349872 · on Sept 3, 2023

A cursory search reveals they're not builtin, but were written for the talk: https://github.com/awalterschulze/ccc-talk/blob/main/Coq/src...

    nail ~= intros
    wat ~= discriminate
    etc.

colonwqbang · on Sept 3, 2023

"wreck" is actually "destruct", "same" is "reflexivity"...

It boggles the mind why the author would make up fanciful new names for the fundamental tactics in this "how to" article.

21eleven · on Sept 4, 2023

He's giving a talk, it's for the lulz

If you are giving a talk on a theorem proving programing language you're gonna want to make it fun

thaliaarchi · on Sept 3, 2023

The only useful ltac I see in there is compare, which can save a lot of tedious rewrites, when needing that frequently.

legobmw99 · on Sept 3, 2023

I used Coq in one class in school, and I never encountered wat or wreck. It’s possible they’re user- or library-defined tactics.

One thing I think is important about Coq is that it really is meant to be interactive. Even after a fair amount of usage I still find it somewhat tricky to just read a proof, especially a nontrivial one. A good IDE will show you a panel that includes all your hypothesis and requirements so far, updating as you run each line

ekidd · on Sept 4, 2023

As other people note, a lot of those seem to be local aliases defined by the author.

But the most important thing to understand about Coq is that it's meant to be read in an IDE. The IDE will show you, at each step, what has been proved and what remains to be proved. The "tactics" you see above basically mean, "At this point in the proof, we can make progress by substituting Hv. And in context, this will make sense.

But without being able to see the proof state at each step, Coq can be incredibly cryptic.

Smaug123 · on Sept 3, 2023

They are not built-in, no. https://github.com/awalterschulze/ccc-talk/blob/1e781326d2e3...

nuancebydefault · on Sept 3, 2023

I find the introduction interesting, it somehow clicked. Devs are like toddlers playing with wooden bricks, who are demanded to scale up this act of building to the level of a grown up's house.

In fact it often feels that way. You're either asked to extend or improve existing constructions that feel like sticks and stones bound together with rubber bands. Or one feels excited for being allowed to make up such a construction as a demo, not knowing it will be used as a basis for a production-quality construction.

I'm not sure if mathematical proof is the way to go. There's just too much code to cover for this approach to be realistic.

Most problems are born on the drawing table. The designs seem right on paper and only in practice, during or after building, they become visable.

mrkeen · on Sept 4, 2023

> I'm not sure if mathematical proof is the way to go. There's just too much code to cover for this approach to be realistic.

It may be unrealistic, but people are doing it anyway. It may be the case that there are just a few Leslie Lamports capable of doing this stuff, and it's unrealistic to expect the rest of us to.

But even if that's the case, I still want them to publish their code as libraries, so I can reach for tools that will work - even if I couldn't have proved it myself.

And if this whole mathematical/proofy way of doing things is a dead end (which I strongly disagree with - we're just at the beginning!) then it's no big deal, because we are already really good at turning JSON into stack traces.

zmgsabst · on Sept 3, 2023

Turning a foam architectural model into a house without the required cost investment because the foam model was “so quick to make” feels like a frustratingly accurate description.

brigadier132 · on Sept 3, 2023

Yep, we build a foundation with plywood and styrofoam and then someone comes and asks you to build a skyscraper on top of it.

trwaw · on Sept 3, 2023

We build a train and then someone asks to turn it into a skyscraper because skyscrapers are the in thing now. It already looks like one on its side, how hard can it be to turn it right side up?

smaudet · on Sept 4, 2023

While I do think proof constructed code could be superior, I am not convinced for two reasons:

1. The problem expressed here has to do with a particular, well understood problem. It is not subject to the whims or constraint needs of the "customer". Proof construction is great when the problem is properly understood, I don't know how well it handles constantly changing requirements. (This encompasses business and interaction constraints too - perhaps you can construct a provable correct UI, but will it be engaging or usable?)

2. Test driven development can guide high quality code construction, arguably at lower construction cost. That is the reason I'd argue this has not taken off in comparison to traditional "test" based design.

You can argue it produces lower quality code, and you'd probably be right. But in comparison to construction of complex, rich design, I suspect that it produces faster results at lower cost. The provable solution will win out...eventually. Depends on whether it can make significant marker penetration once it is complete.

Nevertheless, yes of course we should be building foundations witb concrete not plywood and Styrofoam, so this is a good development. I do think a proper engineering approach with test based verification can build a "somewhat concrete" foundation, at least.

atq2119 · on Sept 3, 2023

The article shows how interesting properties can be proved (with a nontrivial amount of effort). That's cool, but I wouldn't call it correct by construction.

MaxBarraclough · on Sept 3, 2023

It's a standard phrase in the formal methods field, [0] emphasising that it's an alternative to the usual approach where we hope that our testing (which can never be complete) is enough to catch all the worrisome bugs.

Personally, this article didn't strike me as being very easy to follow, or as being the most compelling demonstration of the value of formal methods. At the risk of going off-topic: if you're not seeing the point, I recommend this example using SPARK, a non-functional language. [1]

[0] https://betterprogramming.pub/a-taste-of-coq-and-correct-cod...

[1] https://docs.adacore.com/spark2014-docs/html/ug/en/tutorial....

MaxBarraclough · on Sept 14, 2023

10 days too late to edit: the first link should have been https://wiki.c2.com/?CorrectByConstruction

thaliaarchi · on Sept 3, 2023

Right, not all of the techniques shown yield code that’s correct by construction. For example, an argument was indirectly given, and that error was only found after writing a theorem involving it and finding it was impossible to prove.

In the section “Correct BST by Construction”, they explain the technique of requiring every constructor of a BST to supply a proof that it maintains the properties of a BST. Any modification to a BST constructs a new BST (since Coq is pure), so any operations on a BST must also supply a proof of their correctness. That is correct by construction.

staunton · on Sept 3, 2023

There's some software where it's important not to have bugs and know exactly what it does, and some where any attempt to specify what it's supposed to be doing, let alone fixing all the bugs, is a waste of time. (And things in between...)

Currently, a lot of instances of the first situation get the treatment befitting the second. This is because it's still too hard to use these tools at scale. However, the trend is certainly towards improving tooling and the cost of doing this will go down a lot.

Maybe in a hundred years programming will be replaced by either "formally specify what you want", if you know what you want, (AI fills in implementation of how it does what you want, as well as a proof that it does) or, if you only very roughly know what you want, explain it to the chatbot and answer it's questions.

cvoss · on Sept 3, 2023

In the general case, the "write the spec and the AI does the rest" boils down to automated theorem proving, and I'm optimistic AIs will get pretty good at this in a lot of useful situations.

Even today, in limited well-defined problem spaces, there are sometimes good algorithms for this sort of thing. See Dana Angluin's algorithm for learning the DFA that recognizes a regular language given through an interactive process of having the user provide examples/counterexamples, which serve to provide an increasingly constrained spec.

seanmcdirmid · on Sept 3, 2023

AI probably disrupts programming in the next couple of decades, where they produce not only software but static and dynamic validation artifacts of correctness. It’s sad though, I would have really liked to see how human programming could improve with better tooling, it seems like a lot of formal specification work (and program synthesis using such) could be improvised with just better user interfaces.

rgmerk · on Sept 4, 2023

So the AI spat out code and a formal spec and a theorem prover spat out a proof that the code is consistent with the formal spec?

Great.

Does the formal spec represent a complete and accurate description of what the code is supposed to do?

shrugs

seanmcdirmid · on Sept 4, 2023

If an LML can write an essay that humans can read, surely it’s conceivable that it could spit out proofs/tests/docs etc…in a human readable form. So really, you can just decide for yourself then.

rgmerk · on Sept 4, 2023

So really, you can just decide for yourself then

Even if you can understand what the proof proves (not a given) do you personally understand the actual requirements for the software perfectly?

seanmcdirmid · on Sept 4, 2023

Building software is a conversation right? Your specification starts out vague, the AI makes a bunch of assumptions, and then you just say “oh, I really didn’t want that”, wash rinse and repeat.

smaudet · on Sept 4, 2023

I'm not convinced this is adequate, your specifications will be come rather unweildy.

If you employed this approach, I would want to pair it at a minimum with a test suite.

Even then, how do you know that your system (covered by hill climbing suites of tests that pass), has correct behavior for all things you didn't specify?

You don't. Better to build from the ground up formally than fight with "AI" that probably got it worse than wrong first try.

seanmcdirmid · on Sept 4, 2023

You don’t anyways now. You just have some assurance, not total assurance, with current practices. Many people will find something conversational (in that you have some documentation about what it did and can correct assumptions) good enough because it brings close enough to the human status quo.

I hope your pessimism is founded though, I would really like to see programming by human advance rather than the whole thing be taken over by AI. Hybrid approaches are also possible (AI in the loop but not in the core position).

fortran77 · on Sept 3, 2023

Why was the title changed? Isn't HN policy to use the original title unless it's very misleading?

grzm · on Sept 3, 2023

'dang changed it to the subtitle because it was trollish: https://news.ycombinator.com/item?id=37374404

ftxbro · on Sept 3, 2023

[flagged]

dang · on Sept 3, 2023

Please don't do this here.

abraae · on Sept 3, 2023

Without being able to see the flagged comment, I'd venture the authors had a quiet chuckle when titling this piece.

dang · on Sept 3, 2023

I've unkilled it for you.

dudus · on Sept 3, 2023

I agree with you here 99% of the time.

But this one is too on the nose. This might qualify as the 1%.

hotnfresh · on Sept 3, 2023

I’d blame this one on the headline author. You don’t do that by accident.

dang · on Sept 3, 2023

The article title was obviously a troll (I've changed it to the subtitle now). The HN guidelines ask commenters to abstain not only from feeding trolls but also from gobbling their bait:

"Please don't pick the most provocative thing in an article or post to complain about in the thread. Find something interesting to respond to instead."

I know it's a lot to ask, but we have to, given the site mandate, and anyway why not? It's an interesting experiment to work on as a community.

https://news.ycombinator.com/newsguidelines.html

Exuma · on Sept 3, 2023

You have to admit it’s a good one though. Haha. I don’t want this place to become a cesspool like Reddit but once in a blue moon I do need a good laugh

dang · on Sept 3, 2023

Not sure I'd agree it was a "good one" even before this joke was made a million times. Certainly not after.

andrepd · on Sept 3, 2023

Funnily enough, the pun is intentional and the reason that name was chosen.

dang · on Sept 4, 2023

I know; that's one of the components of this endlessly looping cliché.