Interactive Sudoku Zero-Knowledge Proof

dewitt · on Aug 10, 2016

I don't know the math behind ZKP, but how does this statement work:

> If Peggy didn’t have a solution, there was a chance she’d be caught in this round if Victor had asked for the right set of 9 squares to be revealed.

Doesn't this technique make it a "slightly more than zero knowledge proof", where the amount of knowledge revealed correlates to the probability a proof was obtained? The more knowledge, the more proof?

SilasX · on Aug 10, 2016

True, the right guess on a fraud will expose them immediately. But each round that they don't fail is still valid (additional) evidence in favor of them not being a fraud.

(Because a truthful prover must pass all the tests, the probability of them lying is 1 minus the AND of getting lucky every time, and thus decays exponentially.)

The "proof" here refers to "whether Peggy has a solution", and the "zero knowledge" refers to the content of the solution. With each round, you do gain information about the former, but not the latter.

jobigoud · on Aug 10, 2016

It should be noted that at each round Peggy performs a different permutation, so Victor cannot accumulate knowledge about Peggy's solution.

kevhito · on Aug 10, 2016

This is correct, and key to making this zero knowledge. But "accumulate" isn't quite the right word, I think. During a single round, by itself, Peggy reveals literally nothing about the actual solution -- no information whatsoever -- so one might wonder, how can a bunch of repeated nothings "accumulate" to anything?

Here's how. If Peggy were to use the same permutation repeatedly, revealing different rows/columns/boxes, then Victor can cross-reference the permuted numbers in one revealed row against the same-permuted numbers in other rows. (E.g. Victor might notice that the top left corner has the same as the bottom center tile, which he didn't know before.) And each new reveal allows for more cross-referencing, and soon Victor can learn a lot. Notice that cross-referencing the numbers within a single revealed row don't tell Victor anything -- no number will appear more than once, so all Victor learns is that all the cells are different, which Victor knew already by the rules of Sudoku.

Of course, Peggy uses a different permutation each round, so each round Victor learns nothing, and the different permutations prevent Victor from cross-referencing numbers from different rounds to learn anything that way either.

Manishearth · on Aug 11, 2016

> Peggy reveals literally nothing about the actual solution -- no information whatsoever -- so one might wonder, how can a bunch of repeated nothings "accumulate" to anything?

The trick here is in the wording; Peggy does reveal something. She essentially reveals that she _may_ have a correct solution, with some probability. :)

All information revealed is information that Victor knows to be true (and can guess beforehand) assuming that Peggy had a correct solution. The last part is the operative bit here; that still means that something is revealed if we are considering the space of all moves (but nothing is revealed if we're considering the space of all moves of a non-malicious Peggy). These probabilities accumulate and eventually we're reasonably certain that that assumption is true.

This sleight of hand happens because from Peggy's side, Peggy knows that she is non-malicious and only cares about no information being leaked with this assumption in mind. On Bob's side, he has no idea about Peggy and must consider the case that Peggy is malicious. However, he doesn't care about defending from leaks, he would love leaks. He only wants the information that Peggy has a solution. From his side, he needs to obtain the information that Peggy has a solution (which Peggy is already assuming to be true).

Our brains get confused whilst juggling these assumptions and it seems like something was revealed without anything being revealed -- the trick is in the difference of sample spaces and assumptions.

Overall this is one reason why I love ZKP -- it seems like magic.

jlebar · on Aug 11, 2016

It's not explicitly stated in the article, but it's very important that we can ask for a whole row or column. If we can only ask for rows, then Peggy can cheat by constructing a "solution" that duplicates a number within a column.

heinrich5991 · on Aug 11, 2016

I think it is explicitly mentioned here:

>The procedure can be repeated (with a new permutation each time) to minimize this chance, with Victor asking to reveal a row, column, or 3x3 subsquare each time, until he is certain that Peggy has a solution.

Manishearth · on Aug 11, 2016

Yeah, the first round in the post is just an example. General rounds can ask for anything.

jstanley · on Aug 11, 2016

Is it then also important that we are able to ask for a whole 3x3 square? Or does the availability of rows+columns cover this case implicitly?

Manishearth · on Aug 11, 2016

No, you can construct sudoku solutions where the row/column thing is satisfied but not 3x3 (the simplest one is by staggering the rows by 1. subsquares are mentioned in the post (and are a part of the interactive prover)

dooglius · on Aug 11, 2016

There seems to be a bug here. Suppose that in a puzzle there is only one 8 and one 9 that do not overlap in any way (row, column, or 3x3 square). Now, suppose that Peggy comes up with a false solution to the problem which sets both of these tiles to 8. Then, neither the 8 nor the 9 is allowed to be picked for the green squares, and so no test of Victor's can disprove Peggy's false solution.

Edit: this seems easily fixable too: rather than the whole green tiles thing, just allow Victor to query either a row/column/3x3 to prove the solution is a valid sudoku, or query the original filled in tiles, to show that the solution corresponds to the original problem.

Manishearth · on Aug 11, 2016

Good observation, and fixed! As I noted in the issue, I initially was going to go with a fourth "preset" option but I decided against it for some reason. Added it to the code and blog post.

nurettin · on Aug 11, 2016

So in this scenario ZKP is just a cipher and a key where the cipher anonymizes the document, but lets the format be validated to a varying degree depending on how much of the encrypted document you are willing to read and a key is given to de-anonymize the document.

ianpurton · on Aug 11, 2016

That's not how I understand it.

More like it's a way to prove you have a solution to a problem without revealing the solution.

Manishearth · on Aug 11, 2016

That's nowhere near being "just" a cipher :)

"A cipher that lets the format be validated" is precisely what a ZKP is, though nobody calls it a cipher.

(Cryptographic signature is a simpler form of ZKP though, where the proof doesn't require any back-and-forth)

dmichulke · on Aug 11, 2016

My problem with the approach is this:

How can Victor detect that Peggy didn't just send some random numbers where the previously known fields are still consistent?

jlebar · on Aug 11, 2016

That's the bit about the Peggy sending Victor a commitment.