I don't know the math behind ZKP, but how does this statement work:
> If Peggy didn’t have a solution, there was a chance she’d be caught in this round if Victor had asked for the right set of 9 squares to be revealed.
Doesn't this technique make it a "slightly more than zero knowledge proof", where the amount of knowledge revealed correlates to the probability a proof was obtained? The more knowledge, the more proof?
True, the right guess on a fraud will expose them immediately. But each round that they don't fail is still valid (additional) evidence in favor of them not being a fraud.
(Because a truthful prover must pass all the tests, the probability of them lying is 1 minus the AND of getting lucky every time, and thus decays exponentially.)
The "proof" here refers to "whether Peggy has a solution", and the "zero knowledge" refers to the content of the solution. With each round, you do gain information about the former, but not the latter.
This is correct, and key to making this zero knowledge. But "accumulate" isn't quite the right word, I think. During a single round, by itself, Peggy reveals literally nothing about the actual solution -- no information whatsoever -- so one might wonder, how can a bunch of repeated nothings "accumulate" to anything?
Here's how. If Peggy were to use the same permutation repeatedly, revealing different rows/columns/boxes, then Victor can cross-reference the permuted numbers in one revealed row against the same-permuted numbers in other rows. (E.g. Victor might notice that the top left corner has the same as the bottom center tile, which he didn't know before.) And each new reveal allows for more cross-referencing, and soon Victor can learn a lot. Notice that cross-referencing the numbers within a single revealed row don't tell Victor anything -- no number will appear more than once, so all Victor learns is that all the cells are different, which Victor knew already by the rules of Sudoku.
Of course, Peggy uses a different permutation each round, so each round Victor learns nothing, and the different permutations prevent Victor from cross-referencing numbers from different rounds to learn anything that way either.
> Peggy reveals literally nothing about the actual solution -- no information whatsoever -- so one might wonder, how can a bunch of repeated nothings "accumulate" to anything?
The trick here is in the wording; Peggy does reveal something. She essentially reveals that she _may_ have a correct solution, with some probability. :)
All information revealed is information that Victor knows to be true (and can guess beforehand) assuming that Peggy had a correct solution. The last part is the operative bit here; that still means that something is revealed if we are considering the space of all moves (but nothing is revealed if we're considering the space of all moves of a non-malicious Peggy). These probabilities accumulate and eventually we're reasonably certain that that assumption is true.
This sleight of hand happens because from Peggy's side, Peggy knows that she is non-malicious and only cares about no information being leaked with this assumption in mind. On Bob's side, he has no idea about Peggy and must consider the case that Peggy is malicious. However, he doesn't care about defending from leaks, he would love leaks. He only wants the information that Peggy has a solution. From his side, he needs to obtain the information that Peggy has a solution (which Peggy is already assuming to be true).
Our brains get confused whilst juggling these assumptions and it seems like something was revealed without anything being revealed -- the trick is in the difference of sample spaces and assumptions.
Overall this is one reason why I love ZKP -- it seems like magic.
It's not explicitly stated in the article, but it's very important that we can ask for a whole row or column. If we can only ask for rows, then Peggy can cheat by constructing a "solution" that duplicates a number within a column.
>The procedure can be repeated (with a new permutation each time) to minimize this chance, with Victor asking to reveal a row, column, or 3x3 subsquare each time, until he is certain that Peggy has a solution.
No, you can construct sudoku solutions where the row/column thing is satisfied but not 3x3 (the simplest one is by staggering the rows by 1. subsquares are mentioned in the post (and are a part of the interactive prover)
There seems to be a bug here. Suppose that in a puzzle there is only one 8 and one 9 that do not overlap in any way (row, column, or 3x3 square). Now, suppose that Peggy comes up with a false solution to the problem which sets both of these tiles to 8. Then, neither the 8 nor the 9 is allowed to be picked for the green squares, and so no test of Victor's can disprove Peggy's false solution.
Edit: this seems easily fixable too: rather than the whole green tiles thing, just allow Victor to query either a row/column/3x3 to prove the solution is a valid sudoku, or query the original filled in tiles, to show that the solution corresponds to the original problem.
Good observation, and fixed! As I noted in the issue, I initially was going to go with a fourth "preset" option but I decided against it for some reason. Added it to the code and blog post.
So in this scenario ZKP is just a cipher and a key where the cipher anonymizes the document, but lets the format be validated to a varying degree depending on how much of the encrypted document you are willing to read and a key is given to de-anonymize the document.
> If Peggy didn’t have a solution, there was a chance she’d be caught in this round if Victor had asked for the right set of 9 squares to be revealed.
Doesn't this technique make it a "slightly more than zero knowledge proof", where the amount of knowledge revealed correlates to the probability a proof was obtained? The more knowledge, the more proof?