Research on how to backdoor Diffie-Hellman

tptacek · on March 23, 2016

This is a lot more interesting than I thought it would be from the title.

Obviously, anyone can trivially generate "backdoored" Diffie-Hellman parameters; for instance, just use a bogus generator that locks the algorithm into a tiny subgroup. No matter what private keys Alice or Bob generate, the DH computation will be predictable.

This writer wants to do something cooler: generate NOBUS parameters, in the fashion of Dual_EC DRBG, so that the parameter's authors can predict DH computations but (without Significant effort) nobody else can.

Several of the methods the writer is playing with involve DH parameters for which they can use Pohlig-Hellman to recover secrets. Pohlig-Hellman breaks DH when its victims are unfortunate enough to be in a group with a smooth order --- that is (I'm butchering this), when the group order factors into small primes. It works by collecting samples from the target for each factor of the group; each leaks a bit of the key modulo that factor. You knit the samples together with CRT.

David wrote this up well here: https://www.cryptologie.net/article/196/pohlig-hellman-algor...

Obligatory plug: a fun version of this attack is the kick-off problem on Sean Devlin's very excellent Set 8 of the Cryptopals challenges. Set 8 isn't on the website, but you can mail him (finding him is an exercise for the reader) to get a copy (on the honor system, like the other challenges were before we published). Set 8 builds this attack to an extremely useful attack on Elliptic Curve Diffie-Hellman.

cperciva · on March 24, 2016

Several of the methods the writer is playing with involve DH parameters for which they can use Pohlig-Hellman to recover secrets. Pohlig-Hellman breaks DH when its victims are unfortunate enough to be in a group with a smooth order --- that is (I'm butchering this), when the group order factors into small primes. It works by collecting samples from the target for each factor of the group; each leaks a bit of the key modulo that factor. You knit the samples together with CRT.

I'm not entirely sure how this gets you NOBUS parameters though. You can find any small prime factors of the group order using ECM, so if you're relying on those to let you run Pohlig-Hellman then everybody else can do it too.

(But in any case, you should be locking down your DH implementations to use a fixed nothing-up-my-sleeve group, so even if this is a backdoor it would be a backdoor with flashing neon lights around it.)

pbsd · on March 24, 2016

The idea is to hide the factorization of the group order by using a composite modulus, e.g., p = q . r, where q and r are kept secret. Now the order (q-1)(r-1) is safely protected from ECM (though not from Pollard's p-1, if the order is smooth enough). Better yet, make it p = q . r . s and make each of q, r, and s be safe primes, using index calculus instead of Pohlig-Hellman to find the logs quickly. Yes, I agree that this is as glaring a backdoor as it gets.

For what it's worth, the author missed that interpreting the modulus bytes as 16-bit words in little-endian order does get us a prime, though still not a safe one.

tptacek · on March 24, 2016

I feel like maybe Colin is missing the point a little. Obviously it's a terrible backdoor simply by dint of targeting conventional DH.

But it's a really fun exercise!

baby · on March 24, 2016

> targeting conventional DH

this is targeting ephemeral diffie-hellman. So this would work on any library/private product where you can "hey guys, we should start using 2048bits for DHE, let me suggest this one". This would not work on DH/RSA/... because people usually generate their own parameters for these.

If this remark was pointed at the fact that ECDHE > DHE: I'm not taking the point of view of an implementer trying to implement "new" crypto, I'm trying to take the point of view of a backdoorer (?) trying to push this backdoor in various products/libraries. And it seems like most still use DHE (some stats are in the logjam paper, but I'll do some stats myself out of scans.io when I get time)

baby · on March 24, 2016

the repo is a bit of a mess right now (current research). The p-1 factorization technique doesn't work because of the factor is large enough.

There is also another method that I use that hides the generator's subgroup with a composite modulus.

Both methods are NOBUS and are easy to exploit (no need for index calculus)

> For what it's worth, the author missed that interpreting the modulus bytes as 16-bit words in little-endian order does get us a prime, though still not a safe one.

I'm reading that over and over again and still don't understand what you're pointing at, care to expand?

pbsd · on March 24, 2016

But index calculus is the right way to do it, and it's not particularly difficult to exploit. P-H only puts a quadratic (at best) factor between you and an attacker; index calculus makes that gap subexponential.

Below is the prime number corresponding to the interpretation of the byte array I was speaking of:

    3c6d8e07b9ec437a4177a6c5efbc95994aaf1246780e7d17
    4d7f75538ecebfabf5af04f66d4f37f086734a0b046928ac
    bdbb8f7a43c8d38df5f890a8d66dfc00c6058e9908c6609b
    03bff987e840bc0ecbbfc857cbc822bf941c9259255f5b22
    cea945a2d8a91694b3bfa7bcc1cee388550cb8263e0e46c5
    59a496dff2dccc17

baby · on March 24, 2016

Ahh, you were talking about the Socat modulus. I did miss that! Was it said somewhere or did you find it yourself? I'll check that later but nice finding!

> But index calculus is the right way to do it, and it's not particularly difficult to exploit. P-H only puts a quadratic (at best) factor between you and an attacker; index calculus makes that gap subexponential.

Right now it just doesn't make sense to use index calculus. Method1 is to use a hidden (by the composite modulus) subgroup of small order for the generator. So for a 1024bits modulus, you would either try to do index calculus in a 512 bits modulus or do something like Pollard Rho with a small order.

For the second method you use Pohlig-Hellman on small orders as well, so it wouldn't make sense to use index calculus on a 512 bits modulus again.

Now imagine a 2048bits main modulus n=pq, the problem is now split into two 1024bits modulus p and q. There is no way you can use IC there. But the method I "explained" previously will work. The backdoor will be a NOBUS.

pbsd · on March 24, 2016

You're misrepresenting my suggestion. The idea is to choose the factor size according to the complexity of ECM; the modulus size is irrelevant. So for a 2^80 security level, you choose factors of ~341 bits; relaxing the security to a 2^64-ish place, you can get away with 256-bit factors. This makes index calculus very quick, even on modest hardware, especially considering that the individual logarithm phase---the part you need to do in real-time---is much faster than the whole thing.

baby · on March 24, 2016

I see what you mean, but I don't see any reason to use NFS on the maximum sized group when you can reduce the size of that group (by using a generator of a "small" subgroup modulo each factor) and use a quicker Pollard Rho modulo each factors. You already have a significant advantage and you don't need any heavy pre-computations with NFS.

By doing it your way, you do increase both my computation and the reverser's computation who tries to do Pollard Rho on the full subgroup size. Which is good, but what I found is that this is not necessary as doing it my way is already too hard for someone with no knowledge of the factors to use.

cperciva · on March 24, 2016

Ah, I missed the "composite modulus" bit. Right, that could provide NOBUS, at the expense of being even more glaringly obvious.

baby · on March 24, 2016

> this is as glaring a backdoor as it gets

which lasted a year in Socat. I wonder how long this would last in a closed source project.

_hyn3 · on March 24, 2016

Link to more details on what the OP is working on:

https://github.com/mimoo/Diffie-Hellman_Backdoor/tree/master...

yeukhon · on March 23, 2016

I am always interested in knowing how cryptographers come up with these big big big big numbers?

https://github.com/mimoo/Diffie-Hellman_Backdoor/blob/master...

They appear to be mysterious, magic numbers in every single crypto implementation. And many are written in byte form. I mean how can people comprehend them?

tptacek · on March 23, 2016

People generate large primes using programs designed to do that, and maybe generators/primitive roots for those primes using a CAS like Sage or Octave/Matlab. The same goes for ECC crypto, with a lot of extra testing/filtering to find "nice" constants.

A lot of the other giant numbers you see in cryptosystems are random numbers derived from other constants.

Sometimes there are things to learn by studying the numbers themselves. For instance, the prime field in Curve25519 is a special prime with (bit-level) characteristics that make it especially nice to do math on with general-purpose computers. Other times, the numbers are, for instance, just giant random primes, and aren't themselves very interesting.

yeukhon · on March 23, 2016

I see. Thank you, tptacek.

> People generate large primes using programs designed to do that

Do you mind to name one such program?

tptacek · on March 23, 2016

Well, anything that does RSA, right?

An easy way to do this yourself is to jump into a Ruby IRB and do "OpenSSL::BN.generate_prime(512).to_s".

dunham · on March 24, 2016

If I remember right, this generates a number that is probably prime. :)

Filligree · on March 24, 2016

For values of "probably" that are effectively "certainly", yes.

y04nn · on March 23, 2016

Wikipedia gives some information.

https://en.wikipedia.org/wiki/Primality_test

https://en.wikipedia.org/wiki/Fermat%27s_little_theorem#Gene...

egonschiele · on March 23, 2016

"This repo contains some research I'm currently doing on how to bakdoor Diffie-Hellman". Is this a working backdoor, or a work-in-progress?

noobie · on March 23, 2016

It works.

https://youtu.be/W8vBm6oXlEM