That piece of CSS appeared relatively recently. Guess, it's a weird side effect - that page was supposed to annoy web hipsters, not challenge them to restore vintage tag behavior.
They're probably referring to the bug/"feature" in Linux's implementation, which will happily return data before the system has enough entropy, instead of the sane behavior of blocking on only the first calls to it.
Fortunately major distros work around this bug, so it's only an issue in unusual cases, like cloud VMs.
That link lines out the single most important technical detail:
> FreeBSD’s kernel crypto RNG doesn’t block regardless of whether you use /dev/random or urandom. Unless it hasn’t been seeded, in which case both block. This behavior, unlike Linux’s, makes sense. Linux should adopt it.
It boils down to a tricky and potentially misleading interface. Abstractions are leaky beasts, and if there are many ways to get apparently identical results, we will use the one that most closely aligns with our usual way of thinking.
Security is hard to get right. Cryptographic security depends on entropy, so getting sufficient entropy should be hard too. Right?
Maybe the default answer should be "yeah, right" instead.
It depends what type of random number you need. Having one interface isn't sufficient to describe the types of random applications need. A load balancing system probably doesn't have the same requirements as the RSA private key generation algorithm.
Would it help to think of it as a (kernel) RNG daemon that you're trying to connect to, that doesn't finish starting up until it's seeded? That's basically what the blocking means, in the OpenBSD case.
Which is the correct behavior. `/dev/urandom` should really be the only source of randomness on Linux. Mac [0] got this right, FreeBSD [1] gets this right. I totally agree with sockpuppet. Solving the tabula rasa system boot is a separate issue. Temporarily blocking for seeding is fine, my shouldn't was an RFC shouldn't.
Should your filesystem just start returning a stream of nulls or other deterministic data if the device isn't ready yet? If there's no entropy to pull from, then the kernel shouldn't try to pretend that there is.
/dev/urandom is a perfectly fine source of machine-generated randomness. In order to break /dev/urandom (but not /dev/random), you need to find a corner-case where you can break the cryptographically secure random number generator (CSPRNG) only when you can make certain guesses about its seed values. Since making those guesses about its seed values in the first place from the CSPRNG's output is a hard problem, and making an accurate guess about what eventually happened to the CSPRNG's output is a hard problem, this is pretty unlikely.
But let's quote the kernel source on the matter[1], just to be clear:
> The two other interfaces are two character devices /dev/random and
/dev/urandom. /dev/random is suitable for use when very high
quality randomness is desired (for example, for key generation or
one-time pads), as it will only return a maximum of the number of
bits of randomness (as estimated by the random number generator)
contained in the entropy pool.
> The /dev/urandom device does not have this limit, and will return
as many bytes as are requested. As more and more random bytes are
requested without giving time for the entropy pool to recharge,
this will result in random numbers that are merely cryptographically
strong. For many applications, however, this is acceptable.
The real point to be made here is that, yes, /dev/random is theoretically better - but for many applications, letting /dev/random hang to wait for entropy is worse than having /dev/urandom use a CSPRNG in a way that is generally recognized to be secure.
I would like to add that the original article is talking about using /dev/urandom to generate long-lived keys, not session keys or similar. In this case, the blocking is sometimes acceptable to generate appropriate entropy, since the fact that the key is long-lived implies that you don't do this very often. The argument for /dev/urandom only holds clout when you are making a tradeoff for non-blocking behavior (which is 99% of the time). As such, there is nothing wrong with being slightly paranoid and using /dev/random if you can afford the time spent collecting entropy.
My understanding is that /dev/urandom is perfectly fine for almost all cases, including session keys and such but (if only for the sake of paranoia) for long lived keys it is worth the potential extra time waiting for /dev/random to serve what you need.
The key to remember is that when the pool has sufficient entropy there is no difference between /dev/random and /dev/urandom, and if the pool is low then there is practically no difference between /dev/random and /dev/urandom - the quality of the PRNG means it is practically impossible to tell the difference between the two outputs (take a few thousand bits from each at a time and see if any statistical analysis can reliably tell the difference).
It is increasingly common for CPUs and/or related chipsets to have a built in TRNG so keeping the entropy pool "topped up" is getting easier by feeding the pool from those using rng-tools. The SoC RPi's are based around has an RNG that pushes out more then 500kbit/s for instance.
Your understanding is common, is stated explicitly in the manpage, and is unfortunately incorrect.
/dev/random and /dev/urandom both even use the same CSPRNG behind the scenes. The former tries to maintain a count of the estimated entropy, but this is a meaningless distinction. CSPRNGs can't run out of entropy (for instance, a stream cipher is essentially a non-reseeded CSPRNG that works by generating an arbitrarily long sequence of computationally random bits that can be XORed against a plaintext).
There might be a meaningful distinction if /dev/random provided "true" randomness (and could therefore be used for something like an OTP). But it doesn't. Both use the same CSPRNG algorithm.
I understand that both use the same CSPRNG and seed source(s) for entropy, the difference is one will block if those sources have not output enough information recently (the "pool count" is too low).
The is some genuine randomness there as the entropy sources are not (unlike the PRNG) deterministic: they take whitened fractional values from I/O timings (time between keep presses & mouse signals, and some aspects of physical drive I/O - the low bits of such timings essentially being random noise if the timer is granular enough).
/dev/urandom uses the CSPRNG in what-ever state it is in, /dev/random waits until it considered the CSPRNG to have been sufficiently randomly reseeded. In cases where the current situation is considered random enough (the pool count is high so /dev/random will not block) you will get the same value from either /dev/random or /dev/urandom.
Assuming it has been seeded with enough entropy, if you just booted and haven't gathered/seeded with entropy yet, then /dev/urandom can potentially give you predictable values, whereas /dev/random would be safer as you'd wait until it has enough entropy.
Too bad there isn't a way to tell whether the CSPRNG has been seeded or not.
If you are being paranoid you might prefer to wait for ever for a good random value instead of accepting something you are even fractionally less sure of.
Though practically speaking, that would probably not be acceptable in most (if not all) circumstances.
If you are that paranoid then there are inexpensive true-RNGs out there (free in fact, if your CPU or other chipsets have one that is easily accessible) which can provide enough bits for all but the larger bulk requirements (i.e. generating many keys in a short space of time). You can either use one of them specifically for the process(es) that definitely wants absolutely true random of feed its output into the standard entropy pool.
Probably? It's fine to use /dev/urandom as a seed for random number generators, and for most applications it is safe. But I think within SSL/TLS implementations, there could be reasons to use their own cryptographic PRNG. For one thing, it's easier to reason about in a platform independent way. On modern Linux kernels, /dev/urandom is Probably Safe(tm). But what about everything else? That's where it gets murkier.
> On modern Linux kernels, /dev/urandom is Probably Safe(tm). But what about everything else? That's where it gets murkier.
No. That argument is exactly why I didn't just use /dev/urandom in PyCrypto's userspace RNG when I wrote it in 2008. The result was 5 years of a catastrophic failure in certain cases where fork() is used, even though I specifically designed it to cope with fork(). If someone hadn't made that argument, PyCrypto wouldn't have had a catastrophic failure mode that went undetected for 5 years until I stumbled across it: CVE-2013-1445 http://www.openwall.com/lists/oss-security/2013/10/17/3
It is surprisingly difficult to implement a fast, reliable CSPRNG in a crypto library. There are innumerable things that can leak or corrupt your state, which compromises everything. You can leak state as a result of multithreading, fork(), signal-handling, etc., and libraries generally can't cope with that without having complicated APIs that application developers WILL misuse, causing silent security failures for end-users that go unnoticed for years. Plus, since you're still relying on /dev/urandom anyway, it really only gives you another way to fail.
Arguably, there are so few people who understand this stuff that---at least in the FOSS world---we should kill off all but one implementation, so that the few of us who collectively understand how this stuff really works can focus on that one implementation.
Your point seems valid, but from my reading the article appears to be explicitly describing Linux's /dev/urandom as a poor source of entropy.
And it seems to be implying that while use of the arc4random_buf function is platform-independent, its implementation is permitted to be platform-specific.
That big-endian bug is a perfect example of OCD in coding.
"I can't ever reach that code path so I'll remove it".
I have caught myself doing this in some cases. I once removed a test to see if the CSPRNG was actually working because the test coverage showed that I could never reach that code-path. I then realised that this needed to be there, because otherwise, if the CSPRNG ever stopped working, the code wouldn't know about it, and (maybe) start using streams of zeroes as it's entropy.
Sometimes you need to remember that hardware can fail, or be compromised, even though in most cases it will just cause the program to crash.
Would it be possible to add system tests for some/all of these problems?
e.g. a test which calls explicit_bzero() in a way which would have it optimised out in a platform with a low-quality port.
A reasonably descriptive comment in the header (or failure text) of the test should guide a porter onto the path of wisdom.
(If there is a problem in that the test would need to inspect the output of explicit_bzero(), hence negating the optimisation, it can be implemented as multiple processes).
How does the other process inspect the memory at the right time? How do you know all the scenarios where some compiler would optimize things out? It doesn't sound like it'd be easy to do a portable & reliable test.
Testing that your entropy source is good sounds harder still.
reallocarray() should be pretty easy to test though.
But how many potential issues will your tests miss? If we had perfect test coverage for everything (and the tests were perfect, or we had test for them...), all software would be 100% bug-free.
Tests might not hurt, but I am not sure trying to cater for braindead porters is a good idea. They might get the idea they're doing it right once they get the tests to pass one way or another...
Reasonably descriptive commentary on the mentioned functions is there in the man pages. That is where porters should look.
> How does the other process inspect the memory at the right time?
There must be some side effect of the optimisation, otherwise there wouldn't be a problem. Detect that side effect (e.g. write a buffer of memory to disk, check timing of some code, ptrace-attach to the other process and inspect it, trigger a core dump and pick over the bones, code up an exploit which would work if the explicit_bzero() wasn't present)
> How do you know all the scenarios where some compiler would optimize things out?
You only really need to know one case where the compiler will, if the prevent-optimisation compiler magic isn't sprinkled on it.
> Tests might not hurt, but I am not sure trying to cater for braindead porters is a good idea. They might get the idea they're doing it right once they get the tests to pass one way or another...
At least they'd get an idea that something was up. I guess you might get away with:
#ifndef OPENBSD
#error "You can't just call bcopy() for explicit_bcopy() - see http://good-description-here why not"
#endif
> At least they'd get an idea that something was up.
If they are up for the job, they get that idea when they try to compile the thing and it doesn't because they are missing a function. They will read that function's documentation, and understand it; they may even take a peek at the implementation too, before porting it or implementing their own.
Call me smug but I think porting security sensitive software should be left to people who have a clue. If you have to litter the code with hints and education for people who don't know what they are doing, then you end up with a port that was done by someone who seemed like he might know what he's doing, when there's a good chance that he doesn't. I would rather be able to immediately recognize ports made by people who obviously don't have a clue. So I know what to avoid...
I am all for education, by the way. There are good secure coding guides out there, though having more wouldn't hurt. I just don't believe the approach you proposed is a good one.
"You only really need to know one case where the compiler will, if the prevent-optimisation compiler magic isn't sprinkled on it."
If you do that, the best you can get is a test that works with one specific version of one specific compiler used with one specific set of compiler flags. It probably is easier to just inspect the resulting binary.
And you may not even get that, as the optimizer may use some fairly complex heuristics to choose whether to optimize away a call, such as register pressure (for example, in a three level deep for loop, it may not make sense to try and get extra stuff into registers)
The process under test could send SIGSTOP to itself at the right time, and a parent process could use one of the wait() variants to notice (or the test process could send SIGUSR1 to the monitor).
How many systems today get calloc() wrong? I checked the implementation of quite a few open source implementations a year or two back, and I don't recall seeing one get it wrong.
I get that logic in the post, but there are concerns that FRP256v1 was weakened in the standard similarly to the FIPS curves. So I'm not sure that is good reasoning. Also I am unsure if the libressl/openssl implementation has good small subgroup attack defenses even.
Are you suggesting they remove all curves that may be tainted and ship without them? Thereby forcing application developers that do want to use them to implement each and every single one themselves?
I suppose so, but I'd rather people not use anything other than Goldilocks or 41417. I'm hoping that for those applications if they are forced so use something like p=192 they ignore the ECC option entirely, don't code it, and fallback to some interoperable DSA or RSA scheme instead in whatever protocol it may be. Maybe there is some case where that is not possible?
How about seeming to ship them, but when you try to compile with them, getting an error containing a link to a page that explains why you shouldn't be using them?
From http://www.libressl.org/
"LibreSSL is primarily developed by the OpenBSD Project, and its first inclusion into an operating system will be in OpenBSD 5.6. "
They run every version of OpenBSD in every machine they support, including 32bit SPARC, HP 300 and SGI. By running in all those machines they uncover subtle bugs that are made evident by architecture differences.
That wouldn't have caught Heartbleed, wouldn't have caught a vulnerability like the one in Apple's TLS implementation, wouldn't have caught... Basically, testing that your software works in normal operation isn't enough to ensure it's secure, you need to explicitly test its behaviour under attack.
Actually, OpenBSD did have things in place that would have caught Heartbleed. OpenSSL went out of their way to create a situation that defeated them.
Look, the whole OpenSSL debacle is the fact that OpenSSL has ONE programmer working on it reliably. LibreSSL now has 5x-10x the manpower that was working on OpenSSL--and that's STILL probably low by an order of magnitude.
Google should pledge 5 people to work on LibreSSL by itself. They clearly have them since one of their internal audits uncovered Heartbleed.
The thing is nobody in the companies actually cared until the NSA started spying on them.
All OpenBSD developers work on -current and commonly on multiple platforms. Snapshots are rolled continuously for most platforms and made available to anyone who wants to run the latest code without having to build it themselves. The entire ports tree is compiled regularly on -current too. The compiled packages are then made available.
A bit unfair that this was down voted. Why does the hive mind think collectively that this is OK state for LibreSSL/OpenSSL - a critical component of internet security - to be in?
What does "testing" mean in the LibreSSL/OpenSSL situation anyway? It compiles? A regression suite passes? Manual verification?
Battle testing sounds like something you'd do to a new implementation. But so far there's very little new in LibreSSL; it's just cleanups and bugfixes. Do you battle test dead code removals and bug fixes?
If you look at the portable versions of their products they tend to ship a chunk of the OpenBSD library implementation with them to give consistency guarantees.
Perhaps we need a consistent OpenBSD platform abstraction layer that gives solid guarantees?
The problem with porting POSIX code to Win32/64 is other then windows not being POSIX which causes a lot of problems to start with.
Windows lacks a lot of fundamental equals to Unix-Like system calls. I.E.: Windows has no equal to Fork, instead you need to use something like spawn(), and do some tricky memory cloning to get the same effect.
In fact one could say there is no port of OpenSSL for windows. It hasn't been updated since 2004, and lacks 64bit support.
OpenSSL doesn't need the fork system call and it already builds just fine on Win32 and Win64. You can even get pre-built binaries of the latest release: