Never mind -- the cited report doesn't actually indicate a problem with consecutive 32-bit RDRAND outputs; the output had been passed through sort and cherry-picked before it was put in a comment, without saying this fact until much later. # HG changeset patch # User Taylor R Campbell # Date 1610403538 0 # Mon Jan 11 22:18:58 2021 +0000 # Branch trunk # Node ID a5c97fbff1662a2ee4d6b0025b56112d425fb29e # Parent 5ccab6038c52ea3019d3defa5366332e4fa38b64 # EXP-Topic riastradh-uhidev x86: Detect a new flavour of AMD RDRAND bug. Reported here: https://github.com/systemd/systemd/issues/18184 While here: - Add a citation for the 2019 AMD RDRAND bug -- best one I could find; couldn't find any errata from AMD. - Fix bug in clamping count of how many more iterations we need to satisfy the request. The new bug might actually just effectively halve the entropy. But until there's more information from AMD on the subject, let's treat it as zero. diff -r 5ccab6038c52 -r a5c97fbff166 sys/arch/x86/x86/cpu_rng.c --- a/sys/arch/x86/x86/cpu_rng.c Sun Jan 10 18:19:10 2021 +0000 +++ b/sys/arch/x86/x86/cpu_rng.c Mon Jan 11 22:18:58 2021 +0000 @@ -261,25 +261,68 @@ static void cpu_rng_get(size_t nbytes, void *cookie) { #define N howmany(256, 64) - uint64_t buf[2*N]; + union { + uint64_t u64[2*N]; + uint32_t u32[4*N]; + } u; unsigned i, nbits = 0; + uint32_t m0, m1; while (nbytes) { + /* Draw output from RDRAND/RDSEED. */ + for (i = 0; i < __arraycount(u.u64); i++) + nbits += cpu_rng(cpu_rng_mode, &u.u64[i]); + /* - * The fraction of outputs this rejects in correct - * operation is 1/2^256, which is close enough to zero - * that we round it to having no effect on the number - * of bits of entropy. + * Apply simple repetition tests to detect two + * different AMD RDRAND bugs: + * + * https://arstechnica.com/gadgets/2019/10/how-a-months-old-amd-microcode-bug-destroyed-my-weekend/ + * https://github.com/systemd/systemd/issues/18184 + * + * Of the 2^512 possible outputs, we raise an alarm for + * 2^256 + 2^256 + 2^288 outputs that indicate broken + * CPU RNG. This is such a negligible fraction of the + * 2^512 possible outputs that we round it to having no + * effect on the entropy when the tests pass -- but the + * tests are 100% guaranteed to detect the documented + * RDRAND bugs. */ - for (i = 0; i < __arraycount(buf); i++) - nbits += cpu_rng(cpu_rng_mode, &buf[i]); - if (consttime_memequal(buf, buf + N, N)) { - printf("cpu_rng %s: failed repetition test\n", + + /* + * Check for a 256-bit repeated pair. False alarm + * rate: 2^256/2^512 = 1/2^256. + */ + if (consttime_memequal(u.u64, u.u64 + N, N)) { + printf("cpu_rng %s: failed 256-bit repetition test\n", cpu_rng_name[cpu_rng_mode]); nbits = 0; } - rnd_add_data_sync(&cpu_rng_source, buf, sizeof buf, nbits); - nbytes -= MIN(MIN(nbytes, sizeof buf), MAX(1, 8*nbits)); + + /* + * Check for eight consecutive 32-bit repeated pairs. + * False alarm rate: (2^32)^8/2^512 = 1/2^256. + */ + for (m0 = 0, i = 0; i < __arraycount(u.u32); i += 2) + m0 |= u.u32[i] ^ u.u32[i + 1]; + + /* + * In case the output queue was misaligned, check for + * seven consecutive 32-bit repeated pairs in the + * middle, ignoring the 32-bit ends. False alarm rate: + * (2^32)^2 (2^32)^7 / 2^512 = 1/2^224. + */ + for (m1 = 0, i = 1; i < __arraycount(u.u32) - 1; i += 2) + m1 |= u.u32[i] ^ u.u32[i + 1]; + + if (m0 == 0 || m1 == 0) { + printf("cpu_rng %s: failed 32-bit repetition test\n", + cpu_rng_name[cpu_rng_mode]); + nbits = 0; + } + + rnd_add_data_sync(&cpu_rng_source, u.u64, sizeof u.u64, nbits); + nbytes -= MIN(MIN(nbytes, sizeof u.u64), MAX(1, nbits/8)); } #undef N }