Warning: This article is still an ongoing work (updated Jul 6th 2018)
How to measure random quality ?
I came accross a technical and theoretical challenge recently: how to evaluate the quality of a True Random Number Generator (TRNG) ? (Hopefully others had the same question, see stackoverflow and references at the end of this article).
A TRNG can have many implementation, in my case it was a small piece of a computer chip built around free running oscillators (FROs) which has been integrated to provide random numbers. As the source of entropy (namly quantum effects and what not, but do not ask me I am not an expert) is not deterministic such a generator is called "True" compared to DRBG / DRNG: deterministic random number generators which look like random but are in fact fully deterministic. Basically if you know their inner state (which is much more reduced that the quantom states of a few transistors) you know the next value that will come out, and the one after that and so on and so fourth.
I had not reason not to trust the IP vendor which provided the TRNG, nor the front end engineer which integrated it in our SoC nor the backend engineer which P&R the IP and check DRC enforcement. But still it would have been good to verify at the end that the number coming from the IP where "truly" random. But How do you do that ?
How to test random number generator "quality" ?
The NIST has an answer: a series of tests (aimed at DRBG: link & link) which ensure with a certain level of certainty that some bits are random, i.e. the outcome of the generation could not have been predicted. That the thing with RNG generator, you can only evaluate randomness afterward once the generation is done, for example by computing statistics on the generated number and check if they "look" random enough. Some tests are pretty simple some are more complex but they are not perfect.
By definition strong encryption / hash should be indistinguishable from pure random, also it is likely to be predicatble, as said before if ones know the internal state of the generator and the generation algorithm (see differences between PRNG and TRNG).
No test is absolute: the only thing a test can show is that a source is definitely not random but one can never be sure a source is truly random.
Testing /dev/urandom using dieharder:
And the first lines of the output (on my laptop):
Notice the "Weak" line 22 ?
Post on reddit about interpreting diehard results: link
A. Recommendation for RNG using DRBG (NIST webpage)
B. Recommendation for Entropy Source for RBG (NIST webpage)
C. Recommendation for RBG Construction (NIST Draft)
No test is absolute: the only thing a test can show is that a source is definitely not random but one can never be sure a source is truly random.
Testing /dev/urandom using dieharder:
cat /dev/urandom | dieharder -a -g 200
And the first lines of the output (on my laptop):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | #=============================================================================# # dieharder version 3.31.1 Copyright 2003 Robert G. Brown # #=============================================================================# rng_name |rands/second| Seed | stdin_input_raw| 2.86e+07 |1906170125| #=============================================================================# test_name |ntup| tsamples |psamples| p-value |Assessment #=============================================================================# diehard_birthdays| 0| 100| 100|0.04063257| PASSED diehard_operm5| 0| 1000000| 100|0.39003783| PASSED diehard_rank_32x32| 0| 40000| 100|0.29253977| PASSED diehard_rank_6x8| 0| 100000| 100|0.54463760| PASSED diehard_bitstream| 0| 2097152| 100|0.08717072| PASSED diehard_opso| 0| 2097152| 100|0.17956788| PASSED diehard_oqso| 0| 2097152| 100|0.23600778| PASSED diehard_dna| 0| 2097152| 100|0.56596361| PASSED diehard_count_1s_str| 0| 256000| 100|0.11492636| PASSED diehard_count_1s_byt| 0| 256000| 100|0.38479922| PASSED diehard_parking_lot| 0| 12000| 100|0.12488862| PASSED diehard_2dsphere| 2| 8000| 100|0.05409247| PASSED diehard_3dsphere| 3| 4000| 100|0.48664732| PASSED diehard_squeeze| 0| 100000| 100|0.99872432| WEAK |
Notice the "Weak" line 22 ?
Post on reddit about interpreting diehard results: link
Differences between PRNG and TRNG ?
(unpredictability)What is an entropy pool ?
How fast can you generate Random Numbers ?
Intel entropy source is around 3Gbs (to be refined by conditionner)Can good entropy sources be expected to fail some ENT / Dieharder tests ?
Indeed, as indicated in this stackoverflow responses: Perform ENT / Dieharder and raw entropy or on whitened entropy ?Certifications for Random Number Generators:
NIST SP800-90A. Recommendation for RNG using DRBG (NIST webpage)
B. Recommendation for Entropy Source for RBG (NIST webpage)
C. Recommendation for RBG Construction (NIST Draft)
References:
- Wikipedia's page https://en.wikipedia.org/wiki/Randomness_tests
- Ensuring quality of numbers from TRNG (pdf)
- Statistical testing of random number generators (NIST, pdf)
- Quality measures for various pseudo random number generators (website)
- Random number generators for parallel computers (pdf)
- Wikipedia's page on Diehard tests (link)
- https://stackoverflow.com/questions/43050347/how-to-test-pseudorandom-sequence-of-bits-with-dieharder
- Dieharder's webpage (link)
- Intel's Digital Random Number Generator Implementation Guide
- French agency opinion on AIS-31 (in french, document)
- Stackoverflow question: How does the kernel entropy pool work ?
- LWN article: On entropy and randomness
- Linux's kernel random.c source code
- Wikipedia's definition of entropy (web)
- Understanding and managing entropy (pdf)
- Good practice in (Pseudo) Random Number Generation (pdf)
- Stackoverflow's question: Perform ENT / Dieharder and raw entropy or on whitened entropy ?
- An other stackoverflow question: Differences between methods of hardware random number generators
- ent utility to "evaluate" entropy http://www.fourmilab.ch/random/