Secure random number generation on virtual hardware
Explore how entropy sources, virtual hardware and cryptographic modules shape secure random number generation.
The specter of generating a secure random number can unnerve a software engineer of any experience level. The apparitional subtleties of the myriad available approaches can tremble even the practiced hand of a seasoned cybersecurity professional. And as with the cryptographic domain’s other spindly fingers, secure random number generation demands precisely correct implementation—otherwise, the security of the entire system at question might collapse. But against the severe implications of a poor choice, a rudimentary understanding of where random numbers come from suffices to make an informed decision and effectively mitigate the security risk associated with an inadequate source of random numbers.
For engineers, cybersecurity professionals or anyone generating encryption keys or setting up a process to do so or governance around it, this article will demystify the process of selecting a random number generator by explaining the varying levels of security that different solutions can offer in different contexts.
Understanding entropy and randomness
Briefly, a few key terms will aid in understanding the nuances of random number generation.
Random number generators
The output of a function is called random—and similarly, the function can be called a true random number generator, or RNG—if given all inputs to the function there is no effective computational method to forecast its output. In other words, the output of the function does not follow any predetermined algorithm.
Pseudorandom number generators
A pseudorandom number generator (PRNG), on the other hand, is a function that, given a seed value, produces a sequence that passes statistical tests for randomness. PRNGs are different from RNGs because, given an identical seed and context, they will produce an identical sequence of numbers. However, they can suffice in many cases, provided an attacker has no means by which to discover the seed.
Random bit generators and entropy
The National Institute of Science and Technology (NIST) defines both of these concepts in computational terms. According to SP 800-90, a random bit generator (RBG) is defined as “a device or algorithm that outputs a sequence of binary bits that appears to be statistically independent and unbiased.” As such, RBGs can be either RNGs or PRNGs. To distinguish between the two, NIST defines non-deterministic random bit generators and deterministic random bit generators (DRBGs) as their respective computational equivalents.
Entropy is a related concept that refers to the level of uncertainty in the output of a random function. NIST defines entropy as “a measure of the disorder, randomness or variability in a closed system.” Entropy is measured in bits—a fair coin flip has exactly one bit of entropy; an unfair coin flip has a little bit less entropy, since there is less uncertainty in the outcome; and a roll of a six-sided die has about 2.5 bits of entropy, since there is more uncertainty in the outcome. As the name implies, it is analogous to entropy in the context of thermodynamics. A source of entropy is an RBG acting as a source of uncertainty in the same way a coin or die might.
Cryptographically secure language built-ins
How standard libraries generate random bits
Desktop or server computers generate random bits by feeding output from a source of entropy through a DRBG. For example, computers might accumulate a pool of entropy consisting of things like mouse movements and temperature fluctuations from the central processing unit (CPU) or specialty entropy-generation chips like Intel’s digital random number generator. Upon user request, a DRBG siphons a few bits from the entropy pool and transforms them into a usable sequence of random bits.
Why virtual environments add risk
The cryptographically secure RBG from any language’s standard library uses this type of random bit generation—entropy collection from the hardware running it. Standard library RBGs are ubiquitously recommended by online forums, enterprise policies and government agencies, and they are secure in a wide range of cases. But when working in a virtualized setting, blindly following broadly scoped guidelines and collecting entropy from local hardware can leave your application vulnerable to attack.
Problems with local entropy in virtualized infrastructure
Virtualized infrastructure includes almost anything deployed using a cloud provider, but it could mean a fleet of dynamically provisioned virtual machines in a local data center or even on a single server. Of course, architects must examine each application individually to determine the level of risk each may pose, but two problems bedevil secure local entropy generation on most deployments of virtual hardware.
Entropy starvation at boot time
The first issue is a lack of entropy. Many large software deployments often involve dynamic scaling, which means frequent and automated provisioning of new servers or virtual machines. Soon after boot, an operating system may not have had enough time to collect sufficient entropy to generate secure random bits. Operating systems work around this problem in a number of different ways, which means that a generation call may not actually generate a random bit sequence but instead a predictable vestige of a starved entropy source. For example, in some older, diskless Linux distributions, the kernel’s RBG relies purely on reasonably predictable system events, making the initial entropy state highly predictable.
Entropy spying and manipulation on shared hardware
The second issue pesters systems even more persistently and resists simple solutions. On virtualized hardware, any other user of a shared physical server can read or influence the entropy collected by that particular server. It has been demonstrated in laboratory environments that entropy spying and manipulation permit an attacker to effectively predict RBG output on virtual machine- and container-based deployments across cloud platforms and operating systems. In another case, as a common work-around of the problem of boot-time entropy starvation, many Linux distributions save their final entropy state to the disk on shutdown as a seed to be loaded at reboot. But after the operating system has been shut down, anyone with access to a shared physical disk might be able to read the seed data.
Entropy spying is an extreme danger for software on virtualized infrastructure. Any key generated on a device compromised this way is itself compromised; thus, generating secure random bits on virtual hardware requires a completely novel solution.
A practical solution: Using hardware security modules
What HSMs are and why they’re secure
Hardware-based RBGs are the best source of entropy available on a regular computer, but specialized hardware exists. In particular, for secure random bits when local entropy doesn’t suffice, choose a hardware security module (HSM).
HSMs are special-purpose servers that are primarily used for key management in security-sensitive sectors like government, health care and finance. Capital One uses HSMs to store our most secure keys. HSMs ship with purpose-built entropy generation hardware that is significantly more sophisticated than the entropy sources supplied with general-purpose consumer CPUs. These purpose-built chips are protected from any spying or meddling, generate enormous amounts of entropy, and are nearly impossible to compromise.
Unfortunately, HSMs are often expensive to buy and maintain, and access to HSMs both physically and over the network must be tightly restricted to keep the attack surface on these sensitive machines as small as possible. But as a work-around of the apparently prohibitive cost and security, major cloud platforms provide a way to get secure random bits from an HSM on demand without actually requiring that you purchase and maintain one yourself.
Cloud KMS alternatives from AWS, Azure and GCP
To generate high-quality random bits, AWS Key Management Service (KMS) provides a simple GenerateRandom application programming interface (API) call, Azure Key Vault has Get Random Bytes and Google Cloud Platform (GCP) KMS offers GenerateRandomBytes. None of these API calls requires the creation or management of a dedicated HSM resource, and all three are billed at just $0.03 per 10,000 requests. Random bits generated using these functions are immune to the cloud vulnerabilities that haunt system RBGs.
A perfect solution: Quantum random number generation
For enterprises living on the cutting edge of security that wish to use in-house HSMs directly, some newer HSMs have begun incorporating exciting new quantum random number generator (QRNG) chips. QRNG chips source their entropy from measurements of systems governed by the rules of quantum mechanics. Quantum random number generation, on the other hand, is protected by the universe itself—measurements of quantum systems are unpredictable as a law of physics. No attacker, even given perfect knowledge of a QRNG chip and all its inputs, can predict its output.
QRNG chips have limited availability today. Integration and cost challenges will likely prohibit on-prem adoption for a while, though cloud-based solutions through the providers of other key generation APIs may be available sooner. And because QRNGs are new, they remain under scrutiny for the side-channel and implementation-specific attacks that imperil all real-world cryptographic systems. But don’t despair if you can’t get access to one yet; modern DRBGs are good enough—so good that QRNGs don’t provide any practical advantages in terms of security (at least not any that are known to the public).
Each application is unique: Matching entropy sources to risk
Depending on the context, securing an application may only require a standard hardware-based entropy source or it may require a bleeding-edge QRNG chip. Since attacks on hardware-based entropy generation are possible, we must assume nation-state threat actors are executing them. So for the most security-sensitive random numbers, like AES or RSA keys protecting sensitive data, using random bits generated by a cloud provider’s HSMs is prudent. But for other applications, such a high level of caution may not be necessary.
Ultimately, each application needs an individual security review, and the source of any required random bits should be examined and evaluated as part of a comprehensive threat model. It is easy to find oneself caught in the Lethean maelstrom of online forums that immerse the topic of random number generation in verbiage that implies a mystery too deep to unravel. But—as with any other security decision—it is a set of practical and understandable criteria that determines the appropriate choice.

