BackgroundRandom Number Generators (RNGs) are deterministic calculations, but they (generally) seem random because they use a very "chaotic" calculation to pick the next random number/state based on the current state. That means that just about any RNG needs to be "seeded" with a starting state. As a simple example, the basic random number generator you find on most systems takes a number (in a fairly massive range) as a seed, and then the next number it gives you is based on that. And the next number, and so on. If you're looking for a number from 1 to 10, then you can just take the last digit of that number, and you're going to end up with a seemingly random sequence of single digit numbers. Then if you want the same "random" sequence, you can start with the same seed and you'll be able to replay those random events. As you get into better RNGs, the state ends up often being more complex than a single number. As it calculates new numbers, it updates the state that is behind-the-scenes. But what if it's still seeded with just a single number? What generally happens is that only a portion of the initial state is set by the seed, and then the amount of "random information" that is kept in the random state grows over time. And this was where our problem started.
Initial SettingsVery early in my career I was working with a test generator that used an RNG to help create a random test. If you gave it a particular seed, you'd get a particular test - that was one of the advantages of the seeds, you could recreate tests with the same test generator and the same seed. The first thing this test generator did was setup a bunch of global settings for the test. For example, maybe the system had a bunch of clock settings or I/O settings that were determined by some system register values, and the test generator just came up with random values for these registers so our settings were completely randomized...
The Problem..except they weren't. I generated massive amounts of tests and then I'd look at coverage, which helped me understand what I was testing. And certain pairs of settings would never happen. In other words, I'd never have, for example, system setting "A" turned on when system setting "B" was turned off. This was with millions of tests where A and B were randomly set and were on and off about half the time, but I never saw specific combinations. To make things more confusing, I did a random analysis on the random numbers it was giving us, and they were random.
Basically.So what's going on? It turns out the problem is in the size of the seed as compared to the size of the random state. The random library I was using was quite good, but it took a single (32b) number as the seed, and it needed to setup the state of an array of numbers. So obviously our starting point was limited. So it would only initialize the bottom of this array, and then let state grow in the rest of the array over time. I ended up hacking into this initial state to watch it evolve over time, and what I generally saw was that it started with only the last number in the array set to the seed and the rest of the array was zeros. Then as each random number was calculated, more and more bits in the array would be set randomly, but it wasn't until about the 17th random number that one could fully rely on "randomness" in the random state. So the "chaotic" mathematical calculation that would turn the random state into a single random number, it would initially be using a lot of zeros for it's calculation. You can think of it as this big, complex equation to create a seemingly random number, but many parts of the equation would be turned "off" until those zeros went away. The consequence was that while the numbers coming out of it seemed random, on the face of it, they had direct bit interdependencies between two different calls - at least until the random state was finally built up. So that meant that bit 14 of the 3rd number, for example, was always the opposite of bit 17 of the 5th number, for every sequence of random numbers that started with a new seed. And if bit 14 of the 3rd number would turn some feature on, and bit 17 of the 5th number was turning some other feature on, then I would never see both features on at the same time. And hence I had a serious problem with what we were testing.
The SolutionOnce I understood what was going on, the solution was pretty simple. I considered making a more complex seed and hacking into the RNG state, but I wanted to use something a little cleaner and a little more portable. So instead I simple started up the RNG by setting the seed, and then asked for a random number a few dozen times, and threw all the results away. At that point the random state had finally become interesting, and our dependency issues were effectively gone, and it was off to the races! Eventually I switched to a better RNG and was able to sleep easily at night, knowing that my tests were going to definitely get a good collection of features tested.
Interested in having someone with devotion to understanding complex problems at your company? Consider hiring me to see how it can be done.