There’s a dirty little secret in programming: generating truly random numbers is tough. When are random numbers really random?
Nearly everybody plays a computer game such as Solitaire from time to time. Have you ever had this deja vu feeling you’ve played this same game before? Did that initial “deal” look unsettlingly familiar?
Have you ever yelled at a computer game, “who shuffled this?”
My irritation with this led to remembering all the nasty little complications we’ll “overview” in this article.
There’s a reason for “bad shuffles.” To make each game unique, computers depend on random numbers.
If you don’t play computer games, you still probably realize security passwords are also just random numbers and letters. There’s a more serious side to this discussion. Random numbers are essential to secure password generation, encryption, and even national security.
The core idea to a series of random numbers is unpredictability. Knowing what one number is, should give us no clue what the next number will be.
As explained in Wikipedia’s entry on Random number generation, scientists, mathematicians, statisticians and programmers recognize two general “types” of random numbers:
1. “True” random numbers, generated by sampling some natural random process, such as the flip of a coin, toss of the dice, or sampling the states of quantum bits. But if you think about it, many physical phenomena in the real world that appear to be random are truly not. The shape of the lava blob at each instant in time in the popular 1960’s “Lava lamps” is always dependent on its shape in the previous instant.
“Wheels of fortune,” lottery ball tumblers, roulette wheels and card shuffling machines are examples of mechanical “true” random sequence generators which, when not rigged, do a better job than most of the “stock” functions supplied with personal computers and server-based user functions.
2. “Generated” random numbers use mathematical algorithms, some highly ingenious, to generate sequences that are for most purposes indistinguishable from true randomness. But, because they are generated by a predictable or programmable process, they are also called “quasi-random” numbers. Our desktop PC’s and Macs use simple random numbers to display arbitrary screensaver images on our monitors.
When I was in pre-computer high school math, we all used the “CRC Standard Mathematical Tables and Formulae” book. (Amazon still sells them!) There were pages and pages of random number tables in the back of the book. They were probably generated in some university laboratory somewhere. They were very random. But, if you had the tables, and you knew what the 2,573rd random number was, you always knew what the 2,574th number was going to be. To generate a UNIQUE random number series, you also had to devise an arbitrary random lookup scheme.
I’ve written any number of computer and web programs for personal and private use that depend in some way on random numbers. For simple tasks like a random quotations generator or photo banner carousel, one can call the Perl rand() function to pick a quote at random out of the entire collection. You can call another function to generate a random “seed” to plug into rand(), or you can just give it a “seed” like server time (“1347727038”) which increments uniquely every second.
The thing is, a poorly designed “seed” may always return the same starting number and random number sequence every time you start the program. This can lead to those “bad shuffles” of your card game. Or, the next time you call server time, one second in the future, it will return “1347727039” which means we are off to a bad start right away!
I NEVER try to write my own code for password generation, secure hashes, or anything involving identity protection, privacy or file and data security. Instead, I use canned industrial-strength functions like 128 and 256-bit MD5 (which itself can be hacked today if you have a powerful enough computer and enough time).
Having said all that, I once wrote my own random number generator to see how well I could do it.
The following isn’t a workshop or tutorial. It’s just to show you, very roughly, one or two simple techniques that might be used. Here’s a rough outline of what I remember doing (and it is very crude):
1. Pick a unique number, say, server time: 1347727039
2. scramble the digits: 0719743723
3. square it: 518031026797901000
4. Extract the middle 10 digits: 3102679790
5. square that: 9626621879274440000
6. take your required number of digits out of the middle: ‘9’, or ‘218792744’ or whatever you need.
The serious programmer or statistician can see several flaws in the assumptions my simple algorithm uses. For example, the “scramble” action itself assumes access to some sort of randomizing process. Ultimately, since any process, once guessed, can be repeated, these numbers are at best quasi-random.
Unless you have a doctorate in math theory, there are hidden pitfalls to trap the unwary, and that includes me. One would avoid division, for example. The fraction 5/9 generates the horribly un-random series .55555555555 … Seven is another weird divisor. If you subtract a preset number like ‘1’ from a result that happens to end in ‘000’, you get ‘999’ which is not what you were hoping for! Are your sequences always even numbers (bad)? Odd numbers (bad)? Do they include an expected distribution of prime numbers not divisible by anything else? (Good).
As the Wikipedia article emphasizes,
A “random number generator” based solely on deterministic computation cannot be regarded as a “true” random number generator, since its output is inherently predictable.
Nevertheless, it worked pretty well for whatever my requirement was at the time. But remember, home-brew techniques can’t be counted on to be truly random or truly secure (if security is your purpose).
Some years back I wrote a Perl password generator calling a stock MD5 checksum generator. I posted “Random Password” to this website where you can still find it in the Front Page “Utility” menu. These days, most password vaults like 1Password will generate very secure unguessable passwords of arbitrary length.
A final word on security: password protection is NOT the same as encryption. A password just disables the file:open command until the password is authenticated at the document or application level. I’m sure you can still read many or most common pass-protected documents with one of the disk utilities that can read the hard drive directly, bit by bit. The thing is, you won’t know, because password program companies never guarantee file security. So, to protect against identity and data theft on a hard drive or other local storage media, NEVER entrust your personal financial, banking, medical or other sensitive files and data to a simple password.
Encryption scrambles the whole document or vault storage space with sophisticated algorithms that could take a NASA supercomputer weeks to crack. Password vaults like 1Password or RoboForm aren’t just password protected; they securely encrypt all data.
Lastly, if your card program doesn’t shuffle well enough, try another one!
1,961 total views, 2 views today