Last year, for extra-special Pi Day, I wrote a post that pretty much says all I have to say about Pi. (see: Here Today; Pi Tomorrow) That post was actually published the day before. I used the actual day to kick off last Spring’s series on Special Relativity.
So what remains to be said? Not much, really, but I’ve never let that stop me before, so why start now?
As I looked back over my posts here, I see that the first time I wrote about Pi Day was back in 2013 shortly after I retired. (see: Party Time!) That was more of a humor post just touched on the coincidence of pi with dates.
Last year’s post got into a curious property of pi: The way its digits go on forever without repeating.
More importantly, that the sequence of digits is normal. That means any given finite sequence of digits appears somewhere in the digits of pi.
So every text, sound, and image file (past, present, or future) exists somewhere in pi. And in every other transcendental number with the property of being normal.
I covered all that last year.
What I didn’t get too much into was how random the digits are. Except for the fact that the sequence is well-known, pi is a handy source of randomness (a repeatable, but random, sequence can be useful for testing and development).
That the sequence is well-known means the randomness of pi can’t be used to secure information, but it is useful if you just need a string of random digits.
The reason for today’s post (other than to commemorate the day) is that last fall it occurred to me to see if I could find a text file containing lots and lots of the digits of pi.
I found plenty (yea, internet) and grabbed one that had ten million digits of pi. I thought it would be fun to see for myself just how random the digits of pi really are.
Short answer: Indeed, yes, them is random.
That shouldn’t (and didn’t) surprise me, but it was a little startling to see just how “flat” that randomness really is. Check it out:
0: 999440 ( 9.994400, +0.005600) 1: 999333 ( 9.993330, +0.006670) 2: 1000306 (10.003060, -0.003060) 3: 999964 ( 9.999640, +0.000360) 4: 1001093 (10.010930, -0.010930) 5: 1000466 (10.004660, -0.004660) 6: 999337 ( 9.993370, +0.006630) 7: 1000207 (10.002070, -0.002070) 8: 999814 ( 9.998140, +0.001860) 9: 1000040 (10.000400, -0.000400)
This table shows the ten decimal digits (0-9) along with how many times they appear in the ten-million-digit string, the percentage of occurrences, and how much the percentage deviates from the expected 10% (the total of the two numbers in parenthesis is exactly 10%).
As you see, the maximum deviation — for the digit ‘4’ — is about one-hundredth of a percent, with most of them being much lower. I would expect the deviation to approach zero as more digits are processed.
I also tried scanning the digits to see if there was any structure based on what digit followed any given digit. That, too, should be entirely random.
Here’s a table showing the same sort of data for what digit follows the digit ‘4’ (which “won” the table above):
4_0: 100217 (1.002170, -0.002170) 4_1: 100232 (1.002320, -0.002320) 4_2: 99991 (0.999910, +0.000090) 4_3: 100008 (1.000080, -0.000080) 4_4: 99915 (0.999150, +0.000850) 4_5: 100073 (1.000730, -0.000730) 4_6: 100394 (1.003940, -0.003940) 4_7: 99944 (0.999440, +0.000560) 4_8: 99994 (0.999940, +0.000060) 4_9: 100325 (1.003250, -0.003250)
The only difference is that the expected percentage is 1% here, because the percentage is calculated against the total of digits. But we’re only considering digits following the digit ‘4’ — which only occurs 10% (or so) of the time.
So each following digit here is from a pool that’s 10% of the total. Divvying that up among the ten digits gives us the expected 1% figure.
The flatness remains, and this result is similar for the other nine digits.
Last year’s post touched on how a random series does contain structure, the simplest of which are strings of repeating digits. (More complex structure generally involves patterns of digits.)
So here’s a table showing repeating sequence frequencies:
If we add the frequency counts of each row, we get about 100,000, which is about the count of any given digit following itself (the second table above).
(That the results of two analysis algorithms adds to the confidence there aren’t any major programming bugs!)
It’s interesting how each additional digit decreases the number of occurrences by a factor of ten. (Holy Logarithms, Batman!) One can assume this holds for longer sequences as more digits are scanned.
The fraction of two-digit sequences remains about 0.8% when scanning only 1,000,000 or 100,000 digits, which implies a simple formula giving us the likely count for a repeating sequence of any length:
count = total-digits * (0.8 ÷ 10length)
For a sequence length of 3 in our 10-million digits, we have:
count = 107 * (0.8 ÷ 103)
Which, as expected, is 8,000.
Let’s try a sequence length of 20 in 100-quintillion digits:
count = 1020 * (0.8 ÷ 1020)
Which is 0.8, so we might find one such string in all those digits, but if we keep searching more digits we’re certain to find it! (Here you can see the correlation between number of digits and length of the sequence.)
One might also look for a simple sequence, such as 123456789, but such a sequence is really no more structured than 444444444. Both are specific digit strings of length nine.
So we would expect, per table three above, that the best we could do in the 10-million digits is a sequence like 1234567. We might expect to find almost any given sequence of length seven.
We would, likewise, not expect to find any of length eight or greater.
Sure enough, the sequence 1234567 appears in the data at index 9,470,444 (very close to the end of the data). And the sequence 12345678 does not occur.
I’ve gotten tired of typing in six-digit search strings trying to find one that doesn’t appear.
The odds of finding any seven-digit string is very high. The string 1010101 does not appear, but 0101010 appears five times!
Not that I expected the mathematicians to be wrong, but it’s kind of fun seeing it for my own eyes. Kind of like the first time I looked through a telescope and saw Saturn with my own eyes. Awesome!
Stay mathematical, my friends!