Happy Pi Day!

pi pastryWell, it’s Pi Day once again (although this date becomes more and more inaccurate as the century proceeds). So, once again, I’ll opine that Tau Day is cooler. (see: Happy Tau Day!)

Last year, for extra-special Pi Day, I wrote a post that pretty much says all I have to say about Pi. (see: Here Today; Pi Tomorrow) That post was actually published the day before. I used the actual day to kick off last Spring’s series on Special Relativity.

So what remains to be said? Not much, really, but I’ve never let that stop me before, so why start now?

As I looked back over my posts here, I see that the first time I wrote about Pi Day was back in 2013 shortly after I retired. (see: Party Time!) That was more of a humor post just touched on the coincidence of pi with dates.

Last year’s post got into a curious property of pi: The way its digits go on forever without repeating.

More importantly, that the sequence of digits is normal. That means any given finite sequence of digits appears somewhere in the digits of pi.

pi cake

Pi Cake!

So every text, sound, and image file (past, present, or future) exists somewhere in pi. And in every other transcendental number with the property of being normal.

I covered all that last year.

What I didn’t get too much into was how random the digits are. Except for the fact that the sequence is well-known, pi is a handy source of randomness (a repeatable, but random, sequence can be useful for testing and development).

That the sequence is well-known means the randomness of pi can’t be used to secure information, but it is useful if you just need a string of random digits.

The reason for today’s post (other than to commemorate the day) is that last fall it occurred to me to see if I could find a text file containing lots and lots of the digits of pi.

I found plenty (yea, internet) and grabbed one that had ten million digits of pi. I thought it would be fun to see for myself just how random the digits of pi really are.

Short answer: Indeed, yes, them is random.

That shouldn’t (and didn’t) surprise me, but it was a little startling to see just how “flat” that randomness really is. Check it out:

0:   999440 ( 9.994400, +0.005600)
1:   999333 ( 9.993330, +0.006670)
2:  1000306 (10.003060, -0.003060)
3:   999964 ( 9.999640, +0.000360)
4:  1001093 (10.010930, -0.010930)
5:  1000466 (10.004660, -0.004660)
6:   999337 ( 9.993370, +0.006630)
7:  1000207 (10.002070, -0.002070)
8:   999814 ( 9.998140, +0.001860)
9:  1000040 (10.000400, -0.000400)

This table shows the ten decimal digits (0-9) along with how many times they appear in the ten-million-digit string, the percentage of occurrences, and how much the percentage deviates from the expected 10% (the total of the two numbers in parenthesis is exactly 10%).

As you see, the maximum deviation — for the digit ‘4’ — is about one-hundredth of a percent, with most of them being much lower. I would expect the deviation to approach zero as more digits are processed.

I also tried scanning the digits to see if there was any structure based on what digit followed any given digit. That, too, should be entirely random.

Here’s a table showing the same sort of data for what digit follows the digit ‘4’ (which “won” the table above):

4_0:   100217 (1.002170, -0.002170)
4_1:   100232 (1.002320, -0.002320)
4_2:    99991 (0.999910, +0.000090)
4_3:   100008 (1.000080, -0.000080)
4_4:    99915 (0.999150, +0.000850)
4_5:   100073 (1.000730, -0.000730)
4_6:   100394 (1.003940, -0.003940)
4_7:    99944 (0.999440, +0.000560)
4_8:    99994 (0.999940, +0.000060)
4_9:   100325 (1.003250, -0.003250)

The only difference is that the expected percentage is 1% here, because the percentage is calculated against the total of digits. But we’re only considering digits following the digit ‘4’ — which only occurs 10% (or so) of the time.

So each following digit here is from a pool that’s 10% of the total. Divvying that up among the ten digits gives us the expected 1% figure.

The flatness remains, and this result is similar for the other nine digits.

§

Last year’s post touched on how a random series does contain structure, the simplest of which are strings of repeating digits. (More complex structure generally involves patterns of digits.)

So here’s a table showing repeating sequence frequencies:

Digit sequences
2 3 4 5 6 7
0 80,810 8,162 730 78 4 1
1 81,024 7,921 809 84 8 1
2 80,945 8,112 811 71 9 0
3 81,449 8,200 787 77 5 2
4 81,276 8,027 748 74 9 0
5 81,195 8,206 833 83 8 2
6 81,221 7,978 783 82 9 1
7 81,288 8,021 821 87 11 2
8 81,031 8,056 765 64 15 2
9 80,773 8,202 841 75 9 4

If we add the frequency counts of each row, we get about 100,000, which is about the count of any given digit following itself (the second table above).

(That the results of two analysis algorithms adds to the confidence there aren’t any major programming bugs!)

It’s interesting how each additional digit decreases the number of occurrences by a factor of ten. (Holy Logarithms, Batman!) One can assume this holds for longer sequences as more digits are scanned.

The fraction of two-digit sequences remains about 0.8% when scanning only 1,000,000 or 100,000 digits, which implies a simple formula giving us the likely count for a repeating sequence of any length:

count = total-digits * (0.8 ÷ 10length)

For a sequence length of 3 in our 10-million digits, we have:

count = 107 * (0.8 ÷ 103)

Which, as expected, is 8,000.

Let’s try a sequence length of 20 in 100-quintillion digits:

count = 1020 * (0.8 ÷ 1020)

Which is 0.8, so we might find one such string in all those digits, but if we keep searching more digits we’re certain to find it! (Here you can see the correlation between number of digits and length of the sequence.)

§

pi ringOne might also look for a simple sequence, such as 123456789, but such a sequence is really no more structured than 444444444. Both are specific digit strings of length nine.

So we would expect, per table three above, that the best we could do in the 10-million digits is a sequence like 1234567. We might expect to find almost any given sequence of length seven.

We would, likewise, not expect to find any of length eight or greater.

Sure enough, the sequence 1234567 appears in the data at index 9,470,444 (very close to the end of the data). And the sequence 12345678 does not occur.

I’ve gotten tired of typing in six-digit search strings trying to find one that doesn’t appear.

The odds of finding any seven-digit string is very high. The string 1010101 does not appear, but 0101010 appears five times!

§

Not that I expected the mathematicians to be wrong, but it’s kind of fun seeing it for my own eyes. Kind of like the first time I looked through a telescope and saw Saturn with my own eyes. Awesome!

Stay mathematical, my friends!


Note to programmers: If you’re curious about the programs that generated the data, they are Python programs. I’ve posted the source code on my programming blog, The Hard-Core Coder. See: Python Pi.

About Wyrd Smythe

The canonical fool on the hill watching the sunset and the rotation of the planet and thinking what he imagines are large thoughts. View all posts by Wyrd Smythe

10 responses to “Happy Pi Day!

  • charmarie221

    Apple, please. That is all.

  • Steve Morris

    Of course, it’s only Pi Day in the US, and only if you subscribe to crude rounding 🙂 Here in the UK, it’s just another random day.

    • Wyrd Smythe

      Do you mean because it’s March 15th now or because y’all do dates as “14 March” (which does rather ruin the effect).

      As for rounding, you inspired me to finally get around to doing something I’ve been meaning to do for a while: compare the different precisions of π with regard to how accurately they give the circumference of a circle (good old 2πr).

      Here’s a result (Earth’s circumference):

      15: 25,132.741,228,718,344 
      12: 25,132.741,228,712 
       9: 25,132.741,224 
       6: 25,132.736 
       3: 25,128

      Once you use just six decimal digits, the circumference is accurate down to 1/10th mile!

      And check this out. It’s the circumference of (planet) Pluto given an average distance from the Sun of 3.67 × 109 miles. (Sorry, we do miles here. At least I’m not trying to confuse you with “billions.” 🙂 )

      15: 23,059,290,077.349,080,62 
      12: 23,059,290,077.343,26 
       9: 23,059,290,073.02 
       6: 23,059,285,280 
       3: 23,054,940,000 

      Just 12 digits of π get you that 1/10th mile accuracy.

      Silly me. When I created my arbitrary precision calculator I embedded a π constant having 50 decimal digits.

      Which tells me Pluto travels 23,059,290,077.349,082,370,315,802,433,271,551,170,007,223,391,413,234 miles in its orbit!

  • ~ Sadie ~

    I think this is why I have a humanities degree {{LAUGHING!!!!}}

And what do you think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: