Well, it’s Pi Day once again (although this date becomes more and more inaccurate as the century proceeds). So, once again, I’ll opine that Tau Day is cooler. (see: *Happy Tau Day!*)

Last year, for extra-special Pi Day, I wrote a post that pretty much says all I have to say about Pi. (see: *Here Today; Pi Tomorrow*) That post was actually published the day before. I used the actual day to kick off last Spring’s series on Special Relativity.

So what remains to be said? Not much, really, but I’ve never let that stop me before, so why start now?

As I looked back over my posts here, I see that the first time I wrote about Pi Day was back in 2013 shortly after I retired. (see: *Party Time!*) That was more of a humor post just touched on the coincidence of pi with dates.

Last year’s post got into a curious property of pi: The way its digits go on forever without repeating.

More importantly, that the sequence of digits is *normal*. That means *any* given finite sequence of digits appears somewhere in the digits of pi.

So every text, sound, and image file (past, present, or future) exists somewhere in pi. And in every other transcendental number with the property of being normal.

I covered all that last year.

What I didn’t get too much into was how random the digits are. Except for the fact that the sequence is well-known, pi is a handy source of randomness (a repeatable, but random, sequence can be useful for testing and development).

That the sequence is well-known means the randomness of pi can’t be used to secure information, but it is useful if you just need a string of random digits.

The reason for today’s post (other than to commemorate the day) is that last fall it occurred to me to see if I could find a text file containing lots and lots of the digits of pi.

I found plenty (yea, internet) and grabbed one that had *ten million* digits of pi. I thought it would be fun to see for myself just how random the digits of pi really are.

Short answer: Indeed, yes, them is random.

That shouldn’t (and didn’t) surprise me, but it was a little startling to see just how “flat” that randomness really is. Check it out:

```
0: 999440 ( 9.994400, +0.005600)
1: 999333 ( 9.993330, +0.006670)
2: 1000306 (10.003060, -0.003060)
3: 999964 ( 9.999640, +0.000360)
4: 1001093 (10.010930, -0.010930)
5: 1000466 (10.004660, -0.004660)
6: 999337 ( 9.993370, +0.006630)
7: 1000207 (10.002070, -0.002070)
8: 999814 ( 9.998140, +0.001860)
9: 1000040 (10.000400, -0.000400)
```

This table shows the ten decimal digits (0-9) along with how many times they appear in the ten-million-digit string, the percentage of occurrences, and how much the percentage deviates from the expected 10% (the total of the two numbers in parenthesis is exactly 10%).

As you see, the maximum deviation — for the digit ‘4’ — is about one-hundredth of a percent, with most of them being much lower. I would expect the deviation to approach zero as more digits are processed.

I also tried scanning the digits to see if there was any structure based on what digit followed any given digit. That, too, should be entirely random.

Here’s a table showing the same sort of data for what digit follows the digit ‘4’ (which “won” the table above):

4_0: 100217 (1.002170, -0.002170) 4_1: 100232 (1.002320, -0.002320) 4_2: 99991 (0.999910, +0.000090) 4_3: 100008 (1.000080, -0.000080) 4_4: 99915 (0.999150, +0.000850) 4_5: 100073 (1.000730, -0.000730) 4_6: 100394 (1.003940, -0.003940) 4_7: 99944 (0.999440, +0.000560) 4_8: 99994 (0.999940, +0.000060) 4_9: 100325 (1.003250, -0.003250)

The only difference is that the expected percentage is 1% here, because the percentage is calculated against the total of digits. But we’re only considering digits following the digit ‘4’ — which only occurs 10% (or so) of the time.

So each following digit here is from a pool that’s 10% of the total. Divvying that up among the ten digits gives us the expected 1% figure.

The flatness remains, and this result is similar for the other nine digits.

**§**

Last year’s post touched on how a random series does contain structure, the simplest of which are strings of repeating digits. (More complex structure generally involves *patterns* of digits.)

So here’s a table showing repeating sequence frequencies:

2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|

0 | 80,810 | 8,162 | 730 | 78 | 4 | 1 |

1 | 81,024 | 7,921 | 809 | 84 | 8 | 1 |

2 | 80,945 | 8,112 | 811 | 71 | 9 | 0 |

3 | 81,449 | 8,200 | 787 | 77 | 5 | 2 |

4 | 81,276 | 8,027 | 748 | 74 | 9 | 0 |

5 | 81,195 | 8,206 | 833 | 83 | 8 | 2 |

6 | 81,221 | 7,978 | 783 | 82 | 9 | 1 |

7 | 81,288 | 8,021 | 821 | 87 | 11 | 2 |

8 | 81,031 | 8,056 | 765 | 64 | 15 | 2 |

9 | 80,773 | 8,202 | 841 | 75 | 9 | 4 |

If we add the frequency counts of each row, we get about 100,000, which is about the count of any given digit following itself (the second table above).

(That the results of two analysis algorithms adds to the confidence there aren’t any major programming bugs!)

It’s interesting how each additional digit decreases the number of occurrences by a factor of ten. (Holy Logarithms, Batman!) One can assume this holds for longer sequences as more digits are scanned.

The fraction of two-digit sequences remains about 0.8% when scanning only 1,000,000 or 100,000 digits, which implies a simple formula giving us the likely count for a repeating sequence of *any* length:

count=total-digits* (0.8 ÷ 10)^{length}

For a sequence length of **3** in our 10-million digits, we have:

count= 10^{7}* (0.8 ÷ 10^{3})

Which, as expected, is **8,000**.

Let’s try a sequence length of **20** in 100-*quintillion* digits:

count= 10^{20}* (0.8 ÷ 10^{20})

Which is **0.8**, so we might find one such string in all those digits, but if we keep searching more digits we’re certain to find it! (Here you can see the correlation between number of digits and length of the sequence.)

**§**

One might also look for a simple sequence, such as 123456789, but such a sequence is really no more structured than 444444444. Both are specific digit strings of length nine.

So we would expect, per table three above, that the best we could do in the 10-million digits is a sequence like 1234567. We might expect to find almost any given sequence of length seven.

We would, likewise, not expect to find any of length eight or greater.

Sure enough, the sequence 1234567 appears in the data at index 9,470,444 (very close to the end of the data). And the sequence 12345678 does not occur.

I’ve gotten tired of typing in six-digit search strings trying to find one that doesn’t appear.

The odds of finding any seven-digit string is very high. The string 1010101 does not appear, but 0101010 appears five times!

**§**

Not that I expected the mathematicians to be wrong, but it’s kind of fun seeing it for my own eyes. Kind of like the first time I looked through a telescope and saw Saturn with my own eyes. Awesome!

*Stay mathematical, my friends!*

Note to programmers: If you’re curious about the programs that generated the data, they are Python programs. I’ve posted the source code on my programming blog, The Hard-Core Coder. See: *Python Pi*.

March 14th, 2016 at 10:14 pm

Apple, please. That is all.

March 14th, 2016 at 11:38 pm

Sorry, the Apple Pi is locked (and encrypted). Even the FBI can’t get a slice…

March 15th, 2016 at 1:23 pm

Somebody just isn’t trying hard enough!

March 15th, 2016 at 3:00 pm

Of course, it’s only Pi Day in the US, and only if you subscribe to crude rounding 🙂 Here in the UK, it’s just another random day.

March 15th, 2016 at 3:41 pm

Do you mean because it’s March 15th now or because y’all do dates as “14 March” (which does rather ruin the effect).

As for rounding, you inspired me to finally get around to doing something I’ve been meaning to do for a while: compare the different precisions of π with regard to how accurately they give the circumference of a circle (good old

`2πr`

).Here’s a result (Earth’s circumference):

Once you use just six decimal digits, the circumference is accurate down to 1/10th mile!

And check this out. It’s the circumference of (planet) Pluto given an average distance from the Sun of 3.67 × 10

^{9}miles. (Sorry, we do miles here. At least I’m not trying to confuse you with “billions.” 🙂 )Just 12 digits of π get you that 1/10th mile accuracy.

Silly me. When I created my arbitrary precision calculator I embedded a π constant having 50 decimal digits.

Which tells me Pluto travels

23,059,290,077.349,082,370,315,802,433,271,551,170,007,223,391,413,234miles in its orbit!March 16th, 2016 at 5:15 am

Yes, on March 14, 2016, we say 14/3/16. Logical, I’m sure you’d agree?

March 16th, 2016 at 10:38 am

Oh, quite!

(Although, when it comes to naming files, I reverse that — 2016-03-14 — because then my files with dates in their name (log files and so forth) sort by date automagically.)

March 16th, 2016 at 12:30 pm

Exactly!

May 15th, 2016 at 6:46 pm

I think this is why I have a humanities degree {{LAUGHING!!!!}}

May 16th, 2016 at 7:54 pm

I wonder if there’s a university with a inhumanities degree? 😄

You could take a course in anti-social media there! 😀