L26

Mind ToolsToday I’d like to introduce you to a concept I picked up from mathematician Rudy Rucker in his 1987 book, Mind Tools (The Five Levels of Mathematical Reality). I’ll warn you now that there is some math ahead (but no math homework—unless you want to). It won’t get any more complicated than multiplication and addition, but we will be dealing with some extremely large numbers (so large they are more ideas than numbers).

The end result is that we’re going to tie together the written word with numbers.  I’m going to show you how every word, every sentence, every book, magazine and blog article can be reduced to a single (very large) number. That we can do this provides a foundation we can use to discover some amazing things about mathematical reality.

It may sound dry or intimidating, but stick with it! You just might find it worthwhile.

prison dining hallRucker starts the discussion with the old joke about prison inmates. As they gather in the prison dining hall, a new prisoner notices how sometimes an inmate stands up and calls out a number.  “287!” exclaims one, and all the other inmates break into laughter.  “62!” exclaims another, and again everyone laughs.

The new guy asks about this and is told that most of the guys have been here so long that they’ve all memorized the 1001 Great Jokes book from the library. Therefore, all an inmate needs to do is call out the number of the joke.

This joke goes on to tell how the new guy decides to try it out, so one day he stands up and calls out, “871!”  Not only does no one laugh, some actually groan!  The poor guy asks what happened, “Isn’t 871 a good joke?”

1001 jokes“No, it’s one of the best!” he’s told.  “But why didn’t anyone laugh?” he asks?

“Well,… you didn’t tell it right,” comes the reply.

The point here is that the jokes were coded into numbers.  Given this coding system, the entire joke is reduced to a single number, which represents the joke.  One can envision a scheme that attempts to do this to all jokes that exist.  As there are many, many jokes, the numbers would get large (millions? billions?).  And some sort of index is required to translate between the numbers and the jokes (in the joke book case, the book itself is the index).

Rucker brings up another idea, an extension of which is often used to code secret messages (you may have run into this one in movies or books). If you have two dictionaries, you could number the words from beginning to end.  You could then send secret messages by using those numbers. The recipient would use their copy to decode the message.  (In spy novels, an ordinary book is used, and the code numbers use some combination of page, line and word counts. For example, 121-32-5 might mean the 5th word in the 32nd sentence on page 121.)

Rudy Rucker

Rudy Rucker. Looks like a mathematician, doesn’t he?

The point is that, again, we’re converting text, words, into numbers.  In the spy code scheme, we’re using a bunch of numbers, two or three for each word.  This isn’t very efficient, but it works very well. (The lack of efficiency actually aids the necessary secrecy of the message.)

Something very important to realize here is that these schemes produce unique numbers for the jokes or words.  A given number always stands for only one joke or word, there is never any confusion going from the number to the text.  (In our spy code, notice that any given word could appear many times in a book, so a given word might have many code numbers.  This is not the case with the joke codes or the dictionary code, since each joke or word only appears once in their respective books.)

So now we’ve seen a couple of specialized ways to relate (“map” is the technical term) text and numbers.  Let’s now consider a more universal way of creating a map between text and numbers.  To do that, I need to make sure you understand how we write numbers.  If you already know what “positional notation” is all about, feel free to skip ahead.

digitsThe numbers we use every day (such as those used to number the jokes) consist of ten symbols that we call “digits” (0, 1, 2, 3, 4, 5, 6, 7, 8 & 9).  The digits are the “alphabet” of numbers.  We can think of numbers containing more than one digit as similar to words. Each combination of symbols gives us a unique number or word. (The words “ate” and “eat” are different despite having the same letters. Likewise, “582” and “825” are different numbers.)

Positional notation is the idea that the position of a digit in a number multiplies the digit by some factor.  Our daily use numbering system is called “decimal” because we use only ten distinct symbols. This is called the “base” (or radix) of the system.  Decimal notation is a “base 10” numbering system.

base 10 blocks

Base 10 blocks, a teaching aid. Each shape represents a position. Pictured above, shapes for 1’s, 10’s (the sticks), 100’s (the sheets) and 1000s (the big block).

The multiplication factor is tied to the base (the symbol count). Ten symbols means that our multipliers are based on ten.  Specifically, each position (moving right to left) bumps the multiplication factor by ten times.

The first position, the right-most digit, always has the multiplication factor of 1. Single digits therefore always just have their own value (“8” = {8 x 1} = 8).

To get the multiplier for the next position, we multiply by 10.  Since 1 x 10 = 10, the multiplier for the second position (from the right) is always the base of the numbering system.  Binary notation (which computers use internally) is a base 2 system having only two symbols, the famous “1”s and “0”s of computers.  Therefore in binary, the second position multiplier is 2.

To get the third position (in decimal) we multiply again by ten: 10 x 10 = 100.

The fourth position is: 100 x 10 = 1,000.  (And so on, basically adding a zero each time we multiply by ten.)

2013So let us put this to use.  We’re in the year 2013. What does that number mean?  It breaks down like this:

2 x  1000 = 2000
0 x 100 = 0
1 x 10 = 10
3 x 1 = 3

And: 2000 + 0 + 10 + 3 = 2013

Now that seems like a really trivial example.  Even this business of just adding a zero to each multiplier seems strangely obvious.  Well, of course it is. We’ve been counting in decimal all our lives, so does seem strangely obvious.  We’re used to seeing decimal numbers.

Let’s try it in binary, and it will seem a bit less obvious.  Remember that in binary we’re dealing with factors of 2. Let’s apply that to a randomly chosen binary number, say “11111011101”.

Binary

Calm down! It’s only ones and zeros

1 x 1024 = 1024
1 x 512 = 512
1 x 256 = 256
1 x 128 = 128
1 x 64 = 64
0 x 32 = 0
1 x 16 = 16
1 x 8 = 8
1 x 4 = 4
0 x 2 = 0
1 x 1 = 1

And: 1024 + 512 + 256 + 128 + 64 + 16 + 8 + 4 + 1 = 2013

It’s a lot less obvious when we get away from familiar decimal, isn’t it!

Now we know how to calculate the multipliers for a given base. Just multiply by the base every time you move one position to the left.

What about the symbols themselves? We’ve been using the ordinary digits in each position, but realize that these are just symbols.  The symbol “8” stands for the concept of eight somethings (ships, shoes, sheep or whatever). We’ve assigned the value eight to the text symbol “8”.

What if we used different symbols and assigned values to them?

alpha-codeWhat if we used the alphabet, from “A” to “Z”?  The natural ordering of the alphabet gives us an easy way to assign values.  “A” = 1, “B” = 2, “C” =3 and so on.  You may have even used this idea to create a simple (easily broken) code.  You should be able to decode “23-25-18-4 19-13-25-20-8-5” very easily.

But let’s take it a step further into the realm of positional notation rather than a string of individual numbers.  We have 26 symbols (letters) in our coding system, and that means we’re using base 26 notation.  We need to multiply each position by 26 (instead of 10 or 2).  That’s going to lead to some larger numbers than we’re used to.  Let’s consider the first dozen factors:

1
26
676
17,576
456,976
11,881,376
308,915,776
8,031,810,176
208,827,064,576
5,429,503,678,976
141,167,095,653,376
3,670,344,486,987,776

As I said, the numbers are gonna get big up in here!  Let’s try a simple case, the word “Wyrd.”

[W] = 23 x 17,576 = 404,248
[Y] = 25 x 676 = 16,900
[R] = 18 x 26 = 468
[D] = 4 x 1 = 4

And: 404,248 + 16,900 + 468 + 4 = 421,620

my name isHi! My name is 421,620. Pleased to meetcha!

So now we have a system, let’s call it “L26”, that allows us to code any (English) word into a single number. Effectively this means that every word is a distinct number.

But what about sentences with spaces between the words?  What about punctuation?  What about foreign languages?  What’s the number for Wörd?  Tomorrow I’ll explore these topics!

About Wyrd Smythe

The canonical fool on the hill watching the sunset and the rotation of the planet and thinking what he imagines are large thoughts. View all posts by Wyrd Smythe

8 responses to “L26

  • The Color of Lila

    … and Lila is 217,309. But can we decode it back to “Lila” without first knowing the name? And… won’t other names have my number?

    Speaking of numerical cryptography… the number of the Beast is 666. Scholars have long debated who this refers to, but generally it is agreed to be some annoying person of the time, and not Satan.

    • Wyrd Smythe

      Absolutely. The process is identical to how you take a value and calculate the digits that express the number in a given base. (It’s the reverse of the process I demonstrated, so it involves division.) In decimal, of course, you end up with the same digit string you started with. When you’re expressing a number in a different base, you’ll get a different string of “digits.” (If you’ve ever converted a numeric value into a binary string of 1’s and 0’s, you’ve done the steps.)

      A given string has only one number (in a given scheme), and any number has only one possible text string. And every string has a (unique) number, and every number represents some (unique) string. (Mathematicians say the mapping is bijective or “one-to-one.”) So, no one else has your number!

      There is a slight caveat when it comes to the symbol we use for zero. Numerically, “00000” and “0” are the same thing. As strings, they’re not (or shouldn’t be). However their number would be zero in both cases. In the L26 system, “Z” and “ZZZZZZZZ” would both have the number zero. Any string of just “Z”s will. And strings with leading “Z”s will have the same number regardless of the number of leading “Z”s.

      (In today’s article, the L27 system alleviates that problem, but still has the same issue with spaces, so strings with leading spaces will always have the same number, and all strings of just spaces have the number 0. When you get into higher “L” systems, you can assign zero to some character that never appears, and then the problem goes away.)

      I’ve seen a lot of fun with 666 over the years; shows up in horror movies a lot. When a baseball team is winning twice as many as they lose (i.e. winning 2/3 of their games), they’re playing .666, which I like to refer to as “demon ball.” (FWIW, it’s unheard of to play at that level for any time… best team in baseball right now (the Cardinals) are playing .618. Thus “demon ball” earns the name… almost has to involve some sort of nefarious deal! (Damned Yankees!))

  • The Color of Lila

    Thanks. Now my brain hurts. ; )

%d bloggers like this: