Experimental Math Part II: The Naughty, Naughty Life of Pi
What the hell did you put in my Pi?
Searching for naughty (tee hee!) words in the world’s most famous transcendental number
I’m seeing a lot of buzz about the number Pi lately – must be with Pi day coming up (March 14th, 3-14, get it?). Or maybe it’s a more general love affair with transcendental numbers. Lots of interesting factoids (or as I call them, “Factwads”) are being bandied about, and one of them caught my eye – it has to do with the interesting fact that Pi is a never-ending stream of seemingly-random digits. They’re not really random, of course, but they look random – if you read the first 100 digits of Pi, it will look like someone picked digits out of thin air.
So what’s so cool about that? This means that eventually you’ll find any combination of numbers in a row that you want – somewhere along the string of decimal places, you’ll find “12345” for example. You will also find your birthday, somewhere. It might take a while to find it, but given you’ve got an infinite series to look through, you’ll find it eventually. Somewhere in there is your cholesterol score, today’s winning lottery number, tomorrow’s winning lottery number, the bank account balances of everyone on the planet, the 68th prime number – any string of digits you can think of is in there somewhere.
What’s cooler is if you try encoding words, English text, into numbers – which can then be found somewhere in Pi. That means that any string of words, can be found (after suitable conversion to numbers) somewhere in the digits of Pi. Any phrase you’ve ever uttered, any book ever written, even books not yet written, are in there somewhere. Somewhere along Pi, is a copy of this very essay! The digits of Pi are essentially an example of the famous “monkeys on a typewriter” producing infinite possible strings of text, and it’s right there inside the geometry of a circle.
So naturally, I decided I wanted to try to look for swear words in Pi.
So how do we actually go about this? There are lots of ways to encode letters into numbers – you probably remember doing simple ciphers like A = 1, B = 2, etc. as a kid. That never got you that coveted job at the CIA, did it? Tsk. For this exercise, I’m using the ASCII encoding – for the purpose of computer typing, every letter (and in fact every bit of punctuation you can type on a computer) has been assigned a number. For example, the letter ‘A’ gets assigned the two-digit number 65. Lowercase gets a different number – ‘a’ is assigned to 97. If you’re curious, the percent sign ‘%’ gets assigned to 37. The table to the right shows the whole ASCII code, if you’re interested. So now we can convert any word into a string of numbers, by just stringing along the numeric ASCII codes for each letter. So for example, the word ‘FART’ becomes F = 70, A = 65, R = 82, T = 84 = ‘70658284’. That four letter word expands into an 8-digit number. Note that you don’t have to use this encoding, and if you use a different one you’ll get different results than I have. If you use an encoding that gets by with fewer numeric digits, you’ll have better luck finding whatever phrase you want in Pi, for reasons we’ll discuss below.
Now, I need to get my hands on some Pi. Easy – a quick internet search turned up many copies of Pi out to millions of digits, and I selected one that goes out to 100 million, available here (http://archive.org/details/Pi_to_100000000_places). (Note that I didn’t check accuracy – once timeblimp.com becomes a paying enterprise, I’ll double back and check his numbers.) Ideally I’d like to search even farther along Pi, as I may need to search untold billions until I come across the phrases I want. But we’re also limited by practical aspects of computing and time, so for now I’m limiting myself to 100 million decimal places. Now it’s simple to write a computer program to search for any arbitrary string of letters in the entire 100-million sequence of digits. It doesn’t take quite as long as you’d think, so I was able to find a dictionary of 170 thousand English words, which I could then search for in Pi, one by one. With particular attention on certain four-letter words….
So what happened? Well, I can’t find shit in Pi, but I can sure find plenty of other interesting words. The first mildly naughty word we come across is ‘SUCK’, at position 3,833,072. That might sound like a pretty large number of decimal places, but that’s actually relatively early along for a four-letter word. We also come across ‘BARF’ at 4,843,188. The quite useful word ‘CRAP’ appears at 51,557,663, and my personal favorite, ‘JIZZ’ appears at position 18,953,642. So now I can say that I found some jizz in my Pi.
Here are all the four-letter words I could think of that I managed to find in the first 100 million digits of Pi:
While it’s a theoretical certainty that some longer phrase (e.g. ‘I got your Pi right HERE’) appears in the digits of Pi, as a practical matter it will be very difficult to find. We find two-letter words all over the place, three letter words fairly commonly, but the first commonly-known four-letter word to show up in Pi according to my scheme is ‘RUTS’, at position 17,346. I can’t find a five-letter word until almost 3 million positions in, the word ‘FLOWN’ at position 2,860,850. Not surprisingly, the difficulty of finding any given N-letter word grows exponentially as N increases.
To illustrate this, below is a plot showing the results I got in trying to find about 170,000 English words in the first 5 million digits of Pi. We see tons of 3-letter words, a fair portion of 4-letter words (though far less than 3-letter words), and just a couple of 5-letter words. As you might have guessed, the longer the word, the harder it is to find.
So if you’re looking for your favorite swear word in Pi, and it happens to be pretty long (like my friend @peterwoodward, who continues to pester me to find “FUCKFACE” in Pi), you better be prepared to hunt for a while. As a rule of thumb, for my encoding scheme you’ll need to hunt through 10^(2N) digits of Pi to have a reasonable chance at finding a string of letters of length N. For this essay, which is about 10,000 letters long, you’d have to hunt through 10^20,000 digits in Pi, which is a gogol raised to the 200th power. Keep in mind that the number of atoms in the universe is much less than one gogol. So there’s nowhere near enough material in the universe to form a piece of paper (or computer) large enough to store the digits of Pi you’d need to eventually stumble across this particular essay!
That’s pretty nerdy… but can you really go overboard with all this stuff?
Me? No, I’m stopping here. Other people? Oh, you betcha! I’ve learned by this point in my career that when I take something to a nerdy extreme, someone else surely has outdone me by a mile. And for Pi, probably the most famous transcendental number of them all? There are endless fascinating topics to explore about the wonders of Pi. Here are a few other folks exploring the search for patterns in Pi, to whet your whistle.
If you’re wondering if there’s any way you could look up your favorite word in the digits of Pi (aside from contacting me on twitter, @timeblimp), head on over to the Search For Pi page, mathematician David Bailey has set up a webpage where you can search any word you want in the first four billion digits of Pi. With that kind of room, you can fairly easily find words up to 7 or 8 letters long, which makes the usual four-letter swear words a piece of cake to find.
Over at The Pi Code, Mike Keith has taken my amateurish fumbling and made it the real deal. He’s followed up on the observation that a more efficient coding of letters-to-numbers than the ASCII Code might be more fruitful, by using Base 26 for the conversion – much more natural for our 26-letter alphabet. For his encoding scheme, he’s estimated how often we’d expect to see an N-letter word show up in the digits of pi, and it clearly shows the exponential growth I observed – we should see 4-letter words every 81 positions, but 8-letter words every 5.7 million positions. The first six-letter word he found was “OXYGEN”, which is appropriate, as that’s what nerds like us need after walking up a flight of stairs.
Keith takes it up a notch by then considering Pi in TWO DIMENSIONS (cue epic music) – specifically, write out the digits of Pi in a grid, then look for words in the old word-search style, up, down, left, right, and diagonally. This gives you a little more chance to find something interesting (more “degrees of freedom” for interesting coincidences to happen), and sure enough, he’s able to find lots of examples of words, including spooky examples such as “ALPHA”, “OMEGA”, and “GOD” appearing all in the same little grid about 150k positions in, and of course “DEMON” and “SATAN” in another grid at ~250k.
Finally, Keith points out that if you do another simple substitution-type conversion of letters to numbers, by using the old substitution cipher A = 1, B = 2, the first few digits of Pi are:
3 . 14 15 9 26 5 …
Which decodes to “C.NOIZE” – get it? You’re “Seeing noise” when you look at digits in Pi! Hmmm, that’s kind of mean of Pi to rub it in our faces like that. What a fuckface.
Check out Keith’s page for lots more interesting discoveries in the hunt for English words in Pi, using his base-26 scheme.
If this is all too mathy for you, you might want to explore the humanities side of Pi, and look into Writing in Pilish. The idea is to write a story about any topic you want, with one important constraint – each word in your story has to be of length exactly equal to the corresponding digit in Pi. So your first five words need to have lengths 3, 1, 4, 1, and 5, and you try to keep going as long as you can, sticking to the pattern. How far do you think you could keep it up? Maybe 6 or 7 words? Maybe eke out a sentence? Well sit down, friend, because Mike Keith has written a 10,000-word novella called “Not A Wake“, written entirely in Pilish. Pick up his book on Amazon, for the low low list price of Pi^2 + Pi/2 !