The website doesn't look like it's been updated in a dozen or so years, but it's still up. It's a good style marker for how the web used to look. I guess they still make enough in ad sales and Pi t-shirts to stay running.
This was probably very slow back in the day. The irony is that network speed is being used to cram more stuff across the web, rather than speed up existing material. The price of change.
Well I did always find dial up far more bearable when I was drugged out of my head... though I'm kinda upset you don't remember the v92 connect noise... this is HN not reddit ;)
An idea I've had, the answer is most probably no, but I found it interesting to think about, others may find it trivial:
Is there a number that (1) contains most subsequences and where (2) finding a certain subsequence is computationally efficient? Also, (3) finding a subsequence given a starting and end position should be efficient. If so, we have a efficient mean to transfer data. Just send the index positions. And we can store any data with just those two index positions. Then again, those index numbers are likely to be very very large. Perhaps we can in turn transform the index positions to smaller index positions by finding their positions in the sequence. Then we need a third number to signify the number of recursive uses of the storage.
I don't know about (1), but (2) could be achieved by scanning through the first n digits and building a lookup tree. If we restrict ourselves to Pi then (3) could be achieved by either just keeping all the digits in memory or using the Bailey–Borwein–Plouffe formula [1], which allows the nth digit to be determined without needing to calculate the preceeding digits.
Your're essentially describing a compression mechanism. As you might expect, this isn't a new idea [2] and it generally fails because (as you said) the index quickly gets large as the size of the block of data increases - basically destroying any advantage for non-trivial block sizes. There is some discussion of the space trade-offs in [3], and I wouldn't be surprised if the speed of this approach is poor when compared to conventional techniques.
Another problem is that it is still not known whether number like Pi actually contains every possible finite sequence of digits [3], so not all input blocks can be compressed. In practice, this shouldn't be a problem for small blocks though.
(2) I'm too lazy to do the calculation right now, but it should not be too difficult to calculate the sum of the lengths of all numbers smaller than n, and then you've got the index of n.
(3) In any feasible encoding, the expected size of the start index of n will be much larger than the size of n. And every recursive step will just blow up the sizes even more.
If you slightly modify it to 0. 0 1 2 3 4 5 6 7 8 9 00 01 02 03 04 05... then the number n padded with zeros to k digits can be found starting at position 10 * floor(10^(k - 1) / 9) + k * n.
I proposed a similar idea to a mathematician friend but I couldn't quite understand his dismissal as I am not a mathematician. My idea was to run algorithms trying to find the most convenient equations that generate number sequences that correspond to raw video piece by piece. It would take massive CPU power but if it is a popular video it might be worth it for the bandwidth savings.
Maybe of interest for some of you: http://baby.pirthday.com/
8 days until the perfect creation date for your ultimate pi-baby (born 3/14/15).
Other essential services are: your age in pi (http://pirthday.com/) or seeing who has his pirthday today (http://happy.pirthday.com/).
Disclaimer: Logo has been created by a friend (@turboele), other stuff by me.
Interesting idea. Actually, just thinking, could be used for a pseudo-crypto app. Just send a series of origin numbers [n1,n2,n3...] and this translates to [pos1,pos2,pos3...] series of numbers. You can use also instead of Pi any other transcendental/irrational number [0].
So if I take the latest Lady Gaga mp3 and look at the stream of bytes, and if I can find that identical stream of digits in the digits of Pi, does Lady Gaga (or her record company) "own" that portion of Pi? If yes, then WTF!?!? If no, why not, and what does this mean for the idea of copyright, anyway?
Copyright restricts copying, it doesn't restrict similar, or even identical, expressions that are independently generated (a jury may see similarity as evidence of copying, so merely the theoretical possibility that a similar work was independently arrived at doesn't get you a free pass from liability, but similarity itself isn't restricted by copyright.)
So, no, the fact that a sequence of bytes occurs in a particular digital recording doesn't mean that the copyright owner of that recording "owns" that series of bytes. It means they own a specified set of rights in that particular recording.
Have they found this bit from Contact
yet
"When Ellie looks at what the computer has found, she sees a circle rasterized from 0s and 1s that appear after 10^20 places in the base 11 representation of π. This gives her a way to convince the world of something greater – that intelligence is built into the universe itself."
Well mine isn't within the first 50 million. In any case, what's the relevance of this? Everything (with the correct encoding scheme) can be expressed somewhere in the decimal expansion of Pi.
> Be warned that 50 million digits of pi takes up 50 megabytes. This can take up to 4 hours to download with a 28.8k modem!