Slashdot Mirror


1.7 Billion Digits Of Pi On CD

H0ek writes "Not that there is any use for this whatsoever, but there is a torrent available for 1.7 billion digits of pi on a CD. The data is everything after the '3.' on one line, bzipped. There are a couple of the Cygwin tools on the disk as well as source for a small search tool (because grep just didn't cut it this time). Inside the ISO there's links to the source of the data, in case you want the rest of the 4.2 billion digits available. Wear your geek badge with pride! Be the first kid on your block to have the entire set!"

8 of 202 comments (clear)

  1. 700 Mb? by Anonymous Coward · · Score: 1, Informative
    This will save you a 700 Mb download:
    perl -e 'for($|++,$_++,$a++;;$,+=$a/$_,$_++,$a*=-1,$_++){p rintf"%.16f\n",$,*4}'
    Might take a while to get your 1.7 billion digits though...
  2. Re:Who uses PI? by Daniel+Dvorkin · · Score: 3, Informative

    Our microscopic image processing software uses pi to considerable precision. This is, I admit, a pretty specialized application.

    --
    The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
  3. Hidden messages by jsveiga · · Score: 2, Informative

    If all you want to do is search for mystic stuff inside the number, you don't need the CD with its measly 1.7bi digits.

    Save your bandwidth and just go here to search within 4bi digits.

  4. Re:Woo Hoo! by stienman · · Score: 3, Informative

    You could do that before:
    PI Phone Number Search Engine

    -Adam

  5. Re:Compressing Pi by MankyD · · Score: 2, Informative

    If each digit is was stored as ascii, you could use a Huffman code (the basic zip encoding) to shrink it down. This returns a result much like you suggested. If for those not familiar with Huffman Codes, I'll give a quick and dirty summary:

    Say each char takes 8 bytes but, in this case, you're only using 10 chars, so you don't need 8 bytes to represent it all. Huffman codes do a quick count of character frequencies and create tree of shorter bit representations for each character. Characters that have a higher frequence (say the number 0 occurred a lot) are given a shorter code. A tree is used to mantain uniqueness so that no ambiguity occurs. For instance:

    0 - 0
    1 - 100
    2 - 101
    3 - 11000
    4 - 11001
    5 - 11010
    6 - 11011
    7 - 11100
    8 - 11101
    9 - 11110

    If I give you the bits "011000101" you can translate that to "0 - 3 - 2". Use google for those who want to learn more.

    --
    -dave
    http://millionnumbers.com/ - own the number of your dreams
  6. Re:Algorithm by JaxWeb · · Score: 2, Informative
    --
    - Jax
  7. Re:Shouldn't compress well by whydna · · Score: 2, Informative

    Assuming that the ASCII digits of pi are evenly distributed between '0' and '9', then you should have log2(10) = 3.322 bits per digit. At 3.322 bits per digit and 700MiB (5872025600 bits at 8 bits per byte) we should have about 1,767,655,841
    digits. Assuming that they're publishing exactly 1.7 Billion digits, they're within 3.8% of ideal compression, assuming an eactly even distribution of digits.

    Where it gets interesting is if there's NOT an even distribution of digits (which I don't believe is actually the case for digits of pi, but humor me). For example, assume that one digit, say 0, occurs 100 times more frequently than each of the other digits, with the remain 9 digis occuring evenly. If you use something like huffman encoding and represent '0' with 1 bit, and the remaining bits with an average of 4.22 bits per digit (based on my back-of-the-envelope huffman encoding), you'd wind up averaging 1.266 bits per digit (100/109 * 1 bit + 9/10 * 4.22 bits). Clearly, as the frequency of this digit approaches infinity, the average bits per digit approachs 1. With the scheme described above, any distribution with 0 occuring at least 27.95% of the time (and an even distribution of the remaining digits), this basic encoding will perform better than the ideal encoding described in the first paragraph. For the curious, my huffman encoded values in this scenario were (0, 1000, 1001, 1010, 1011, 1100, 1101, 11100, 11101, 1111) for 0-9 respectively.

    Obviously, the digits of pi are NOT distributed as disproportionately as I mentioned above, so a simple huffman encoding scheme is unlikely to produce better than ideal encoding. In fact, assuming an even distribution of digits, a back-of-the-envelope huffman encoding would yield encoded values of (1100, 1101, 1110, 1111, 000, 001, 010, 011, 100, 101) for 0-9. With this, you'd average 3.4 bits/digit, which is only just over 5% away from ideal encoding (this also agrees with the estimate from above, especially since bzip2 uses huffman encoding).

    Of course, huffman encoding is not the only option. You could consider checking for multiple-digit combinations and determining the frequency of each combo. Or you could look at actual bit patterns, which is similar to run-length-encoding.

    All of this is discounting the fact that the digits of PI are computable, and thus encoding them can be completely avoided if you're willing to spend considerable resources calculating the values yourself. In this case you need 0 bits per digit (discounting the size of your "decompression program"). This is the most computationally intense option, but yields the most optimal compression ratio.

  8. Applicable poem by H0ek · · Score: 2, Informative

    The nifty thing about sharing links like this is you get fun mail, like this poem from a friend:

    Now I will a rhyme construct
    By chosen words the youth instruct
    Cunningly devised endeavour
    Con it and remember ever
    Widths in circle, here you see
    Sketched out in strange obscurity

    I might just have to memorize it. ;-)

    --
    H0ek
    Think you're smart? Prove you've got brains!