Nerd Vacation to the Earth Simulator
eecue writes "Earlier this year I went on vacation to japan. At the end of my trip I was lucky enough to receive a tour of the Earth Simulator, which is the world's fastest super computer. I took pictures and wrote about it."
I wonder how far in advance things like, say, the climate can be predicted, even by such a powerful computer. It's almost impossible to predict the weather for even a small area (I live in the Netherlands) for more than the coming few days to a week, because it's so sensitive to small errors. (That doesn't mean I'm not impressed by the thing, of course.)
The article mentions that "They were afraid to mention on their website that they offered tours as there were only 3 english speaking employees of the lab". Now this hits Slashdot. Guess they may as well mention it on their site now, since it's already now known in the world of the rabid technophile.
404 Not Found: No such file or resource as '.sig'
This is the fundamental obstacle to simulation of natural phenomena. However, while local parameters remain hard or impossible to predict, global parameters are easier to forecast, and computing power helps. This is where supercomputers come in: for example, they help us study the effect of global warming far out into the future.
Hmmm. "Final" digits of pi, eh? Hey, the comment was a reasonably funny idea, and the flamebait mod is a bit harsh, but have a think about it for a second.
... although this may not turn out to be the case. I don't know what algorithms people use to calculate pi to these levels of accuracy).
I read in Q Magazine (music mag from the UK) a column by Blur's bassist in which he wrote that he was doing some thinking about the value of pi, which as we know is an infinite decimal. No, not in the style of "one ninth" which is, also, an infinite decimal (0.1111... recurring), but rather, an infinite, random sequence of digits, that occurs in a precise order.
Now think about that for a second. He pointed out that if you were to take the number 6 and repeat it a million times, and then string together the phone numbers of everyone in your nation's capital city, then that sequence of digits WILL occur somewhere within pi. In fact, it will occur an infinite number of times, but let's not labour the point.
Taking his original concept, it occurred to me that you could use a system whereby a sender and receiver both have a whizz-bang algorithm for calculating pi. Now, no doubt the maths graduates in Slashdot will chime in with how this can be done, but let's imagine that both you and I have some method of reliably generating a sequence from pi (e.g. start at the millionth digit within pi's sequence, and then crank out the next 100,000 values).
So imagine, say, if you were to take some digital media, e.g. the entire source code for Windows, and zip it up into a single archive. The sequence of values that represent the archive would also occur somewhere within the sequence of pi. Now assuming (ah! a big ask) I can FIND that sequence somewhere in there (may take a while...) I can effectively represent ANY binary stream by simply knowing where to start within pi's sequence, and for how many digits (known beforehand by having access to the original file). This way, the binary stream can be "stored" simply by reference to its starting digit, and its length.
This is a pretty mad concept when you think about it. Data transmissions for previously analysed, static data would become immediate (only two numbers to send) although the burden of using this technique naturally falls on the originator host to find the sequence within pi, and for the recipient to have a method of regenerating those digits. Hopefully, it would be easier to regenerate the sequence for the recipient. So a central computer with access to the media and a staggering quantity of poke (hey! The Earth Simulator!) can scan through the sequence of pi to find the starting point, but once that job is done, the recipients may not have to trawl through all of pi in order to regenerate the sequence (assuming you have an algorith that can start at an arbitrary location within the digit sequence
All digital media could be stored by those two values, irrespective of size. No DRM concerns for accessing digital media (hey, it's just two parameters to the pi algorithm, and I'd be fairly confident on the 'prior art' argument against patents prohibiting this if they tried to patent any restrictions). No media degredation in storage (e.g. CD-Rs not being readable after a few years). Who would need terabytes of storage, when a terabyte could be represented by two numbers? Unless, of couse, the starting point itself is so far into the sequence of pi that it takes MORE space to store the starting point than the size of the binary stream itself.
Anyway this comment is always going to languish in the -1 off-topic silt at the bottom of the Slashdot pond, but this occurred to me not so long ago and so I fancied sharing.
We apologise for this break in transmission, normal service will now be resumed.
Aegilops
If we begin with the assumption that the digits of Pi are completely random, then the following analysis is much simpler and much more correct (or it had better be now that I said it was) than the one presented:
The probability of finding a particular single digit is 0.1, or of finding a particular sequence of two digits is 0.01 or 0.1^2. The probability of finding a particular sequence of n digits is 0.1^n.
Therefore, the expectation is that on average you will find a particular sequence of n digits once every 1/(0.1^n) digits, or 10^n digits.
The question then arises as to the efficiency of indexing this many digits to locate the sequence desired. The amount of storage required for the index is log base 10 (log_10) of the number of digits you need to look. If we assume the desired sequence will always occur in average or less digits, then the amount of storage required for the index is:
log_10 (10^n) = n (log_10 (10)) = n
Unfortunately, the assumption I slipped in above that the desired sequence will always occur in average or less digits only holds 50% of the time. Therefore, in order to have a good chance of finding the sequence, we need to include a longer search space, and thus the index needs to be just slightly more digits in length than the sequence being stored.
In essence, a very effective data expansion algorithm.
(Proofreading is left as an exercise for the reader.)
No. This is a common mistake. The supercomputer in HHGTTG calculates 42. The earth's job was to find the question. Something along the line of: :-)
"What do you get if you multiply 9 by 6 and"
(Then they ran of of scrabble pieces
-- Make software not war