The Ultimate All-In-One Storage Solution
karnifex writes "Filled up your LaCie Bigger Disk already, and looking for a little more storage space? Good news! The Petabox is ready! 'The petabox by the Internet Archive is a machine designed to safely store and process one petabyte of information (a petabyte is a million gigabytes).' And luckily, as the Internet Archive notes, it's shipping-container friendly (20' x 8' x 8'). So save on delivery costs and order two!"
Will we find one of these things in eBay in 10 years selling for $10 and feel all nostalgic about those days when that amount of storage media was the size of a room?
If you have to ask, you can't afford it. Just remember that. It might come in handy again someday. :)
From the site:
PILOT STATUS 5/2004
* The first 100TB Rack is up and running!
* The second 100TB Rack will be up by the end of May
* Thermal Targets have been met
* Systems Booted from USB Dongle
* Reiser FS running
* PC-based Router running
Maybe I'm missing something but this looks to me like they don't really have a Petabyte of storage working but plans to incorporate a Petabyte of storage with only 100 TB up and running now. Not that 100 TB is anything to brush off.
I know the pull is to get these things as big as you can get but i would love to see hard drives that will work for ever. Now I know everything breaks but I mean in 400 years how is anyone going to know what we were like if all the data on us slowly goes away because the hard drives or the cds don't really last very long
just because your a schizophrenic doesn't mean people arn't really out to get you
Assuming 2 layered disks that is 10 GB per disk (feeling generous).
100 disk -> 1 TB
15000 disks -> 150 TB.
Netflix has a "mere" collection of 15000 disks. Your patebyte disk is only 1/6th full.
You upload all music CDs: 1 GB per disk (feeling generous).
How many CDs can be in print? Maybe a 500,000?
That is only 500 TB. Now your disk is 2/3rd full.
Lets upload all printed material. May or may not fit in the rest.
Then again, if you want to archive the internet: ~6G pages. 10kB each. 60 TB. each run. Store the last 16 versions -> 1TB.
Code poet, espresso fiend, starter upper.
You're complaining that these hard drives won't run forever and you're right. Neither will CD's. However, I would also like to point out that the vast majority of ancient egyptian papyrus isn't around today. Also, don't start goign off on using clay or stone tablets, because they break (even the Rosetta stone is broken).
Honestly, computers are still far superior to what we were using before. It's not like we've got Homer's original version of the Illiad sitting in a museum somewhere; we just have many duplicated copies that have been reproduced over the years. You're right that hard drives fail and CDs break, but we can keep updating onto new media. Besides, when a monk drops an iota when transcribing the Bible, Jesus goes from being God to godlike. When a computer adds an iota, the checkbit fails and the data is resent.
Somebody is also going to point out that, as systems change, data can become unreadable. Heck, I had a professor who couldn't update his lab instructions because the software that read the lab printouts wouldn't run on new machines and the fileformat wasn't understood by any other software. So, want to stop our data from becoming unreadable? Well, let's just do what the Etruscans did! Of course, we don't have a clue what they did because nobody can read Etruscan. For a more familiar example, think of heiroglyphics before the Rosetta stone. It's pretty common for data to become lost and unreadable. Also, this bring us back to the solution. Along with the data, include the source code for the software that can read it. If you really want to be anal, you could even include the source to an emulator for the machien it was designed to run on.
Still, you might point out, 400 years from now, we'll still lose 99% of that do to failures of whatever nature. Once again, you would be be right. However, do you honestly believe that we have 1% of all the data that was collected in 1604? Hell, most of the people couldn't even right, so we don't know ANYTHING about their lives. I'm sorry that we can't digitally preserve our wonderous society for all of eternity, but it's completely blind to believe that this makes us in ANY way different to any other culture. Read Percy Shelley's Ozymandias before complaining about how people in the future won't know what our lives were like.
If you expect a hard drive to fail after three years (I'm guessing) but these occurances are randomly distributed (an assumption that will be true after running this thing for a year or two) you can then expect that the 4000 hard drives in this array would have about 3 failures per day. This thing would never be at full speed! it would be constantly restructuring its RAID. Also, it would cost about $300 just in hard drives (not to mention controllers, power supplies, et cetera).