IBM Introduces Petabyte-Capacity 'Storage Tank'
statikuz writes "Wired is reporting that IBM's new data storage system, codenamed "Storage Tank", uses software to link servers in multiple locations over an IP network, creating a sort of mega-server capable of connecting thousands of computers and processing multiple petabytes of data. 'Storage Tank has the potential to become to an organization's data what the Dewey Decimal system is to a library,' said Dan Colby, general manager of storage systems at IBM. 'It reinvents the way information is filed, managed, shared and accessed within an organization.' CERN is currently using a beta version of the system to store data from the Large Hadron Collider particle accelerator, which is being used to recreate the first moments of the Big Bang. IBM expects Storage Tank eventually will be able to handle 10 to 20 terabytes of CERN data. Get your own 'starter configuration' for only $90,000!"
Quote:
"Storage Tank has the potential to become to an organization's data what the Dewey Decimal system is to a library"
Strange that he compares it to a system that few libraries use anymore. Yes, it revolutionized cataloguing. Right before it became obsolete (because it cost too much).
Not too long ago Slashdot reported on the owners of the Dewey Decimal system suing a hotel in New York for using it as the theme for their room numbering. How long until IBM starts suing everyone with a storage tank?
I'm betting that SHOULD be 10 to 20 petabytes. 10 to 20 terabytes isn't actually all that much, Maxtor has 300 gigabyte drives out. A very simple array could be built that is easily 10-20 terabytes.
Storage Tank comes extremely late - it was first promised to come out in early 2001.
According to this article at The Register, IBM failed to provide such features of Storage Tank as, "link servers and storage systems from all vendors, making it possible to view and access a file from any system. ". Instead, it will only support AIX and Windows platforms starting this November. Support for other Unix versions, including Linux, is expected not earlier than mid-2004.
...who read that one link as 'Large Hardon Collider' ...yeesh, I think I need to get out more.
<thud>
"The best argument against democracy is a five minute chat with the average voter."
--Winston Churchill
I always thought a good idea was multiple RAID storage across the entire network. So all the files are spread throughout the network. With multiple copies so if two or three computers go down, that data is not lost...kind of a cross between SAN and RAID.
open source solution that already stores 100s of terabytes that is called LUSTRE... LUSTRE is already deployed in a few live aplications run by the NCSE (hope I remembered that right)....
At the symposium this year, the fellow mentionned they were working on scaling to petabyte storage for next year.
Sincerely,
Mentally Challenged Parents Association
(What's a Petafile, Walter?)
--
Power to the Peaceful
Hope they have lots of backup. Of course, how do you backup a system like this?
"Open the pod by doors, Hal" > "I'm afraid I can't do that, Dave" sudo "Open the pod bay doors, Hal" > alright
...you will discover that 1 petabyte is enough
room for more Divx encoded porn than a man could
watch in a lifetime with no sleep or bathroom
breaks. Think about that for a second.
startime - 10-01-03
endtime - 10-01-13
that "10-20" terabytes line has to be a typo.
I spoke w/ some people from CERN regarding their CASTOR HSM, and a few years ago they were up in the petabyte range already. By now, they're probably sitting at at least a few hundred TB online, and probably 5 PB offline, as a conservative guess.
IBM's been doing GPFS filesystems in the > 50 TB size, w/ > 1 GB/sec. throughput for years. That, and even's IBM's mid-tier FAStT products can confortably carry 12 TB on one dual-controller storage head.
Still, further abstracting the issue of locality is very exciting stuff. I'd be interested to see exactly how they go about doing it, and if it's anything that you can't get w/ Lustre when it's ready.
PC moderators can suck my White pierced, tattooed dick. If you think pride == hate, s/dick/Aryan meat mallet/g.
I will not be pushed, filed, stamped, indexed, briefed, debriefed, or numbered. My life is my own.
10 to 20 Terabytes of data is what the LHC collisioner is going to generate each second while it is running. CERN is expecting to generate at least 5 petabytes of data per year.
It should also be noted that CERN is a large user of lower cost large storage arrays based on 3ware cards, but those won't scale to what the LHC will require.