Slashdot Mirror


Petabyte Storage Array

knight13 writes "Engadet is reporting that EMC is rolling out a petabyte RAID array. From the article, "And if you're ready for that level of storage, there's now someplace to get it: EMC has launched its first petabyte array, a version of the company's flagship Symmetrix DMX-3 system that includes nine room-filling cabinets of drives." The price? A mere $4 million."

5 of 185 comments (clear)

  1. Storage Limits by Doc+Ruby · · Score: 1, Informative

    Each of your eyes has about 8Kx6K retinal detector cells, signalling at 40Hz. Nyquist sampling means we'd need 16Kx12K 24bit pixels to fill them at 40fps, or 368Gbps. Two ears at even 1Mbps falls into the rounding error, but 10:1 compression is a reasonable minimum expectation: 3.7Gbps. 100 years is 1450PB . The average American earns about $40K:y for 40 years, or $1600K. That means when we get $1:PB, we'll be able to afford to store everything we see and hear in our entire lives.

    25 years ago, 10MB cost $5K to store, or $1:2KB. Today $1 stores 1GB, a 500K-fold increase. The 1500-fold increase to $1:PB is relatively around the corner.

    --

    --
    make install -not war

  2. Re:On a related note... by Anonymous Coward · · Score: 1, Informative

    There are, in fact, when it does work, you may just see one drive, though you'll probably need to use a setup disk/utility if it doesn't default to something. Numerous people get compaq/hp proliants with scsi raid cards to work.

    You might like this, it doesn't just apply to SATA.

    http://linux.yyz.us/sata/faq-sata-raid.html

    http://developer.skolelinux.no/info/prosjektet/del prosjekt/hw-raid-info.html

  3. Re:Kinda Interesting by Theatetus · · Score: 4, Informative
    Imagine just once, you had to wait 4 hours for some tapes to come back onsite. Now that is four hours times approx 40,000 people (number of employees unable to work). That one outage just cost you 160,000 hours of downtime, where you could not serve your customers. Assuming you pay on average $25/per employee/per hour you've paid for the system in one go.

    Only if you use Enron math. You have to pay $25 per employee per hour either way. The only thing that matters is what you mentioned as a side note, revenue from customers lost during the outage. If whatever system relies on this backup is generating you $1,000,000 per hour, then an array like this would pay for itself in one four-hour outage. But, that doesn't take into account opportunity cost: you could still be better off if you put that $4 million to use generating revenue; if it made back more than the outage costs you you're still on top.

    --
    All's true that is mistrusted
  4. Petabyte drives... by jd · · Score: 5, Informative
    It really depends and Moore's Law doesn't really apply to it. The jumps tend to be much larger and much more random. The problem is that capacity is limited by several factors: drive speed, disk rigidity, read/write-head speed and the distance the read head is from the disk surface.


    The faster a disk spins, the more disk surface is exposed to the magnetic field used to write to the drive, so the less storage you have. Disk rigidity is important for two reasons - it limits how close the read head can get and it limits how precisely you can know how much disk surface has been visible. The faster you can either read magnetic fields or generate them, the less disk you need to write to, thus increasing storage. The distance of the read head determines the surface area exposed to the magnetic field on writing, so determines how far apart your data must be to not overlap.


    A trivial question might be: Using a standard, existing hard disk (but modifying the controller as necessary) increase the capacity of a hard drive? The answer is "probably".


    One way to do it would be to add enough RAM such that a fairly substantial portion of the disk can be held in ramdisk on the controller. Because you are then not reading and writing to the disk directly, but going through ramdisk, the speed of the drive becomes much less important. If you slow the drive down substantially, whilst writing to it at the same speed, the data won't be smeared over the disk as much, so you should be able to increase the density.


    In practice, as disk manufacturers don't design their disks with that kind of mod in mind, you are very likely to run into significant problems with defects on the surface that simply aren't visible at 7200 or 15000 RPM. Other problems, such as stability (drives depend a lot on gyroscopic effects and aren't built to go slow), may also limit how much you can cheat on the density.


    Another option would be to seriously cool the read/write head, so that you could flip the magnetic state faster. Again, you're limited. Mechanical devices don't like being freeze-dried - even when they ARE dry. However, you may be able to get some improvement that way.


    If you're just looking for ANY increase in capacity, then that's trivial and requires no engineering (but some programming). Modern computers are very fast, compared to modern hard drives. If you have one physical sector per physical track, then break down the structure entirely in memory, you eliminate the need for inter-sector gaps, physical sector headers, etc. You might be able to squeeze out another 10%-15% by this method, which isn't a whole lot but isn't bad for the effort it would take.


    There are very likely other mods that hard disk manufacturers could use, but which would be totally beyond anyone doing homebrew stuff. The platters probably aren't using the absolute ideal materials - let's face it, they're in business to make money and there are far more home buyers wanting cheap drives than there are perfectionists wanting perfect drives. I suspect there are other areas they could improve on, using existing technology, but won't because it's not economic.


    That's probably why you see bursts of improvement. When there's a massive enough need for the extra storage, it can be achieved. When there isn't, it's not worth the extra investment.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  5. Much better drives means lower failure rates by Terje+Mathisen · · Score: 2, Informative

    The Internet Archive Project http://www.archive.org/ is running on the PetaBox http://petabox.com/ rack system, which was commercialized by Capricorn Tech http://www.capricorn-tech.com/ more than a year ago.

    This system uses absolutely no board/controller lever redundancy, instead they use a separate file system on every disk, then mirror pairs of 1U units, and finally mirror the entire (mirrored) rack to a geographically distant location.

    I am currently testing a much denser solution, the SATABeast http://nexsan.com/products/products/satabeast/sata beast.html from nexsan http://nexsan.com/ which manages to pack 42 500 GB SATA drives into a single 4U rackmount box. With multiple RAID5 volumes and shared hot spare drives, this results in about 17-18 TB of usable file system space.

    According to the nexsan engineer I spoke with today, they do so much burn-in testing of the Hitachi Deskstar drives they ship, that over the 15-18 month period they've used these drives, the total error rate has been just 0.4%.

    Even if these numbers are somewhat skewed due to many systems (i.e. drives) being relatively recently installed, it is still very impressive.

    For our setup we plan to use multiple full boxes, each connected to a separate NFS server. Each server has multiple FC host adapters, so if a server crashes, the corresponding box can be connected to one of the other servers.

    We will also use rsync to mirror all data across the country to a secondary site.

    Terje

    --
    "almost all programming can be viewed as an exercise in caching"