Slashdot Mirror


27 Billion Gigabytes to be Archived by 2010

Lucas123 writes "According to a Computerworld survey of IT managers, data storage projects are the No. 2 project priority for corporations in 2008, up from No. 4 in 2007. IT teams are looking into clustered architectures and centralized storage-area networks as one way to control capacity growth, shifting away from big-iron storage and custom applications. The reason for the data avalanche? Archive data. In the private sector alone electronic archives will take up 27,000 petabytes (27 billion gigabytes) by 2010. E-mail growth accounts for much of that figure."

7 of 178 comments (clear)

  1. So, in other words... by thesymbolicfrog · · Score: 5, Interesting

    From the summary:
    "E-mail growth accounts for much of that figure."

    We're archiving spam?

    1. Re:So, in other words... by goodtim · · Score: 5, Interesting

      Actually, I have a partial answer to this question. As a sysadmin for a Novell GroupWise email system, I can tell you that the actually message data for duplicate incoming messages (such as spam that is sent to many people at the same time) are only stored on disk once. Some sort of "pointer" is used to reference the messages to the individual users mailboxe's. Check out the docs if you are interested.

      That said with about 1400 users (spread across multiple postoffices), we have probably about 400gb of email data. We are able to keep it low, by having a 120 day retention policy. After that point, email can be archived locally, otherwise its deleted. Independant of that, and to comply with regulations and disaster recovery scenarios, email data is backed up and replicated offsite using disk-to-disk backup (eVault in case anyone is interested).

      This gives us the ability to archive email for up to 27 years or something like that (with relatively low storage costs because the disk-to-disk is incremental, storing changes at the per-block level).

      As for Microsoft Exchange, I have not the slightest clue how data is stored.

      --
      "Flee at once, all is discovered."
    2. Re:So, in other words... by LoudMusic · · Score: 2, Interesting

      From the summary:
      "E-mail growth accounts for much of that figure."

      We're archiving spam? No, we have associates using their email as a file storage device - sending documents to eachother through email rather than just sending an email that says "Your 38MB file is on the file server in /X/here/where/there/document.type".
      --
      No sig for you. YOU GET NO SIG!
  2. how much is surveillance data? by petes_PoV · · Score: 2, Interesting
    E-mail growth accounts for much of that figure

    And a great deal of video archive from CCTV as well I expect.
    The question that arises is how would you index all this?

    --
    politicians are like babies' nappies: they should both be changed regularly and for the same reasons
  3. Re:Wow, welfare for programmers... by phoebusQ · · Score: 3, Interesting

    How do you figure that storage needs driving the increase in disk capacities and creating jobs is "a huge drain on the economy"?

    And what do data-archiving rules have to do with welfare for programmers? Maybe for disk manufacturing firms or data admins, but programmers?

  4. Redundant Data by tm8992 · · Score: 2, Interesting

    I wonder how much of this data is really redundant--copies of other data. How many emails can really be unique? How many employees download the same video a hundred times on the company's server? As network speeds increase, it will be less necessary for multiple users to store the same thing (think streaming those videos), so could this really be an exaggeration of future storage requirements? Could a better system be designed to minimize redundancy?

  5. a helpful reference page for large numbers by HappyEngineer · · Score: 4, Interesting

    Here is my helpful reference page for big numbers. I love big numbers. I'm actually working on a site right now which will help people to visualize big numbers. I can't give out the url yet because it'll be another month or two before it's ready to be seen. But, it'll have many fun options like Cow Stacking and Hamster Canyon.

    Cow stacking is where you select cow as the animal and from earth to moon as the place and you'll see a graphic of cows being stacked to the moon and the number of cows which would be required to complete that stack.

    Hamster Canyon will be where you select a hamster and the Grand Canyon and you'll see a picture of the Grand Canyon filled with hamsters and a number that indicates the total number of hamsters required to fill the canyon.