Slashdot Mirror


"Digital Universe" Enters the Zettabyte Era

miller60 writes "In 2010 the volume of digital information created and duplicated in a year will reach 1.2 zettabytes, according to new data from IDC and EMC. The annual Digital Universe report is an effort to visualize the enormous amount of data being generated by our increasingly digital lives. The report's big numbers — a zettabyte is roughly a million petabytes — pose interesting questions about how the IT community will store and manage this firehose of data. Perhaps the biggest challenge isn't how much data we're creating — it's all the copies of it. Seventy-five percent of all the data in the Digital Universe is a copy, according to IDC. See additional analysis from TG Daily, The Guardian, and Search Storage."

19 of 137 comments (clear)

  1. Re:Who cares? by Stooshie · · Score: 4, Insightful

    Maybe you won't but then you are not CERN or the Hadron Collider

    --
    America, Home of the Brave. ... .and the Squaw.
  2. Hardware: "Digital Universe" Enters the Zettabyte by Thanshin · · Score: 2, Interesting

    "In 2010 the volume of digital information created and duplicated in a year will reach 1.2 zettabytes, according to new data from IDC and EMC. The annual Digital Universe report is an effort to visualize the enormous amount of data being generated by our increasingly digital lives. The report's big numbers -- a zettabyte is roughly a million petabytes -- pose interesting questions about how the IT community will store and manage this firehose of data. Perhaps the biggest challenge isn't how much data we're creating -- it's all the copies of it. Seventy-five percent of all the data in the Digital Universe is a copy, according to IDC."

  3. How do we have copies of all this data? by HockeyPuck · · Score: 2, Insightful

    Since this is EMC, let me tell you...

    EMC loves to tell you to use RAID1. - 2 copies of your data
    If it's important, you should use timefinder (snapshots), 1 more copy of the data.
    If you want DR, then you should implement SRDF, 1 more copy of the data (this one is remote)
    If you want to do data warehousing on what you just replicated, you run timefinder on the remote copy, 1 more copy.

    So that makes it 5 copies of my data on disk.

    Oh, and to protect myself from data corruption (or a deleted file) being replicated to all these copies, it's still recommended that I backup to tape/VTL/MAID.

    Total of 6 copies of data. That is if I'm using dedup on my VTL or TSM (which stores versions of a given file). If i'm using a traditional (daily incrementals plus weekly fulls) I could have lots of duplications within my tape infrastructure.

    Ever wonder why EMC stands for Endless Mirroring Company.

  4. Re:Hardware: "Digital Universe" Enters the Zettaby by cgenman · · Score: 2, Insightful

    Only 75%? Considering that all DVD's are copies, all local caches are copies, I wouldn't be surprised if that number was much larger.

    Also, cutting out all the copies would only reduce the problem to .3 zettabytes. For day-to-day IT purposes, that's about the same number.

  5. I'm happy to see by ltning · · Score: 2, Funny

    That we have all become good citizens, backing up all our data. I presume the data recovery firms are all panicking now that all their potebtial customers have backups of everything, and thus no longer need their services.

    Not bad to have a global backup ratio of >1:1

    Personally I use RAIM (Redundant Array of Instant Messages) to back up all my important notes and communications. It only works as long as all my friends log everything too, of course.

    --
    Love over Gold.
    1. Re:I'm happy to see by natehoy · · Score: 2, Funny

      Dude, that's so old-school. I use RAT (Redundant Array of Tweets). My data is backed up... 140 characters at a time.

      I'm thinking of upgrading to a system with a larger packet size. RASC (Redundant Array of Slashdot Comments) might work, but I'm afraid of having my pr0n collection marked "insightful".

      --
      "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
  6. What is the data? by sadtrev · · Score: 2, Interesting

    I was told about 10 years ago that "70% of the world's digital data is stored under MVS" which surprised me a bit, even then.
    After some thought when you consider that almost all commercial transactions (banks, telcos etc) whould have been running MVS then it may have been true.
    SETI and CERN and other large scientific endeavours are small fry in comparison.

  7. Challenge? by O('_')O_Bush · · Score: 2, Insightful

    " Perhaps the biggest challenge isn't how much data we're creating — it's all the copies of it. "

    Why is that a challenge? Digital media is somewhat unique in that you can carefully craft media or information (reports, programs, videos much in the same way you'd carve a chair) but risk instantly and nearly irrecoverably lose it (much unlike a chair).

    Copies of data are a safeguard by redundancy. A website gets taken offline, well good thing there is a mirror. My camera breaks or my hard drive disk fails, well good thing I have an external backup or copies on my DVDs.

    --
    while(1) attack(People.Sandy);
  8. Re:Hardware: "Digital Universe" Enters the Zettaby by Rockoon · · Score: 2, Insightful

    In the world of home storage, 75% is definitely way too low. The average personal desktop probably has 20 to 40 gigabytes of used storage, with far less than 1 gigabyte being original content. If they also back up this data, the fraction grows even lower.

    Everything on their DVR is also not original.

    Now, in the business world things are a bit different. Here you can expect the same 20 to 40 gigabytes of used storage on the median machine, but backed by a massive networked database of original uptime-critical content with at least a couple mirrors.

    It is this second category that is clearly driving their estimate.

    --
    "His name was James Damore."
  9. Re:Who cares? by natehoy · · Score: 4, Funny

    Yes, 640 petabytes should to be enough for anybody.

    --
    "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
  10. Re:Hardware: "Digital Universe" Enters the Zettaby by PlusFiveTroll · · Score: 3, Insightful

    If every piece of digital data doesn't have a copy made of it, it is one hardware failure away from non-existence. Most of the storage space used in businesses that I administrate is not for the original data, but for multiple backup copies. Copies are not a bad thing, in the business we call them redundancy.

    # Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)
            * Torvalds, Linus (1996-07-20).

  11. Of COURSE most of it consists of copies! by King_TJ · · Score: 2, Insightful

    A typical individual wouldn't have a whole lot of unique information to store in the first place.... Basically, a collection of photos and some video from a few vacation trips or holidays, and some handwritten notes .... Maybe some artistic works (a few original songs or paintings, or ?) if he/she was interested in such endeavors. Oh, and your tax records and resume. But let's face it. Most of us are FAR more of content consumers than creators. Content creation usually results in mass re-distribution of the original work, as others want to enjoy a copy of it.

    I don't see any harm with this either, since duplication is the best way to protect against data loss. (When my parents were trying to trace their family history, they reached a dead-end because a library had burnt up in a fire that contained the only known records of some of the people they needed to research. With so much data going digital, on media that's practically EXPECTED to fail after less than 10 years of regular use? You better believe we need lots of duplicates out there!)

  12. Re:Who cares? by CODiNE · · Score: 4, Funny

    Don't you mean

    Yes, 640 petabytes should to be enough for everybody.

    --
    Cwm, fjord-bank glyphs vext quiz
  13. 1.21 zettabytes? by nonregistered · · Score: 5, Funny

    1.21 zettabytes? Great Scott!

  14. Retarded IP by static416 · · Score: 5, Insightful

    This beautifully illustrates how idiotic the concept of "copy right" and IP in general is in the digital universe. When 75% of 1.2 zettabytes is mostly untracked copies of other information, just storing the licenses alone would be an impossible task.

    How do you maintain a business model built on the exclusive right to copy information in world where everything is a infinitely copied and copyable? It's like trying to legislate and sell access to saltwater while floating on a raft in the middle of the pacific.

  15. Re:Hardware: "Digital Universe" Enters the Zettaby by iamhassi · · Score: 2, Insightful

    HD home movies and photographs are far more than 1gb

    --
    my karma will be here long after I'm gone
  16. Re:Library of congress by OctaviusIII · · Score: 2, Interesting

    According to Wikipedia, it's about 10^9 Libraries of Congress, not including images.

    --
    What's this? Another weblog? On transit?
  17. Space Program by Ukab+the+Great · · Score: 5, Funny

    - 1 zettabyte / 1.44MB floppy disk = approx 694,444,444,444,444 floppy disks.

    - 694,444,444,444,444 * 3.5 inches per disk = 2,430,555,555,555,550 inches if you laid the floppies end to end.

    - 2,430,555,555,555,550 inches / 63360 inches per mile = 38,361,040,965 miles

    - 38,361,040,965 miles / 2.7 billion miles to pluto = approx 7 round trips to Pluto via floppy disk.

    In conclusion: Don't kill NASA yet, President Obama. We've found a way to get to Pluto!.

  18. Too many duplicates consuming disk space? by RhapsodyGuru · · Score: 2, Insightful

    No problem...

    zfs set dedup=on tank

    there... that should do the trick.