Slashdot Mirror


Info Glut - Five Exabytes of Data Created in 2002

securitas writes "If you had any doubts that you are overwhelmed by the volume of information in your life, a new Berekley study (PDF) shows that five exabytes of data were created in 2002, twice the 1999 total. That's five million terabytes of data, or 500,000 Libraries of Congress, which works out to about 800 MB of data for each of the 6.3 billion people on the planet. Of note is that 92 percent of the new information was stored on magnetic media, which may create an interesting problem for historians and archaeologists of the future. The study was conducted by University of California-Berkeley's School of Information Management and Systems professors Peter Lyman and Hal Varian. More at CNet, Infoworld, ByteAndSwitch and The Register."

23 of 284 comments (clear)

  1. And about 1% was worthwhile by XNuke · · Score: 4, Insightful

    I looks like they are counting every tiny email about "going to lunch". Lots of DATA little INFORMATION.

    1. Re:And about 1% was worthwhile by uberdave · · Score: 3, Interesting

      I wonder how much of that was duplicate data. How many copies of the Matrix are floating around online? Did they count FTP mirror sites as separate data?

      For that matter, how much of the data is real, and how much is virtual? If two sites point to the same download, is that data counted twice, or once?

    2. Re:And about 1% was worthwhile by Jason1729 · · Score: 3, Interesting

      That's a good point. How much of that was spam?

      ProfQuotes

    3. Re:And about 1% was worthwhile by tachin · · Score: 4, Insightful
      Lots of DATA little INFORMATION.
      From data you can extract "information", take a lot of those "going to lunch" mails and you can see what groups of people lunch together and at what time....
    4. Re:And about 1% was worthwhile by Tenebrious1 · · Score: 4, Informative

      I wonder how much of that was duplicate data. How many copies of the Matrix are floating around online? Did they count FTP mirror sites as separate data?

      The blurb said 92% was stored on magnetic media; curious about the rest, I looked glanced around the article. Surprisingly a large part, 7%, is FILM! The reason film comprised such a large percentage is that each film reel is duplicated thousands of times to be sent to theaters around the world.

      So if they're counting duplicates in film, I'd guess they'd count duplicates in magnetic media.

      --
      -- If god wanted me to have a sig, he'd have given me a sense of humor.
    5. Re:And about 1% was worthwhile by kfg · · Score: 5, Funny

      "I wonder how much of that was duplicate data."

      3% was [AOL] Me Too! [/AOL] posts.

      1% was In Soviet Russia jokes.

      0.5% Profit!!!

      So I guess there was a fair amount of duplication.

      KFG

  2. Sounds about right. by Matey-O · · Score: 4, Insightful

    That's a believable number. Consider the amount of published data on Kazaa, or that 45 minutes of raw DV video is roughly 12.5 Gb*. Move 100 of your CD's to MP3s and you're consuming/creating roughly 3.5 Gb* (or more if you're using higher than 128kb MP3's). And I'm not evern commentin on pr0n.

    (*I said roughly...comment on the comment, not the mathematical precision of the statement.)

    --
    "Draco dormiens nunquam titillandus."
  3. Yeah... by the_mad_poster · · Score: 4, Funny

    ...and most of it is still sitting in my Inbox at work right now.

    --
    Alito: A vote for Alito is a punch in the eye to put that bitch back in her place!
  4. This artcical says 23 exabytes by SirJaxalot · · Score: 3, Informative
    1. Re:This artcical says 23 exabytes by Vaevictis666 · · Score: 3, Informative
      Your article states:

      They found that new information flowing across televisions, radios, telephones, Web sites and the Internet had increased by 3 1/2 times to a total of 18 exabytes as of 2002. The amount of new but stored (non-transmitted) information in 2002 was determined to be about five exabytes.

      This jives with the other articles. 5 exabytes generated content, 18 exabytes transferred content - still one heck of a lot of bits floating around :)

  5. Huzzah! by GaelenBurns · · Score: 4, Interesting

    Hooray for exponential curves! It is daunting, though. As an illustration of this, I read that the White House has already turned over 2 million pages of documents relating to 9/11 to the independent investigation panel.

  6. Re:Damn by Carnildo · · Score: 3, Funny

    You've got a thousand times your allotment of porn! Think of all the poor people in Africa who you are depriving of their annual allowance!

    --
    "They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
  7. quote by CGP314 · · Score: 5, Interesting

    All of the books in the world contain no more information than is broadcast as video in a single large American city in a single year. Not all bits have equal value. --Carl Sagan

  8. that's a LoC per minute, almost. by sulli · · Score: 3, Funny
    525,600 minutes per year. Impressive.

    But if these data were recorded on floppies, and stacked up to the moon n times, how many VWs would it take to carry those floppies to the stack site?

    --

    sulli
    RTFJ.
  9. Storage by 3Suns · · Score: 3, Interesting

    I work at EMC, and this fact (along with projections for similar growth in the future) is a big marketing strategy for the company, especially toward investors. The storage market grows with the amount of information produced... it's gotta be stored somewhere!

    --

    -3Suns

    ~~~~
    The Revolution will be Slashdotted
  10. True it's a lot of info to create, but... by The+Jonas · · Score: 4, Insightful

    ...how much info is destroyed each year to offset these numbers. I mean shredded files, stuff thrown in trash, bills, deleted data files, discarded/lost storage media, etc... In the end (of each year), I wonder, what is the actual increase in stored information?

  11. It's only going to get worse... by mengel · · Score: 3, Interesting

    At Fermilab where I work, the larger experiments are expecting to generate 1PB/year of data in around 2005, up from somewhere around 300TB/year currently.

    --
    - "History shows again and again how nature points out the folly of men" -- Blue Oyster Cult, 'Godzilla'
  12. Re:No problem here. by GaelenBurns · · Score: 5, Interesting

    I wonder how many pages of paper an exabyte of data would take up? We're talking about gigantic masses, here. Why not figure it out? I'm guessing, based on character counts from Open Office, that you can get about 2kB of data on a single sheet. That's 4kB if you use both sides. And you get around 125 sheets per pound... So, based on some guesses, it looks like it will take 2,251,799,813,685 pounds of paper to print one exabyte of this data. For all 5 exabytes, we're looking at a wieght 122 times that of the Great Pyramid. Not as much as I'd suspected... but still fun!

  13. My figures by robogun · · Score: 3, Interesting

    I just did another backup, so the figures are right at hand.
    I'm a news photographer, shooting digital.
    In 2002 I saved 78,742 photos to disk. (Bad images were not saved.)
    That worked out to 122 gig. The output was transferred fromt he CF cards and archived to DVDs.
    But how much of that 122 gig is really information? The image file saved by the Canon 1d is mostly empty air, as far as I can tell. There is also EXIF data and IPTC, and who knows how much hidden BS is included a'la Microsoft Word documents?
    Simple compression was able to whittle that down to 33.2 gig. So that's my contribution.
    The main beneficiary is the DVD-R blank disc makers and Western Digital, I guess.

  14. Re:800 MB per person by Anonymous+Crowhead · · Score: 5, Funny

    I personally burned over 500 CDs last year

    Congrats, you balanced out 1 medium-sized tribe in Africa.

  15. Re:Let's get the standard jokes out of the way by NumLk · · Score: 3, Funny
    You forgot these jokes:

    I for one welcome our new data generating overlords!

    With all that data you'd think that my conne3^&#5$ATDT01[NO CARRIER]

    In Soviet Russia data generates YOU!

    Homer: I see they have the Internet on computers now.

    --
    Children in the backseats don't cause accidents. Accidents in the back seats cause children.
  16. Reminds me of this observation: by targo · · Score: 4, Funny

    5 billion files are created every day.
    3 billion of them will never be found again.
    Poor files...

  17. Re:No problem here. by indianajones428 · · Score: 5, Funny


    So 122 Great Pyramids = 500,000 Libraries of Congress?

    Great, another conversion factor to remember...

    --
    When a thing has been said, and said well, have no scruple. Take it and copy it. --Anatole France