Info Glut - Five Exabytes of Data Created in 2002
securitas writes "If you had any doubts that you are overwhelmed by the volume of information in your life, a new Berekley study (PDF) shows that five exabytes of data were created in 2002, twice the 1999 total. That's five million terabytes of data, or 500,000 Libraries of Congress, which works out to about 800 MB of data for each of the 6.3 billion people on the planet. Of note is that 92 percent of the new information was stored on magnetic media, which may create an interesting problem for historians and archaeologists of the future. The study was conducted by University of California-Berkeley's School of Information Management and Systems professors Peter Lyman and Hal Varian. More at CNet, Infoworld, ByteAndSwitch and The Register."
I looks like they are counting every tiny email about "going to lunch". Lots of DATA little INFORMATION.
That's a believable number. Consider the amount of published data on Kazaa, or that 45 minutes of raw DV video is roughly 12.5 Gb*. Move 100 of your CD's to MP3s and you're consuming/creating roughly 3.5 Gb* (or more if you're using higher than 128kb MP3's). And I'm not evern commentin on pr0n.
(*I said roughly...comment on the comment, not the mathematical precision of the statement.)
"Draco dormiens nunquam titillandus."
...and most of it is still sitting in my Inbox at work right now.
Alito: A vote for Alito is a punch in the eye to put that bitch back in her place!
With all the time I spend at work, it seems like I've created about half of that.
That's an awful lot of pr0n!
OpenOffice tips:richhillsoftware.com
That's a lot of porn. Though I think their stats are off a bit, as I have 800gb of porn, not mb. Oh well, better luck next year!
Looking for hardware (Currently need: Large Etch-a-Sketch) Have one? See my journal!
IS IN PDF! Now we know who to blame...
a new Berekley study (PDF) shows that five exabytes of data were created in 2002,
:-)
Shoot, it felt like my doctoral dissertation was responsible for at least 2 of those 5 exabytes.
Visit Jonesblog and say hello.
here is the aritcal
Geminatron
Subject says it all
The ultimate network admin tool needs HELP!
Of note is that 92 percent of the new information was stored on magnetic media, which may create an interesting problem for historians and archaeologists of the future.
In 70, 60 maybe even 50 years it might be difficult accesing todays hard-disks with the futures technology. And of course (as always) it brings about the problem of how long the data lasts before it's corrupted.
When anger rises, think of the consequences.
Confucius (551 BC - 479 BC)
.. just slashdotters copying porn
Of note is that 92 percent of the new information was stored on magnetic media, which may create an interesting problem for historians and archaeologists of the future.
Well, why won't they just print it ? Sheesh...
United States of America, good ol' backers of world peace.
Hooray for exponential curves! It is daunting, though. As an illustration of this, I read that the White House has already turned over 2 million pages of documents relating to 9/11 to the independent investigation panel.
I guess I can buy that new HardDrive I was looking at now.
How about temporary and efferent data, like SSH keys and data passed through X11, used for short point-to-point transfers? It might be just me, but if this doesn't take into account that data, the total could be much higher...
I'm the Devil the Windows users warned you about.
as i just received another couple of letter asking for assistance from the war torn regions of africa, how much of this is spam and related garbage?
oddly enough the most useful information is often the most concise. duck!
Hmmmmm.... I think I might know where all that 'new data' came from.
"Lawyers are for sucks."
- Doug McKenzie
Subject says it all for me but since this requires a body...
For those curious the dictionary's definition of data is as follows.
Factual information, especially information organized for analysis or used to reason or make decisions.
Computer Science. Numerical or other information represented in a form suitable for processing by computer.
Values derived from scientific experiments. Plural of datum.
I have a Cig, but do you have a light?
Glad to see Slashdot is contributing to the glut by reporting on it...
But how many {VW Beetles, encyclopedias, football fields, Coke cans, DVDs, hours of porn} is that?
Of note is that 92 percent of the new information was stored on magnetic media, which may create an interesting problem for historians and archaeologists of the future.
That's because about 57 percent of it was internet porn.
All of the books in the world contain no more information than is broadcast as video in a single large American city in a single year. Not all bits have equal value. --Carl Sagan
I hope that Varian, et. al. realize that by publishing this study, they are adding to the problem.
In the long run, the second law of thermodynamics will take care of this.
From the article Verian (an economist) states:
``We're producing all this information, but we don't necessarily have the tools to use it most effectively,'' he said.
What does it mean to use data "effectively", and is the "We" producing the data the same "We" using it? My first instinct on not having the tools to use this data most effectively is "that's good". My second instinct tells me that data is already being used TOO effectively. Personally, I hope that cross-reference of mass data stores containing personal information does NOT become more effective.
dd if=/dev/random of=/dev/zero bs=89458905980359804890448 count=403908538905980358904895983
But if these data were recorded on floppies, and stacked up to the moon n times, how many VWs would it take to carry those floppies to the stack site?
sulli
RTFJ.
So what the writeup is saying is that there's a whole lotta data, which is a problem, and that 92% of that data probably won't survive that long, which is a problem. It sounds like these two problems cancel each other out! (That is, as long as the 8% that does survive is the useful stuff.)
I'm 25, I've never had a girlfriend and I have no prospects
Heck, you sound like the average Linux developer. Grab some code and start hacking.
I think more needs to be preserving the important e-mails of government for posterity. The DoD and other agencies do not backup or retain e-mails in any meaningful way nor does the Whitehouse or National Archive have any kind of e-mail policy, AFIK. Hard disks and, by extentsion, e-mail suffer from the time limit of magnetic media...eventually all those ions disappear and there is no *magnetic* in the media.
CDs have the translation problem...what happens in 150 years when the standards are corrupted or lost and nobody can acknowledge the binary code in any meaningful format?
I work at EMC, and this fact (along with projections for similar growth in the future) is a big marketing strategy for the company, especially toward investors. The storage market grows with the amount of information produced... it's gotta be stored somewhere!
-3Suns
~~~~
The Revolution will be Slashdotted
Is that 5 Exabyte 8505's or did they use 8505XL's?
This page accidentally left blank
People ALWAYS have prospects somehow. You just have to think about it some more and get some help from friends or professionals if you have problems figuring out what to do.
...of course, if you still wanna kill yourself, jumping off of some very high thing is the most beautiful way out... but still, don't do it :)
United States of America, good ol' backers of world peace.
I gave my 200 GB. What did you give.
--"Sorry for the inconvience." Gods Last Words to his Creation
DNA, So Long and Thanks for all the Fish
Today we have dug up another spinning thing in a case made of platters and heads. While we are not sure what exactly they were used for, we think they were used to play with the "charge" of milkey wayan matter. The "charge" is simply a property of the particles that make up the matter in that some have simple attraction or repulsion. This appears to have been harnessed in a primitive way as to allow simple machines to be created solely out of that matter. These spinning things appear to have been used to store small amounts of information - the heads seem to have been made to tranfer this "charge" property to a stronger "charge" property in other parts of the component. This may have been used in some of their simple machines to "display" the information in another place at a much later time.
- Many large companies are building VERY large data warehouses, to capture and analyze every iota of information about every transaction. In a year or two, much of today's data will be largely irrelevant, and will likely be summarized and deleted.
- People send a lot of email, and post a lot of messages, about day-to-day stuff that has no long-term value.
- Surveillance video is used more than ever. This is not going to be stored long-term, except perhaps in the most security-sensitive areas.
Either way, I highly commend the article's author for using both "Libraries of Congress" and "feet of books" as measurement units.You only get to count data you have generated yourself, anything you got from somewhere else (99% of porn, everything on P2P apps) doesn't count.
As such, I think I'm under my one-cd-per-person (800mb) limit for the year, but I do know a few friends (artists) that would definitely be over :P
Another interesting question is whether data conversion counts - If I copy a CD to oggs, or a DVD to Divx, does that cound as new data created for the purposes of this study?
http://www.wired.com/wired/archive/11.09/full.html
If Slashdot were chemistry it would look like this:Cadaverine
How much of that was in kids' artwork for the refrigerator door? Cause that would store a lot better in a vector file format...
the major advances in civilization are processes which all but wreck the societies in which they occur - A.N. White
Like the previous posters mentioned, it is really about quality not quantity. Who cares if all of this so called important information is on magnetic media. The constitution was written on shredded tree pulp that was compressed and dried to an unstable piece of paper; somehow we've managed to go all these years without losing track of *that* important piece of information.
How *did* we do it???
But how many football fields long is that?? Let's try to put that in some context that Joe Sixpack like me can understand!
There's a Mercedes gap too. I want one and can't afford one, but it's not government's job to do anything about it.
I think the more interesting thing to study would be to determine how much unique data is being generated. I mean who cares if two million people have the latest Britanny Spears song in mp3 format? And that's not even talking about "information", but just simply raw "data". I also wonder if they took into account "data in transit" (being transmitted over the ethernet) and temporary data (caches, etc).
On slashdot, there have been topics on digital media durability in the past (run your own searches I'm too lazy). It really couldn't hurt to start archiving stuff on to material that can last hundreds of years if not longer. Was their any digital media that could do that? It couldn't be magnetic because that deteriorates over time, and it couldn't be CD etched as the scratches tear away at it piece by piece....thoughts?
...in bed
...how much info is destroyed each year to offset these numbers. I mean shredded files, stuff thrown in trash, bills, deleted data files, discarded/lost storage media, etc... In the end (of each year), I wonder, what is the actual increase in stored information?
At Fermilab where I work, the larger experiments are expecting to generate 1PB/year of data in around 2005, up from somewhere around 300TB/year currently.
- "History shows again and again how nature points out the folly of men" -- Blue Oyster Cult, 'Godzilla'
Wow, that sounds even more than Gazillion!
My new harddrive will be no less than 1.2EB...
Tera, Giga, Exa, Don't give it to me in those terms. Put it in terms I can understand!
Just how much of that was porn?
-Goran
Carpe Scrotum - The only way to deal with your competition.
Methinks the word "Data" may be used more loosely than Kathleen Fent-Malda's pussy.
500,000 Libraries of Congress, huh? I've always had several problems (SI questions aside) with this unit of measurement. The Library of Congress is constantly expanding & adding new material. What year Library of Congress do they mean? I imagine they aren't working w/ up to the minute data and that the libary is expanding much faster now. Not to mention the fact that everyone always makes exabytes ~2.4% smaller than they really are (and with numbers this big, it actually makes a difference!)... So call me the new number nazi troll already and get it over with...
Webmaster Wanted - Entropic Reactions
Why is it that everything that is data is related to either/or x libraries of congress or y encyclopedia brittanicas, as if either of those is actually an approachable figure. I want to lobby for a new measure, such as x two hour porn dvd's or y illegally downloaded songs.
The truth about Scientology, Xenu, and you: Operation Clambake
pr0n + spam + kazaa
!(^((ri)|(mp))aa$)
It repeatedly calls malloc() and free(), storing information in RAM, which may create an interesting problem for historians and archaeologists of the future.
Then think of how many bytes of that number are actually backed up if they are irreplacable?
I'd bet not much. And what is backed up may only have a shelf life of about 20 months if on poor CD-R or Floppies.
Saskboy's blog is good. 9 out of 10 dentists agree.
Damn- that puts some stuff in perspective... 800 MB per person is really not that much... just over one CD per person on the planet.
I personally burned over 500 CDs last year, filled a couple of hard drives, and sent God knows how much email...
I think this goes to show what a wealthy little world we computer people live in.
... how much of it was porn? :)
Hey, way to add another 800k to the glut with this pdf file!!
Try going out to a bar/dance club and getting shit drunk some night.
I just did another backup, so the figures are right at hand.
I'm a news photographer, shooting digital.
In 2002 I saved 78,742 photos to disk. (Bad images were not saved.)
That worked out to 122 gig. The output was transferred fromt he CF cards and archived to DVDs.
But how much of that 122 gig is really information? The image file saved by the Canon 1d is mostly empty air, as far as I can tell. There is also EXIF data and IPTC, and who knows how much hidden BS is included a'la Microsoft Word documents?
Simple compression was able to whittle that down to 33.2 gig. So that's my contribution.
The main beneficiary is the DVD-R blank disc makers and Western Digital, I guess.
It really makes you wonder how much of that data is just redundant waste.
How many other sysadmins out there are tired of hearing this? Every time I go to a company and even suggest quotas on the file server, the engineering group always says, "Disk space is cheap, or "you can get an 80GB disk for cheap."
Of course, this never takes into account backup media and the whole backup infrastructure (anyone price decent commercial backup software recently?).
I'm surprised it's only five exabytes. The admins of the world should go ahead and put a 400MB Quota on all 6.3 Billion people. That way, we'd be down to 1999's storage levels....
It's about 6 Exabytes.
It's a joke..
Friends don't help friends install M$ junk.
If we used analog computers instead of digital, how would this be measured?
tasks(723) drafts(105) languages(484) examples(29106)
that's about two week's worth of linux-kernel traffic.
efferent adj.
1. Directed away from a central organ or section.
2. Carrying impulses from the central nervous system to an effector.
Doesn't just one experiment produce 45 zillion
megabytes. (Don't quote me on that.)
An mp3 is usually about 1 meg a minute. But a raw wav file is several times more. The same goes for raw video verus mpg2 or quicktime.
I suppose the number could be much larger if you expand data before counting it.
I don't understand, how many elephants does an exabyte weigh?
Looks like 599, assuming said motion picture is a complete rotting turd. Thanks for gems like this one, MPAA!
-Looking for a job as a materials chemist or multivariat
"Of note is that 92 percent of the new information was stored on magnetic media, which may create an interesting problem for historians and archaeologists of the future."
I'm going to start inventing in rock quarry companies right away. I predict all future data will be chisled on rock like in the olden days!!!
:: Either way, I highly commend the article's author for using both "Libraries of Congress" and "feet of books" as measurement units.
Even though it knows the Answer to Life, the Universe, and Everything and number of feet in 10 metres, it can't convert 10 libraries of congress into feet of books.:(
I demand that this be fixed immediately!;)
Is this a sigs-optional kind of place? 'Cause I am totally down with that if you know what I mean.
ln2(5 exabytes) is a little over 62!
(62.3 for RAM style exabytes or 62.1 for HD style exabytes).
Thoughts on tech, Software Engineering, and stuff
Not least for those historians who want to know what my Amazon.com session ID was on the day that my Runescape character hit mining level 33.
Shop as usual. And avoid panic buying.
What's the big deal? That's only five 8mm tapes, isn't it?
Call me old fashioned, but I like a dump to be as memorable as it is devastating - Bender
For he talks shit!
So.. what will the archeologists *really* think when they did up our hard drives?
That is FUCKING AWSOME!
Let me know if that works, so I can try it!
...how many golf balls falling on said stack it would take to knock it over. And if you laid all the bits in the data side by side, I wonder how many times it would go around the earth?
-Looking for a job as a materials chemist or multivariat
I'm a library science student. I'll have my MLS in December, and I've found a lot about this topic. In fact, I'm sitting in the library science library right now.
For books, the standard is that any book should last for at least 500 years (Though this is a problem, what with all the acidic wood pulp paper publishers have used since the mid-1800s). The much-hated microfilm has that same lifespan.
But we are nowhere close to finding a viable archival format for electronic information.
This is a problem. There is so much important stuff, but digital formats change so fast we can't keep up. And the reliability of computer hardware is another can of worms.
Libraries and Archives would bow down to anyone who found a format that remains viable, readable, and usable for perhaps the next century.
Now, here's a little math for you
United States of America, good ol' backers of world peace.
...in vlokswagon veichles?
I recommend something messy and embaressing involving autoerotic asphysixiation while speeding down the highway.
You know, something we'll get a laugh at when we read about it on annanova.
Been there.. tried that. A change of scenery helps. Try getting out, and doing something radically different in/with your life.
Don't worry about being able to read old legacy data formats. If there's any interest in the data, there's somebody somewhere who will write an interperter / converter / emulator for it. Just look at the 8-bit emulation scene.
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
5 billion files are created every day.
3 billion of them will never be found again.
Poor files...
When men used to be men
long x;
{ for (true)
x = rand();
send_to_info_glut(x); }
Please send the data created to Info Glut, and while you're at it, send it to all the spammers and to SCO. With some luck, you might DDOS them off the internet.
Wh47 d1d j00 541, 31337 15n't t3h r0xor5 ne m0r3???
> ...it's gotta be stored somewhere!
/dev/null is the prime choice of storage medium. This should really be an opportunity companies producing high speed, high capacity null-devices.
For most of it
Where are the VC when one needs them?
they seemed to have missed my massive collection of porn
And,
Floppy disk volume: 0.0889m * 0.0889m * 0.015875m = 0.00012546345875m^3
VW Jetta Cargo capacity: 368.119 liters = 0.368119m^3 (assuming all seats in place, and NOT the wagon model)
So, 763549741511.11 floppies * 0.00012546345875m^3 = 95797591.4976523121517125m^3
divide that by the 0.00012546345875m^3 per Jetta, and we get:
~7.635 x 10^11 Jettas required to ferry the floppy disks to the dump site!
And all I want is a VW minibus. makes me seem quite modest..
free speach
Did you mean: free speech
I'm 173.205 percent sure these numbers are not very accurate. I'm 314.159 percent sure that they won't affect how I sleep. And I'm 628.318 percent sure that the funding for this kind of "research" has an upper bound.
WWJD for a Klondike Bar?
I'm still attempting to figure out how to hook up my 20MB hard drive from my first computer (Its not IDE) and get one very small (less then 100k)file.
:) :)
Being the usual procrastinator it gets more and more difficult to retreive this file.
The hard drive was hooked up to a 286 through used two cables (one small, one large, not including power of course) and went to a daughter board.
If anyone has any suggestions on how to retreive this data that would be super
-Steve
Candle burns its brightest in the dark
How will the robots ever survive without reliable data to recreate our world accurately?
:(
I don't want chicken to taste like everything
I just downloaded a WinXP "patch" - better chalk up another exabyte.
hell i know i'm personally responsible for several gigs, at least 500. so does that mean there are a bunch of people running dataless? also if it's only 800MB per person, hell all we would need is for EVERYONE to have PC with a 1GB drive and make i gigantor sized cluster out of them all.
I just want to point out that 800 MB per person works out to 1,600 slices of 512x512 CT data (the standard size of CT slices at 16 bits per voxel) - which means that this amount of data is roughly the same thing as about a 1mm * 1mm * 1mm CT scan of every human on the planet.
Education is the silver bullet.
in 2002 I personally created about 400-500GB of data.
sometimes, I really have to wonder about studies like these and where they get their info from. . .
the history of the world
Statistics like this only serve to amaze and astound pointy haired boss types. Oh my God! They shriek. Do we REALLY??? Meanwhile, the world keeps turning, we all keep getting up in the morning, and I keep wishing I could get laid. Just once. I mean, REALLY!
Seriously, though, I bet the breakdown is something like this:
1. Most of the "information" is probably composed of music and film. We all know how much bandwidth and disk space music and film take up. Here's another thing: different sites might have different copies of a film, so there's probably a lot of duplication. Not to mention the zillion copies of any given song that are being passed around. I really don't think of this stuff as "information". It's more "entertainment" than anything else. Some of it may be interesting for archival purposes (news footage, for instance) but the news companies already do this. THIS AIN'T A PROBLEM, FOLKS.
2. Another large chunk of the "information" they're kvetching about is probably (almost certainly) composed of transitory messages like emailed messages and IM. This stuff was never meant to be hoarded. And it doesn't matter. It's used, it disappears, that's it.
3. Yet another large chunk of this "info" is probably control messages passed around the web as internal controls (ICMP, etc). Again, this stuff is transitory, like emailed memos. Who cares?
4. Getting into the "real stuff", you have all the ecommerce going on. But each company handles its own backup and storage. This is not a societal problem, this is an individual problem. Companies can deal with their own information storage problems. If they design their applications well, they won't have to store so much. But this isn't even that serious a problem there; it's just part of doing business.
5. Then you have informational web sites, and personal sites, and blogs, etc. They come and go -- they always have. Everything interesting gets cached or mirrored anyway. This isn't much of a problem either.
6. Finally, you have real paper documents, like those used by the bank and the government. Ok, some of this might add up. But they've got procedures in place (and they've had them for hundreds of years) to deal with this. Digital technology is actually making this easier, not harder, so that's a good thing, right?
Overall, who cares how much information is generated? It's a useless statistic, like the tonnage of toilet paper people use annually. It might work as filler for, say, a "Ripley's Believe it or Not" strip in the sunday paper, but that's about it. Who cares? If someone started screaming "OH MY GOD, do you know how many TONS of TOILET PAPER America uses in a single YEAR??? IT'S A CRISIS!" wouldn't you slap that person? I would. Unless she was a hot chick (see paragraph 1).
Farewell! It's been a fine buncha years!
Maybe more research could be done into a marketable multi-century (millenial?) storage. For corporate purposes, several decades of fidelity, perhaps a century or two, would be fine - but government will need better than that.
Yeah right. The government wants all historical data distroyed as soon as it is created.
Of note is that 92 percent of the new information was stored on magnetic media, which may create an interesting problem for historians and archaeologists of the future.
They fail to mention that also of note is that 99% of that informations is in the form of pr0n! That's a lot!
If I say zettabyte and yottabyte did I just create new measurement terms?
Silly reporters!
I design user interfaces for a free network management application,
Forget spam...
How large is a usual 5 minute MPEG file with stereo sound in a medium resolution? Lots of those were created, way more then spam.
Dangit, Cowboyneal! I told you to turn off that packet sniffer at MAE East!
Now look what you've done.
-Adam
What if you take a page with text and scan it? It can take a size anywhere between 30-1000 KB. The same text can be written in an text editor in 5-6 KB. In MS word in 60 KB.
2 years back, CD-R's were the in thing. Everyone and anyone was storing data on it. Since its size was 700 MB, files were generally smaller and compressed. Higher broadband connections and DVD recorders(alongwith faster processors) are becoming common, people don't care so much about file sizes.
Regarding duplicate data- ask five people to compare what files take up how much of their hard disk.
Maybe slashdot could do a poll on this, asking what percentage of space do movies, music etc take up on the hard disk. This would give a rough guide as to how much data duplication takes place.
If you go to IRC servers, you will see bots with uploading speeds of 2-5-10 Mb/s..
Lots of people download files from there.
Stuff that is interesting to one might be interesting to millions of others on the net.
Similarly, if you check the files downloaded from download.com, you might see a 15 MB application downloaded millions of times.
That is a lot of data duplication.
If the data on the web is say 1 exabyte, then there must be a corresponding amount on the hard drives/backups of people, organisations... who put this stuff on the web in the first place.
If poster had carefully read the report it is noted in the report that the comparison is to the print collection of the Library of Congress. If you add in their audio and film collections they have at least two orders of magnitude more data. Even the LOC doesn't seem to be sure how much their entire collection is.
No electrons were harmed creating this post, though some may have been subjected to electrical and/or magnetic fields.
In a weird way, this reminds me of the Jumping Jesus Phenomenon
Five exabytes of data is a meaningless figure if you consider that probably 52% of that was pr0n. The other 35% was source code (non-human readable data). And the remaining 13% was made up of spam, web logs, and e-mail to grandmaw.
Un-news
"Of note is that 92 percent of the new information was stored on magnetic media, which may create an interesting problem for historians and archaeologists of the future."
Nevertheless, I still think that if historians are to analyze those 8% of the data of the last decade alone, history will become a booming business for the coming centuries.
Regarding web pages:
You read that right, 28% of the internet sampled appears to be porn. Anyone surprised? Read on...
Regarding P2P networks:
This follows my general idea that part of the reason that the internet is as large as it is, is due to the fact that it allows anonymous connection to taboo material.
Clinton made me a Republican. Bush made me a Libertarian. Trump is making me question reality.
just wait until the HDTV porn files start swapping...200 exabytes here we come (no pun intended)
Most optical media does not have any better longevity than magnetic media, and in many cases is actually worse. There are a multitude of problems. For stamped discs, the most insidious is oxidation of the aluminum reflective layer, which reduces the contrast ratio between the pits and lands to a level too low for normal drives to read the discs.
For dye-based writable discs (e.g. CD-R) there is the same problem (though with regard to the pregroove and general reflectivity rather than data pits and lands), and the dye will eventually undergo the same chemical reaction used to write the disc due to ambient temperature and aging.
For phase-change discs (e.g. CD-RW) I expect the temperature and aging problems to be reduced due to the higher activation energy needed for the phase change. However, I am not aware of any actual studies on longevity of phase-change media.
Discs with a gold reflective layer are basically immune to the oxidation problem, but how much of the 8% of data that is not on magnetic media is actually on gold phase-change discs? Probably only a trivial percentage of it.
hmmm....p0rn.
reminds me of that one ep of the simpsons where Bart starts drawing Angry Dad cartoons and Leny says "It's the number 1 non-porn site on the web; 1 trillionth overall"
it's the other way around.
harmonious design
most of this new 'information' is cryptography-related...I mean, did everyone but me write a book last year?
And most of the rest of it is spam.
As far as I'm concerned, the only new information available to me this year is Stephenson's 'Quicksilver' and the movie 'LotR:RotK'. The rest is just wash.
I admit I take more pictures than most, but I haven't gotten a video camera yet... just think of the Terabytes I'll consume with that bad boy.
--Mike--
But if you've really decided to end it all (I've had these thoughts), consider cashing in all of your assets, and going to maybe the Chicken Ranch or somewhere like that, and finding a girl who excites you, and negotiate a full weekend sex session with all the money you have. Or maybe with two or more girls, if you have the money. You may want to bring some v1agra with You'll end up talking during all that time, and you can tell them your story, about how this is your blaze of glory, and that they'll be the only women you've ever done it with, and they'll think it's so romantic, sort of like the "Leaving Las Vegas" movie, and maybe one of them will fall in love with you, and decide to quit the business and marry you, and support you by only doing lesbian porn. If not, and when your time is up, and they kick you out on your ass, then you will probably have no trouble finding a way to kill yourself at that point. But of course, maybe at that point, you'll decide that the lack of sex with a female isn't worth killing yourself over. You'll have hit rock bottom, and realized that it isn't that bad. You'll rebuild your life and with your newfound clarity and attitude, you will naturally attract women, and you'll live happily ever after.
This is all a big "maybe." Personally, I just recommend the DVD and lube thing.
Helium is the preferred method.
Search for hemlock society.
No GF is no reason to kill oneself anyway.
From the 400 or so years that are classed as the Old English (upto abotu 1150 AD), we have a total of 5 million words in texts. That would probably fit on less floppy disks than Windows 3.11 and its Dos. Or in my telephone. It's true that not all bits are equal.
Now, are you using the current Library of Congress Measurement, or are you using an old one? I mean, new books must be coming in. I presume that's not just the ASCII, but scans of the pictures as a decent resolution.
How will I ever do the proper conversions if you aren't using the up-to-date standards?
=Brian
There is nothing so good that someone, somewhere, will not hate it.
Holy crap! There's a lot of everything in the world. Why is data much more exciting?
Dividing 95,797,591m^3 of floppies by 0.368119m^3 per Jetta, the requirement is 260,235,389 Jettas to transport them all there. Or one Jetta, preferably one more reliable than my old thing, 260,235,389 times.
(Is the cargo capacity really that little? I would think it's over a cubic meter. Maybe they reduced the capacity in newer models.)
sulli
RTFJ.
That there can't be an accurate data representation of the data in the Library of Congress because THEY don't know how much stuff they have. My cousin worked there this past summer, and he said they still have a large portion of the basement filled up with (unorganized, mind you) stacks of CD's that they haven't even put into their database yet. Same goes for books. It'll be awhile until anybody knows how much data the LoC has.
I belong to the ______ generation.
I wonder how much of that data was duplicate slashdot stories.
Cuz I know the guy hosting this file is going to have a huge bandwidth bill.
You want the truthiness? You can't handle the truthiness!
Okay, call me... A dork, but wouldn't a film real technically be media and not data?
I mean, come on, why not count all the stuff kids write on paper... Oh wait... Nevermind that comment.
How about the little, itsy-bitsy electric impulses running around in my brain? That's data.. Kinda-sorta... Okay, okay... Most of it is cobwebs, but still.. If a duplicated film real (aka MEDIA) is counted, then you have to start adding other things to the mix.
Historians/anthropologists/archaeologists are interested in the ways in which the past created its future.
They're not interested in analyzing every lump of dung a past civilization created.
If they have 3 lumps of dung from a million individuals, it's something they'll study. If they have a million lumps of dung from 3 individuals, no.
Just how many copies of the goatse.cx picture do you need to archive, anyway?
The incredibly long thin strip of plastic with the tiny holes running along the edges is the media. The sequence of pictures is the data. What they did was figure out how big of an mpeg-2 file file would be needed to encode the movie. A lot of what this study is, is not so much how much data was generated, but how much new data storage capacity was generated. For example, if the industry produced 1 million blank cds, the study would show 700 million megabytes of new data.
"I'm not impatient. I just hate waiting." - My Dad
Rambling in my head: "Maybe I should have read the article first.... Boy do I look stupid!"
They built fudge factors in for this. I read through some of the methods they used. For their internet figures, for example, they sampled 9800 websites of the supposed 61 million URLS compiled by the Internet Archive (enough to get a 95% confidence level), wget/mirrored them to thier own servers (dropping links to other domains), and then analyzed the files for creation date, size, and uniqueness. For television We estimate about 1/4 of the programs are "original,". For CDs, they estimate that 1 in 20 gets trashed. Presumably, these figures are statistically based.
"I'm not impatient. I just hate waiting." - My Dad
Well it's a good thing the universe keeps expanding, cause otherwise we might run out of places to put all this data.
In a related report, 4.7 exabytes of that data was swapfiles being written by Windows XP.
TT
Couldn't you know how many hard drives have been shipped, their capacities, estimate how long they last, and then take some random samples of how full people's hard drives are and then make an estimate?
Is that what they did?
Avoid Missing Ball for High Score
It seems wherever you go these days we're taking photos of it - these days usually in digital. Having become the (proud) owner of a Canon 300D 6MP camera in the last few days I am amazed that in the good old days of the 8086, where Wordperfect 4.2 and DOS 3.0 didn't quite full a 10MB drive - today that same drive would hold only ~ 3 JPG photos from the camera ...
and then there is the old saying that junk *will* fill the space provided.
Jon - TheSpork
... is that 2 of those exabytes was just data created by the researchers to discern the amount of data made in 2001.
does that count?
Some days at my job I create gigs of test data every few minutes.
... 5000 years from now?
in their eyes, this century will hardly exist.
But only a fraction of that will make it onto my web site - I have maybe 60 megabytes of photos (cut-down to around 100k each) online and 10 megabytes of text on my web sites, and would be adding less than 40 megabytes a year to that.
Maybe I'll get a video camera, though, or put up some MP3s of my gamelan group...
Danny.
I have written over 900 book reviews
and how much was lost due to people not creating backups?
I've experiments to run, there is research to be done on the people who are still alive.
1. Get 500 TB raid array 2. Mount at /data
3. cat /dev/urandom > /dev/data/file.dat
4. Wait a while
Does this count as "creating" 500 TB of data? I don't think so. Simmilrarly much of these comments about Kazaa and P2P are stupid... just because theres 500 TB of data on Kazaa doesn't mean theres 500 UNIQUE TB.. probably over 90% of it is duplicates of other data, after all that's how P2p functions.
i dinna believe it ... ... what? ... 5 exa-whatever-bytes of data!
i got these tiny fractal generator (it's like
15 kb in size) but it can easally generat
so you must ahave been kidding.
as for these "huge" amounts of data they're "producing" in these physical collison tests, well
very simple acctually: they haven't figuered out
how to make a "small" experiment. nevermind.
oh and don't forget the "massive" amounts of redundant data that is being produced on Internet Relay Chat everyday.
Make sure there is no one standing on the ground before jumping then.. but still, don't do it =)