Slashdot Mirror


Library of Congress To Archive All Public Tweets

After the recent announcement that Groklaw will be archived at the Library of Congress, mjn writes with word that the push to archive more digital content continues: "The US Library of Congress announced a deal with Twitter to archive all public tweets, dating back to Twitter's inception in March 2006. More details at their blog. No word yet on precisely what will be done with the collection, but besides entering your friends' important updates on the quality of breakfast into the permanent archival record, the deal may improve access for researchers wanting to analyze and mine Twitter's giant database."

171 comments

  1. Your tax dollars at work... by Third+Position · · Score: 4, Insightful

    Given the signal to noise ratio for most tweets, I'm not convinced this is a particularly good use of resources...

    Just because you can do something, doesn't mean you have to!

    --
    American Third Position
    Finally, a real choice!
    1. Re:Your tax dollars at work... by sopssa · · Score: 4, Insightful

      It's not like it takes a lot of space to archive them, it's just 140 characters per tweet. There's a lot of useless information in the newspapers and books too, but they have archived them too because some of that info is valuable or might become valuable.

    2. Re:Your tax dollars at work... by Anonymous Coward · · Score: 5, Interesting

      And just because you don't have to, doesn't mean you shouldn't!

      This is probably the best way to capture a snapshot of our current society. Sure, the barrier for entry is a little lower, but I think this will be invaluable for historians who look back and try to understand us.

      Or, if anything, it'll confuse the hell out of them .

      Everyone wins either way!

      captcha: formally

    3. Re:Your tax dollars at work... by Anonymous Coward · · Score: 0

      It might not be, but it clears up who holds the copyrights to all of the messages posted on Twitter. It's certainly not the users.

      If you had any good ideas that you ever mentioned in a tweet, well, you don't own those idea any more.

    4. Re:Your tax dollars at work... by Anonymous Coward · · Score: 5, Funny

      Hi, @librarycongress! I just took a shit. I am honored that you will be archiving this momentous occasion for future generations.

    5. Re:Your tax dollars at work... by mlush · · Score: 2, Insightful

      Given the signal to noise ratio for most tweets, I'm not convinced this is a particularly good use of resources...

      Just because you can do something, doesn't mean you have to!

      Its a fantastic idea, its probably only a few Tb of data but it represents the unedited reaction of ordinary people to historical events and a detailed insight into their everyday lives.

    6. Re:Your tax dollars at work... by Anonymous Coward · · Score: 0

      Since you can't copyright an idea, you probably didn't own it in the first place.

      For the limited subset of ideas that are inventions, and the even more limited subset of those which are patentable -- well, if you patented it (or proceed to patent soon enough/meeting correct conditions after tweeting), you still own it, and if not, public disclosure, which prevents someone else from patenting and owning it, is now on record in the LoC.

      (Yeah, the above contains some simplifications of actual patent law -- still a lot better than the all IP=copyright rubbish.)

    7. Re:Your tax dollars at work... by Anonymous Coward · · Score: 0
    8. Re:Your tax dollars at work... by eln · · Score: 1

      I'm reminded of the Futurama episode where they go to a museum of the 20th century and everything there is ridiculously inaccurate because of how information tends to get lost and garbled over time. I can just imagine what a museum of the 21st century will look like if their primary source is old tweets. They'll probably think our self-imposed 140 character limit was due to some bizarre superstition and we worshiped someone known only as "aplusk" as a God whose wisdom came down to us in the form of what will appear to them (and to many of us) as complete gibberish.

    9. Re:Your tax dollars at work... by Dogtanian · · Score: 1

      Hi, @librarycongress! I just took a shit. I am honored that you will be archiving this momentous occasion for future generations.

      Obligatory.

      --
      "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
    10. Re:Your tax dollars at work... by Moralpanic · · Score: 1

      Just because you don't see a useful or 'good use' of it, doesn't mean that others won't. If you're just randomly looking at tweets, you're going to have a hard time finding anything useful. But if you do a targeted search or data mine, then you can come up with useful stuff. Say for instance, all the tweets somebody has made since the time they went online. You can get a pretty good idea of that individual's development. Or the tweets on a day like 9/11. And like others have mentioned, it's not like resources are scarce to archive them. 140 characters could hold literally trillions upon trillions of tweets on a single hard disk.

    11. Re:Your tax dollars at work... by uncqual · · Score: 1

      Or, it might cause historians to think there was a Little Dark Age in the early part of the 21st century.

      Now, if the LOC would archive /., the historians would know there was a Little Dark Age in the early part of the 21st century (and this post would be evidence that the denizens of the Little Dark Age even knew they were living in such a time).

      --
      Why is there an "insightful" mod and why isn't it "-1"? If I wanted insight, I wouldn't be reading /.
    12. Re:Your tax dollars at work... by mdm-adph · · Score: 1

      You have to remember that the people usually shouting "wargarbal waste of money" to scientific situations such as these aren't the type to give two shits as to generations that come after them, as we've all seen. :(

      Future historians? These people are trying to burn history books today.

      --
      It is by my will alone my thoughts acquire motion; it is by the juice of the coffee bean that the thoughts acquire speed
    13. Re:Your tax dollars at work... by Anonymous Coward · · Score: 0

      And just because you don't have to, doesn't mean you shouldn't!

      Actually, when you're spending tax dollars, that's a pretty damned good rule of thumb you encourage breaking. But hey, tax money is free, right?

    14. Re:Your tax dollars at work... by blair1q · · Score: 2, Insightful

      50 million tweets/day
      140 characters of message
      60 bytes of metadata (timestamp, sender id, etc.)

      10 GB of twitter archive per day
      10 TB per 3 years

      What does 1 TB cost these days? about $100?

      Storage space will indeed be an inexpensive part of the cost, and will decline in price at about the same rate the traffic is growing.

    15. Re:Your tax dollars at work... by Chowderbags · · Score: 1

      If they'll be looking at Twitter I don't think I want future historians to understand us.

    16. Re:Your tax dollars at work... by 0100010001010011 · · Score: 1

      That's uncompressed. Toss in bzip, gzip-9 or 7za. It's just plain text it should compress >80% rather well.

    17. Re:Your tax dollars at work... by omnichad · · Score: 1

      1 trillion bytes = 1TB in the words of HD manufacturers. One trillion 140 character tweets is exactly 140TB.

    18. Re:Your tax dollars at work... by pushing-robot · · Score: 3, Informative

      I's fun to think of historians as just attributing everything they learn about societies to religion and superstition, but the biggest reason we think pre-Enlightenment civilizations were obsessively religious is because the priest castes were generally among the most literate and the most concerned with preserving knowledge of the past. Much of what we know about history comes through their writings—and therefore, their perceptions. They quite literally wrote history, to a large extent, and our understanding of their society is colored by their bias.

      The Information Age has democratized knowledge to a huge degree. Historians centuries or millennia hence will have plenty of sources other than the lens of the Catholic Church. Given current trends, even just a decade from now a few consumer-grade storage devices could hold everything the Library of Congress or Archive.org contains today. As long as there are a few people in the world interested in preserving it, modern history should be safe.

      --
      How can I believe you when you tell me what I don't want to hear?
    19. Re:Your tax dollars at work... by gyrogeerloose · · Score: 1

      You can get a pretty good idea of that individual's development. Or the tweets on a day like 9/11.

      True enough, although you do have to wonder how much help "9/11? ZOMG--WTF?!" is going to be to future researchers.

      --
      This ain't rocket surgery.
    20. Re:Your tax dollars at work... by Myopic · · Score: 1

      Indeed I agree, especially given the overlap in the topics people tweet about, thus the words/text used in tweets.

    21. Re:Your tax dollars at work... by Myopic · · Score: 1

      Hey, that's interesting and insightful. I never thought of it that way before. I wonder if skeptics and nonbelievers were as common then as now. (In America I peg us at about 20% of the population, with about half of us being in the closet.)

      Imagine an ancient ritual sacrafice of a virgin or something, and one fifth of the crowd is sort of rolling their eyes thinking "really? I mean, really? You guys think that stabbing a girl with a hymen is going to bring you blessings from magical beings in the sky? Get a grip, losers."

    22. Re:Your tax dollars at work... by Anonymous Coward · · Score: 0

      Silly them. They should instead archive Wikipedia -- the definitive source of encyclopedic truth on the Internet. Wikipedia is MUCH MORE widely studied in academia than Twitter, and it is arguably a much more important information source.

    23. Re:Your tax dollars at work... by jecowa · · Score: 1

      It could be useful to people of the future who want to examine the culture of this time period.

      --
      my opportunity to freely express myself with the potential persecution and hangings and such
    24. Re:Your tax dollars at work... by Anonymous Coward · · Score: 0

      Finally got myself a follower...

    25. Re:Your tax dollars at work... by severoon · · Score: 1

      Yes, this truly is a giant database. Let us do math.

      140 characters/tweet * 2 bytes/character * 12E9 tweets = ~3.36TB

      O. M. G. This would fill more than half the hard disk space I have in my NAS...truly massive! (At my company, there was an April Fool's rumor going around on the day that Twitter would be going down for 10 minutes while their high school intern upgraded their "Tweet Storage Unit" (TSU) by adding an extra 2TB drive. Har har! To be fair, they store a good bit of metadata besides the tweet itself, so let's give them a factor of 10 just for giggles. 30TB! Wow! Truly knee-buckling!

      If you happen to work for Google, I will wait courteously as you catch your breath, dry your eyes, and massage your sore ribs. Also, pop the snot bubble coming out of your nose.

      --
      but have you considered the following argument: shut up.
    26. Re:Your tax dollars at work... by russotto · · Score: 2, Funny

      Now, if the LOC would archive /., the historians would know there was a Little Dark Age in the early part of the 21st century (and this post would be evidence that the denizens of the Little Dark Age even knew they were living in such a time).

      When the historians of the 50th century unearth the records of /., they'll realize the Final Dark Age came upon humans in the early part of the 21st century, and that while many saw something happening, none realized the extent. And then they'll click their mandibles in sorrow over what could have been, and move on to the next planet.

    27. Re:Your tax dollars at work... by RealGrouchy · · Score: 3, Funny

      50 million tweets/day
      140 characters of message
      60 bytes of metadata (timestamp, sender id, etc.)

      10 GB of twitter archive per day
      10 TB per 3 years

      Yes, but how much is all of that in Libraries of Congress?

      - RG>

      --
      Hey pal, this isn't a pleasantforest, so don't waste my time with pleasantries!
    28. Re:Your tax dollars at work... by Anonymous Coward · · Score: 0

      I wouldn't admit to such an embarrassing NAS.

    29. Re:Your tax dollars at work... by socsoc · · Score: 1

      Depending on the geographical region of your anecdote, I'd peg it much higher.

    30. Re:Your tax dollars at work... by Anonymous Coward · · Score: 0

      I feel pretty blessed when I stab a virgin.

      And believe me, my "sacrificial instrument" always "finishes them off," if you know what I mean!

    31. Re:Your tax dollars at work... by Anonymous Coward · · Score: 0

      It isn't a good use of resources and if they don't review every tweet they are archiving then they are probably opening up themselves for a major lawsuit if they permanently archive anything hateful, racist, and/or defamatory.

      Conceptually, it's about as meaningful as archiving all the trash from the tabloids, in my own opinion anyway.

    32. Re:Your tax dollars at work... by noidentity · · Score: 1

      Given the signal to noise ratio for most tweets, I'm not convinced this is a particularly good use of resources...

      Obviously it's a project really funded by the DOD, the highest quality source of entropy yet for cryptography.

    33. Re:Your tax dollars at work... by Myopic · · Score: 1

      I live in Wisconsin, grew up in Alaska, and lived for a while in New Hampshire and Massachusetts.

      Also to be clear, for skeptical nonbeliever I refer not only to Christianity or its similar easy-to-characterize religions, but also the Eastern sorts of religions, and the New Age sorts of religion (ghosts, "energy", pagan spirits).

      Obviously, I hope you are right that we amount to greater numbers. Where do you live?

      (Skepticism is a shockingly overdue prevalent worldview.)

    34. Re:Your tax dollars at work... by socsoc · · Score: 1

      California, but I'd say that it's closer to 40 (more if we include skeptics that attend church services just to appease someone else). And maybe it's due to the 20s-30s age group too.

      In WI, I am actually surprised at 20%. I've always felt like an outsider (admittedly not urban areas).

    35. Re:Your tax dollars at work... by Johann+Lau · · Score: 1

      The Information Age has democratized knowledge to a huge degree. Historians centuries or millennia hence will have plenty of sources other than the lens of the Catholic Church.

      I kinda disagree. We are, for the most part, very confined and very conform in our opinions, very much brainwashed. Maybe even more so than in the dark ages, because there the priests wore uniforms, and the churches were easily identifiable buildings, while today we're just as much gullible puppets than the people in the past, but the strings have become much less visible and weaved much deeper into our society.

      You could also argue that because the church had total power, it had no reason to not be candid in internal records, while what people post in their personal blogs and tweets is, for the most part, insecure people lying to themselves.

      Yeah, I'm partly playing devils advocate here... just saying it ain't so easy, and there's no reason to declare freedom and enlightement just yet...

      As long as there are a few people in the world interested in preserving it, modern history should be safe.

      As long as the people who are interested in preserving history are somewhat right in the head maybe, and not interested in "preserving" history because they're using it as their little vehicle.

      The very idea of objectively preserving history is either hilarious or insane, I'm not sure. I'm not saying you shouldn't try, or that nothing insightful comes of it - but know what you're doing, or rather, what you're not doing.

    36. Re:Your tax dollars at work... by Johann+Lau · · Score: 1

      This is probably the best way to capture a snapshot of our current society.

      You mean, of the parts of our society who use Twitter... it's bad enough they exist, but preserving their brain farts?

      captcha: formally

      Good for you.. brain fart for me.

    37. Re:Your tax dollars at work... by Myopic · · Score: 1

      Well, if it's 40% then that's pretty good. But, being from California, can you disclaim the accusation that many Californians are New Age-y hippie cranks?

      In any case, let's hope our numbers keep increasing. If we get to the 40% you suggest, then I think we might start seeing some Out Atheist politicians.

    38. Re:Your tax dollars at work... by socsoc · · Score: 1

      No, I can't rebut that accusation...

    39. Re:Your tax dollars at work... by atisss · · Score: 1

      1/16 of current "Library of Congress archived web" or half of it's printed-archive-plaintext size.

      Although I wonder if they are going to print it out, thus increasing size of it's printed size and making this comment totally redundant and useless.

    40. Re:Your tax dollars at work... by severoon · · Score: 1

      6TB is embarrassing? For a NAS used for personal backup? That's not only sufficient, I have lots of room to grow into it...I suppose I should be doubly embarrassed.

      Let me check...hm, nope. I'm not. :-)

      --
      but have you considered the following argument: shut up.
  2. Why? by sexconker · · Score: 0, Insightful

    Seriously, why?

    1. Re:Why? by ColdWetDog · · Score: 1, Interesting

      Seriously, why not? Mayhaps this will be a treasure trove for some unsuspecting social scientist in the 23rd Century. Really, the study of what boring, routine stuff people do day in and day out is important and can yield valuable insights into the past.

      Of course, that assumes that budding social scientists in the 23rd century can read.

      --
      Faster! Faster! Faster would be better!
    2. Re:Why? by oldhack · · Score: 1

      So eons later, whoever inherited this planet discovers this relic "Library of Congress". Seeking the ancient wisdom, they finally manage to decipher them after much struggle, and goes:
      WTF?

      --
      Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
    3. Re:Why? by sopssa · · Score: 0, Offtopic

      So eons later, whoever inherited this planet discovers this relic "Library of Congress". Seeking the ancient wisdom, they finally manage to decipher them after much struggle, and goes:

      WTF?

      But maybe the world has always been WTF? All we know are romanticized stories.

    4. Re:Why? by wisnoskij · · Score: 2, Insightful

      I would that a social scientist in the 23rd Century does that think that average human of today posts every triviality in his life like most of the current twitters.

      --
      Troll is not a replacement for I disagree.
    5. Re:Why? by maxwell+demon · · Score: 2, Interesting

      Soon after, he publishes a paper with his revolutionary new theory: People in the 21st century were so forgetful that they decided to record all details about their daily life in a central database so they could recover it if necessary.

      --
      The Tao of math: The numbers you can count are not the real numbers.
    6. Re:Why? by MobileTatsu-NJG · · Score: 1

      Does the phrase 'history is written by the victors' mean anything to you?

      --

      "I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)

    7. Re:Why? by Anonymous Coward · · Score: 0

      The world has always been WTF.

      More likely they will say: "No wonder they're gone."

    8. Re:Why? by socsoc · · Score: 1

      brought to you by Carl's Jr.

  3. hmm... by Pojut · · Score: 1

    I could see them archiving tweets that were relevant to pop culture or history...but all of them??? Seems like a waste of time and money to me.

    1. Re:hmm... by Captain+Splendid · · Score: 3, Insightful

      I'm thinking the byte limit on tweets is the main factor here...easier to just scoop 'em all up than to figure how to get the "important" ones.

      --
      Linux, you magnificent bastard, I read the fucking manual!
    2. Re:hmm... by bugi · · Score: 3, Insightful

      all of them???

      Disk space is cheap...

      They should get a copy of the internet archive while they're at it.

    3. Re:hmm... by sopssa · · Score: 1

      In the history only popular news or writings were archived. Wouldn't it be interesting to see what someone else, normal people, said about Shakespeare or some kings 1000 years from now? All we have now is what was archived - popular writings that governments agreed to.

    4. Re:hmm... by Shakrai · · Score: 1

      Wouldn't it be interesting to see what someone else, normal people, said about Shakespeare or some kings 1000 years from now?

      They were probably too busy watching Medieval Idol to even realize who Shakespeare or the King was ;)

      All we have now is what was archived - popular writings that governments agreed to.

      Which is all we'll have in the future, unless you think the United States Government is liable to be around in a thousand years.

      --
      I want peace on earth and goodwill toward man.
      We are the United States Government! We don't do that sort of thing.
    5. Re:hmm... by Pojut · · Score: 1

      Hmm...that is a good point...

    6. Re:hmm... by Shakrai · · Score: 2, Funny

      They should get a copy of the internet archive while they're at it.

      And alt.binaries too. Think of the "research" potential there... ;)

      --
      I want peace on earth and goodwill toward man.
      We are the United States Government! We don't do that sort of thing.
    7. Re:hmm... by lgarner · · Score: 1

      I agree that it's probably a waste, but I think it'd be an even bigger waste to actually analyze them all to pick the important ones.

    8. Re:hmm... by Anonymous Coward · · Score: 0

      I could see them archiving tweets that were relevant to pop culture or history...but all of them??? Seems like a waste of time and money to me.

      Why should Lady Gaga's tweets be archived but not mine*? Isn't it conceivable that I, Anonymous Coward, will become famous in 20 years time? 50 years after that, Twitter will be gone, and I will be dead while someone is looking up information about my early years to write a biography about me.

      *other than I do not have a twitter account.

    9. Re:hmm... by Trepidity · · Score: 3, Insightful

      I suspect a lot of the interesting information is in the aggregate anyway, not individual tweets: things like trends, analysis of subgroups, linguistic analysis, etc.

    10. Re:hmm... by natehoy · · Score: 4, Interesting

      They were probably too busy watching Medieval Idol to even realize who Shakespeare or the King was ;)

      A jest, I know, but it does demonstrate a serious point.

      Our history books are based on records maintained by the winners of wars, the leaders, the successful, etc. We know a lot about Shakespeare. We know relatively little about how his audiences actually felt about his work.

      We largely speculate as to how life was for the ordinary folk during historical periods based on writings about them, not writings from them. The exception to this is diaries, and now many people maintain those any more. Twitter can help replace some of that perspective.

      Admittedly, Twitter is not an ideal way to get a picture of a society, but you get to hear historical events told from a very different perspective. Actually, you get to hear them from LOTS of perspectives. They may not be an accurate portrayal of the events, but they are a snapshot of how a society reacts to and perceives events.

      Who will represent the narcissists in society for future generations?

      --
      "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
    11. Re:hmm... by Anonymous Coward · · Score: 0

      Absolutely. I'm inclined to say there *are* no 'important ones', but the totality is very interesting as corpus.

    12. Re:hmm... by Jason+Levine · · Score: 1

      I think that the importance of a single tweet varies depending on who is sending it and who is reading it. If I tweet/twitpic about some activity my children are doing, you might think a giant yawn is being generous. Meanwhile, however, a family member or friend reading it might be genuinely interested in that information. To give another example, if @grantimahara tweets about an upcoming episode of Mythbusters, you are a fan of that show, you'd likely find it very interesting. However, someone else who has no interest in the show would find it boring.

      All that being said, of course, I don't think the amount of people who find "I'm in the potty" tweets interesting is very large. (Not that I've ever seen one of these, mind you.)

      --
      My sci-fi novel, Ghost Thief, is now available from Amazon.com.
    13. Re:hmm... by maxwell+demon · · Score: 1

      Isn't it conceivable that I, Anonymous Coward, will become famous in 20 years time?

      At least, you are a quite well-known and active Slashdot user. However, it seems you suffer from a massively split personality.

      --
      The Tao of math: The numbers you can count are not the real numbers.
    14. Re:hmm... by dskzero · · Score: 1

      A waste of money and time would be to hire people to read every goddamn tweet in order to find out the relevant ones.

      --
      Oblivion Awaits
    15. Re:hmm... by kencurry · · Score: 1

      but most people tweet about mundane crap, not what happened on Capitol Hill. i.e., signal to noise will be horrible for trying to decipher What the Hell Happened...

      --
      sigs are for losers (except to point out that sigs are for losers)
    16. Re:hmm... by Anonymous Coward · · Score: 0

      My history professor always talks about how we have all these records about the rich and noble people in history. We even know how many enemas King Louis XIV had! But we don't know what the lives of the common people in France were like at the time.

      For example, we have court records from the Spanish Inquisition, and we know that most people were sent back into society (not executed), but we don't know what life was like for them after that. Were they accepted back into their village? Were they shunned?

    17. Re:hmm... by blair1q · · Score: 1

      That in fact is an ideal reason to do this, and twitter is nearly the ideal forum. The only hole in it is that some people aren't represented. Those who are over- or under-represented can be identified and the weight of their observations adjusted. But those who simply are not recorded will not have had an opinion at all.

      The real problem here is, the LoC is a government entity, and all my experiences with technology provided by government entities has left me less than impressed. Searching the LoC's archive may get you a deeper set of results than searching Twitter (which cuts off results at an unpredictable time in the past), but I'm going to bet that the tool will be slow and have missing or cryptic features and will return the results in a format that's hard to work with. Certainly I don't expect them to provide inferential data mining or self-repairing regexes.

      Can't wait for the first court case that turns on information recorded in a tweet in the LoC.

    18. Re:hmm... by Dogtanian · · Score: 1

      Disk space is cheap...

      Since it's "twitter", surely that should be "cheep"?

      Uh, sorry. :-(

      Anyway, if Twitter messages are 140 bytes and we assume the overhead averages 30% per message, that's 187 bytes per message.

      5.5 tweets per metric kilobyte.
      5475 tweets per megabyte.
      5,475,935 tweets per gigabyte.
      5,475,935,828 tweets per terabyte.

      Which isn't far short of the earth's population. Figure out the average number of tweets per person on earth, and you know how many $60 1TB hard drives you need to store them all.

      The question is the average- do the self-absorbed, narcissistic, twitter-spewing 14-year olds push that up to the point where the large number of people who don't want to or can't use Twitter don't matter?

      --
      "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
    19. Re:hmm... by SnarfQuest · · Score: 1

      We know a lot about Shakespeare.

      Really? So why did he leave his second best bed to his wife, and not his best one? Who really wrote the plays attributed to him? We eagerly await your answers.

      --
      Who would win this election: Andrew Weiner vs Andrew Weiner's weiner.
    20. Re:hmm... by BJ_Covert_Action · · Score: 1

      The exception to this is diaries, and now many people maintain those any more.

      Maybe not in written paper form, but certainly many people maintain and update their own blogs, notes, and other status updates on things like Myspace, Facebook, and blogspot. Surely those resources would be a good source for the same type of information that is maintained in diaries. I suppose diaries had/have the added advantage of usually being considered private, so more information may be disclosed in them. However, it's become pretty apparent that there are still many netizens that don't think enough about privacy to keep their blogs and facespaces more discriminatory than a typical diary they would keep.

      That said, I wonder if the Library of Congress could find a way to archive the blogosphere.

    21. Re:hmm... by bugi · · Score: 1

      alt.binaries too

      Good idea. Maybe linux-kernel too. Is there a better example of large scale teamwork? For coding, I mean. Not for documenting the downfall of the US legal system.

    22. Re:hmm... by Anonymous Coward · · Score: 0

      Apparently I'm the only one that picked up on your subtle humor... comparing twitter messages to dog poops.

    23. Re:hmm... by Monkeedude1212 · · Score: 1

      but most people tweet about mundane crap, not what happened on Capitol Hill. i.e., signal to noise will be horrible for trying to decipher What the Hell Happened...

      Not really. I enter "White house" on Twitter's own search features and there is only about 30% noise, 70% stuff relevant to my topic.

      So, in the year 31000 when they discover this data cache from the year 2010, they'll have search algorithms better than we could possibly concieve.

    24. Re:hmm... by maxwell+demon · · Score: 1

      Twitter messages should be highly compressible.

      --
      The Tao of math: The numbers you can count are not the real numbers.
    25. Re:hmm... by tpstigers · · Score: 1

      I think Twitter is the ideal way to get a picture of a society. What people say on a daily, mundane level is pretty much what a society IS. The average schmuck doesn't give a rat's ass about what goes on on Capitol Hill (if they even know what Capitol Hill is). A society is made up of people, not leaders.

    26. Re:hmm... by Anonymous Coward · · Score: 0

      looking at my twitter the latest tweet number is around 12,183,083,700 so they would fit on 3 1TB drives based on your maths

    27. Re:hmm... by Jay+L · · Score: 1

      Who will represent the narcissists in society for future generations?

      I will, of course, as I'm sure you all assumed.

    28. Re:hmm... by Anonymous Coward · · Score: 0

      I could see them archiving tweets that were relevant to pop culture or history...but all of them??? Seems like a waste of time and money to me.

      Its far easier to archive them all than to try and sift the wheat from the chaff.

    29. Re:hmm... by CrimsonAvenger · · Score: 1

      but most people tweet about mundane crap, not what happened on Capitol Hill

      Which shows something important in itself - that most people don't care all that much about most of what happens on Capitol Hill.

      --

      "I do not agree with what you say, but I will defend to the death your right to say it"
    30. Re:hmm... by lennier · · Score: 1

      They were probably too busy watching Medieval Idol to even realize who Shakespeare or the King was ;)

      Shakespeare was Renaissance English Idol, while Chaucer slammed the Medieval category.

      Just because something is now stuffy 'literature' doesn't mean it wasn't wildly populist entertainment in its time. There's a reason why a lot of Shakespeare centers on drunks, crossdressing and hitting people with swords.

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
    31. Re:hmm... by lennier · · Score: 1

      Who really wrote the plays attributed to him?

      David Tennant, obviously.

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
    32. Re:hmm... by socsoc · · Score: 1

      drunks, crossdressing and hitting people with swords.

      So you're saying that we should archive /b/?

    33. Re:hmm... by Anonymous Coward · · Score: 0

      A good point, but we actually know very little about Shakespeare, and that is why debate continues whether Christopher Marlowe wrote Shakespeare's plays. In fact, the exact date of his birth is not known. Do we really think humans will be here long enough to use this archive and how often will it be updated, and how long will we continue to use Twitter, or will it wane in popularity like using the telegraph or Western Union to send a message. Any technology that does not play an active role helping us produce energy

  4. Diabolical Intentions by Ordonator · · Score: 5, Funny

    Clearly, once they've finished, they plan to destroy the entire world so that they can claim to have truly archived all human knowledge, forever.

    1. Re:Diabolical Intentions by Anonymous Coward · · Score: 0

      Have we really degenerated that much that Twitter is the whole of human knowledge?

    2. Re:Diabolical Intentions by Anonymous Coward · · Score: 1, Funny

      Shut up and get on the Scootie-Puff Jr.

    3. Re:Diabolical Intentions by slimjim8094 · · Score: 1

      I saw that episode. "Just remember, Scooty Puff Jr. sucks..."

      --
      I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
    4. Re:Diabolical Intentions by Arancaytar · · Score: 1
  5. finally! by Anonymous Coward · · Score: 0

    Something I've written is considered good enough to be put into a library? Who wants to touch me?

  6. Oooh, I know by Prikolist · · Score: 1

    Next they'll archive 4chan

    --
    I think Linux isn't better than Windows hence in the slashdot realm I'm a troll
  7. Pooping by InsaneMosquito · · Score: 1

    Maybe someday, some historian will care how often we all pooped. Without our saved tweets, how would they know this important information?

    1. Re:Pooping by Anonymous Coward · · Score: 0

      I'm a twitter shitter!

    2. Re:Pooping by lee1026 · · Score: 3, Interesting

      I know you are joking, but this kind of stuff is actually very important to historians. For example, the only reason we are able to reconstruct how many hours a day people worked in the medieval era is by looking at court records - the judge will ask things like "what were you doing at five" and the person will respond with answers like "eating" or "sleeping" or "working", and by going though a lot of court records, we were able to guess at how people lived back then.

      This will allow the historian of the future to guess much more accurately.

    3. Re:Pooping by slimjim8094 · · Score: 1

      Tycho is being a douche... alright, poop time... okay, poop is coming out.

      I'm a twitter shitter!

      --
      I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
    4. Re:Pooping by PolygamousRanchKid+ · · Score: 1

      Future alien archeologists will say: "These fuckwit twits must have had shit for brains."

      "Let's saucer on over to another planet, Zog . . . there's nothing to learn mining this crap . . . and we might catch something here . . . ick!"

      --
      Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
    5. Re:Pooping by Anonymous Coward · · Score: 0

      Historians will likely look back and find a culture of thieving bastards. Maybe this is something we want to hide from future generations.

    6. Re:Pooping by Dogtanian · · Score: 1

      Trust me on this. There will be *way* more data than anyone needs to reconstruct "typical" expamples of this information, even if 99% of the data created from present-day society disappears.

      The obsessives worrying that we're about to enter a digital dark age forget about the massive amount of loss of data, information, photos, etc. from the past, and also underestimate the stupid amount we're archiving (intentionally or otherwise) nowadays.

      Modern society is fast approaching the point where the major problem will be archiving too much, not too little. Every digital transaction leaves a footprint, and data storage is becoming so cheap that it's going to get harder not to leave traces of that data *somewhere*, if only via some cache or whatever.

      --
      "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
    7. Re:Pooping by SnarfQuest · · Score: 1

      Someone needs to take one of the diskworls books, and convert it to a tweet play. Have a large group of people each take a character, and tweet that characters speach.

      Take that, future historians. You'll need retrophrenology to make it through that analysis.

      --
      Who would win this election: Andrew Weiner vs Andrew Weiner's weiner.
    8. Re:Pooping by Stray7Xi · · Score: 1

      I know you are joking, but this kind of stuff is actually very important to historians.

      Plus in twenty years when the current college crowd is running for public office we will have all sorts of shit to dredge up.

  8. All these recursive acronyms are great, but... by zill · · Score: 1

    I think it's a really bad idea to define measurement units recursively.

    1 new Tweet = 0.00000000000000017263 ( the current LoC + the new Tweet )

  9. The only time... by comm2k · · Score: 2, Interesting

    The only time I really actively used Twitter was during the recent LHC 3.5TeV event, because the webstream was completely overloaded. LoC preserving it? Future generations will look back and conclude that some people REALLY did have to TOO much time and trivial stuff to share.

    1. Re:The only time... by jfengel · · Score: 2, Interesting

      Future generations will look back and conclude that some people REALLY did have to TOO much time and trivial stuff to share.

      Sure, why not? You never know what sort of insights you'll get. What people do in their free time is just as important to historians as what they do when they're working. More so, sometimes, since the work is often ephemeral while the free time is an important insight into the culture as a whole.

      Most of it's garbage, but garbage middens are one of anthropology's favorite data sources.

    2. Re:The only time... by Pojut · · Score: 1

      I find it to be an extremely useful tool for keeping up on various personalities and the going-ons behind the scenes at certain websites. A sampling of the list of the people I follow:

      PADnD (Penny Arcade live tweets their Dungeons and Dragons games)
      mattsinger (critic for IFC)
      aedavis (Ashley Davis, who draws Once Upon a Pixel)
      washcaps (Washington Caps Hockey official twitter)
      mcps (Montgomery county Public Schools, who my fiancee works for)
      CameronPierce (Bizzaro author)
      CERN (LHC stuff, obviously)
      BenKuchera (head gaming writer for Ars Technica)
      zanelamprey (the guy from Three Sheets)
      TimOfLegend (Tim Schafer's official twitter)
      reverendanthony (Of Destructoid and Hey Ash Watcha Playin' fame)
      RWZombie (Rob Zombie)
      thekiko (Kiko, one of the Penny Arcade crew)
      Official_PAX (Official twitter of Penny Arcade Expo)
      geoffkeighley (Of GameTrailers fame)
      Templesmith (Ben Templesmith's official twitter)
      TychoBrahe (Tycho aka Jerry Holkins from Penny Arcade)
      neilhimself (Neil Gaiman's official twitter)
      cinemassacre (James Rolfe, aka Angry Video Game Nerd)
      pvponline (Scott Kurtz of PVP fame)
      cwgabriel (Gabe aka Mike Krahulik from Penny Arcade)
      wilw (Wil Wheaton's official twitter)
      joel_gardiner (The guy that plays FPS_Doug on Pure Pwnage)
      glapaire (The guy that plays Kyle on Pure Pwnage)
      jarettcale (The guy that plays Jeremy on Pure Pwnage)

    3. Re:The only time... by Monkeedude1212 · · Score: 1

      Future generations will look back and conclude that some people REALLY did have to TOO much time and trivial stuff to share.

      Which is why its important that we store this information. We know what the history books are going to say. We know that the War on Terror will come out to either be a horrible attrocity that human kind should never try to re-attempt, or it will be declared a huge success that ushered in a new era of peace and stability. People will ask "I wonder what was going through peoples heads?"

      And this is the PERFECT example. It will show that a lot of people didn't do anything, and they'll probably infer it to be Apathy.

  10. How many libraries of congress to store all that? by jollyreaper · · Score: 1

    Great, we've got a variable constant now.

    --
    Kwisatz Haderach
    Sell the spice to CHOAM
    This Mahdi took Shaddam's Throne
  11. Future political campaigns will be fun by merrickm · · Score: 1

    When today's teenagers and young adults are old enough to be running for public office and such, this, along with whatever archives of Facebook and the like may exist, will make for some great entertainment.

  12. I tweeted about this. by The+MAZZTer · · Score: 4, Funny
    1. Re:I tweeted about this. by spydabyte · · Score: 1

      too bad your tinyurl won't be archived. maybe if they did then you can set them up to recursively archive itself. hmmmmmm.

    2. Re:I tweeted about this. by A10Mechanic · · Score: 1

      Oh great. You just divided twitter by zero.

    3. Re:I tweeted about this. by Trepidity · · Score: 1

      The LoC isn't archiving URL shortener targets (yet, anyway), but the Internet Archive is on it, which at least ups the likelihood that some future researcher will be able to decode what those links pointed to.

  13. T's by anarking · · Score: 1

    Twittering twits tweet terrible tangents to tantalizing twats teaching totalitarian tools the totality that's timeless trash

  14. Re:How many libraries of congress to store all tha by Qzukk · · Score: 1

    Great, we've got a variable constant now.

    Don't worry, we'll just set up a system to tweet the new value whenever it changes ;)

    --
    If I have been able to see further than others, it is because I bought a pair of binoculars.
  15. The future. by MaWeiTao · · Score: 1

    If they think tweets are worthy of being archived why not just archive every blog and comment in existence? Many of those offer far more worthwhile insight than 99% of tweets.

    I remember in school students and sometimes teachers occasionally mocking the customs of past cultures. There was always that subtle arrogance that we're somehow more enlightened than people were 500, 1000 or 2000 years ago. The problem is that people confuse technological advancements for intellectual and philosophical advancement. I'd argue that socially and philosophically humans have progressed little over the last few thousand years. Certainly there have been some cultural shifts, but I'm hard-pressed to see any fundamental shifts. I do think we may be close to one, but judging from what I see on Twitter and Facebook I'm not particularly optimistic.

    With the massive proliferation of every last inane comment preserved for posterity I can only imagine how utterly stupid we are going to look to people of the future.

    1. Re:The future. by Intron · · Score: 1

      If they think tweets are worthy of being archived why not just archive every blog and comment in existence? Many of those offer far more worthwhile insight than 99% of tweets.

      There is a slippery slope here. What happens when the try to archive the Library of Congress within the LOC? The recursive archiving would destroy them.

      With the massive proliferation of every last inane comment preserved for posterity I can only imagine how utterly stupid we are going to look to people of the future.

      Take that, future people!

      --
      Intron: the portion of DNA which expresses nothing useful.
  16. In other words... by fahrbot-bot · · Score: 1

    Library of Congress To Archive All Public Tweets

    ... Twitter's new (publicly funded) "Backup and Data Retention Plan".

    Okay, I'm sure someone (probably The Daily Show) will, at some point, find something useful in all that noise.

    --
    It must have been something you assimilated. . . .
  17. Legal implications? by slimjim8094 · · Score: 2, Interesting

    All 'useless twits' jokes aside, this is pretty interesting. But I wonder if they'd run into any copyright laws.

    Reading the Twitter ToS turns up with this:

    You retain your rights to any Content you submit, post or display on or through the Services. By submitting, posting or displaying Content on or through the Services, you grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).

    which looks to me like posters retain copyright, but Twitter retains the right to grant others the same license you've granted them (non-exclusive license to provide their service).

    So based on my reading, Twitter (and the LoC) are in the clear?

    --
    I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
    1. Re:Legal implications? by Anonymous Coward · · Score: 0
    2. Re:Legal implications? by petsounds · · Score: 1

      It seems like your reading is probably right, but I would hope they would at least anonymize the data. It seems like quite the invasion. Right now, one can only find tweets from a few weeks prior in Twitter's public search. Now anyone can request any prior tweet.

  18. Why bother? by Neuroticwhine · · Score: 1

    And never before will the frivolousness of humanity be on such a display!

    Because in 100 years time, someone might really want to know that "TwitTwittering: Had an awesome soup today" or that "InaneNYC: Just took such a dump."

  19. A little late for April 1st gags by doggo · · Score: 1

    Why? No, seriously, why? Aren't there more important things for the Library of Congress to be spending money and resources on?

  20. One person's trash... by mschaffer · · Score: 1

    ...is another person's treasure.

    Of course, once it goes on the curb, it's up for grabs.

  21. They do it it old fashion way. by mschaffer · · Score: 1

    Just dig up the privy.

    Here's Ben Franklin's:
    http://www.flickr.com/photos/wallyg/2571265145/

  22. Small data set by fulldecent · · Score: 3, Interesting

    Math for the day:

    Without compression, all tweets in human history will fit on a single hard drive costing less than $100.

    http://search.twitter.com/search?q=a (to find the latest tweet number)
    http://twitter.com/about (character limit)
    http://www.pricewatch.com/hard_removable_drives/ (1.5TB drive)Delete

    http://www.google.com/buzz/fulldecent/18tfNfPHSBp/Math-for-the-day-Without-compression-all-tweets-in

    --

    -- I was raised on the command line, bitch

    1. Re:Small data set by dskzero · · Score: 1

      That doesn't takes the ammount of data to keep it archived. It needs an author, a timestamp. You can't just throw all the twits one after the other.

      --
      Oblivion Awaits
    2. Re:Small data set by Anonymous Coward · · Score: 0

      Twitter hasn't been around that long, so "human history" is a super short time.
      Besides, even $1 spent on archiving something as stupid as a tweet is an insult to me as an American taxpayer.
      I'd sooner see a paid government official getting overtime for having a paid poodle urinate on his face than to do that... geez

    3. Re:Small data set by penguinchris · · Score: 1

      Even if you double or triple the data stored per tweet to account for other metadata, assuming the parent's math is correct, it still shouldn't matter because that's still a trivial amount of storage to manage.

  23. What about other microblogging platforms? by TeXMaster · · Score: 1

    I wonder if they're going to archive stuff like identi.ca too, or any other related platform.

    --
    "I'm never quite so stupid as when I'm being smart" (Linus van Pelt)
  24. Twiiter by wisnoskij · · Score: 1

    While on a whole twitter is very important, most likely in an importance vs amount comparison they would rate as one of the lowest scoring collections of data of all time.

    --
    Troll is not a replacement for I disagree.
  25. Researchers? by straponego · · Score: 1

    You mean advertisers and Stasi. Ugh.

    Yeah, yeah, it's public. Agreed. And everybody knows there's no difference whatsoever between what some guy can read and an exhaustive, automated audit trail and connection map of everything that has ever been posted. That's why nobody uses search engines, after all.

  26. One thing I know for sure... by sootman · · Score: 1
    --
    Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
  27. For the future by SmallFurryCreature · · Score: 2, Interesting

    We learned more about ancient Egypt from their twitter then from all the official records designed to be survive the ages. Sure sure, very interesting to read the "unbiased" record of a pharaoh in his own tomb, but it is from the "trash" notes that were recovered that we learned about how the country itself worked. Including such little details as that the pyramids were not made by slaves.

    The official records of the US will be Fox news. Better pray that future researchers have access to some other source, or they will come back in time and nukes us all, causality be damned.

    --

    MMO Quests are like orgasms:

    You may solo them, I prefer them in a group.

    1. Re:For the future by Anonymous Coward · · Score: 0

      Ancient Egypt didn't have Twitter.

    2. Re:For the future by Anonymous Coward · · Score: 0

      "...more about ancient Egypt from their twitter then from all..."

      But we didn't learn the difference between 'then' and 'than', did we?

      Even Fox gets that right. :p

  28. I hope for a highlight by flahwho · · Score: 1

    Kile Orton's (of Chicago Bear & Denver Bronco 'Fame') Twitter is the quintessential collection of tweets ever. I hope it gets highlighted. NECKBEARD FOR PRESIDENT! http://twitter.com/Kingneckbeard

  29. That's a lot of information by Anonymous Coward · · Score: 0

    How many Libraries of Congress is that?

  30. Certainly could be the users by XanC · · Score: 2, Insightful

    A library archiving your work does not necessarily imply that you don't own the copyright on it.

    1. Re:Certainly could be the users by blair1q · · Score: 1

      And I would bet the Library of Congress doesn't have to give a damn about copyright anyway.

    2. Re:Certainly could be the users by Anonymous Coward · · Score: 0

      Are you kidding? That is exactly what it means. Copy-right. Copyright. It means that you hold all of the rights to copy. They are copying users' words without their explicit permission. That does not sound like the users own the copyrights, it sounds like Twitter does.

      Now go and tell Congress that they are not allowed to copy your tweets because you continue to hold the copyright over them. How much do you want to bet that they'll just dismiss you?

      Fuck it, I'm starting up my own "library" and I'm going to be archiving all data linked from The Pirate Bay. If a US corporation and the US government can steal, then so can I.

  31. delete the tweet by Anonymous Coward · · Score: 0

    delete the tweet

  32. Oblig. by SeaFox · · Score: 1

    All your tweet belong to us!

  33. Libraries have an exception by pavon · · Score: 1

    I think this would be legal regardless of what the ToS says. See the exemptions given to libraries and archives in 17 USC 108.

    1. Re:Libraries have an exception by Anonymous Coward · · Score: 0

      But that's only in the US... Twitter is global.

    2. Re:Libraries have an exception by Anonymous Coward · · Score: 0

      Does that mean the Library of Congress could archive all public Facebook accounts?

    3. Re:Libraries have an exception by maxwell+demon · · Score: 1

      But the LoC is only in the US.

      --
      The Tao of math: The numbers you can count are not the real numbers.
  34. Re:How many libraries of congress to store all tha by PriNT2357 · · Score: 1

    What I want to know is, how many tweets to store the library of congress? (Tweets included or not. Take your pick)

  35. Usenet by blair1q · · Score: 1

    They should have been archiving Usenet from the beginning.

    1. Re:Usenet by Somebody+Is+Using+My · · Score: 1

      I'm glad somebody said this. Usenet may be long past its heyday (at least as a source of information; it still has more nefarious uses) but at its height it was both a valuable resource and an entertaining community.

      Google shouldn't have a monopoly on this information; if the LoC considers tweets worth saving, then (non-binary) Usenet should definitely be included in their archives.

  36. Meanwhile: ACTA, not achieved. by Hurricane78 · · Score: 1

    And don’t even ask about Wikileaks as a whole...

    --
    Any sufficiently advanced intelligence is indistinguishable from stupidity.
  37. Twitter steganography by AlpineR · · Score: 1

    You can make it happen. Come up with a method to encode alt.binaries in 140-character chunks and the Library will archive them all for you.

  38. Neat, forever TwitterShare! by tlhIngan · · Score: 1

    Given that we can store almost 525 bytes of data in a single twit (I refuse to call them tweets), which is enough for a sector of data plus metadata, could it now mean we can store our data permanently at taxpayer's expense?

    I call it TwitterShare as a play on RapidShare to send files easily... and now those files will be forever archived. Sounds like a good way to backup data to me! Other than letting everyone else in the world see your files...

  39. Related by Anonymous Coward · · Score: 0

    In related news, Google is making their own copy of the archive of all tweets searchable: Official Google Blog: Replay it: Google search across the Twitter archive

  40. I have five DVDs' worth... by pongo000 · · Score: 1

    ...of archived gopherspace content I'm willing to donate to the LoC. Seems to me this dated motherload of data would have far more historical significance and impact than thousands upon thousands of dissociated mindfarts.

  41. Copyright registration; eminent domain by tepples · · Score: 1

    You're probably right. For one thing, the Library of Congress runs the Copyright Office, and registering a copyright means the LOC gets two copies anyway. For another, the Library of Congress is an agency of the Congress, which has the power under the Fifth Amendment to take any private property for public use in exchange for just compensation.

    1. Re:Copyright registration; eminent domain by Anonymous Coward · · Score: 0

      But people tweet outside USA jurisdiction too. Of course if I read twitter's TOS I'd have a clearer picture

    2. Re:Copyright registration; eminent domain by tepples · · Score: 1

      But people tweet outside USA jurisdiction too.

      If they tweet on the USA-based server, they make themselves subject to USA jurisdiction.

    3. Re:Copyright registration; eminent domain by stuckinphp · · Score: 0

      If they tweet on the USA-based server, they make themselves subject to USA jurisdiction.

      Bit naive to say that isn't it?

      --
      if only
  42. That's it by Junior+J.+Junior+III · · Score: 1

    I'm putting my Library of Congress stock recommendation to STRONG SELL.

    --
    You see? You see? Your stupid minds! Stupid! Stupid!
  43. Why dont they archive the books first? by NynexNinja · · Score: 1

    I find it quite ironic Library of Congress would be spending time archiving totally useless things like twitter.com postings, at the same time ignoring the thousands (if not hundreds of thousands or millions) of books in thier archive that they have yet to make public. I would say their first priority should be in making sure that everything that is in their actual Library gets put online and made public first, then after that work is done, then talk about doing other things. It is all a pretty big waste of time to do these other projects IMHO. They need to make their entire book archive public, and they have repeatedly refused to provide any timeline about when that will take place.

  44. Are they going to translate before archiving? by Anonymous Coward · · Score: 0

    Are they going to translate the l33tspeak and lolcatspeak to English before archiving so that the future generations can actually understand the tweets? There is a few online tools to do it

  45. Scientific Situations? by ClosedSource · · Score: 1

    I didn't realize that scientific papers are now chopped-up and delivered via twitter.

  46. Detailed? by ClosedSource · · Score: 1

    You keep using that word. I do not think it means what you think it means.

  47. Tweet inverse Density by idji · · Score: 1

    Well, at least we know know how many libraries of congress are required to archive all tweets. 1 LOC.

  48. Obligatory by ThatsNotPudding · · Score: 1

    And nothing of value was saved.

  49. Tweetstore in 3... 2... 1... by argent · · Score: 1

    How long before someone comes up with a scheme to backup files in encoded tweets "for posterity"?

    Seriously, they should be spending their effort on funding or replicating the Internet Archive instead.