Slashdot Mirror


Library of Congress To Archive All Public Tweets

After the recent announcement that Groklaw will be archived at the Library of Congress, mjn writes with word that the push to archive more digital content continues: "The US Library of Congress announced a deal with Twitter to archive all public tweets, dating back to Twitter's inception in March 2006. More details at their blog. No word yet on precisely what will be done with the collection, but besides entering your friends' important updates on the quality of breakfast into the permanent archival record, the deal may improve access for researchers wanting to analyze and mine Twitter's giant database."

14 of 171 comments (clear)

  1. Your tax dollars at work... by Third+Position · · Score: 4, Insightful

    Given the signal to noise ratio for most tweets, I'm not convinced this is a particularly good use of resources...

    Just because you can do something, doesn't mean you have to!

    --
    American Third Position
    Finally, a real choice!
    1. Re:Your tax dollars at work... by sopssa · · Score: 4, Insightful

      It's not like it takes a lot of space to archive them, it's just 140 characters per tweet. There's a lot of useless information in the newspapers and books too, but they have archived them too because some of that info is valuable or might become valuable.

    2. Re:Your tax dollars at work... by Anonymous Coward · · Score: 5, Interesting

      And just because you don't have to, doesn't mean you shouldn't!

      This is probably the best way to capture a snapshot of our current society. Sure, the barrier for entry is a little lower, but I think this will be invaluable for historians who look back and try to understand us.

      Or, if anything, it'll confuse the hell out of them .

      Everyone wins either way!

      captcha: formally

    3. Re:Your tax dollars at work... by Anonymous Coward · · Score: 5, Funny

      Hi, @librarycongress! I just took a shit. I am honored that you will be archiving this momentous occasion for future generations.

    4. Re:Your tax dollars at work... by pushing-robot · · Score: 3, Informative

      I's fun to think of historians as just attributing everything they learn about societies to religion and superstition, but the biggest reason we think pre-Enlightenment civilizations were obsessively religious is because the priest castes were generally among the most literate and the most concerned with preserving knowledge of the past. Much of what we know about history comes through their writings—and therefore, their perceptions. They quite literally wrote history, to a large extent, and our understanding of their society is colored by their bias.

      The Information Age has democratized knowledge to a huge degree. Historians centuries or millennia hence will have plenty of sources other than the lens of the Catholic Church. Given current trends, even just a decade from now a few consumer-grade storage devices could hold everything the Library of Congress or Archive.org contains today. As long as there are a few people in the world interested in preserving it, modern history should be safe.

      --
      How can I believe you when you tell me what I don't want to hear?
    5. Re:Your tax dollars at work... by RealGrouchy · · Score: 3, Funny

      50 million tweets/day
      140 characters of message
      60 bytes of metadata (timestamp, sender id, etc.)

      10 GB of twitter archive per day
      10 TB per 3 years

      Yes, but how much is all of that in Libraries of Congress?

      - RG>

      --
      Hey pal, this isn't a pleasantforest, so don't waste my time with pleasantries!
  2. Diabolical Intentions by Ordonator · · Score: 5, Funny

    Clearly, once they've finished, they plan to destroy the entire world so that they can claim to have truly archived all human knowledge, forever.

  3. Re:hmm... by Captain+Splendid · · Score: 3, Insightful

    I'm thinking the byte limit on tweets is the main factor here...easier to just scoop 'em all up than to figure how to get the "important" ones.

    --
    Linux, you magnificent bastard, I read the fucking manual!
  4. Re:hmm... by bugi · · Score: 3, Insightful

    all of them???

    Disk space is cheap...

    They should get a copy of the internet archive while they're at it.

  5. I tweeted about this. by The+MAZZTer · · Score: 4, Funny
  6. Re:hmm... by Trepidity · · Score: 3, Insightful

    I suspect a lot of the interesting information is in the aggregate anyway, not individual tweets: things like trends, analysis of subgroups, linguistic analysis, etc.

  7. Re:Pooping by lee1026 · · Score: 3, Interesting

    I know you are joking, but this kind of stuff is actually very important to historians. For example, the only reason we are able to reconstruct how many hours a day people worked in the medieval era is by looking at court records - the judge will ask things like "what were you doing at five" and the person will respond with answers like "eating" or "sleeping" or "working", and by going though a lot of court records, we were able to guess at how people lived back then.

    This will allow the historian of the future to guess much more accurately.

  8. Small data set by fulldecent · · Score: 3, Interesting

    Math for the day:

    Without compression, all tweets in human history will fit on a single hard drive costing less than $100.

    http://search.twitter.com/search?q=a (to find the latest tweet number)
    http://twitter.com/about (character limit)
    http://www.pricewatch.com/hard_removable_drives/ (1.5TB drive)Delete

    http://www.google.com/buzz/fulldecent/18tfNfPHSBp/Math-for-the-day-Without-compression-all-tweets-in

    --

    -- I was raised on the command line, bitch

  9. Re:hmm... by natehoy · · Score: 4, Interesting

    They were probably too busy watching Medieval Idol to even realize who Shakespeare or the King was ;)

    A jest, I know, but it does demonstrate a serious point.

    Our history books are based on records maintained by the winners of wars, the leaders, the successful, etc. We know a lot about Shakespeare. We know relatively little about how his audiences actually felt about his work.

    We largely speculate as to how life was for the ordinary folk during historical periods based on writings about them, not writings from them. The exception to this is diaries, and now many people maintain those any more. Twitter can help replace some of that perspective.

    Admittedly, Twitter is not an ideal way to get a picture of a society, but you get to hear historical events told from a very different perspective. Actually, you get to hear them from LOTS of perspectives. They may not be an accurate portrayal of the events, but they are a snapshot of how a society reacts to and perceives events.

    Who will represent the narcissists in society for future generations?

    --
    "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."