Slashdot Mirror


Library of Congress To Archive All Public Tweets

After the recent announcement that Groklaw will be archived at the Library of Congress, mjn writes with word that the push to archive more digital content continues: "The US Library of Congress announced a deal with Twitter to archive all public tweets, dating back to Twitter's inception in March 2006. More details at their blog. No word yet on precisely what will be done with the collection, but besides entering your friends' important updates on the quality of breakfast into the permanent archival record, the deal may improve access for researchers wanting to analyze and mine Twitter's giant database."

10 of 171 comments (clear)

  1. Your tax dollars at work... by Third+Position · · Score: 4, Insightful

    Given the signal to noise ratio for most tweets, I'm not convinced this is a particularly good use of resources...

    Just because you can do something, doesn't mean you have to!

    --
    American Third Position
    Finally, a real choice!
    1. Re:Your tax dollars at work... by sopssa · · Score: 4, Insightful

      It's not like it takes a lot of space to archive them, it's just 140 characters per tweet. There's a lot of useless information in the newspapers and books too, but they have archived them too because some of that info is valuable or might become valuable.

    2. Re:Your tax dollars at work... by mlush · · Score: 2, Insightful

      Given the signal to noise ratio for most tweets, I'm not convinced this is a particularly good use of resources...

      Just because you can do something, doesn't mean you have to!

      Its a fantastic idea, its probably only a few Tb of data but it represents the unedited reaction of ordinary people to historical events and a detailed insight into their everyday lives.

    3. Re:Your tax dollars at work... by blair1q · · Score: 2, Insightful

      50 million tweets/day
      140 characters of message
      60 bytes of metadata (timestamp, sender id, etc.)

      10 GB of twitter archive per day
      10 TB per 3 years

      What does 1 TB cost these days? about $100?

      Storage space will indeed be an inexpensive part of the cost, and will decline in price at about the same rate the traffic is growing.

  2. Why? by sexconker · · Score: 0, Insightful

    Seriously, why?

    1. Re:Why? by wisnoskij · · Score: 2, Insightful

      I would that a social scientist in the 23rd Century does that think that average human of today posts every triviality in his life like most of the current twitters.

      --
      Troll is not a replacement for I disagree.
  3. Re:hmm... by Captain+Splendid · · Score: 3, Insightful

    I'm thinking the byte limit on tweets is the main factor here...easier to just scoop 'em all up than to figure how to get the "important" ones.

    --
    Linux, you magnificent bastard, I read the fucking manual!
  4. Re:hmm... by bugi · · Score: 3, Insightful

    all of them???

    Disk space is cheap...

    They should get a copy of the internet archive while they're at it.

  5. Re:hmm... by Trepidity · · Score: 3, Insightful

    I suspect a lot of the interesting information is in the aggregate anyway, not individual tweets: things like trends, analysis of subgroups, linguistic analysis, etc.

  6. Certainly could be the users by XanC · · Score: 2, Insightful

    A library archiving your work does not necessarily imply that you don't own the copyright on it.