Slashdot Mirror


Library of Congress To Archive All Public Tweets

After the recent announcement that Groklaw will be archived at the Library of Congress, mjn writes with word that the push to archive more digital content continues: "The US Library of Congress announced a deal with Twitter to archive all public tweets, dating back to Twitter's inception in March 2006. More details at their blog. No word yet on precisely what will be done with the collection, but besides entering your friends' important updates on the quality of breakfast into the permanent archival record, the deal may improve access for researchers wanting to analyze and mine Twitter's giant database."

25 of 171 comments (clear)

  1. Your tax dollars at work... by Third+Position · · Score: 4, Insightful

    Given the signal to noise ratio for most tweets, I'm not convinced this is a particularly good use of resources...

    Just because you can do something, doesn't mean you have to!

    --
    American Third Position
    Finally, a real choice!
    1. Re:Your tax dollars at work... by sopssa · · Score: 4, Insightful

      It's not like it takes a lot of space to archive them, it's just 140 characters per tweet. There's a lot of useless information in the newspapers and books too, but they have archived them too because some of that info is valuable or might become valuable.

    2. Re:Your tax dollars at work... by Anonymous Coward · · Score: 5, Interesting

      And just because you don't have to, doesn't mean you shouldn't!

      This is probably the best way to capture a snapshot of our current society. Sure, the barrier for entry is a little lower, but I think this will be invaluable for historians who look back and try to understand us.

      Or, if anything, it'll confuse the hell out of them .

      Everyone wins either way!

      captcha: formally

    3. Re:Your tax dollars at work... by Anonymous Coward · · Score: 5, Funny

      Hi, @librarycongress! I just took a shit. I am honored that you will be archiving this momentous occasion for future generations.

    4. Re:Your tax dollars at work... by mlush · · Score: 2, Insightful

      Given the signal to noise ratio for most tweets, I'm not convinced this is a particularly good use of resources...

      Just because you can do something, doesn't mean you have to!

      Its a fantastic idea, its probably only a few Tb of data but it represents the unedited reaction of ordinary people to historical events and a detailed insight into their everyday lives.

    5. Re:Your tax dollars at work... by blair1q · · Score: 2, Insightful

      50 million tweets/day
      140 characters of message
      60 bytes of metadata (timestamp, sender id, etc.)

      10 GB of twitter archive per day
      10 TB per 3 years

      What does 1 TB cost these days? about $100?

      Storage space will indeed be an inexpensive part of the cost, and will decline in price at about the same rate the traffic is growing.

    6. Re:Your tax dollars at work... by pushing-robot · · Score: 3, Informative

      I's fun to think of historians as just attributing everything they learn about societies to religion and superstition, but the biggest reason we think pre-Enlightenment civilizations were obsessively religious is because the priest castes were generally among the most literate and the most concerned with preserving knowledge of the past. Much of what we know about history comes through their writings—and therefore, their perceptions. They quite literally wrote history, to a large extent, and our understanding of their society is colored by their bias.

      The Information Age has democratized knowledge to a huge degree. Historians centuries or millennia hence will have plenty of sources other than the lens of the Catholic Church. Given current trends, even just a decade from now a few consumer-grade storage devices could hold everything the Library of Congress or Archive.org contains today. As long as there are a few people in the world interested in preserving it, modern history should be safe.

      --
      How can I believe you when you tell me what I don't want to hear?
    7. Re:Your tax dollars at work... by russotto · · Score: 2, Funny

      Now, if the LOC would archive /., the historians would know there was a Little Dark Age in the early part of the 21st century (and this post would be evidence that the denizens of the Little Dark Age even knew they were living in such a time).

      When the historians of the 50th century unearth the records of /., they'll realize the Final Dark Age came upon humans in the early part of the 21st century, and that while many saw something happening, none realized the extent. And then they'll click their mandibles in sorrow over what could have been, and move on to the next planet.

    8. Re:Your tax dollars at work... by RealGrouchy · · Score: 3, Funny

      50 million tweets/day
      140 characters of message
      60 bytes of metadata (timestamp, sender id, etc.)

      10 GB of twitter archive per day
      10 TB per 3 years

      Yes, but how much is all of that in Libraries of Congress?

      - RG>

      --
      Hey pal, this isn't a pleasantforest, so don't waste my time with pleasantries!
  2. Diabolical Intentions by Ordonator · · Score: 5, Funny

    Clearly, once they've finished, they plan to destroy the entire world so that they can claim to have truly archived all human knowledge, forever.

  3. Re:hmm... by Captain+Splendid · · Score: 3, Insightful

    I'm thinking the byte limit on tweets is the main factor here...easier to just scoop 'em all up than to figure how to get the "important" ones.

    --
    Linux, you magnificent bastard, I read the fucking manual!
  4. Re:hmm... by bugi · · Score: 3, Insightful

    all of them???

    Disk space is cheap...

    They should get a copy of the internet archive while they're at it.

  5. The only time... by comm2k · · Score: 2, Interesting

    The only time I really actively used Twitter was during the recent LHC 3.5TeV event, because the webstream was completely overloaded. LoC preserving it? Future generations will look back and conclude that some people REALLY did have to TOO much time and trivial stuff to share.

    1. Re:The only time... by jfengel · · Score: 2, Interesting

      Future generations will look back and conclude that some people REALLY did have to TOO much time and trivial stuff to share.

      Sure, why not? You never know what sort of insights you'll get. What people do in their free time is just as important to historians as what they do when they're working. More so, sometimes, since the work is often ephemeral while the free time is an important insight into the culture as a whole.

      Most of it's garbage, but garbage middens are one of anthropology's favorite data sources.

  6. Re:hmm... by Shakrai · · Score: 2, Funny

    They should get a copy of the internet archive while they're at it.

    And alt.binaries too. Think of the "research" potential there... ;)

    --
    I want peace on earth and goodwill toward man.
    We are the United States Government! We don't do that sort of thing.
  7. I tweeted about this. by The+MAZZTer · · Score: 4, Funny
  8. Re:hmm... by Trepidity · · Score: 3, Insightful

    I suspect a lot of the interesting information is in the aggregate anyway, not individual tweets: things like trends, analysis of subgroups, linguistic analysis, etc.

  9. Legal implications? by slimjim8094 · · Score: 2, Interesting

    All 'useless twits' jokes aside, this is pretty interesting. But I wonder if they'd run into any copyright laws.

    Reading the Twitter ToS turns up with this:

    You retain your rights to any Content you submit, post or display on or through the Services. By submitting, posting or displaying Content on or through the Services, you grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).

    which looks to me like posters retain copyright, but Twitter retains the right to grant others the same license you've granted them (non-exclusive license to provide their service).

    So based on my reading, Twitter (and the LoC) are in the clear?

    --
    I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
  10. Re:Pooping by lee1026 · · Score: 3, Interesting

    I know you are joking, but this kind of stuff is actually very important to historians. For example, the only reason we are able to reconstruct how many hours a day people worked in the medieval era is by looking at court records - the judge will ask things like "what were you doing at five" and the person will respond with answers like "eating" or "sleeping" or "working", and by going though a lot of court records, we were able to guess at how people lived back then.

    This will allow the historian of the future to guess much more accurately.

  11. Small data set by fulldecent · · Score: 3, Interesting

    Math for the day:

    Without compression, all tweets in human history will fit on a single hard drive costing less than $100.

    http://search.twitter.com/search?q=a (to find the latest tweet number)
    http://twitter.com/about (character limit)
    http://www.pricewatch.com/hard_removable_drives/ (1.5TB drive)Delete

    http://www.google.com/buzz/fulldecent/18tfNfPHSBp/Math-for-the-day-Without-compression-all-tweets-in

    --

    -- I was raised on the command line, bitch

  12. Re:hmm... by natehoy · · Score: 4, Interesting

    They were probably too busy watching Medieval Idol to even realize who Shakespeare or the King was ;)

    A jest, I know, but it does demonstrate a serious point.

    Our history books are based on records maintained by the winners of wars, the leaders, the successful, etc. We know a lot about Shakespeare. We know relatively little about how his audiences actually felt about his work.

    We largely speculate as to how life was for the ordinary folk during historical periods based on writings about them, not writings from them. The exception to this is diaries, and now many people maintain those any more. Twitter can help replace some of that perspective.

    Admittedly, Twitter is not an ideal way to get a picture of a society, but you get to hear historical events told from a very different perspective. Actually, you get to hear them from LOTS of perspectives. They may not be an accurate portrayal of the events, but they are a snapshot of how a society reacts to and perceives events.

    Who will represent the narcissists in society for future generations?

    --
    "This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
  13. Re:Why? by wisnoskij · · Score: 2, Insightful

    I would that a social scientist in the 23rd Century does that think that average human of today posts every triviality in his life like most of the current twitters.

    --
    Troll is not a replacement for I disagree.
  14. For the future by SmallFurryCreature · · Score: 2, Interesting

    We learned more about ancient Egypt from their twitter then from all the official records designed to be survive the ages. Sure sure, very interesting to read the "unbiased" record of a pharaoh in his own tomb, but it is from the "trash" notes that were recovered that we learned about how the country itself worked. Including such little details as that the pyramids were not made by slaves.

    The official records of the US will be Fox news. Better pray that future researchers have access to some other source, or they will come back in time and nukes us all, causality be damned.

    --

    MMO Quests are like orgasms:

    You may solo them, I prefer them in a group.

  15. Certainly could be the users by XanC · · Score: 2, Insightful

    A library archiving your work does not necessarily imply that you don't own the copyright on it.

  16. Re:Why? by maxwell+demon · · Score: 2, Interesting

    Soon after, he publishes a paper with his revolutionary new theory: People in the 21st century were so forgetful that they decided to record all details about their daily life in a central database so they could recover it if necessary.

    --
    The Tao of math: The numbers you can count are not the real numbers.