Slashdot Mirror


How Google Saved USENET

Masem writes: "Salon has a well-written article article on the recent revival of much of the USENET archives from '81 to '90 by Google. It mentions that much of the recovery was thanks to years of work in transferring data off 140-some 10" magnetic tapes (~120megs of data) to a more conventional format in order to recover much of the early posts. Even a reference to the previous Slashdot story is made." Update: 01/07 23:52 GMT by T : btempleton adds: "O'Reilly Network asked me to do an article on similar themes and rememberances of USENET history." Thanks, Brad.

17 of 280 comments (clear)

  1. Wow, similar story by Tairan · · Score: 3, Informative
    released today in the San Francisco Chronicle. Read it over at sfgate.com. I/m surprised two independent media organizations would review the same company about the same thing and release it in the same general time frame! Amazing~

    --
    /. is a commercial entity. goto slashdot.com
  2. Oooh 10" magnetic tapes! by TheLocustNMI · · Score: 5, Informative
    Having had to work with those bastards, I'd have to give extra kudos to Google! There are few places in the United States that can actually read them, and get you the data from them anymore, and they must've been lovingly cared for, with some of them being 20 years old!

    I think I speak for everyone when I say "Thank you Google for arming me with the information contained in old USENet posts to bring up embarassing teenage posts to my friends!"

    1. Re:Oooh 10" magnetic tapes! by RadioheadKid · · Score: 5, Informative

      Actually, Google didn't do much, if any of the magnetic tape work, it was Bruce Jones, a grad student who transferred 107 tapes in two weeks and then David Wiseman did the rest over the next ten years. Google just downloaded them from him...

      --
      "Karma can only be portioned out by the cosmos." -Homer Simpson
  3. Re:Just think... by irregular_hero · · Score: 3, Informative

    It's been posted here before, but a list of "first mentions" are here. Notably absent is the first mention of Kibo... just an early post BY him. :)

  4. google rocks for doing it, but don't think... by emn-slashdot · · Score: 2, Informative

    that it's that hard to get or use the equiptment.

    http://www.unisys.com sells 10" SCSI readers for thier A-series system. You can buy it seperately without a A-series service contract, and it works like any other /dev/rmt device.

    I worked for a company that distributed bank software on them as late as... well... now. And yes, it is cobol software. ;)

    Major Kudos to google for bringing back old usenet posts. Besides the knowledgebase provided, they are fun to read! Lots of stuff tasteful geek humor. I recommend checking it out.

    --
    -EvilMonkeyNinja
    Mild Mannered Host by Day
    Wild Hammered Programmer by Night
  5. Re:Embarrassing posts archived by Anonymous Coward · · Score: 3, Informative

    Scratch that. I found this page which tells you how to remove posts you have made.

  6. Re:Just think... by frank_adrian314159 · · Score: 3, Informative
    I don't know what this "nine parts" jazz is, but that little 1997 blurb is about the funniest thing I've seen all day.

    According to Lucas, SW was supposed to be a trilogy of trilogies (Lucas has since recanted and said that E3 will be the last). E5 was out 3 yr. after E4, E6 three years after that. You do the math. No one expected the long hiatus between E6 and E1. After Jar Jar, they wondered if Lucas had waited long enough...

    --
    That is all.
  7. Re:Just think... by ideut · · Score: 5, Informative
    Reading your first link, it's amusing to see that even ten years ago there were a lot of ridiculous IP shenanigans. Such as

    "Ashton-Tate is once again pushing its case for a copyright on the programming language used in DBase. ".

    And the numerous silly patents, such as

    'Emacs is threatened by IBM patent number 4,674,040 which covers "cut and paste between files" in a text editor. Many Emacs features are threatened by patent number 4,458,311, which covers "text and numeric processing on same screen." Patent 4,398,249 covering the general spreadsheet technique known as "natural order recalc" stops us from using it in GNU '

    --

    --

  8. O'Reilly Network article on the same theme by btempleton · · Score: 4, Informative
    This is a popular theme this month, with no surprises. O'Reilly Network also asked me to do an article on the history of USENET and things discovered in the archives. At the same time I also did an article on the history of some popular net terms like spam and net surfing.

    You can read the article I wrote on the O'Reilly site

    --
    Has it been over a year since you last donated to the Electronic Frontier Foundation
  9. Re:I've really got to wonder... by athakur999 · · Score: 2, Informative
    No ads whatsoever.

    Googles DOES have ads, just not the obtrusive, annoying kind. I.e., look up "car tires" and the first thing you see is a "sponsored link" by Tire Rack.
    --
    "People that quote themselves in their signatures bother me" - athakur999
  10. Copyright? by markb · · Score: 2, Informative

    You posted to a public place. You gave up your copyright when you did that.

  11. Re:That little? by snake_dad · · Score: 3, Informative
    From the google groups faq:
    * Can I access binary content on Google Groups?

    No. Google Groups does not archive any binary content.
    So maybe binaries where not archived in the early days either, or maybe there were no binaries yet, i can't remember. Anyway, nowadays binaries account for most of the enormous amounts of data pushed over the usenet. So filtering that out makes the data a bit more usable.
    --
    karma capped .sig seeking available Slashdot poster for long-term relationship.
  12. Re:That little? by btempleton · · Score: 4, Informative

    There were no binaries in the early days. First there were the net.sources groups where you would find new Unix programs, notably the lastest updates of USENET software.

    Binaries groups showed up a bit later, mostly after the great renaming, mostly for IBM PC Shareware and freeware binaries. No Warez or photos, not until a lot later.

    --
    Has it been over a year since you last donated to the Electronic Frontier Foundation
  13. Cool it with the damn poem parodies by churchr · · Score: 2, Informative

    On every single story, somebody posts a parody of that poem. This is the new Beawolf cluster.

  14. Re:Let us POST! by savaget · · Score: 3, Informative
  15. Re:I've really got to wonder... by clambert · · Score: 3, Informative

    There IS money to be made with this. Google's text based ad technology is VERY powerful, and has some of the best targetting potential in the industry.

    While I'm not sure of the legalities, Google will probably add the same text based ads located on its web search to its newsgroup search. This will mean when you search for "tivo upgrade", you could see a text based ad pointing offering hard drive upgrade kits next to the news posts. Unobtrusive, yet effective.

    Not really bait and switch, but they're getting everyone hooked on the system now, and'll work on ads later. (just like they did for the web search)

    Again, I don't blame them. Everyone has to make a buck, and Google's doing it in the best possible way.

    --
    mailto:<?=implode("@", array("chris", implode(".", array("php", "net"))))?>
  16. Napkin calculation: VHS as a backup medium by yerricde · · Score: 3, Informative

    Hell, a converted VCR using VHS as a backup medium can store like 100GB (saw one somewhere, I forget the link.)

    Assuming 9 Mbps of raw data (half the data rate of HDTV, because garden-variety VHS is nowhere near broadcast-quality), and assuming some heavy-duty error correction reducing effective data rate to 6 Mbps, VHS's SP mode records for 7200 seconds, giving 5 gigabytes on a tape at a bare minimum. (For comparison, a single-layer DVD holds about 4 1/2 GB.) If we go to EP mode, increase the bandwidth to S-VHS levels, and apply 3:1 text compression (common with deflation of large Latin-alphabet texts, especially containing quoted material), we may be able to store even more data per tape.

    --
    Will I retire or break 10K?