Slashdot Mirror


How Google Saved USENET

Masem writes: "Salon has a well-written article article on the recent revival of much of the USENET archives from '81 to '90 by Google. It mentions that much of the recovery was thanks to years of work in transferring data off 140-some 10" magnetic tapes (~120megs of data) to a more conventional format in order to recover much of the early posts. Even a reference to the previous Slashdot story is made." Update: 01/07 23:52 GMT by T : btempleton adds: "O'Reilly Network asked me to do an article on similar themes and rememberances of USENET history." Thanks, Brad.

17 of 280 comments (clear)

  1. I've really got to wonder... by SmittyTheBold · · Score: 4, Interesting

    ...how Google will make money off of this. They supposedly make money off licensing their technology (and presumable their collected data, as well.) No ads whatsoever. I applaud their dedication to that goal so far.

    Groups.google.com seems like the kind of thing they're doing just becuase they can, though. I can't imagine there is much money to be made off the technology, because it's all text - the same search tech applies. So, as far as I can tell, there is no business reason to be doing this. it's a drain on resources with little to no return, except for (geek) community goodwill.

    The conclusion I draw, then, is Google is in this just for the fun, challenge, or doing something for the community - maybe all three. Philantropy at its best. =)

    --
    ± 29 dB
    1. Re:I've really got to wonder... by krogoth · · Score: 5, Interesting

      I think google should be paid just for being so damn cool. They deserve spontaneous income for things like the groups (with the history they now have), having a '1337-h4x0r' language you can use (http://www.google.com/intl/xx-hacker/), changing their banner for special days (anyone else see the christmas thing?)...

      There's a lot of companies right now that should be punished for doing stupid things, but Google is the complete opposite; I'd like to see Microsoft, the RIAA, and the MPAA have to donate 20% of their money to google :)

      --

      They that quote Benjamin Franklin on liberty and safety deserve neither.
  2. Archaic Technology by irregular_hero · · Score: 5, Interesting
    The article isn't kidding about the difficulty of finding a reader for your typical nine-track tape these days. I spent lots of bucks on a SCSI nine-track a few years ago for archiving system and application software on nine-track from old computer systems. And although the purchase helped, there are still occassions when I have to fire up some very old Big Iron to read one tape or another.

    An interesting thing about these tapes: They stretch over time and can sometimes become unreadable because of that. There are times when, to extract the information on the tape, I would put a number of them in my freezer for an hour or so, then try again. Nine times out of ten that would actually work.

    Another note about the article: I can still remember discussions with others who had modems about 1200 baud being just "too fast". The reasoning was that the average person couldn't read much faster than 300 baud. :)

    1. Re:Archaic Technology by Greg+Lindahl · · Score: 3, Interesting


      Clever businesses transferred their 9-track archives to Exabyte about a decade ago. The problem is people with only a few tapes, not clueful people with lots of them.

      As an example, the VLA (Very Large Array, a radio telescope in New Mexico) had its entire archive on 9 track. When Exabytes finally became cheap, they just copied their entire data archive (everything observed since it started taking data in 1978, thousands of tapes) to Exabyte tapes. The expense wasn't that large compared to their overall operations expense.

  3. Repeat I know, but a great read by C.+Mattix · · Score: 5, Interesting

    I know this is a repeat but this is a great read. Dr. Gene Spafford's farewell posting. If you don't know who that is, look it up.

    ===
    From: spaf@cs.purdue.edu
    Newsgroups: news.announce.newusers,news.misc,news.admin.misc,n ews.groups,soc.net-people
    Subject: That's all, folks
    Followup-To: poster
    Date: 29 Apr 1993 19:01:12 -0500
    Message-ID:

    [ I originally was going to post nothing on this topic. I'm burned
    out, and I don't want my fatigue to appear like I'm posting
    self-indulgent garbage. However, several people have argued with
    me, and convinced me that maybe I should make a statement to "end an
    era," and as a piece of net "history." At the least, even if it is
    perceived as self-indulgent garbage, it will fit right in with the
    rest of the net. ]

    There is a Zen adage about how anything one cannot bear to give up is
    not owned, but is in fact the owner. What follows relates how I am
    owned by one less thing....

    About a dozen years ago, when I was still a grad student at Georgia
    Tech, we got our first Usenet connection (to allegra, then being run
    by Peter Honeyman, I believe). I'd been using a few dial-in BBS
    systems for a while, so it wasn't a huge transition for me. I quickly
    got "hooked": I can claim to be someone who once read every newsgroup
    on Usenet for weeks at a time!

    After several months, I realized that it was difficult for a newcomer
    to tell what newsgroups were available and what they covered. I made
    a pass at putting together some information, combined it with a
    similar list compiled by another netter, and began posting it for
    others to use. Eventually, the list was joined by other documents
    describing net history and information.

    In April of 1982 (I believe it was -- I saved no record of the year,
    but I know it was April), I began posting those lists regularly,
    sometimes weekly, sometimes monthly; the longest break was for 4
    months a few years ago when I was recovering from pneumonia and poor
    personal time management. (Tellingly, only a few people noticed the
    lack of postings, and almost all the mail was "When will they come
    out?" rather than "Did something happen?") As time went on, people
    began to attach far more significance to the posts than I really
    intended. It was flattering for a very short time, and a burden for
    most of the rest; there is no telling how much time I have devoted
    over the last decade to answering questions, editing the postings, and
    debating the role of newsgroup naming, to cite a few topics. I really
    tired of being a "semi-definitive" voice.

    Starting several years ago, at about the time people started pushing
    for group names designed to offend or annoy others, or with a lack of
    concern about the possible effects it might have on the net as a whole
    (e.g., rec.drugs and comp.protocols.tcp-ip.eniac) I began to question
    why I was doing the postings. I have had a growing sense of futility:
    people on the net can't possibly find the postings useful, because
    most of the advice in them is completely ignored. People don't seem
    to think before posting, they are purposely rude, they blatantly
    violate copyrights, they crosspost everywhere, use 20 line signature
    files, and do basically every other thing the postings (and common
    sense and common courtesy) advise not to. Regularly, there are postings
    of questions that can be answered by the newusers articles, clearly
    indicating that they aren't being read. "Sendsys" bombs and forgeries
    abound. People rail about their "rights" without understanding that
    every right carries responsibilities that need to be observed too, not
    least of which is to respect others' rights as you would have them
    respect your own. Reason, etiquette, accountability, and compromise
    are strangers in far too many newsgroups these days.

    I have finally concluded that my view of how things should be is too
    far out-of-step with the users of the Usenet, and that my efforts are
    not valued by enough people for me to invest any more of my energy in
    the process. I am tired of the effort involved, and the meager --
    nay, nonexistent -- return on my volunteer efforts.

    This hasn't happened all at once, but it has happened. Rather than
    bemoan it, I am acting on it: the set of "periodic postings" posted
    earlier this week was my last. After 11 years, I'm hanging it up.
    David Lawrence and Mark Moraes have generously (naively?) agreed to
    take over the postings, for whatever good they may still do. David
    will do the checkgroups, and lists of newsgroups and moderators
    (news.lists), and Mark will handle the other informational postings
    (news.announce.newusers).

    I'm not predicting the death of the Usenet -- it will continue without
    me, with nary a hiccup, and six months from now most users will have
    forgotten that I did the postings...those few who even know now, that
    is. That is as it should be, I suspect. Nor am I leaving the
    Usenet entirely. There are still a half-dozen groups that I read
    sometimes (a few moderated and comp.* groups), and I will continue to
    read them. That's about it, though. I've gone from reading all the
    groups to reading less than ten. Funny, though, the total volume of
    what I read has stayed almost constant over the years. :-)

    My sincere thanks to everyone who has ever said a "thank you" or
    contributed a suggestion for the postings. You few kept me going at
    this longer than most sane people would consider wise. Please lend
    your support to Mark and David if you believe their efforts are
    valuable. Eventually they too will burn out, just as the Usenet has
    consumed nearly everyone who has made significant contributions to its
    history, but you can help make their burden seem worthwhile in
    between.

    In closing, I'd like to repost my 3 axioms of Usenet. I originally
    posted these in 1987 and 1988. In my opinion as a semi-pro
    curmudgeon, I think they've aged well:

    Axiom #1:
    "The Usenet is not the real world. The Usenet usually does not even
    resemble the real world."
    Corollary #1:
    "Attempts to change the real world by altering the structure
    of the Usenet is an attempt to work sympathetic magic -- electronic
    voodoo."
    Corollary #2:
    "Arguing about the significance of newsgroup names and their
    relation to the way people really think is equivalent to arguing
    whether it is better to read tea leaves or chicken entrails to
    divine the future."

    Axiom #2:
    "Ability to type on a computer terminal is no guarantee of sanity,
    intelligence, or common sense."
    Corollary #3:
    "An infinite number of monkeys at an infinite number of keyboards
    could produce something like Usenet."
    Corollary #4:
    "They could do a better job of it."

    Axiom #3:
    "Sturgeon's Law (90% of everything is crap) applies to Usenet."
    Corollary #5:
    "In an unmoderated newsgroup, no one can agree on what constitutes
    the 10%."
    Corollary #6:
    "Nothing guarantees that the 10% isn't crap, too."

    Which of course ties in to the recent:

    "Usenet is like a herd of performing elephants with diarrhea --
    massive, difficult to redirect, awe-inspiring, entertaining, and a
    source of mind-boggling amounts of excrement when you least expect
    it." --spaf (1992)

    "Don't sweat it -- it's not real life. It's only ones and zeroes."
    -- spaf (1988?)

    --
    Gene Spafford, COAST Project Director
    Software Engineering Research Center & Dept. of Computer Sciences
    Purdue University, W. Lafayette IN 47907-1398
    Internet: spaf@cs.purdue.edu phone: (317) 494-7825
    ===

  4. But they're still missing the important stuff... by dghcasp · · Score: 2, Interesting
    There are no posts in rec.humour by Minas Spetzakis (ca ~1992.)

    Since he's immortalized in the Net Legends FAQ, it's a shame there are few examples of his jokes, other than in our memories.

    And now, the Minas'ized version of this post:

    Friend says to me, "See Google because they have many funny posts." I search for my name and find out I am being a kook. Friend says "Legendary!"

  5. Re:Just think... by Cowculator · · Score: 2, Interesting

    They did leave out this first mention in 1991 of a certain kernel, though, which Linus obviously remembered just a few months later in his own first.

    To quote another /. poster via the article about how embarrasing things like this are, "It's like having naked baby pictures of yourself stapled to your forehead when you walk around"...

  6. Strange New Google Service by kisrael · · Score: 4, Interesting

    Google has a history of doing a lot of things right, but I have my doubts about their new service: catalogs.google.com. It's a search engine for graphically scanned in versions of mail order catalogs! You type in sewing machine, say, and you get 3 views for each match: a scan of the catalog cover, a scan of the page, and a close up of the page, with the search terms highlighted in yellow.

    It's so retrofuture weird! Like what someone on a C=64 in the 1980s might think a future of online shopping would look like...

    --
    SO YOU'RE GOING TO DIE: The Comic for Dealing with Death
  7. USENET -- works in practice, but not in theory by GGardner · · Score: 4, Interesting

    I find it very interesting that in the last 10 years of USENET, it's traffic (and presumably use) have grown dramatically. However, the number of servers has, I believe, dropped equally dramatically. USENET was one of the most distributed systems I remember using, with it's shared-nothing, "flood-fill" algorithm.

    Yet, as it scales up to more and more messages, it actually is becoming less distributed. A good lesson for all the futurists forcasting the rise of distributed systems...

  8. Star Wars.... what was he thinking? by kaladorn · · Score: 2, Interesting

    And with Attack of the Clowns on the way, with (rumor has it) NSync in it, and more Jarhead Bites, you certainly have to wonder if Lucas is having "Sellout" tatoo'd on his forehead....

    Now, LOTR may not have been perfect, but at least it was reasonably true to the book (hence a decent story) and showed what you can do with a good story. In this instance, we have Lucas busily destroying the mystique and the depth built up in the first SW trilogy (well, first in terms of release date).

    I waited outside for a few hours to get tickets to EP1. I'll wait till a while after the premier to see this next film. If it is as disappointing, I'll wait for EP3 maybe longer than that. George, this is not the way to go about prying Imperial Credits from my wallet....

    --
    -- Mal: "Well they tell you: never hit a man with a closed fist. But it is, on occasion, hilarious."
  9. Henry Spencer... by Jacco+de+Leeuw · · Score: 4, Interesting
    Ironically, Henry Spencer is also the lead programmer for the Linux IPSEC stack FreeS/WAN (encrypted and secret communication).

    While also saving the Usenet archives (public and widely dispersed information)..!

    --
    -------
    Warning: Slashdot may contain traces of nuts.
  10. That little? by man_ls · · Score: 4, Interesting

    So let me get it straight. 9 years of USENET posts occupy only 16.8GB of hard disk space?

    You sure those 10-inch magnetic tapes weren't 1200MB or 120GB or something? Hell, a converted VCR using VHS as a backup medium can store like 100GB (saw one somewhere, I forget the link.)

  11. God how foolish people look by Kingfox · · Score: 3, Interesting

    This is downright scary.
    Nothing like looking through the archive to see an old post from a skilled sysadmin friend asking a basic question in the wrong group years ago.
    Nothing like seeing delusional inane posts you wrote while in high school making you look like an utter twit.
    Nothing like seeing old usenet posts from friends who have died years ago. This is just too creepy for words.

  12. David Wiseman is cool :) by Large+Green+Mallard · · Score: 3, Interesting

    Aside from his good works in the terms of Usenet, David is the reason I am where I am today. 4 years ago, I was stuck in Perth, Australia and very bored. I was reading the student newspaper one day and saw an article about student exchanges. To cut a long story short, 6 months later I was at The University of Western Ontario.

    I had looked over the courses they ran in Computer Science there, and saw one called "Unix and C". Being a bit of a geek and having used unix a *tiny* bit in my high school days, I thought it was be a cool one to take. David was the lecturer for this course. He had a lot of knowledge and passion for the subject, which is unsurprising considering his experiance with all manners of unicies. His classes for CS175a taught me a lot about Unix (and a little about C). I got 92% overall for the unit, an A+ and the highest mark I've ever got for any unit. The next semester I was at Western, I taught myself Perl, using an account on the CS Department servers and on the Reznet linux box a friend had :)

    It was a unit for non comp-sci majors. CS Majors were expected to learn this stuff in a bunch of different classes.
    Sadly, Western no longer offers CS175a - Unix and C. I feel it is a loss to the community as a whole, but at the same time, I understand that a one semester course in Unix and C probably isn't seen as too acedemic by many. Which I think is a shame. Too many universities turn out gimps fluent in one langauge, and one language only - Windows *shudder*. I think it sad that units to teach people how to click mice and use Word can get you acedemic credit, but Unix and C courses don't seem worthy enough to run.

    When my time was up in Canada, I came back to Australia and while I finished my degree, I made money on the side doing CGI scripts in Perl. Then, when my degree was finished, I applied for a job as a System Admin at a department at The University of Western.. Australia. It was the first job I applied for and I got a callback the morning after I had a 70 minute panel interview. Due, in large part, to the stuff I had learnt in David's class, I passed the interview quite well.

    Today, I am 22, earn over AU$40k, I get to play with lots of cool computing and network hardware, and I think it would be safe to say that if I hadn't taken that course with David, I wouldn't be where I am today. I suspect I would have been working as a security guard, making minimum wage, since my degree wasn't actually in Computer Science, but Security Studies. Thinking back, I'm pretty damn glad I did take it ;)

    David's homepage is here

  13. I would KILL for this archive... by pipeb0mb · · Score: 4, Interesting

    Can you still download the archives? If so, where?
    All that info would be incredibly useful!

    What format do you think it would be in? Threaded text or database format or what? How would you read it or search it?

    Also, what do they do with the attachments? Imagine THAT archive. Heh heh heh.

  14. Re:Who owns the posts by graxrmelg · · Score: 4, Interesting

    I did not grant others the rights to my works then. Neither did Bill G.

    You posted your messages on an international network of servers that store messages and provide anyone with access to them. It's a little late to consider them secret. Why does it offend you that someone's storing them and providing anyone with access to them?

    Google Groups is simply a very large and fancy news server that doesn't expire articles, and you implicitly granted permission for your articles to be stored on news servers by posting them in the first place.

    My question is how Google determines whether someone is the real poster of a message. Can just anyone demand the removal of any message they don't like?

  15. SAIL recovery by Animats · · Score: 3, Interesting
    A few years ago, several Stanford CS alumni, including myself, did something like this for the archives of SAIL, the Stanford AI Lab system dating back to about 1970. Old backup tapes still existed, having been recopied around 1990 to 6250 BPI 2400' 0.5" open reel tape. We read in several hundred reels, using an old Sun 3 server. The data was transmitted to Bruce Baumgard at IBM Almaden Research (another Stanford CS alum), who converted it to Unicode (SAIL had a nonstandard character set with extra symbols) and sorted out the files.

    The original SAIL users were contacted, one by one, and offered CD-ROM copies of their files. Where the original users permit, their files will be made publicly available. The permission process is still going on, but the result will be an archive of the early days of AI.