Obtaining Archives of USENET?
Academic Researcher asks: "Google took over Deja and built the mammoth Google Groups, which is a near complete archive of USENET dating back to early 1980s. For an academic project, I need to analyse a lot of USENET data. The Terms of Service for Google Groups will not allow automated access (and even so, I'd have to write a bunch of tools and reverse engineer it all). Inquiries about purchasing copies of the archive have gone unanswered. No one else seems to have such an archive. Apart from this meaning that world has one single USENET archive (I hope they have backup floppies!), how can I obtain historical data for research purposes ? I'd happy pay money for DVD's of archival material if they were available. Can anyone help"
http://groups.google.com
Search: *
Then save to file.
DONE!
For every annoying gentoo user, are three even more annoying anti-gentoo crybabies. Take Yosh from #Gimp for example.
You REALLY just want to get all that hot porno and wares and MP3z!!! We can see right through your story! I bet it's "seminal research" too.....
Ron Paul 2012
how can I obtain historical data for research purposes ?
Translation:
How can I build the largest pr0n collectin ever?
---
I support spreading santorum
But if you follow it up with masses of "C'moooooooon" they generally just give in.
I suppose that if you sent it to the wrong department then it would take *much* longer.
Well, it's not the department actually.
What you need to do is fill in the X-Meta tag in your email (or if you use Microsoft email products, the <HEAD> <META> tags) with keywords describing your email. "Hot amine poon-tang" is often a useful meta tag.
Then, you need to get a lot of people to link to your email (that is, refer to it in their email) with a X-References tag.
You can do this just by getting it popular on a mailing list, but even more effective is to post it to a number of bloggers. Bloggers, being how they are (self-important but afraid of not being noticed), will endlessly refer to it, and will probably get into interminable "blog spats" about it -- well, to be honest, about something completely tangential to it, but what do you care? It pushes your "Mail-Rank" score up regardless.
If you can't do that, try contacting "MailKing", a commercial service that sends out a lot of email, purportedly from unrelated individuals, which refers to your email. They'll make it look like your email is really important.
When your "Mail-Rank" is high enough, Google will have no choice but to notice it and reply.
Best of luck getting your email noticed!
Opinions on the Twiddler2 hand-held keyboard?
"Inquiries about purchasing copies of the archive have gone unanswered"
So, yes, I guess this qualifies as a dumb question.
OK, so say we keep this archive for a year. That's 730 x 160gb hard drives. Forget internet bandwidth; forget LAN bandwidth; where the fuck are you going to get enough hard drive controller bandwidth to be able to search such a monster?
... I should have an answer by 2012."
"OK, archive, give me the md5 hashes of every article posted between the hours of 10:00 and 11:00 (except Wednesdays) during 2002. Don't forget the porn. Count how many of the hashes contain both the hex strings 'DEAD' and 'BEEF'."
"Hmm, I'll have to get back to you
It's just not that useful. 106 TB/year, mostly porn and warez.
I don't see how you can reconcile those two sentences.
To make laws that man cannot, and will not obey, serves to bring all law into contempt.
--E.C. Stanton