Slashdot Mirror


SpamArchive.org Launched

An anonymous reader writes "SpamArchive.org has just been launched. SpamArchive.org is a community resource that provides a database of known spam to be used for testing, developing, and benchmarking anti-spam tools. The goal of this project is to provide a large repository of spam that can be used by researchers and tool developers. In the past, there were a few small personal spam archives that were used. There was no large set of spam that could be used to test new anti-spam algorithms. Thus, developers could not sufficiently test their techniques across a range of messages. Also, the lack of a "standard" sample of spam made it difficult to effectively benchmark anti-spam tools."

269 comments

  1. So... by Markus+Landgren · · Score: 5, Funny

    Do they have a mailing list I can sign up for if I want to get updated by e-mail?

    1. Re:So... by RyoSaeba · · Score: 5, Insightful

      LOL, want'em to forward every new spam they receive ?
      Don't you have enough already ? ^_^

      Seriously, this sounds like a great idea.

      I can see a few technical troubles to catalog spam, though.
      Most obvious is that usually spam is personalized, that is the recipient's mail address (or part of it) often appears either in the subject or in the body. So will this archive store every variant of every spam, or just a 'global' model ?
      Also need to define how catalog tools are supposed to access the archive, ie: grab from url ? ftp text file ?

      And in any case, until spam filters are hooked directly on the smtp mail server itself, users will still have to take the time to configure their anti-spam tool, launch it regularly to clean the mailbox, and so on...

      For instance Mozilla will incorpore spam filters, but from what i got you'll still have to download that freaking spam before it gets filtered, which can take some time if those are big spams (like viruses or such).

      Ok, it sure beats having legitimate mails removed from the server without our knowledge...

      Just my 2 cents of euro.

      --
      Tsuyoikoto ha taisetsu da ne, dakedo namida mo hitsuyousa (Strength is an important thing, but tears too are necessary)
    2. Re:So... by stevenp · · Score: 4, Funny

      > Do they have a mailing list I can sign up for if I want to get updated by e-mail?

      No, but you can open a Hotmail account and receive a dayli dose of UP-TO-DATE spam message FOR FREE.

    3. Re:So... by Anonymous Coward · · Score: 0

      I've got two @msn.com accounts, and one @hotmail.com account. At most, I'll get two to three spam mails a week. I get more then that on my isp account (@attbi.com).

    4. Re:So... by Arker · · Score: 3, Informative

      If you want to get a lot of spam to test your filters with, just check the archives of NANAS on Usenet. What precisely this new thing does that a spider of that archive couldn't give you I don't know.

      --
      =-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Friends don't let friends enable ecmascript.
    5. Re:So... by Arker · · Score: 3, Interesting

      I've got two @msn.com accounts, and one @hotmail.com account. At most, I'll get two to three spam mails a week. I get more then that on my isp account (@attbi.com).

      I don't believe you.

      I'll tell you why. First, my mom has an MSN account, and it's overloaded with spam daily. Now granted, that may be her own damn fault, she could have given it out in ways she shouldn't, etc. But, I also have a hotmail account. I made it a few months ago solely to have a login to the MSN chat thingy because one particular client wanted to contact me that way. I was very careful to make sure that I read every page during sign up, and un-checked all the appropriate boxes - I opted in to NOTHING. I NEVER gave it to ANYONE, I never posted it anywhere, I never even logged into it, I only know about the email that hits it because the chat program tells you how many new mails you have when you sign in. I haven't used that either in awhile, but two weeks after creating the account, it had over 380 new messages.

      So I must say your claim is quite unbelievable.

      --
      =-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Friends don't let friends enable ecmascript.
    6. Re:So... by plumby · · Score: 5, Interesting
      It may partly depend on what user name you picked. I've got two email accounts with my ISP, neither of which I've ever given to anyone. One has a common surname as the account name. The other has a collection of random gibberish as username. The first one recieves several spam messages per day. The other one has probably recieved one in the last 3 months.

      I guess that the spammers quite probably have a standard list of common names that they put in front of @hotmail.com, @aol.com, etc.

      As a tip, though, I've just set my spam levels on hotmail to only recieve emails from people that are in my address book. I've not got a single spam on that account (except from MS themselves) since I did that.

    7. Re:So... by Arker · · Score: 1

      But my username at hotmail is not a common first or last name, it does contain a word from the dictionary, barely, but it's also got numbers added. So I think they are doing more than running a list of common names, they have to be doing at least a dictionary attack with numerical additions. And I wouldn't be surprised a bit if MS cooperates with the larger ones, and/or spams me themselves despite what the checkmarks on registration say.

      --
      =-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Friends don't let friends enable ecmascript.
    8. Re:So... by wheany · · Score: 3, Interesting

      I made a hotmail account that has a long username by repeating my "real" username several times. That way it is pretty safe from aaaaaaa, aaaaaab -type attacks. I've gotten 0 spams so far.

    9. Re:So... by bob · · Score: 1

      My mom also has an @msn.com account. Now, that account is in her husband's name, and he doesn't even know how to use a computer, so the primary address under that account is never used. My mom set up an additional email address for herself, and she uses that. The thing is, the primary email address for the account -- that never gets used for anything -- gets tons of spam. But the secondary email address -- that my mom uses all the time -- gets none. I'll let you draw your own conclusions on this one.

    10. Re:So... by Anonymous Coward · · Score: 0

      To short circuit things, if you are a spammer please send all of your spam to:

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

      submit@spamarchive.org

    11. Re:So... by TheTomcat · · Score: 2

      I'd have to agree with the other reply(ies) to this thread. I have a hotmail account created solely for the purposes of MSN messenger signup. The only mail I receive PERIOD is hotmail service spam. And that's one every month or so, and it's directly FROM hotmail. It depends on the username chosen.

      S

    12. Re:So... by swb · · Score: 2

      I've gotten spam like that to accounts that have zero usage or knowledge by others.

      I've assumed that:

      1) Spammers do random generation sends

      2) Spammers harvest the left hand side of email addresses and the hit big ISPs with the dictionary of usernames @bigISP.com. This would make sense, since lots of people are free with their username on web forums or usenet, assuming that guarding the RHS of their address is enough.

      3) ISPs and service providers that claim they will never spam are lying or at least internally rationalizing the stretched-to-breaking privacy policies so they can sell email addresses.

    13. Re:So... by Strog · · Score: 1

      Set your hotmail to exclusive in the junk filters. Then you only receive email from those on your list. This sounds like it would work well for since you aren't really using it for email anyway.

    14. Re:So... by Slime-dogg · · Score: 1

      yeah, I've got a couple accounts like that too.

      One, figgy4@prodigy.net gets so much spam that I'd actually post it to a web page. heh heh. (300 mails in 2.5 days).

      The other one, a hotmail account, is 11 characters long, and I don't use it for anything but personal communication. It is not common, in that it is the misspelled name of an element. I never recieve spam at that account.

      My Attbi account started recieving spam from the topica.com domain, and after numerous complaints to abuse@attbi.com, as well as topica's admin... I just started signing topica's admin address to as many spam companies as possible. Soon after that, my spam problem from topica went away. heh heh.

      --
      You need to restart your computer. Hold down the Power button for several seconds or press the Restart button.
    15. Re:So... by Arker · · Score: 1

      Thanks for the suggestion, but I don't need to fix it, it's not a problem, and I wasn't complaining. I don't care how much spam hits that account because I never use it, for anything, I never even log into it. Nothing sent to it is of any importance to me whatsoever. It's just a datapoint regarding how much spam a hotmail account receives by default.

      --
      =-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Friends don't let friends enable ecmascript.
    16. Re:So... by Strog · · Score: 1

      I've had hotmail since before Microsoft did. It was pretty good for years but about 6 months ago I had to go to exclusive because of 150 spams a day. I just use it messenger now to talk to the rest of the family.

    17. Re:So... by Anonymous Coward · · Score: 0

      THE NATURE OF FREEDOM
      93. We are going to argue that industrial-technological society cannot be reformed in such a way as to prevent it from progressively narrowing the sphere of human freedom. But because "freedom" is a word that can be interpreted in many ways, we must first make clear what kind of freedom we are concerned with.

      94. By "freedom" we mean the opportunity to go through the power process, with real goals not the artificial goals of surrogate activities, and without interference, manipulation or supervision from anyone, especially from any large organization. Freedom means being in control (either as an individual or as a member of a SMALL group) of the life-and-death issues of one's existence; food, clothing, shelter and defense against whatever threats there may be in one's environment. Freedom means having power; not the power to control other people but the power to control the circumstances of one's own life. One does not have freedom if anyone else (especially a large organization) has power over one, no matter how benevolently, tolerantly and permissively that power may be exercised. It is important not to confuse freedom with mere permissiveness (see paragraph 72).

      95. It is said that we live in a free society because we have a certain number of constitutionally guaranteed rights. But these are not as important as they seem. The degree of personal freedom that exists in a society is determined more by the economic and technological structure of the society than by its laws or its form of government. [16] Most of the Indian nations of New England were monarchies, and many of the cities of the Italian Renaissance were controlled by dictators. But in reading about these societies one gets the impression that they allowed far more personal freedom than out society does. In part this was because they lacked efficient mechanisms for enforcing the ruler's will: There were no modern, well-organized police forces, no rapid long-distance communications, no surveillance cameras, no dossiers of information about the lives of average citizens. Hence it was relatively easy to evade control.

  2. wow by gomerbud · · Score: 2, Funny

    I should just gzip my mbox and send it to them. That'll give them years of research material.

    --
    Kan jeg få en pils, vær så snill?
    1. Re:wow by isorox · · Score: 2

      I should just gzip my mbox

      bzip2 would shae a few gigs off mine

    2. Re:wow by vb.warrior · · Score: 0

      Still wouldnt make you any more popular or sociable would it though...

  3. A hotmail account is just as good by Anonymous Coward · · Score: 5, Funny
    There was no large set of spam that could be used to test new anti-spam algorithms


    Whoever wrote this obviously doesn't have a Hotmail account.
    1. Re:A hotmail account is just as good by Anonymous Coward · · Score: 1, Interesting

      Having a hotmail account for a year now and receiving about 1 spam mail every 2 weeks on it (as opposed to my ISP account with 2 a day or so) i honestly have no idea what you're talking about

    2. Re:A hotmail account is just as good by Anonymous Coward · · Score: 1, Informative

      Actually, I have a hotmail account JUST FOR MESSENGER. I never give out this address. And yet, even with the spam filters turned on, it was almost always showing "1 new message". Finally I switched it to "allow only messages from people on this list", and with an empty allow-list, I haven't got any spam so far.

  4. Hard to get worked up about that by RebRachman · · Score: 5, Interesting

    Even I know how to buy a domain name and write a few paragraphs of text on a white background. There is nothing about this archive to hint at its origin or credibility. This is a /. worthy story?

    1. Re:Hard to get worked up about that by Paul+03244 · · Score: 0, Troll

      Amen

    2. Re:Hard to get worked up about that by Anonymous Coward · · Score: 1, Funny

      Let's see how well it survives slashdot. That will tell us more about how good the admins are :)

    3. Re:Hard to get worked up about that by arvindn · · Score: 5, Informative

      Even I know how to buy a domain name and write a few paragraphs of text on a white background.
      But you didn't, did you?

      This is a /. worthy story?
      You're missing the point. The story is not on /. because something revolutionary has been done, but because the huge number of /. readers can get together and create a useful database. Obviously it would be no good if no one knew about it. In a sense, the story is worthy because it got on /. :) Kind of a reverse Catch-22, if you like.

      What you can do:
      • Help them implement their automated spam review scripts. As with any project, they need volunteers.
      • Make sure you send them a copy of all the spam you receive. From their page:
        SpamArchive.org's efficiency is proportional to the amount, quality, and variety of spam that is provided. End users can forward known spam to submit@spamarchive.org.
    4. Re:Hard to get worked up about that by jjl · · Score: 1

      It's not about the ease of doing that, but about the will of putting up it and wanting to put time and resources in it.

      --
      --
    5. Re:Hard to get worked up about that by Anonymous Coward · · Score: 1, Informative

      SpamArchive.org's efficiency is proportional to the amount, quality, and variety of spam that is provided. End users can forward known spam to submit@spamarchive.org.

      I expect they mean efficacy.

    6. Re:Hard to get worked up about that by MWelchUK · · Score: 1

      Slashdot has got a white background by default to! Wow, it is easy.

    7. Re:Hard to get worked up about that by RebRachman · · Score: 5, Insightful

      The point is that if they want to do a spam archive, you would expect them to do some minimal research. This page clearly shows that SpamArchive.org has not done the following basic background work:

      1. Told me who they are so that I might trust them.

      2. Told me anything about their technology/database so that I might know if it is really going to be useful. For all I know they haven't even thought about the collection, storage and retrival issues behind dealing with this.
      3. Collected the archives supposedly uncoordinated that already exist and collated them.
      4. Added even one link to a relevant site. You would assume that to undertake such a project they would at least have visited a few sites before concluding there was nothing out there. Posting couple of relevant URLs wouldn't be too much work.

      In short, I am not impressed that someone who can do 20 minutes of work is the same someone who can undertake the huge project proposed here. It looks like they think that somehow all they need is for people to send them information by e-mail, and for a few other people to volunteer to do the work. Not a promising start.

    8. Re:Hard to get worked up about that by rastachops · · Score: 1

      People who use FP are generally net newbies. Are they really all that trustable?

    9. Re:Hard to get worked up about that by jez9999 · · Score: 1, Redundant

      20 MINUTES of work? I could do that webpage in 1 minute.

    10. Re:Hard to get worked up about that by Anonymous Coward · · Score: 0

      Dude, if the text wasn't black on white, the /. effect would have kicked-in.

      -Marton

    11. Re:Hard to get worked up about that by bobdole34 · · Score: 0

      Im gonna have to agree. Its written in MS frontpage no less.

      --
      "Failure of Windows operating systems is extremely rare. If it happens, it is usually due to operating system file c
  5. Database? by dat00ket · · Score: 5, Funny

    Can't researchers just set up their own hotmail account?

    Seems cheaper.

    1. Re:Database? by JessLeah · · Score: 3, Funny

      Ahh, but think of the fees they'd have to pay Microsoft for all that extra storage ;)

      After they carefully posted the new Hotmail address all over the Web, they'd blow their quota in around 12 hours. :)

    2. Re:Database? by stevenp · · Score: 2, Informative

      The learning mechanisms for detecting spam, like the Bayesian classification require a large amount of messages to build a good spam detection profile. The average 500 message JunkMail folder is not big enough for the purpose.

    3. Re:Database? by Anonymous Coward · · Score: 0

      Nowadays you don't even have to post a hotmail address anywhere to get spam.

    4. Re:Database? by jez9999 · · Score: 2

      The average 500 message JunkMail folder is not big enough for the purpose.

      What? If a Bayesian script was having to go through significantly more than that per e-mail to check whether it was spam, you'd be waiting minutes just to get your e-mail classified.

    5. Re:Database? by Gabe+Garza · · Score: 1
      That's not how a "Bayesian script" (at least a correctly implemented one...) works. :) You don't have to scan your entire training set every time you analyze an email, what you do is analyze the training set (once) and build a datastructure from it that you can access efficiently.

      For example, in a Bayesian filter you're probably interested in word frequency. So you'd go over the training set once and build a hash table that maps a word to the number of times it's occured and also increment a counter of the total number of words. You can incrementally update this table as more spams come in, of course. And since the table can be saved and restored from disk without too much work, you only have to analyze a given spam message once.

    6. Re:Database? by spongman · · Score: 2

      not so at all. I have been using the excellent, free spambayes filter and it works remarkably well, even for small spam corpora. 500 spams is plenty.

  6. I can picture a future... by JessLeah · · Score: 3, Funny

    ...where wizened historians wearing horn-rimmed spectacles will sit, hunched over computers, studying the archives of ancient spam.

    "This one mentions sex... apparently, sex was a preoccupation of the early twenty-first century..."

    1. Re:I can picture a future... by larien · · Score: 2

      You say that like it's a bad thing...

  7. archive overload by ndevice · · Score: 2, Interesting

    Asking for a slashdotting is one thing, but asking to be an archive for spam is another.

    I wonder if anyone knows just how much of the stuff is out there, and if it's even possible to store all that. Of course, spam being mostly duplicates and all, maybe they have a chance. But with spammers staying ahead of the game and rotationg their text, I wouldn't count on it.

    On the other hand, why not just set up a couple of hotmail accounts, bait them a bit, and just watch the spam come in? Why even bother asking for it?

  8. Trade Spam! by Pathwalker · · Score: 5, Funny

    Now that spam is so collectable, someone should start a service to let people trade it?

    What will someone give me for my rare "Help fund the freedom fighters in Chechnya!" complete with numbered bank accounts to send donations to?

    1. Re:Trade Spam! by EmagGeek · · Score: 1

      I'll trade you my 1997 "Penile Enlargement" rookie card^H^H^H^Hspam for it... this one is the original!

    2. Re:Trade Spam! by Surak · · Score: 3, Funny

      Now that spam is so collectable, someone should start a service to let people trade it?

      Yeah, it's called 'Gnutella'. :-P

    3. Re:Trade Spam! by leecho · · Score: 0

      How long before spammers file for DMCA?

  9. Sounds like a good idea, but by Anonymous Coward · · Score: 0

    but what use will it be if the anti-spam tools it helps develop can't adapt to new forms of spam. It is a good idea to build an archive of old spam, but what about the new spam that it will ultimately give rise to. Just like any biological system they will adapt or die. Hopefully DIE! But if not then they will be more annoying than ever.

  10. Tell everyone! by some+guy+I+know · · Score: 5, Funny

    I think that they should send email out to everybody describing this great service!

    --
    Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
    1. Re:Tell everyone! by RyoSaeba · · Score: 1

      And get cataloged as spam senders ?
      Would be easy for'em to catalog their own spam, though !

      --
      Tsuyoikoto ha taisetsu da ne, dakedo namida mo hitsuyousa (Strength is an important thing, but tears too are necessary)
    2. Re:Tell everyone! by Anonymous Coward · · Score: 0

      Do you ride the short bus to school?

  11. Who are these guys? by gomerbud · · Score: 5, Interesting

    Dude, i could have registered a simlar domain and put up a comparable web page within a matter of hours. I hope they really exist.

    Wouldnt it be great if the submit email address was forwarded to someone's ex girlfriend? Thats the ultimate form of revenge...

    1) Register domain name.
    2) Put up web page advertising some kind of anti-spam database.
    3) Forward all email sent to the submit address to someone you dont like.
    4) Get slashdotted.

    The end result is that three million people send 100 spams the first hour to the submit address. Within a short amount of time, your foe has 300 million emails in his/her mailbox. Now that's spam.

    --
    Kan jeg få en pils, vær så snill?
    1. Re:Who are these guys? by EmagGeek · · Score: 1, Funny
      Dude, i could have registered a simlar domain and put up a comparable web page within a matter of hours. I hope they really exist. Wouldnt it be great if the submit email address was forwarded to someone's ex girlfriend? Thats the ultimate form of revenge... 1) Register domain name. 2) Put up web page advertising some kind of anti-spam database. 3) Forward all email sent to the submit address to someone you dont like. 4) Get slashdotted.

      Don't forget...

      5) ???
      6) PROFIT!

    2. Re:Who are these guys? by Anonymous Coward · · Score: 0

      More likely, thousands of people forward all their spam to these addresses without removing their email addresses from the message - providing the guys who run spamarchive.org with a ton of new addresses to spam.

      I remember when Slashdot ran an article years ago about a site called Anti SPam or something. You registered with them and they helped you avoid spam by altering a message in a certain way or something (it's been a long time, so I forget).

      It was exactly as I suspected back then though -- it was a scam to collect a ton of email addresses. I used a unique address with that site and I still get about two dozen spams per day to that address that I only used at that slashdot advertised site - and this is three or four years afterward.

    3. Re:Who are these guys? by Anonymous Coward · · Score: 0

      You sir, are an evil genius!

    4. Re:Who are these guys? by piranha(jpl) · · Score: 1
      Wouldnt it be great if the submit email address was forwarded to someone's ex girlfriend? Thats the ultimate form of revenge...

      Or... you could just send your own personal collection of 100-1000 spam messages over and over to the target, randomizing headers to avoid filtration.

      If your scenario was correct, the spamarchive.org's mail servers would still be used to originate mail. Even if someone else sent it, it's SA.O's IP address that is making connections to the target's MX(s). And it also would consume SA.O's own bandwidth. Just like if you sent it yourself. (In fact, more of their bandwidth would be used for forwarding than originating, since you have to receive the message as well as send it.) That, uh, puts you in a bit of a bad situation with your ISP.

      Pretty silly idea.

    5. Re:Who are these guys? by theLOUDroom · · Score: 1

      Not necessarily. My registrar offers free email forwarding. If they set it up that way, they can point it wherever they want. But if they do it that way, the to address will be whatever@spamarchive.org or something. Still easy to figure out where it's coming from.

      --
      Life is too short to proofread.
    6. Re:Who are these guys? by Corporate+Troll · · Score: 2, Insightful
      Much easier:
      • Set up sendmail
      • Make script that sends a mail out of a random collection of SPAM, goatse.cx pictures and viruses. Make sure that the FROM: fields is faked
      • For the paranoid: use free dial-up ISP in order to cover your traces.
      • Set script in cronjob and let it run every minute. (or run put the script in infinite loop)

      Your ex is gonna love you for that. Not that *I* ever do such things... Don't be astonished if your car is keyed the next day, by the way.

    7. Re:Who are these guys? by eX-fly · · Score: 1

      Well, I did a WHOIS and found:

      Domain Name: SPAMARCHIVE.ORG

      Registered Through....: 1cheapdomains
      Created on............: Sep 28, 2002 6:07:24 AM
      Expires on............: Sep 27, 2003 3:59:34 PM
      Record last updated on: Sep 28, 2002 6:07:24 AM

      Owner, Administrative Contact, Technical Contact, Billing Contact:
      Guru Rajan (ID00024772)
      11475 Great Oak Way
      Suite 210
      Alpharetta, GA 30022
      us
      Phone: +1.6789699399
      Email: guru.rajan@ciphertrust.com

      Hell, you can even call him if you'd like ;-)

    8. Re:Who are these guys? by Dog+and+Pony · · Score: 2

      Well, yeah, but that is so lame.

      A scheme like this would have style. :)

    9. Re:Who are these guys? by Corporate+Troll · · Score: 1

      Well, yes, my technique is a bit less refined... but definately gets more "nerd points". Reliable, cheaper (no domain needed) and more efficient.
      Anyway: the weak point in the spamarchive-as-revenge-technique is that point 4 is uncertain. How do you get slashdotted? Don't forget that what is posted on the frontpage is at the whim of the editors. You can sumbit a good story, have it rejected, and then two months later that story pops up on the frontpage submitted by someone else.
      So how do you get your website posted on slashdot? I don't think there is a safe way to be absolutely certain.

    10. Re:Who are these guys? by signer · · Score: 1

      I just checked out ciphertrust on Dogpile. Guess what? They're an email security company that makes spam filters, based in Atlanta! Looks like this might be for real after all...

      --

      Independent musicians and registration-free net radio at EmergentSound

    11. Re:Who are these guys? by johnnliu · · Score: 1


      >Dude, i could have registered a simlar domain and put up a comparable web page within a matter of hours. I hope they really exist.

      Don't know if you'll survive a slashdotting though... :)

    12. Re:Who are these guys? by FunkyChild · · Score: 1

      5) ???
      6) Profit!

  12. Oh i thought it was a collection.... by phunhippy · · Score: 3, Interesting

    Damn!
    And there I was thinking they were creating a historical archive of all the funny worthless spam we get in our mailboxes every day...

    See that could turn spam in to a fun thing! set up a site where spam is ranked most popular by the number of people forwarding in the same SPAMS they get.. i think it would be interesting to see a daily/hourly/weekly TOP 10 SPAM in the world graphs..

    I would do this myself.. cept i suck at html.. anyone need a VoIP network built? :)

    1. Re:Oh i thought it was a collection.... by martin-boundary · · Score: 3, Funny

      Wow! And if the site becomes popular, you could start putting up banner ads, and maybe a couple of pop-overs and pop-unders. MAKE MONEY REAL FAST! ;0)

    2. Re:Oh i thought it was a collection.... by phunhippy · · Score: 2

      Wow! And if the site becomes popular, you could start putting up banner ads, and maybe a couple of pop-overs and pop-unders. MAKE MONEY REAL FAST! ;0)

      hey! i did'nt even think about that!! maybe some flash adds to that cover the screen? tell you what! your now Sr. Vice President in charge of marketing! Call X-10 RIGHT AWAY!!!!

    3. Re:Oh i thought it was a collection.... by martin-boundary · · Score: 1

      YES! This is going to be even better than the stray-hair-in-pet-food-removal-service-over-the-in ternet idea I had. W A Y T O G O! Let's have a business party^H^H^Hmeeting!

    4. Re:Oh i thought it was a collection.... by Anonymous Coward · · Score: 0

      YES! This is going to be even better than the stray-hair-in-pet-food-removal-service-over-the-in ternet idea I had. W A Y T O G O! Let's have a business party^H^H^Hmeeting!

      Lets learn how to use the preview button buddy

  13. recycled spam by ndevice · · Score: 2, Insightful

    With some people already accusing bugtraq of being a repository for exploits that anyone could use for exploit purposes, you'd think that the same could happen to the spam archive.

    Soon we'll see old spam being recycled as the new breed of spam trolls mine the archive for inspiration - and maybe just material reuse.

    Then, of course, it's not like we don't see recycled spam anyway, so maybe this isn't such a bad thing...

    (And if I sound incoherant, it's 2 in the morning. I should be sleeping.)

  14. They're asking for trouble by EmagGeek · · Score: 3, Insightful
    Is this really necessary? I mean, come on, how hard is it to find spam for research? Most people get more spam than their Hotmail inbox can handle just for signing up for the account. All a researcher has to do is start clicking the "Remove Me" link in those emails and he or she will have more spam than he or she knows what to do with!

    Combine that with posting to some anti-spam newsgroups with their real email address, and bingo boingo, all the spam in the world will come right to them.

    This site also creates a problem in that only the spam posted to that site might be used for research. There might be millions of spam emails overlooked because they don't make it onto that site. Think of those poor spammers that won't get filtered :)

    Won't someone please think of the children!?!?

    1. Re:They're asking for trouble by redshift-systems · · Score: 1

      Good point, how hard can it be to find a bunch of spam? Sounds like they just couldnt be bothered digging anything up themselves, so go out and get the community to do the leg work.

      Anyway, seems a shame to just use it for anti-spam algorithm research, why not put it to good use and create API's from the database, then email apps can update definitions on a regular basis, similar to anti-virus software. Now that would make sense.

    2. Re:They're asking for trouble by piranha(jpl) · · Score: 2, Insightful
      Is this really necessary? I mean, come on, how hard is it to find spam for research? Most people get more spam than their Hotmail inbox can handle just for signing up for the account. All a researcher has to do is start clicking the "Remove Me" link in those emails and he or she will have more spam than he or she knows what to do with!

      Wrong. I've been setting up bogus e-mail accounts on a domain created exlusively for spam research/testing. I've gone through at least a dozen "unsubscribe" links and never received one spam out of it to those test accounts. Perhaps the spammers only highlight records for people who "unsubscribe" when those people were in their database in the first place.

      (The most spam I've received so far in one of these test accounts was from signing up to freefootfetishezine.com.)

      This site also creates a problem in that only the spam posted to that site might be used for research. There might be millions of spam emails overlooked because they don't make it onto that site. Think of those poor spammers that won't get filtered :)

      That doesn't make sense; they might not get a good sample of the spam if they don't solicit samples, just as much as they might not get a good sample if they do. It makes more sense that they would get more spam--and more diverse spam--from soliciting examples. Consider that submitted samples would come from all over the world, from a variety of sources, and in a variety of languages.

  15. Imagine by Anonymous Coward · · Score: 0

    Imagine a b....
    Oh forget it

  16. Anti-intuitive archive! by krazyninja · · Score: 2
    Well, now that all the possible spam is archived in one place, we can expect spammers to find out new methods of spamming, which are not in the archive. The people who are behind this, (no names, no addresses mentioned in the site) would do well instead to archive the latest developments in anti-spamming technologies, than just archive the spam. Also, IMO, a tool that is tested with such a big archive of general spam, will never work for specific anti-spamming applications, which is what consumers would prefer.

    --
    "Do something man. Right now."
    1. Re:Anti-intuitive archive! by gomerbud · · Score: 1

      [daver@tombstone:~]$ whois spamarchive.org
      [snip]
      Owner, Administrative Contact, Technical Contact, Billing Contact:
      Guru Rajan (ID00024772)
      11475 Great Oak Way
      Suite 210
      Alpharetta, GA 30022
      us
      Phone: +1.6789699399
      Email: guru.rajan@ciphertrust.com

      --
      Kan jeg få en pils, vær så snill?
  17. What about NANAS? by tsvk · · Score: 5, Informative

    NANAS, or the newsgoup news.admin.net-abuse.sightings does just this. It is a public archive of spam which can be searched e.g. with Google Groups:

    http://groups.google.com/groups?group=news.admin.n et-abuse.sightings

    Why reinvent the wheel? Or does this new spam archive have any new functionality to offer?

    1. Re:What about NANAS? by Anonymous Coward · · Score: 0

      Or does this new spam archive have any new functionality to offer?

      It doesn't. Because it doesn't exist. Visit the site.

  18. Great! by Cheese+Cracker · · Score: 2

    Now Spam Radio got an archive to dig out new infomercials from. :)

  19. NANAS Google Archive by Ricardo+Dias+Marques · · Score: 5, Informative

    Well, there is already a pretty large Email and USENET Spam archive at the NANAS (news.admin.net-abuse.sightings) newsgroup.

    You can check the Google Groups archive

    You can read the NANAS charter at http://www.killfile.org/~tskirvin/nana/charter/nan as.html

  20. Quite obscure problem, actually. by mirko · · Score: 2

    Most obvious is that usually spam is personalized, that is the recipient's mail address (or part of it) often appears either in the subject or in the body. So will this archive store every variant of every spam, or just a 'global' model ?

    I guess this could be easy to implement some "almost identical" recognition filter but the problem would be that somebody forwarding a funny spam to somebody else (hey, haven't you kept your very first "herbal alternative to viagra spam" spam message in order to show it to somebody ? ... ok, neither did I.) might be listed as a spammer so, there should be some re-occurrence filter to ensure that a given "spammer" doesn't send a given spam-model more than once to more than once recipients but here, once again, we may face some situation where everybody could be hurt by such restrictions.
    I personally consider the spam problem as overhyped as it doesn't take me more than 15 seconds a day to eliminate unwanted messages.
    I have more problem in real life with these advertisers who dump their pizza-prices in my mailbox but here, in Switzerland, every one pay for every garbage he dumps.

    --
    Trolling using another account since 2005.
  21. spamarchive.com by philj · · Score: 3, Informative

    I've owned spamarchive.com for ages.

    Want it? - I have no use for it.....

  22. Whois.. by Anonymous Coward · · Score: 5, Informative

    says:
    Domain Name: SPAMARCHIVE.ORG
    Owner, Administrative Contact, Technical Contact, Billing Contact:
    Guru Rajan (ID00024772)
    11475 Great Oak Way
    Suite 210
    Alpharetta, GA 30022
    us
    Phone: +1.6789699399
    Email: guru.rajan@ciphertrust.com

    http://www.ciphertrust.com introduces itself as:

    Protect Your Email Gateway
    Anti-spam and email security for the enterprise

    CipherTrust has integrated defenses for all email application-level threats into one, comprehensive device. Our IronMail appliance protects enterprise email systems such as Microsoft Exchange, Lotus Notes and Novell GroupWise against viruses, spam, and intruders, and provides message privacy and policy enforcement.

    1. Re:Whois.. by Anonymous Coward · · Score: 4, Insightful

      So let's get this straight...

      This database is run by a little-known company of
      mixed reputation that sells its own anti-spam tool.

      It doesn't promise any new functionality that news.admin.net-abuse.* doesn't already provide. There's absolutely no reason to believe that the spams collected here will be any 'better' a sample than those collected by opening a random Hotmail account.

      So, what's in it for Ciphertrust? As well as their own library of spam, they'll have a collection of e-mail addresses of people who are interested in fighting spam.

      And what's in it for us? Anyone? Bueller? Anyone?

    2. Re:Whois.. by Anonymous Coward · · Score: 0

      Registrant:
      Netbank Inc. (PXAEBDDWQD)
      11475 Great Oaks Way Suite 100
      Alpharetta, GA 30022
      US

      Domain Name: NETBANK.COM

      Administrative Contact, Technical Contact:
      Netbank Inc. (RVNSIMFPUO) dnsadmin@NETBANK.COM
      Netbank Inc.
      11475 Great Oaks Way Suite 100
      Alpharetta, GA 30022
      US
      999-999-9999

      Record expires on 26-May-2010.
      Record created on 26-May-1998.
      Database last updated on 21-Nov-2002 09:43:03 EST.

      Domain servers in listed order:

      NS2.NCRWEBHOST.COM 199.105.175.3
      NS1.NCRWEBHOST.COM 199.105.175.2

    3. Re:Whois.. by Matts · · Score: 2

      And according to their contacts page, Guru Rajan is their Chief Architect.

      --

      Matt. Want XML + Apache + Stylesheets? Get AxKit.
    4. Re:Whois.. by Anonymous Coward · · Score: 0

      Ooh! At last a smart person who does more than wonder about it. I had the same suspicions as others as soon as I read the story and original link, did my own quick research, and found the same results. Nice job (giving myself a pat on the back at the same time, heh).

  23. Re:Top 20 spammers in the country. by Anonymous Coward · · Score: 0

    +1 Informative? WTF? The parent is a _TROLL_, man. Look at some of those names a bit more carefully.

  24. The opposite by sholden · · Score: 5, Insightful

    Exactly the opposite is needed for work on mail filters.

    Spam is really easy to find, everyone knows that, create a hotmail account fill out some web forms, post to some newsgroups, put a mailto: on a web page. Wait a little while. Bingo, lots of spam.

    However, non-spam email is harder to find. Using your own makes techniques that work with your particular type of email and not other people's.

    Non-spam is harder to collect. Since email is often private in nature. Removing identifiers from the headers is easy enough, but the body also can contain things like addresses, emails, phone numbers, comparisons of the boss to bacteria, etc.

    A collection of real emails, from which personal information has been replaced with fake data would be of great use. A few people I know are working on creating such a data set of email. It is aimed at more general email filtering though, not just spam detection, and hence requires categorisation. And is from academia and hence will probably lose the race with the heat death of universe for completion.

    I do note they have a 'non-spam' heading on the very sparse web page which is encouraging.

    1. Re:The opposite by duncf · · Score: 1

      I am one of the developers of SpamAssassin and I'm going to agree; non-spam is far harder to collect and it is needed in just as high a quantity as spam.

      The biggest problem with non-spam is that it's private and often sensitive. It would be impossible to collect a giant corpus of non-spam representative of the business world that could be used to tweak spam filters.

      For SpamAssassin, we get users/developers to submit the results (tests hit) for each message when run through spamassassin, and plug spam and non-spam results into some sort of a Genetic Algorithm. This way, users only need to submit results of rules, not full messages to us for scoring. For the most recent score set, we had 169k non-spam messages and 29k spam messages. (The scores are very good!)

      For testing individual rules, we have a similar mechanism in place, with a smaller volume of results.

      I'd say the best testing you can do involves the user with the mail running the test, and sending you the results, rather than sending you the mail.

      One problem with public corpuses is that they tend to get dated, and generally aren't representative of the messages you want to filter. Filters based on a Bayesian type mechanism will find this sort of an archive entirely useless, and there are clearly better methods for rules-based filters.

    2. Re:The opposite by sholden · · Score: 1
      One problem with public corpuses is that they tend to get dated, and generally aren't representative of the messages you want to filter. Filters based on a Bayesian type mechanism will find this sort of an archive entirely useless, and there are clearly better methods for rules-based filters.
      The main benefit of a public corpus is academic. It allows different metholodogies to be compared and for experiments to be repeatable.
    3. Re:The opposite by duncf · · Score: 1
      The main benefit of a public corpus is academic. It allows different metholodogies to be compared and for experiments to be repeatable.

      Perhaps a public corpus can compare diffent methodologies, however, there are better ways of harvesting one.

      Regardless, these corpora will never be accurate representations of normal mail.

  25. Spam and anti-spam by zedman · · Score: 5, Funny

    Would spammers try to "anti-spam" the spam archive by submitting billions of perfectly normal emails?

    Ian

    1. Re:Spam and anti-spam by Anonymous Coward · · Score: 0

      What a great idea. I can submit every email from my in-laws to the archive!

    2. Re:Spam and anti-spam by leuk_he · · Score: 2

      And what about the users that were lazy and didn't want to unsubscribe from a mailing list (let's say, e-bay) and just block it as being "spam"). This comes back as what exactly is spam?

      -- This posting is ACCORDANCE with slasdot law 2.8.

    3. Re:Spam and anti-spam by elodan · · Score: 1
      Where would they get the billions of perfectly normal emails?

      All the emails would have to be unique - if they were dupes then they would, by definition, be a type of spam all on their own.

      And do you really see those worthless scumbag spammers banding together? The sharks would just eat each other.

    4. Re:Spam and anti-spam by Penguinoflight · · Score: 2

      Spammers are generally just stupid enough to click send. They won't likely find this site, and it's not worth their time to mess it up either.

      --
      "And we have seen and do testify that the Father sent the Son to be the Savior of the World"
      1 John 4:14
    5. Re:Spam and anti-spam by pacc · · Score: 2

      But are normal people smart enough for their own good?
      I'm already contemplating to submit submit@spamarchive.org to "daily-word-of-the-bible mailinglists"

    6. Re:Spam and anti-spam by Penguinoflight · · Score: 2

      Go ahead and do it if this "daily-word-of-the-bible" mailinglist is really unsolicited, but if they aren't unsolicited you'll compromise the whole spamlist and make it harder for the people running it. They don't want a bad name for blacklisting stuff that's perfectly legit.

      --
      "And we have seen and do testify that the Father sent the Son to be the Savior of the World"
      1 John 4:14
  26. I hope they really wanted it! by su-geek · · Score: 1

    I just added a rule to my spam filter to forward all messages!

  27. A project like this needs funding by Anonymous Coward · · Score: 2, Funny

    This worthy effort needs funding to keep it alive. I have some contacts from Nigeria who may be able to help, I will forward their details.

  28. Re:Top 20 spammers in the country. by Gendou · · Score: 1

    Actually, I can vouch for this; it's totally real. I also saw it in last month's issue of Wired, list & all. Yes, I'm a bit ashamed to admit that I read Wired, but, hey, what're ya gunna do? The article on spam was really interesting and is worth a read, even if you already consider yourself an expert on the subject.

  29. Non-spam messages for false hit testing by jjl · · Score: 3, Insightful

    Archive of samples of non-spam messages should be collected as well, containing real E-mail messages which aren't spam. These messages should be more or less normal private E-mails which are just volunteered to make public for testing purposes.
    The purpose of the samples of non-spam messages would be to help preventing false hit testing for the spam filtering algorithms, just as real spam messages are used to tune the algos for detecting spam.

    --
    --
    1. Re:Non-spam messages for false hit testing by MWelchUK · · Score: 1

      Just what I was thinking!

      If I only had spam comming in to the filter, all I would have to do is pass all E-mail to the TrashCan!

      I wish that they had just mandated something like a extra line in the header of all spam rather than a remove link at the bottom, then we could have just filtered out all occurences with that in it!

  30. S-P-A-M, again and again and again and again by n3k5 · · Score: 1

    this article reminded me of that hilarious 'spam' song by save ferris and i decided to dig out the lyrics. if you happen to know the direct url (google helps there), it works just fine, but check out what happens if you click at their link in their lyrics listing for 'save ferris'.

    --
    but what do i know, i'm just a model.
    1. Re:S-P-A-M, again and again and again and again by Alranor · · Score: 3, Funny

      Please, if you're going to quote spam songs, why didn't you find this one

      Lovely Spaaam! Wonderful Spaaam!
      Lovely Spaaam! Wonderful Spam.

      Spa-a-a-a-a-a-a-am.
      Spa-a-a-a-a-a-a-am.
      Spa-a-a-a-a-a-a-am.
      Spa-a-a-a-a-a-a-am.

      Lovely Spaaam! (Lovely Spam!)
      Lovely Spaaam! (Lovely Spam!)
      Lovely Spaaam!

      Spaaam, Spaaam, Spaaam, Spaaaaaam!

  31. What about the others ? by ltjohhed · · Score: 2, Interesting
    Like SpamHaus ? It seems like a similar service right ?!

    --
    All generalizations are false
  32. Are YOU a spammer? by Cheese+Cracker · · Score: 2, Funny

    Take the test and find out... ;)

  33. Re:Top 20 spammers in the country. ??? by Anonymous Coward · · Score: 0

    What???? I would like a link to that story just to confirm this...

    Craig McPherson of the hole Debianits is an evil spammer... I refuse to believe it...

    Damn back in the days on LNO he used to be such a nice troll, he had style..

    - Lovechild

  34. What if... by serlaten · · Score: 5, Interesting

    ...spammers use the anti-spam tools to create spam that doesn't trigger the automatic spam filters.

    1. Write spam mail
    2. Filter through widely used spam filter
    3. If spam is flagged as spam, rewrite; goto 2
    4. Send
    5. Profit
    1. Re:What if... by thing_in_itself · · Score: 3, Insightful
      After a certain point though, spammers are pretty much stuck with a few basic "selling points" -- it's hard to sell something if you don't include a product description or URL or address/phone of some sort, and spam filters will evolve to catch those kinds of things unless they're stripped down to their bare bones (as in, just a random bare URL.... hey, wait, that sounds like half the e-mail I send to my friends ;).

      Even then, a hypothetical "widely used" spam filter will probably include a user-specific Bayesian filter, so you can create your own local database of what tends to be spam, and more importantly, what tends not to be spam -- and your own "real mail" keywords will probably be highly specific to your interests/career. So you're basically "evolving" a personal blacklist/whitelist to go along with the global filter.

      But probably the most interesting thing about "spam evolution" is that if spam can get through a spam filter, it's going to be really toned-down and bland. That may not make a difference to you, but it'll drastically lower the spammers' response rates because their ads aren't as flashy. Less profit = less spammers. (This last paragraph wasn't "my idea" -- forget where on the web I saw it.)

    2. Re:What if... by chrj · · Score: 1

      The ones targeted for the spam emails is not the ones who know how to setup filtering software, thus going through this procedure isn't needed.

  35. That could be heaven for spammers.. by heytal · · Score: 4, Insightful

    The archive could give them a lot of valid email addresses...

    Consider this one: You forward a spam to submit@spamarchive.org. The forwarded mail is now a part of the archive. Spammers snoop the archive for email addresses.

  36. Re:Top 20 spammers in the country. by kubrick · · Score: 2

    Interesting, Informative? A 4? For a troll's in-jokes?

    Bah, I say.

    --
    deus does not exist but if he does
  37. Does that mean... ? by Noryungi · · Score: 2

    I can send them a copy of all the awesome, truly fantastic offers that arrive in my mailbox? =)

    Oh, the joy! 300 copies of "make money fa$t", "enlarge the size of your penis" and "Amazing investment opportunities", delivered lovingly every day to this archive, to be preserved for the good of humanity forever more!

    (Clicking hysterically on the "forward" button...) ;)

    --
    The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
  38. Re:Top 20 spammers in the country, or just a troll by DarkSkiesAhead · · Score: 2


    The parent is a troll, folks. This same email list has been posted to multiple discussions, probably by the same loser. I'd really like to see moderators show a little bit judicious. A quick search on wired.com turns up nothing looking like the supposed article. This is completely fake and some of those names should look familiar (but not for spam). Will someone more reasonable please mod this one down?

  39. Spam archive and stats by minesweeper · · Score: 4, Informative
    If you're looking for 5+ years of archived spam and plots of spam volume versus time, check out this guy's site.

    His page of graphs shows the exponential growth of spam over the past few years.

  40. Bandwidth friendly Spam.... by hughk · · Score: 2
    Just think, instead of sending you yet another suggestion to partake of the latest penis enlargement scheme, they could just send you a URL pointing to the appropriate message in the archive. I'm sure many recipients would be a lot happier if they received a URL rather than a 1K message. Microsoft's Outlook would be nice and friendly too and probably display it without prompting.

    Of course, it would make filtering easier too.....

    --
    See my journal, I write things there
  41. Good idea by arvindn · · Score: 3, Interesting


    Aside from all the bashing these guys are getting here for not having any working code, this kind of database would actually be quite a good idea.

    One main problem for anti-spam is this: humans are very good at telling spam from legitimate messages. Comupters are nowhere close. Why not? Well, humans are simply better at certain types of problems like pattern recognition because of centuries of evolution. But there are ways around this: genetic algorithms and neural nets are two that I can think of. Both of these are "learning" strategies and need large databases to get started. We're talking about billions of messages or more, not the hundreds that you get everyday.
    So the kind of database (one for spam, one for non-spam) that these guys are talking about would be an excellent way to develop intelligent spam-detectors.

    Sorry if this is unpopular opinion, but we are against legal and in favor of technolgical solutions for most of the problems of the internet, aren't we? Then why are we waiting for anti-spam legislation to fall like manna from the sky? The best way to fight spam is using technology. Methinks this is a step in the right direction. So get off your ass and contribute. Forward your spam to them. Think of clever algorithms that can make good use of a large database. And code them. And submit patches. Isn't that what open source is for? Hey, may be this is going to be a killer app for open source, considering how big a problem spam is going to be in the next few years :)

    1. Re:Good idea by Debillitatus · · Score: 2
      problems like pattern recognition because of centuries of evolution

      Just centuries, you say?

      --

      Come on, give it up, that's

  42. Launched? by jsse · · Score: 1
    Download Archives
    Spam Archives
    • (coming soon)
    Non-spam Message Archives
    • (coming soon)

    Anti-Spam Community Links

    check back as we create a resource page for the anti-spam community.





    Exactly what's the definition of a 'launch'?
    1. Re:Launched? by Anonymous Coward · · Score: 0

      Apperently a page that says "the site is launched"
      It's all pretty lame if you ask me...

  43. Geekiness by EuroChild · · Score: 2, Funny
    "... a few small personal spam archives that were used..."

    Geekiness has reached a new high! Or should that be low...?

    --
    Does this make my brain look big?
  44. IDIOT MODERATORS MOD THIS TROLL DOWN! by Anonymous Coward · · Score: 0

    You guys are a bunch of frickin' sheep.

  45. Benchmarking "False Positives" by gwappo · · Score: 3, Insightful
    It would seem to me that the value of such a repository is limited if all it contains is spam.

    If anyone writes an anti-spam tool, I need to distinguish between spam and non-spam, making non-spam equally valuable for spam-filter benchmarking.

    Having a log with only spam makes it quite easy to achieve a 100% benchmark (simply reject it all!).

    Couldn't find anything about this on the site, so unless I'm missing something, the value of such a log is limited at best.

  46. Not intended purpose by 0x0d0a · · Score: 4, Informative

    This isn't like Distributed Checksum Clearinghouse or some other spam *solution*. It's intended to test to see what percentage right antispam tools get right -- false positives and negatives. It's useless (at least directly) to end users.

    So unless your antispam tool breaks on some names in personalized letters, I would think that it's okay.

  47. Re:Top 20 spammers in the country. ??? by Anonymous Coward · · Score: 0

    Hey, lovechild!

  48. I'll see you, and I'll raise u US$ 5.= by Anonymous Coward · · Score: 0

    I was about to add a comment in the general idea of above post, but he pretty much sums it up.
    Having a 100% spam archive is pretty useless as a base-standard for doing tests. U need to have at least as many REAL mails (I made up the amount,but u get the idea) for a base to work from.
    It's easy to identify all mail as spam, it's much harder to identify the real mail in between. This is probably a different approach as well. Rather then looking for spam, you can try to filter out real mails.

    Do not reply to this with technical stuff. I am not THAT technical, and probably the latter suggestion I made is not technically feasible.

    But anyway the main point is: a 100% spam archive doesn't seem that usefull to me..

  49. Ummm, yeah... by Anonymous Coward · · Score: 0

    <head>
    <meta http-equiv="Content-Language" content="en-us">
    <meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
    <meta name="GENERATOR" content="Microsoft FrontPage 4.0">
    <meta name="ProgId" content="FrontPage.Editor.Document">
    <title>spama rchive</title>
    </head>
  50. Won't make a difference by ch-chuck · · Score: 2, Interesting

    You might as well start up a database to catalogue all the different shapes of sand on the seashore - largely useless exercise in futility.

    What people are starting to do is block EVERYTHING that isn't on a 'whitelist'. That way granny and Junior don't get mail from anyone unless they're pre-approved. If they get mail from J.Random Stranger it's bounced with a request to put a short random token in the subject line. Thanks to marketing a good third of Internet mail traffic is useless crap. Thanks marketers!

    To show just how evil and desperate unemployed, cash strapped, deep in debt spawns of satan those people are - yesterday I got a letter from my mortage holder, Chase Manhattan bank, marked "IMPORTANT ACCOUNT DOCUMENTS ENCLOSED". It turned out to be yet another credit card pitch. ("You qualify to give us even more money!!") Bastards. It's not my fault the Msft office automation vision they bought into turned out to be way more expensive than the sales flak led them to believe.

    I wish unemployed marketers would turn to prostitution and drugs instead of spam - at least they'd be supplying things people actually WANT.

    --
    try { do() || do_not(); } catch (JediException err) { yoda(err); }
  51. Re:Top 20 spammers in the country. by isorox · · Score: 2

    MOD DOWN PARENT!

    McPherson, Craig, doesnt look like a spammer - I remember a couple of years ago at LNO. He's a decent troll.

    trollmastah@hotmail.com - Really, one of the countrys top spammers with a hopmail address?

    *@adequacy.org - that well known site isn't spam central.

  52. As Admiral Ackbar says... by imag0 · · Score: 4, Funny

    It's a trap!!!

    1) Set up story about new site accepting spam to assist in creating better anti-spam tools.
    2) accept all the submissions from the teeming millions(tm) at a popular tech site or two.
    3) cull all the email addresses from those duped to forward spam to you.
    4) sell said email addresses to spammers.
    5) PROFIT!!!!

    1. Re:As Admiral Ackbar says... by Walterk · · Score: 3, Funny

      Why don't we simply subsribe them to all those spam lists? They get their daily spam. You've done your job. The spammers have spammed. Everybody happy.

    2. Re:As Admiral Ackbar says... by Anonymous Coward · · Score: 0

      That would be a cute idea.

    3. Re:As Admiral Ackbar says... by Anonymous Coward · · Score: 0

      5) PROFIT!!!!

      Um no. Can't you see that it's SpamArchive.org? That's prima face evidence that it's non-profit.

      Get a clue!

    4. Re:As Admiral Ackbar says... by Anonymous Coward · · Score: 0

      Some .com's can be non-profit too. Just Ask Jeeves about that!

  53. An easy way for the creators to attract SPAM by Anonymous Coward · · Score: 0

    would be simply posting the email address here, and on alt.sex.*

    then let the email scrubbing bots do the reat.

  54. I think this is just going to make spam more annoy by autopr0n · · Score: 3, Interesting

    Call me a cynic, but in my estimation, the only thing effective Spam filters based on content are going to do is make Spam more annoying. Why? Because spammers are going to have the same access to filters that regular people do. All they'll need to do is run their Spam through the filters to check and make sure they pass. In other words, if these Spam filters really work well then it won't be possible to determine what is and isn't Spam by a quick glance at the subject line or formatting of the message. Rather then "INCREDIBLE OPPORTUNITY FOR FAST EAZY MONEY$$$$$$$$$5390ANFP9O" and "HOT HORNY SLUTS WANT TO MEAT YOU" we'll get stuff like "Dude, check this out!" with a body like "hey man, long time no see. What have you been up to? I've just been hanging out, not too exciting, although I met this cool chick off the 'net. Hrm, you still looking for a gf? You should check out FriendFinder.com :). Anyway, talk to you later, bro."

    And you'll need to read the whole message before you realize its Spam

    You might not like to believe it, but spammers (or at least some spammers) are hackers, in both senses of the word. ESRs supposed "hacker ethics" are as much bullshit as anything else he says.

    The only way these things will work is if the vast majority of people do not use these things. I don't know how likely that will be, with MSN already promoting it's 'less Spam' features.

    I think what we need is a fundamental change in the way email is handled. The current system is just way to prone to abuse, and should be replaced entirely. The new standard could use things like digital certificates and other technology to make sure you're talking to an individual (while protecting anonymity in some cases, although the receipt of anon email could be optional, etc, etc)

    --
    autopr0n is like, down and stuff.
  55. Maybe I'm being cynical.... by Maddog+Batty · · Score: 5, Interesting

    If you were a spammer and wanted to collect a large number of valid email addresses, how about this as an idea...

    1) Produce a website pretending to be antispam.

    2) Ask people to send their spam emails to the site (generally including a valid from address of course)

    3) Publish on slashdot so as to get lots of interest.

    4) ???

    5) Profit!

    (Unfortunately, we all know what stage 4 is for spammers...)

    --
    wot no sig
    1. Re:Maybe I'm being cynical.... by Anonymous Coward · · Score: 1, Informative

      If it is just a ploy to get addresses, avoid the trap by using a DEA (disposable email address); emailias.com, sneakemail, spamex, etc.

    2. Re:Maybe I'm being cynical.... by gorbachev · · Score: 1

      It wouldn't be the first time spammers are trying to masquerade as anti-spammers.

      This is why it is important that people know who is running this show, how they are running it and what exactly are they planning on doing with the spam archives.

      The whois records shows someone by the name of Rajan Guru with an Email address on ciphertrust.com, which seems to be in the business of "Email Security" and makes a serverside spam filtering product called IronMail.

      Why aren't they telling on spamarchive.org it's them that's behind the operation?

      Are they trying to develop a new commercial product or enhance IronMail based on what people submit to spamarchive.org?

      Proletariat of the world, unite to kill spammers

      --
      In Soviet Russia, I ruled you
    3. Re:Maybe I'm being cynical.... by Anonymous Coward · · Score: 0

      IronMail is a highend 25000 box. That said, there's nothing wrong with them using it for their own anti-spam software, as long as others can share it too.

  56. Spam works. by Big+Mark · · Score: 2

    Think about it: while 99.999...n...9% of spam mails are either deleted before they're read or shunted into a "Spam" folder, there will be enough Internet newbies / technology imbeciles / other non-slashdotters ;=) who think that unsolicited emails can be a cure to their debt problems / small penis / whatever.

    So long as enough people are suckered by the adverts, the spammers get enough to pay their bandwith bills, and they can continue to spam us.

    What's needed is education for the naive: just ignore unsolicited adverts. TOTALLY. I mean, when was the last time you opened a credit card mailshot? Or one of those "Espescially for you" things in real life?

    Exactly. Trial by error is not a good learning solution for spam. It should be mandatory that all ISP sign-up procedures inform new customers that any unsolicited emails can safely be ignored, hopefully that way the spam industry will start to wither and die.

    -Mark

  57. Is it me or by zBoD · · Score: 2, Insightful

    it is exactly the same thing as www.spamrecycle.com that exists for a long time now?

    BoD

    --
    BoD
  58. Re:Top 20 spammers in the country. ??? by Anonymous Coward · · Score: 0

    Hi Anonymous Coward...

    - Lovechild

  59. What's the point? by brunnock · · Score: 5, Insightful

    What's the point of testing a filter against a database of known spam if you can't test it against a database of nonspam?

    Anybody can write a filter for bulk mail. How do you differentiate between solicited and unsolicited bulk mail?

    1. Re:What's the point? by triptolemeus · · Score: 1

      I can imagine that everyone has his own mailbox of nonspam. So it should not be so hard to get your hands on such a thing.

      --
      The site where: "I'm right, as long as you ignore the things that prove me wrong", became a valid method of debate.
  60. playing cat and mouse by wiggys · · Score: 1
    Fighting spam is like fighting crime, hackers or piracy. For every measure we put in place some spammer somewhere will find a way around it.

    Take piracy, for example. As soon as someone finds a new way to stop people copying games someone else finds a way around it. It's been going on for years, and it's unlikely there'll ever be a way around it (even Palladium will have its holes).

    Today, for example, I received a spam inviting me to "increase the size of my *enis". They are obviously aware that the word "penis" is blocked by many mail systems so they simply found an easy way around it.

    --

    Sorry, but my karma just ran over your dogma.

  61. Need inspiration by Chatterton · · Score: 1

    Great, Theses lasts times my inspiration has felt... With this great source, I could take some good example for my next piece of spam :-)

  62. Re:Top 20 spammers in the country. ??? by Anonymous Coward · · Score: 0

    Hey, again, Lovechild!!

  63. Too Bizarre by dfn5 · · Score: 2

    I discussed this idea yesterday with my manager. I've been looking at spamhaus over the last couple of days but they don't take spam reports from end users. So I had the idea of setting up a domain for users to forward spam. This spam database could then be used to create an RBL for the most active mail relays. I suppose now I can create the RBL without collecting the spam. :-)

    --
    -- Thou hast strayed far from the path of the Avatar.
  64. Re:I think this is just going to make spam more an by Anonymous Coward · · Score: 0

    cynic

  65. Re:I think this is just going to make spam more an by Walterk · · Score: 1

    Don't know about you, but I check the sender before I read it, and quickly scan for any URIs, if they seem for me. I don't know of anyone who would send me something like "check this out" anyhow, not by email, that's what IRC and IMs are for.

  66. How to end spam by Permission+Denied · · Score: 5, Interesting
    I've had the same email address for five years, and I receive zero spam. None whatsoever. I also advertise the email address widely (web, usenet, mailing lists).

    How does this work, you ask? I create a new email address each time I give out my email address. We have a sendmail setup that allows you to make "username+foo@example.com" go to "username@example.com" where "foo" is any arbitrary string.

    So, amazon.com thinks I'm "username+amazon@example.com", securityfocus thinks I'm "username+bugtraq@example.com" and so on. Once I receive spam on one of the addresses, it's trivial to write a filter that matches with near 100% confidence ("username+bugtraq@example.com" should only receive messages originating from securityfocus, etc.). Most times, if an address receives a spam, I can just procmail all mail to the address to /dev/null (eg, no complex rules like for the bugtraq example). This also allows me to track where spammers get their lists.

    We use sendmail. Equivalently, qmail allows "username-foo@example.com" and if you own your own domain, just use "foo@example.com".

    I find this advanced filtering stuff fascinating, from a completely academic point of view. I, of course, can't apply any of it since I don't receive any spam, but it's interesting nonetheless. I just read through how the Bayesian filter works. It is very simple: it only filters based on word (token) probabilities. So, it would assign a value to "make," "money" and "fast," but not "make money fast". Seems like you could get much better results if you do something more advanced like Markov chains or a neural net. There's lots of research out there on textual matching, and I'm not sure why people would start out with such a simple algorithm when there may be better things available (where "better" is measured not only by accuracy, but also by training time).

    1. Re:How to end spam by Queuetue · · Score: 1

      I'm not sure why people would start out with such a simple algorithm when there may be better things available

      You're a people, right? Start coding.

    2. Re:How to end spam by elodan · · Score: 3, Insightful
      IMO, all the spam filtering technology we're so busy inventing is missing the point to an extent. It's not so much the problem of finding the spam in your mailbox and having to delete it, as it is to do with the amount of bandwidth downloading the spam eats up.

      You and I resent the time we spend deleting rude/crude/criminal/porno spam, but at the end of the day if you've got broadband you only notice the TIME lost.

      A user using a cheap Linux handheld in India can't afford the bandwidth to download a hundred graphic-rich spams a day.

      Bandwidth costs.

      Shouldn't we therefore be looking at ways to stop the spam being sent, or at least limit the propagation of it by filtering it early in the routing process?
      Unfortunately I'd guess this messing with other people's email would have legal implications, but can we work round it?

    3. Re:How to end spam by CvD · · Score: 4, Insightful

      It is still too much work for me to have to set up a new email address every time I leave it on a website somewhere.

      With an advanced spam filter, you set it up and forget about it...sometimes checking your spamfolder if there are any false positives.

      How do you create new email addresses? Do you have a CGI script interfaced with your alias file or so to easily make new email addresses? That would be useful.

      For me it still is too much work to set up email addresses that way. And you need to start doing this from the beginning, otherwise there will still be an amount of spam that gets sent to your username@example.com address (as is the case with me).

      Cheers,

      Costyn.

    4. Re:How to end spam by beebware · · Score: 1

      He probably has a "catchall" account setup so mail sent to a non-existant alias is automatically sent to the postmaster account instead of being bounced. It's all "tagged" in the headers where it was destined for, but it all goes into the one mail box.
      The only problem with this, that I have experienced, is dictionary attacks and "generic addresses" (such as 'abuse', 'postmaster' and 'webmaster' - most of which you HAVE to accept by the RFCs)...

    5. Re:How to end spam by Masa · · Score: 2
      If I remember correctly - but I might be wrong - Sendmail ignores everything after the '+' sign in the username part of the address. So "abc+def@example.com" is always sent to address "abc@example.com". No need to play with alias file.

    6. Re:How to end spam by CvD · · Score: 2

      That would be very useful. It would mean only having to adjust your procmail filters when spam came through. :-)

    7. Re:How to end spam by kbeer · · Score: 1

      If you want to do this 'one time email address' method, you don't have to do any sendmail munging. Just use spamgourmet (http://www.spamgourmet.com). Once you register (which is free), you can just make up your own addresses like this:

      ANYWORD.NUMBER.SPAMGOURMETNAME@spamgourmet.com

      Then at most NUMBER emails are forwarded to your email address. You know where it came from because you know what ANYWORD you used.

    8. Re:How to end spam by dhall · · Score: 1

      With qmail, it's rather easy to setup a "generic" address prefix.

      You can setup email to

      d-*@insertyourdomainhere.com

      Then, when you give out your email address, you can selectively choose a descriptive name.

      Let's say I need to give the New York Times an address, and I want to see if they ever expose my email address.

      I send them d-nytimes@insertyourdomainhere.com.

      With beauty of this solution is with qmail, by default anything to d-* will be accepted. If this address is exposed, I can make an explicit rule to deny this address, basically decomissioning it.

      I've had to do that with Onsale, when they either sold or exposed my email address. Real has also exposed my email address.

      If you REALLY need to give out your email, it should be a disposible address. With qmail it's also possible to come up with email addresses that expire after a set time.

      The main problem with this solution is that it doesn't SOLVE the issue. It only hides it. The email is still wasting processor time and bandwidth. Unfortunately the only way to fix the problem is to prevent spammers from spamming. As long as network providers feel they can make a buck off this scum, there will continue to be spammers.

    9. Re:How to end spam by chrisvdp74656 · · Score: 1
      I use Courier. For me, it's as easy as:
      ssh server
      echo chris-website: chris >> /etc/courier/aliases/websites
      makealiases

      If you really wanted, you could set up a shell script to do this for you. Eg:
      #!/bin/bash
      echo "Adding $1-$2: $1 to /etc/courier/aliases/websites"
      echo "$1-$2: $1" >> /etc/courier/aliases/websites
      echo "Updating aliases database..."
      makealiases
      echo Done.

      Chris

      PS. I also have all mail coming through Courier's SMTP engine run through SpamAssassin, and only get the occasional false positive - and they're usually daily mailings I'm subscribed to, or have been forwarded from fifty other people first.

      --
      09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
    10. Re:How to end spam by SEWilco · · Score: 1
      I just read through how the Bayesian filter works. It is very simple: it only filters based on word (token) probabilities. So, it would assign a value to "make," "money" and "fast," but not "make money fast".

      It would assign a value to "make money fast" if that is the token which you feed it. Simplest is to run the text through any sentence analyzer so you can boil it down to tokens which always have the same components in the same place in the tokens (+, +, etc). It doesn't have to be perfect, just consistent. This takes care of trivial rephrasings such as "how to make money fast".

    11. Re:How to end spam by SEWilco · · Score: 1

      (Oh, great. I used < and > characters around "noun", "adjective", and "verb", and they got processed as HTML, leaving the "+" which was between.)

    12. Re:How to end spam by Anonymous Coward · · Score: 0

      In this article they describe a quite nice system that help this spam filtering and also helps anonymizing registration in websites and does password management, all using strong cryptografy. Quite nice idea, though I haven't used the actual implementation.

      Consistent yet Anonymous Web Access with LPWA

    13. Re:How to end spam by Permission+Denied · · Score: 1

      That's way difficult. I don't do anything with the sendmail setup. I don't have to "create" email addresses. It's just a sendmail rewrite rule that changes "x+y@z.com" to "x@z.com" where "y" is any alphanumeric string. I just give out the address and I know it will get to me - I don't have to ssh in to some server, I don't go to any webpage.

    14. Re:How to end spam by Anonymous Coward · · Score: 0

      I started using bayesian popfile (popfile.sf.net) last week and it has an amazing succesrate. Its accuracy is over 99% after the initial learning. I malfunctioned on one spam email in spanish - the only one in spanish i have received.
      Its simple, fast and easy enough to be implemented by "the masses".It simply works even though it seems simple and uninteresting from your academic standpoint.

    15. Re:How to end spam by tshuma · · Score: 1

      I think this is a good idea, but not the best!
      But if someone wants to find the best solution for spam email, he should know about your idea!!!

      I mean there are some problem with this way, but it could make very strong and good base for a very new and usable solution!!

      --
      There is only one good solution: The simpliest!
  67. Copyright by rockdreamer · · Score: 2, Insightful

    Spam, like all written text is subject to copyright

    Couldn't the spammers sue for copyright infringement?

    1. Re:Copyright by Anonymous Coward · · Score: 0

      Honestly -- Can a snail-mail advertiser sue you for giving their advertising materials to the local recycling shop instead of framing them on your wall or storing them in your safety deposit box?

      Surely it's only a copyright offense if you keep a copy of the materials when you give the original away..

  68. Are they legit? by Zocalo · · Score: 5, Informative
    Typical of a Slashdor story. Lot's of people asking questions when they can find out the answer and post it in the same amount of time.

    According to WHOIS, "spamarchive.org" was registered by one Guru Rajan, who has an email address at "ciphertrust.com". Also according to WHOIS, "ciphertrust.com" has the same person as technical contact and if you check the website you find they are the vendors of "IronMail: The Secure Internet Email Gateway", an established if not well known product.

    In short, yes, it seem legit, and it probably took me less time to find that out than the time taken by the myriad people asking "is it legit" took to post the question. ;)

    --
    UNIX? They're not even circumcised! Savages!
    1. Re:Are they legit? by Anonymous Coward · · Score: 0

      You are my god. I want to subscribe to your newsletter.

    2. Re:Are they legit? by bokal · · Score: 1

      Maybe. And according to the CipherTrust hompage Guru Rajan is "Chief Architect & Director of Engineering" so everything looks fine.

      But everything being legit, good, and fine does not in it self make it a good idé.

      Bo

    3. Re:Are they legit? by revery · · Score: 1

      Also typical of a slasdot story is redundancy, such as you posting this at 6:40, when at 5:26 an Anonymous Coward posted everything you said with additional detail.

  69. Heck. by Anonymous Coward · · Score: 0
    Just threw 30 megs away three days ago...


    BTW isn't this what is called a 'corpus'? You can do some beautiful research with it...

  70. Re:Database? (you get what you pay for) by Anonymous Coward · · Score: 0

    Pfhu! that's just low quality run of the mill bulk mailed average sort of spam. I however am in possesion of some serious l337 spam that's just not sent to anyone, and I'm not parting with those lumps of resercher gold, oh no.
    No free lunch for you.

  71. um, why not just use the FTC? by rakerman · · Score: 3, Interesting

    They've got gazillions of messages sent to uce@ftc.gov

    Why not just make that available to the public for creating training sets for spam?

    The idea of a central archive is good, but I don't see why there's a need to reinvent a New! Improved! wheel.

  72. I told you this would work by Anonymous Coward · · Score: 0

    We set up a quick website, tell people we are 'collecting spam' (make up a good excuse) and voila! thousands of _verified_ email addresses, belonging to well connected people with high incomes within hours! Thanks Slashdot!!

    AC

  73. storage is not free yet by rakerman · · Score: 2

    I don't see how this can work. Sure, hard drives get cheaper all the time, but how can they possibly afford to keep up with a wide open "send us spam" request? They'd need petabytes of storage.

  74. revolting by Anonymous Coward · · Score: 0

    that image of a pig merged with a block of spam is quite revolting! Please change it.

    1. Re:revolting by hAN+sHAN · · Score: 1

      what do you think spam is? candy? soy?

  75. Out with the next interstellar sonde... by Anonymous Coward · · Score: 0

    ...nicely burnt into a DVD. Provide the extraterrestrials with a realistic sample of our culture.
    They won't ever bother to contact us any more after that... or they'll zap us in an instant. Either way the world will be a better place :-)

  76. Not what you think. by pucko · · Score: 1

    "spamarchive.org" is registered by "ciphertrust.com".

    Ciphertrust develops and sells spam-prevention software.

    Interesting.

  77. I think it's already been done, but in reverse... by Pendant · · Score: 5, Interesting

    In order to counter the rising tide of spam I recently installed a spamblocker, even though I'm wary of such beasts because of the danger of false positives.

    Sure enough, I have received false positives. But only from one source: my filter traps the Network Solutions email asking for confirmation to proceed with the transfer away of a domain to another registrar. Net$ol changed the format of these emails a while back: they now start off by talking about a "special offer" and it's only towards the end that the real purpose of the message is revealed. My suspicious mind wonders whether these emails are intentionally designed to look like spam to reduce the number of successful transfers... sneaky :(

  78. Legit? by ek_adam · · Score: 2

    Do we know that this is a good site, or is this a devious mechanism to collect the email addresses of everyone who forwards them spam?

    1. Re:Legit? by ichimunki · · Score: 1
      It sounds legit from the whois info, but so what? It's stupid and I'll tell you why:
      1. It is easy to obtain spam without a central database, even so there are repositories out there already, no?
      2. Filtering spam based on matching-style filters is just an arms race with spammers. Proper filtering requires a trainable filter. One man's spam is another man's dearly loved invitation to invest in shaky African government.
      3. They aren't collecting enough to build proper tests if all they are collecting is spam. I can already avoid 100% of my spam by simply throwing every email I receive into a bit-bucket. Sure, I may get a few false positives, but so what-- my filter works great with their spam corpus, right? I can improve my filter by adding a whitelist capability, but that's still less than ideal since I want strangers to be able to send me mail (closed eBay auctions would be a good example of a case where whitelisting is a total PITA). So unless this test corpus also includes valid emails that should not get caught by the spam filter, my tests are guaranteed to be incomplete. And again, one man's valid mail is another's hated spam.
      So, it's not important to look at spam and try to outsmart spammers. That is a losing proposition. It's more important to work on "smart" and easily configured clients. Stock match-based filters just don't work, not for spam, not as nannies for children surfing the web, and not here on Slashdot (witness the problems posting URLs or samples of Perl code).
      --
      I do not have a signature
    2. Re:Legit? by rgmoore · · Score: 2
      They aren't collecting enough to build proper tests if all they are collecting is spam.

      It depends on what their purpose is. If I want to train my personal spam filter, I already have a large corpus of non-spam to use- all of the emails that I've saved over the past few years. OTOH, I don't save all of the spam that I get, so I need a training set of spam to use for the other side of things. Since spam is, by its very nature, a bulk thing that's sent out indiscriminately, other people will probably have a spam corpus that is reasonably similar to what I receive, so I can reasonably use an archive of other people's spam to train my own spam catcher. I think that it would have been very useful to have a few thousand standard spam messages when I started using bogofilter.

      FWIW, it also looks as though they are trying to collect an archive of non-spam email. There is a spot on their page where a non-spam archive will be in the future. I'm not sure if I'd want to send them my personal email to put into that kind of a list because some of it includes personal information, but I could certainly see somebody developing an archive of innocuous non-spam mail- mailing lists, legitimate business mail (like "the item you have been waiting for is now in stock" notices), and the like.

      --

      There's no point in questioning authority if you aren't going to listen to the answers.

  79. Yeah, right. by peterpi · · Score: 1
    "There was no large set of spam that could be used to test new anti-spam algorithms"

    Gimme a break, they could have just set up a hotmail account and left it for a couple of hours.

    Sounds like an overenthusiastic noob just got himself a domain name.

    1. Re:Yeah, right. by Anonymous Coward · · Score: 0

      > Sounds like an overenthusiastic noob just got
      > himself a domain name.

      As someone who used to work with Guru while at Interland, you hit the nail on the head.

      Did you used to work with Guru, too?

  80. easy to filter that last one by Anonymous Coward · · Score: 0

    Any email sent to me that calls me 'dude' gets itself automatically deleted...

  81. Re:Top 20 spammers in the country. by peterpi · · Score: 1
    "it's totally real. I also saw it in last month's issue of Wired"

    Oh, it must be true then! :p

  82. Creative Fun-Spam by Anonymous Coward · · Score: 0

    As anyone knows, filling webform as required by some sites leads to the snail spam being sent. Some of these forms are prompting to fill in manually the position you are in. Usually I omit these as they are mostly non-mandatory. However, once I put myself as 'Yellow Snow Developer' and, lo and behold, here is the envelope in my mail addressing me as such.
    Let them entertain the community (and a few post workers)

  83. An old idea... by beaviz · · Score: 2, Funny

    This reminds me of an idea that i've had for som etime.. spamnewsreportingforthemasses.com - A news site reporting news from spam-sources - sort of like a satirical view on spam.
    "New indian health care enables you to have more lovers"
    "New solution for your economical problems found"

    - and throw in a hoax section too...

  84. For profit? by alech · · Score: 2, Informative

    The domain is registered to Guru Rajan of ciphertrust.com. Funnily enough, Ciphertrust markets a product called IronMail that does (among other things) spam detection. So who says they are really putting the database out once they have it and not use it for their own good?

  85. Everytime someone sends a hotmail email by The+Analog+Kid · · Score: 1

    Tired of spam? Get advanced junk mail protection with MSN 8. http://join.msn.com/?page=features/junkmail and depending on where the sender lives it will spam you in a different language, Direct chatten met je vrienden met MSN Messenger http://messenger.msn.nl

  86. UGH! by Anonymous Coward · · Score: 0

    How lame is this? A simple domain just registered, nobody has a clue where this idiot starting it came from and a simple static html page.

    This made slashdot headlines, because!?

  87. 1) Open Hotmail account
    2) ???
    3) PROFIT!!!

    --
    how does one change his /. id?
  88. Resistant Strains? by Queuetue · · Score: 3, Interesting

    Although spam eradication is a good idea in general, I wonder if bulk training will only result in resistant strains of superspam developing, much like the v-cillin resistant staphs that are popping up lately.

    If we deal with a little spam by hand today, will that keep us from having to deal with undetectable spam later? I can imagine spam systems that probe you (using actual system probes of you and your contacts, marketing history and social engineering) to target spam that you may actually believe is a recommendation for the Sony(tm) handicam from your Uncle Bowser, or really is your wife asking you to pick up some Clorox(tm) brand bleach and fabric softener on the way home...

    Luckily, neither of them is likely to be sending information about my penis to me at work.

    Much like modding the Xbox (and thus giving MS the practice they need to harden Palladium), giving the hard fight to the spammers might just backfire on us.

  89. Frontpage? by tader · · Score: 0, Troll

    Can someone who uses Frontpage to create an anti-spam homepage, be trusted?

  90. try spamgourmet by jqh1 · · Score: 2, Funny
    auto-create disposable addresses at spamgourmet.com.


    not too much work.

    --
    who's moderating the meta-moderators?
  91. The source code by RobertTaylor · · Score: 1



    spamarchive ...oh dear. This has to be a joke?

  92. Next - Spammers Use DMCA to Get SPAMS removed by semprebon · · Score: 3, Funny

    I expect we'll next see Spammers using the DMCA to get their copyrighted SPAM removed from the database...

    --
    Andrew Semprebon EQ Systems Inc.
  93. Spam bank by jhampson · · Score: 1

    I told my wife that I was going to make a deposit at the spam bank. I went like 5 times a day.
    She's amazed and grateful that I can even 'log in' any more.

  94. Re:Top 20 spammers in the country. ??? by isorox · · Score: 2

    Hey, that wasnt me!

  95. What license??? by LinuxParanoid · · Score: 2

    Are they going to offer the content of spamarchive.org under an Open Content license, or is this just another database that will eventually be absorbed and closed to the public by some corporation protecting database copyrights?

    --LP

  96. Congratulations by Anonymous Coward · · Score: 0

    You got the joke!

  97. There are already many spam archives by Richard+W.M.+Jones · · Score: 2, Informative
    You can find many of them listed from my spam archive :-)

    Rich.

  98. Spam after this site is functioning... by bjb · · Score: 1
    Email message basically contains:

    Hello, please refer to article http://spamarchive.org/message.pl?id=298572

    Thanks for reading, and your click gets our site 0.005 cents per hit!


    --
    Never hit your grandmother with a shovel, for it leaves a bad impression on her mind...
  99. Totally different problems. by Erpo · · Score: 2

    Fighting spam is like fighting crime, hackers or piracy. For every measure we put in place some spammer somewhere will find a way around it.

    All problems are not the same - some have solutions and some don't. Take spam and piracy for example.

    There's a system out there right now for spam blocking (I forget the name or URL at the moment, but it's been mentioned before on slashdot) that maintains a whitelist of people that are allowed to contact you, and when it receives an email from a person that is not on the whitelist, it stores that email in a temporary area and emails the sender asking for a confirmation email in return. If the spam-blocker receives a confirmation email (i.e. the actual person gets the return email, hits reply, and hits send as per the directions) then the original email gets through to your inbox. Right now this is a 100% effective spam-blocker. No good email is filtered out, and no spam is let through because spammers forge their return addresses and therefore never get confirmation emails. It has the added bonus of not requiring the user to look through a "junk mail" folder. Implementing this system universally (1) server-side would solve the spam problem. The only way spammers could get through would be to provide actual "from" email addresses which open them up to lawsuits, and (as they have to check incoming messages and reply to them, meaning they have to either host the "from" account themselves or have fast access to a server that does) it would open them up to all sorts of DDoS attacks. Got a 1KB spam email that slipped through with a from address of from@spammer.dynamicdnsservice.com? Hit that ever so satisfying "Can The Spammer" button and blast spammer.dynamicdnsservice.com with 100KB of data. The more spam the spammer pushes out, the more clogged its downstream pipe gets.

    (1) Ok, not this system, as a spammer could always find out who your friends are and put their email addresses in the from: header, but a system based on public key cryptography would do the job nicely. That would mean client-side software updates and a protocol change, but it's still a solvable problem.

    Now, take a look at piracy. There is a property of information (or data, or bits, or whatever you want to call it) that is so absolute and inviolable that I would go so far as to call it a law of the physics of information. It is: The only way to control the distribution of information is to ensure that the people and machines that have access to that information all agree to control its distribution. That's it - think about it. It means every technology-based digital restriction mechanism can be broken. (2) Yeah, you could put telescreens in all homes and watch everyone 1984 style, but that's a very poor solution. The best way to deal with "piracy" is to stop thinking along the lines of trying to control information like a physical good and find an alternative business model. No endless wasteful competition between DRM designers and hackers, and no more buying expensive DRM snake oil for businesses.

    (2) Yes, even palladium can be broken. Here's an easy three-step process for breaking a palladium system:

    (1) De-solder the TCPA components from the motherboard except the CTRM (yes, including the cpu if necessary), attach them to an add-in pci card along with a power connector (again, if necessary) and a pci interface chip that talks to the bus and simulates a CTRM that has "measured" a trusted system.

    (1.5) Not really a "step". Design and fabricate the above chip.

    (2) Write a kernel level driver for the OS of your choice that diverts calls to the trusted hardware subsystem in loaded applications to calls to the driver itself which simulates the trusted subsystem. Any time it needs a "Yes, I am a trusted system." certificate signed, the driver should call upon the pci card to perform this function. (Yes, you can install your own drivers. You just have to boot your system in untrusted mode [where applications would normally not receive services from trusted hardware])

    (3) Download "protected files" and let your trusted applications happily place them (in encrypted format) on your hard disk. When you want to directly access the unencrypted data, snag the decryption key directly from the driver.

    Yeah, it's complicated, and not all people have the necessary skills to pull it off, but keep in mind that:
    *It only has to be done once to release information from DRM jail and make it available to anyone.
    *Once the step 1.5 chip has been designed and the driver written (along with a userspace "data recovery" tool), they can be sold fairly easily as the equivalents of "mod chips" in game consoles.

    Two last important notes:

    *Yes, I've read the TCPA specs and I know this will work. If you would like to verify this for yourself (a smart move), they're freely available for download in pdf format from the TCPA web site.

    *This does not mean palladium can be safely ignored - quite the opposite. When the only legal way to access certain content and services is an attempt to violate the physics of information by a single convicted but unpunished monopoly, everyone is in trouble. I'm sure you can think of other terrible consequences, but here's something to get you thinking in another direction. What will happen when everyone trusts the "Trusted Computing Platform Alliance" enough to put their personal (medical, financial, etc...) information into the system?

  100. It helps but only a little by AUsBandit · · Score: 1

    because spam has 'evolved' into a more dynamic form. Spammers now append random letters and characters into emails so they don't match theese filters. So then the filters 'evolve' to match the patterns of randomness. It is an evolving game back and forth. But all in all it remains simply pattern recognition and pattern generation done by computer. If a computer can make it then a computer can un-make it. So really we will never be thru with spam until we take a less 'trusting' approach to email. By that I mean we all have to adapt to a more opt-in method of reciving mail.

  101. Tagged addresses by gorbachev · · Score: 1

    Tagged addresses (na+foo@example.com, "foo" is the tag) are automatically routed to the correct Email address (in the case of my example to na@example.com).

    There is no need to set up any new email addresses.

    I use it all the time. Too bad many online vendors do not allow me to enter the '+' sign on their registration forms.

    Proletariat of the world, unite to kill spammers

    --
    In Soviet Russia, I ruled you
  102. I will miss spam... by leecho · · Score: 0

    I surely hope this project gives developers a great tool to fight spam, and I wish best of luck to them! But there's still that deep feeling in me that says that I will miss my ~100 spams a day. I was so used to see messages like "MAKE MONEY FAST!" that I think a really (or nearly) perfect spam filter will make my inbox so empty... :'(

  103. Spam-chu, I choose you! by dynayellow · · Score: 1

    Gotta delete 'em all.

  104. SpamArchive.org - sounds like a Can of Spam to me by hotkoolaid · · Score: 1

    Maybe they should rename it CanofSpam.org? ;-)

    --
    koolaid
  105. Large collection of legitimate e-mail needed more? by tschild · · Score: 2, Insightful

    I don't thing that a large archive of spam is hard to come by. You don't need to publicly invite submissions either - just acquire a domain and hosting with catchall e-mail service, set up e-mail forwarding to an address for your database, then publish several addresses under that domain where spammers are bound to pick them up (newsgroups, FFA lists) and register them with services who sell their e-mail lists with a lot of different demographic information vectors. You'll get as much input as you have a use for.

    For calibrating spam filters you'll probably only want spam from the last few months as spam does evolve - e.g. it's mostly herb*l vi*gra these days.

    What is at least equally needful but much more hard to come by is a large, representative collection of legitimate e-mail, to test spam filters for false positives. This collection would need to cover diverse languages, cultures and contexts (private, business/x-industry, business/y-industry, system error messages, automatic notification messages etc.)

    What is hard about this collection of legitimate e-mail is that the privacy of both sender and recipient is affected, and that, if confidential information is masked or deleted, the e-mail isn't the original one and spam filters might evaluate it differently.

    There is one subset of legitimate e-mail available: public archives of mailing lists. But these e-mails don't cover the style of e-mail in other contexts.

  106. Re:I think this is just going to make spam more an by Ari+Rahikkala · · Score: 1

    Moreover, as far as I know most of us, when teaching our relatives to use e-mail (hey, the revolution has to propagate somehow) we also teach them how to quickly spot and delete spam. That is, we teach them to think "Joe Computer Professional said it would be better for all of us if I just ignored spam". But if filters are installed everywhere, spam will become a different beast - spammers will write message that get through filters and are thus inevitably also harder for himans to distinguish as spam. Because of this more people will read it and also more people will buy things from spammers. Ergo, spam will become more profitable because filters will force spammers to be less stupid.

    And I know that this is not an impossible prediction. Thanks to a little care with my e-mail address (and I mean a little - I don't even have a scratch account for use at the less reputable parts of the 'net I visit), the fact that my main account is not with a big provider and probably most importantly SpamAssassin, I receive very little spam - less than one messsage a month. But I once got a rather long message that I had to read through twice and visit the URLs given in the message because I couldn't figure out whether it was a legitimate mail, bogus randomness by an insane businessman or true spam. I wouldn't want to have more of those taking my time.

  107. FALSE STATEMENTS by mgkimsal2 · · Score: 4, Insightful

    ... and I receive zero spam

    Once I receive spam on one of the addresses...

    I also advertise the email address widely ...

    So, you receive no spam, but when you do receive spam, you edit procmail. Which is it?

    Also, you widely advertise your email address, but you don't actually use your email address, but made-up aliases. Which is it?

    You're simply masking the problem, and going thru a moderate amount of gyrations (which most average joe 'net users won't/can't go through) to do so.

    1. Re:FALSE STATEMENTS by Anonymous Coward · · Score: 0

      you receive no spam, but when you do receive spam, you edit procmail. Which is it?
      you widely advertise your email address, but you don't actually use your email address, but made-up aliases. Which is it?


      Bingo.

      This guy is full of shit.

      "I don't recieve spam because when I recieve spam..."

      Jesus, how stupid can one person be?

  108. What About Spamnet? by duncan7 · · Score: 1

    Cloudmark makes SpamNet, a P2P plugin for *gasp* Outlook, that allows users to submit spam messages to a database, where an algorithm integrates the submissions into a master spam list that gets published back to the clients, which then pull messages out of users' inboxes as they arrive. (Works pretty well, too.) I should think their DB would be a good place for this effort to begin.

  109. Mail account by Anonymous Coward · · Score: 0

    Another idea to catch up some spam is to write a message in a newsgroup, using a real email address.
    Surely you'll fill the mailbox quickly.

  110. This is not new by jonadab · · Score: 1

    I already _have_ a large repository of spam in a set of folders
    in my mail repository. The US FTC already has a _huge_ repository
    of spam. The news.admin.net-abuse people have a positively
    *enormous* repository of spam from both email and usenet.

    Anyway, a large repository of past spam is not really what you want
    for testing anti-spam solutions, because spammer tactics keep on
    changing. It used to be that a whitelist solution could trip on
    unrecognised From: fields, but now they're using the same From:
    field for everyone. It used to be that you could filter by the
    IP address of the mailserver used, but these days the mail servers
    migrate constantly across entire Class B networks. It used to be
    that you could filter based on subject lines with lots of digits
    at the end, but these days they're using random sequences of
    letters, and if you filter based on that they'll switch to Markov
    chains, which are simple to create and AI-complete to recognize.

    For anti-spam testing, what you want is a mail account that never
    gets anything sent to it _except_ spam, for which you can create
    infinite alias addresses and release them in strategic places.
    (You start by designating addresses starting with u as having been
    released only on usenet, generate a few thousand addresses that
    start with u, and use them in the From: fields of a bunch of posts
    to test newsgroups.)

    You have to be constantly getting _new_ spam for testing. The old
    stuff will give you a false sense of how well your stuff is working.

    --
    Cut that out, or I will ship you to Norilsk in a box.
  111. Counter productive by jtougas · · Score: 1

    Who do you think this will help most? The people making anti-spam software, the people sending spam, or the clever ones that send spam telling you to buy anti-spam software?

  112. U n i v e r s i t y D i p o l m a s by gatkinso · · Score: 3, Funny


    Get your now! You gate to betta rife. Moa pay, wok wess.

    www.dipwomas.tw

    --
    I am very small, utmostly microscopic.
  113. Finally ! by jomagam · · Score: 1

    There is an OSS project I can contribute to !

  114. archive of spam not all that useful by pigpen_ · · Score: 2, Insightful

    An "standard" archive of spam might work great for benchmarking rule based filters against each other, but adaptive filters, like the popular Bayesian kind, work best when they learn on your own emails and spams. There's also no point in testing an adaptive filter when you can't also feed it non-spam emails.

    --
    Zambozay! My brain must've been eatin' a sandwich!
  115. Service is already available on the windows side by terradyn · · Score: 2, Informative

    Ok... for the people that still use Outlook, this exact service is provided by a company called CloudMark. The address is Spamnet.com. I've been using it for some time and it seems pretty robust. A community basically earmarks spam messages and based on votes a piece of spam gets moved to a spam folder on retrieval. Nothing is ever deleted.

  116. news.admin.net-abuse.sightings already exists... by tskirvin · · Score: 2, Informative
    I've moderated a Usenet newsgroup that does this kind of stuff for the last six years now (since Nov 1996). (Yes, I know others have stated some of this stuff, but it's worth mentioning it again.)
  117. Copyright Infringment Here? by limekiller4 · · Score: 2

    Can the spam writers claim copyright infringement?

    --
    My .02,
    Limekiller
  118. The FTC Already Has Such an Archive by WayTooOldForThis · · Score: 1

    Your tax dollars have already funded a huge archive of spam at the Federal Trade Commission. In fact, they are running out of room to store the stuff. The FTC says they can't release the contents because of privacy concerns, but surely there is a way around this: xxx out receivers' email addresses; apply secure aliases to protect the innnocent, etc.

  119. spam submittal to multiple engines? by aagha · · Score: 1

    Is there anything out there that will let you submit the daily spam you get to Razor, SpamAssassin, and SpamArcive?

    Right now, I use Pine and I can: 'razor-report -d' my spam (speaking of which, I've not been getting any mail caught by Razor for the last couple of days).

    If someone could tie all these puppies together so that every e-mail I receive goes through these filters, My spam would be even less.

  120. I like Spam, spam keeps me warm by Anonymous Coward · · Score: 0

    Some of us like spam, some of us don't have a life and we feel loved by the quantity of spam we recieve. It's like mail order catalogs for your mother. Long live spam!!!

  121. Good point actually. by EricHsu · · Score: 1
    This is actually a legitimate question. I believe they can, as distribution of advertising is still covered by copyright. (Witness the sad demise of adcritic.com and people who try to host Apple ads.) The question would be whether this use of spam was "fair use", because it was for research.

    The second issue is whether it's covered by the laws that supposedly protect e-mail conversations. I've been socialized on Usenet to believe that it's illegal to publically post pieces of private e-mail. I've never seen this law, so perhaps it's merely socially condemned practice. That would be an interesting question too, to see if spam can be considered "private e-mail". Since the same e-mail is sent to millions, it probably isn't. But what if it were personalized? Would it count then? Interesting issues to spice up media law...

  122. copyright infringment? by f64 · · Score: 1

    how long before one of the spammers sue the site for copyright infringement for making
    publicly available the end results (ie spam) of their hard and honest work?

    i'm willing to bet my two cents on that the spammers will win the case.

    ---
    i'm not paranoid, just scared of 'them'.

  123. can community anti-spam last? by EricHsu · · Score: 1
    I love the idea as much as anyone, but a service that is based on sheer voting-by-message cannot last. Right now spammers send out exact copies to everyone, so this works. If this service becomes too effective, wouldn't spammers move to personalizing their e-mails with enough text changes to fool the message matching?

    Now if you move to a statistical method, there is the issue of training your filter. By the nature of the statistical method, it may well be more accurate if you train it, as opposed to the masses. Why? Because your pool of Ham (non-Spam) is going to have distinct characteristics that will help avoid false positives for you (but maybe not someone else). If a community trains it, then on average, it may be that the Ham becomes less distinguishable from the Spam.

    On the other hand, this second point is an empirical claim. It would probably be relatively easy to do a little study of this. Get some 100 people to share statistics on their Ham and Spam (not the actual messages). The researchers see if the aggregate generated filtering is better than the individual ones. Nobody's privacy is (too) compromised.

  124. Algorithm testing issues by JoeBuck · · Score: 2

    To be usable for algorithm testing, the spam database would need to be divided into a "training" set and a "testing" set. Algorithms would need to be tuned based only on the training set, and tested on the testing set. Otherwise any stats obtained will be over-optimistic, as the algorithm might be deliberately or accidentally tuned to work really well only with the particular messages in teh training set.

  125. Re:Maybe not quite right... by wiresquire · · Score: 1

    Spammers:
    1) Download spam archives
    2) Download tools to fight spam
    3) Generate new spam that doesn't get caught by tools in 2)
    4) Profit

    --

    So does Anonymous Coward have good karma?

  126. Hottest new site has tons of XXX e-mail by Anonymous Coward · · Score: 0

    The hottest site on the net!

    SpamArchive.org

  127. bogus email harvesting by zogger · · Score: 2

    --been following this spam problem for awhile. One of the ideas I have seen that seems to me to have a more pro active approach to it is to poison the spammers email lists on purpose by using their own robots against them. Instead of trying to build filters and generate lists of IP's to block and etc, wouldn't it be better to create masses of webpages that contain nothing but zillions of bogus but good looking email addy's? From what I understand it's expensive for the spammers to send out huge numbers of spam emails, the profit margin is slim. This idea might knock it to the mass-zero level for most of them as it would become unprofitable for them to be in that business. If thousands of websites had a page of bogus emails, and they were different, then eventually the spammers harvested lists would be filled with useless mostly emails and the bouncing would resemble superturbo flubber.

    I'm not good enough to know if this would work or not, just seeking commentary on it.

  128. What if... by Anonymous Coward · · Score: 0

    hats were ants?

  129. Dude! by hasse · · Score: 1

    That's one cool radio station. I hadn't heard of this one before, but they use music by Monotonik (used to be Mono, with guys like Supernao and Mortimer Twang on the Amiga). Excellent 'electronic' music. The spam voiceover makes it very unreal.

  130. Cloudmark by Anonymous Coward · · Score: 0

    Maybe CipherTrust is trying to find new ways to beat peer to peer spam-fighting software from competition.

    SpamNet: join up with this innovative service and help fight spam across the Web. (Spam-Filtering Software).

    Scott Parker
    534 words
    1 December 2002
    Internet Magazine
    97
    ISSN: 1355-6428
    English
    Copyright 2002 Gale Group Inc. All rights reserved. COPYRIGHT 2002 EMAP Media Ltd.

    You may have noticed spam is becoming a serious problem. Not only is it clogging up email servers and wasting our valuable time, it's also likely to be the sort of stuff you or your family don't want to see. So what do you do about it?

    Well, we should all be careful about where we display our email address, and create special accounts to use when registering products and services. But even if you never give your address to anyone, spam still gets through.

    There are various products designed to combat unwanted email, but Cloudmark has come up with a new solution. SpamNet is a worldwide community that aims to identify and filter junk mail before it arrives. It's a free Outlook plug-in that lets you report any spam you receive to the entire SpamNet community. It's easy to install, but you might have to tweak things to get it to operate behind a firewall.

    SpamNet adds a couple of extra buttons and options to your Outlook toolbar, and creates a Spam folder. Any incoming mail identified as junk by the community is diverted there, but you can also run the service on any existing mail folders.

    You can filter and report any spam that does get through at the click of a button. To maintain the integrity of the spam database, each member of the SpamNet community is rated according to how much spam they report and how accurate those reports are-so reports from long-time, trusted users will carry more weight than others. This is important, as the network is open to abuse from people trying to block legitimate email.

    SpamNet sounds great in principle, but does it work in practice? We found it managed to filter incoming email quite effectively, diverting about half the unsolicited mail we received into the Spam folder. We also found that running SpamNet on an existing mail folder crashed Outlook several times, although it did eventually shovel the majority of junk into the Spam folder.

    The filtering wasn't perfect--it did class some legitimate emails, including personal ones, as spam. These were easy to retrieve, as SpamNet doesn't actually delete any messages, but it does mean the odd genuine message might be missed.

    The Beta release we tested is only available for Outlook 2000 or XP, but there are plans to release a version for Outlook Express, and hopefully any problems will be ironed out soon.

    SpamNet is an effective filtering tool, but you still have to download the junk mail and delete it. And if you were hoping to hide the spam from your kids, think again-it all remains on your machine.

    ***

  131. Hesitant by Steve+Cowan · · Score: 2

    I would be reluctant to forward messages directly from my personal mailbox to such an archive, in case the headers of my forward get left in their archive.

    My email address would then exist in their archive, and could be wrongly identified by some developers as a spammer's address.

    Or worse, my email address could be spidered so that I could be delivered more junk mail.

    As has already been suggested, some assurances on this site are in order. I don't know who these people are or what they're going to do with my spam when I forward it to them. And the archive is not available to me yet.

    Perhaps /. is a little premature in posting this. The concept is great, but until some content is available from their site, I wouldn't exactly call this a "launch".

  132. this is just stupid by Anonymous Coward · · Score: 0

    Instead of testing anti-spam tools...they'd better create an online list of known spammers and let isps block their stuff from arriving in innocent peoples mailbox

  133. Great idea, IF some issues are handled right by sakeneko · · Score: 2

    I love this idea.

    Among my other activities, I maintain a spam filter . Like most other people who do spam filtering, I rely upon my own spamtrap addresses, reports by my users, and then crosscheck with news.admin.net-abuse.sightings and a few private mailing lists used by anti-spammers. A canonical archive of spam, however, would be a wonderfully helpful tool.

    I can see a number of issues that will need to be managed with a list like this, however. Here are a few:

    • Where will the spam come from? Where will the Spam Archive get its spam, and how will it ensure that only spam, and not legitimate bulk email, is included?

    • This is not a trivial issue. Relying on reports of spam from random individuals almost guarantees that some of your "take" will be legitimate, solicited email. Some spammers report legitimate email as spam in order to make a spam filter ineffective by polluting it. Some anti-spammers consider all commercial email to be spam, whether it was solicited or not. Other users sign up for an email list and then forget that they did so -- lots of people are trigger happy these days because of the deluge of spam. (I'm not making this up -- this has happened to me more than once.)

    • How will spammed email addresses, particularly spamtrap addresses, be protected? Spam is sent to specific email addresses. One of the best sources of "clean" spam -- spam that you know is spam -- are spamtrap addresses deliberately created and planted for spammers to find, which are never used for any other purpose.

    • However, if people submit spam sent to a spamtrap address to the archive, spammers can then access the archive and remove those addresses from their mailing lists, or "listwash" them, making them less useful. In addition, troublemakers can feed those addresses to web sites or subscribe them to legitimate mailing lists. This ruins these addresses for their intended purpose. It can also result in mailbombing spamtrap addresses with a flood of confirmation messages for properly-run email lists.

    • How will you classify and cross-reference the database? To be most useful, a database of this type needs to be searchable. It will rapidly grow large enough to require a supercomputer to search unless the maintainers set it up properly. (Even if they do, I foresee them needing several very powerful computers.)

    • How are they planning to pay for the resources they will need? If by donations, they need to set up a non-profit organization, and solicit donations. I'd be happy to donate, but I suspect that they'll need more money than I and a few geeks who like the idea can afford. :)

    I'm sure I'll think of other concerns as time goes on, but this should get some discussion started. I can think of some ways I'd handle these issues, but I'd like to hear what other Slashdot readers have to say....

  134. What about false positives? by Anonymous Coward · · Score: 0

    The spamarchive only helps in testing filtering algorithms for false negatives.

    There needs to be an archive (corpus) of non-spam email so that filtering algorithms can also be tested for false positives.

  135. Why visit a repository... just visit your inbox. by nenolod · · Score: 1

    This is a fairly useless idea for a website, as to look at spam, all you have to do is open up your e-mail inbox. Why would someone actually care to look at spam? How bored would they have to be to take actions like this without going insane? Why post spam? Are the maintainers of the website being paid for this, is providing spam to the rest of the world overly important to their whole idea of the what information should be online, or is it just plainly that they have an unnatural obsession with the unsolicited bulk mail?

    SPAM is not something that should be celebrated or thought of as entertainment. It is an annoying advertisement that has turned into the world's largest electronic nuisance. The idea just seems to be a waste of time and money, and bandwidth.

  136. Well actually... by nenolod · · Score: 1

    From their website:

    "We will publish SpamArchive.org mailing list information soon."

    So, yeah, probably. Probably gonna contain some spam too.

  137. DMCA? by Anonymous Coward · · Score: 0

    Hey, waitaminute, that is my spam! It is copyrighted by sleazydroid inc. I will unleash the thunder of the DMCA on you if you not immediate remove my opt-in, completely voluntary requested emails!
    The US government is my friend, you are not!

  138. This archive violates the DMCA. Please desist. by buswolley · · Score: 1

    So called "spam" are actually copyrighted works. Please desist in storing spams,comparing spams porperties, etc. This violates the DMCA.

    --

    A Good Troll is better than a Bad Human.

  139. Pay me for my spam... by BSOD+from+above · · Score: 1

    collection. I have been collecting this stuff for years now. I have enough of it to fill a CD. Maybe I can sell it on ebay. Honestly, if they want my spam they'll have to pay me for collecting it in my inbox, it really is hard work (unless your address ends in @hotmail.com). I would love to see any spammer take a loaf for the team.

    --
    Karma: Censored (mostly affected by decency laws)
  140. What spam works? by totierne · · Score: 1

    Ok so there is always sex, but surely the spammers will be able to target spam at people who just have not decided to buy their product yet, and may actually want er 'info-email'.

    Does anyone have click through rates and 'success' of spam?

  141. Dude... by bbtom · · Score: 1

    One organisation? That sucks. We need a peer-to-peer distributed method for cataloguing spam...

    How about we create a Napster clone that has a bunch of master server lists and 'Ultrapeers'.

    No, scratch that. We could use email!

    Who wants to sign up to this peer-to-peer distrubted spam catalogue? AOL or Hotmail account reccomended, buy not required.

    --
    catch (HumourFailureException e) { e.user.send("You, sir, are a humourless idiot."); }
  142. SpamCop by Muskie · · Score: 1

    I never read every post, but I use MailSmith (http://www.barebones.com/products/mailsmith.html) and report my spam to Spam Cop (http://spamcop.net)

    Of course this is Spam I downloaded first... SpamAssassin is also pretty keen and I want to make greater use of it.

    Surely SpamCop or SpamAssassin already has a pretty good database of Spam. I know I send in a pile mostly from my university account which though largely retired was in use on the UseNet, Web etc. since 94/95.

    Muskie

  143. Is it possible to Slashdot a mailbox? by Xandar01 · · Score: 1

    Seeing how addresses can be harvested like in this "Story of Nadine" it might be fun to plaster a email address all over the web such as JoBob@hotmail.com which is really an alias that immediately forwards the mail to submit@spamarchive.org. Heck, if each one us set up one alias on our mail servers to point to their submission box I bet we'd fill them up with data REAL quick. I might even think it would be the first time a mailbox got slashdotted...maybe.

    --
    Life moves pretty fast; if you don't stop and look around once in a while, you could miss it. -FB
  144. this coupled with... by zonker · · Score: 0

    ...popfile and your spam problem might disappear, however your idea of spam and mine may not be the same so it might not work as well as you'd think...

  145. Re:Top 20 spammers in the country. by Anonymous Coward · · Score: 0

    My first post on /. Hey Gendou, I was interested in conversing with you about the posts you made on the Tresco (warez guy) intellectual property thread. I couldn't gind a superior way than this to contact you. Please email me @ fanniecat@hotmail.com with contact info so we can IM or something. As for these subjects being duscussed, I think spamarchive.org sucks, what a lame website! And I don't judge people for what they read; I try to get information from as many sources as possible. Wired is OK, like everything else, in moderation.

  146. Not all that useful without sample non-spam by NTDaley · · Score: 1

    I hope people using this collection will make sure they test it against a decent sample of non-spam as well. Otherwise I imagine there could be a pretty good chance of false positives, i.e. marking non-spam as spam.

    --
    bits and peace
    Nicholas Daley
  147. Last Post! by alpg · · Score: 1

    The only promotion rules I can think of are that a sense of shame is to
    be avoided at all costs and there is never any reason for a hustler to
    be less cunning than more virtuous men. Oh yes ... whenever you think
    you've got something really great, add ten per cent more.
    -- Bill Veeck

    - this post brought to you by the Automated Last Post Generator...