Analysis of Spam, and a Proposed Solution

Boycott of Microsoft's Caller ID for E-mail by Anonymous Coward · 2004-04-06 06:53 · Score: 5, Informative

There's a boycott occurring for Microsoft's Caller ID for E-mail. They're asking for anyone developing a mail client, spam filter or mail transport agent to use a more open protocol, rather than a patented one.

Re:Boycott of Microsoft's Caller ID for E-mail by Danse · 2004-04-06 10:40 · Score: 2, Informative

Yes, you're most likely just trolling, but just in case some people don't realize why you're wrong, I figured I should point it out anyway. It's not a philisophical point. It's a very practical point. If Microsoft has a patent on it, then open source software and Microsoft competitors can't adhere to the standard without facing the posibility of lawsuits or large licensing fees. Maybe not right away, but whenever Microsoft feels it would benefit them most (read: after it becomes widely accepted and implemented).

--
It's not enough to bash in heads, you've got to bash in minds. - Captain Hammer

Not a scholoarly article - here's the text by Catamaran · 2004-04-06 06:54 · Score: 3, Informative

This is not a scholarly article. Here is his summary:

CONCLUSION REGARDING PROPOSED METHOD

I did not describe the details of how the proposed system would work, but I hope the proposal aspect of this article leads to more thinking about solutions to spam -- especially about solutions that avoid invasion of privacy by any form of content analysis or packet tracking, or cooperation with specific corporations, or censorship.

The web page contains lots of images of SPAM that the author has received.

Here is the text of his proposal:

SPAM CONTROL PROPOSAL This section contains a proposal for SOFTWARE and SOCIAL PRACTICES that have the potential of greatly reducing the nuisance of spam from a person's life. GENERAL INFORMATION Things required by this proposal: (1) A person who wishes to greatly reduce spam must install software on each computer with an e-mail client application (such as Microsoft Outlook). (2) A person who wishes to greatly reduce spam, when sharing his or her e-mail address, must also go through the trouble of sharing a code number. (3) Mailing list services must make a slight modification to their databases and mailing scripts to store and use codes in addition to e-mail addresses. Things that are NOT required by this proposal: (1) Changes to e-mail servers, e-mail protocols, e-mail content standards, or Internet infrastructure, are not required. (2) Existing spam countermeasures (content-filtering, IP blacklisting, anti-spam laws, etc) will not be necessary. (Such countermeasures are futile and dangerous anyhow.) (3) It is possible that changes to existing e-mail clients will not be required. Things that will NOT be directly helped by this proposal: (1) Internet bandwidth consumed by the futile efforts of spammers trying to make it through to people. (Once the futility becomes apparent worldwide, the spamming model may naturally be a very unattractive waste of time.) (2) E-mail "inbox" clogging while the spammer profession lingers on, before the futility of spamming has a chance to sink in worldwide. (3) People with e-mail clients and services provided by giant corporations may not experience the diminished spam until the giant corporations have a chance to update software. Other qualities of this proposal: (1) Totally open technology; not "security through obscurity". (2) Non-commercial, public-domain method, can be implemented by anyone without consideration. (3) Totally smooth transition from current e-mail clients, servers, mailing list services, etc. (4) Privacy preserved (no content analysis), and possibly even improved (as proposed software becomes more widespread). CORE CONCEPT The following paragraphs describe the core concept of the method. Certain details will be discussed in the "Use Cases" section: Messages received by an e-mail client will be sorted by codes contained in the message subject fields or within the message bodies. Spam messages are extremely unlikely to contain the proper codes, and are thus diverted to an anonymous-sender category. Unlike an e-mail address alone, which is a single, unmoving target for spammers, the additional codes are generated by formulae, and are tiny, constantly-moving targets in a huge expanse of possible target locations. Furthermore, any breach of trust can instantly be traced to specific unscrupulous people, and immediately and conveniently patched. The concept can be likened to "spread-spectrum" communication, or, much more loosely, "port knocking". CORE IMPLEMENTATION The following paragraphs describe the core implementation of the method. Three encrypted files are stored on an e-mail client machine: (1) PRIMARY FORMULA TABLE: Encrypted table with entries in the form: ( SHA hash of recipient e-mail address, primary formula ) (2) SECONDARY FORM

--
Test 1 2 3 4

Wrong by JohnGrahamCumming · 2004-04-06 06:54 · Score: 5, Informative

From TFA:

Salting the message with random words thwarted Bayesian filtering.

No, it hasn't. That's utter nonsense. This entire article is filled with statements like this with no justification. How about reading my presentation at the MIT Spam Conference that showed that random word insertion did not fool POPFile (or other Bayesian filters).

John.

Re:Wrong by Anonymous Coward · 2004-04-06 07:16 · Score: 2, Informative

I don't know what spam data you used, bit i've noticed quite a few spams getting through my bayesian filter lately... they all have more random words in sentances at the bottom than the real message at top. They do it like 'hank urged me and I to send you this flower and important notice' Bad grammer but i'm sure it's ment to look like a 'real' sentance since the computer can't 'read' like a person. It's kinda like an adlib game... they make a list of several hundred sentances with verbs and/or nouns missing then use word lists to fill them in.
Re:Wrong by cpeterso · 2004-04-06 07:20 · Score: 2, Informative

The existence of low-scoring or unknown "regular" words would NOT mask the presence of high-scoring spammy words! The Bayesian filter would not be fooled.

--
cpeterso

Have the users pay for it... by Vexler · 2004-04-06 06:55 · Score: 4, Informative

Here is another way of looking at it: Spammers exist because there are idiots out there who fall for "vicod1n" or "pen1s enl@rgement" or what have you. We should have users who are purchasing these products pay an additional "spam tax" on it, to compensate for the wasted bandwidth and so on. Sort of like "shipping and handling fee". Actually, it comes close to the Internet tax idea that Congress is punting about, but applied to spams.

Bandwidth and storage for the ISP by RT+Alec · 2004-04-06 07:01 · Score: 5, Informative

I administer a mail server for a small ISP. The problem with filtering on the user's end is that my costs are consumed by the time the user deals with the spam. I don't think, as the article suggests, that spammers will slow down if their message is not being read, in fact they will just spew out ever more spam. If a 1/10 of 1% hit rate does not deter them, a smaller hit rate won't either.

I have to put some upper limit to the amount of storage I can give each person (right now I allow 100M, which I think is quite reasonable). But if a user goes on vacation and does not check their e-mail for a month, they could have their inbox filled with spam and viruses (not much difference these days, from a server admin point of view). This will preven legitamate messages from coming through. Therefore, I use the following technical measures to help reduce spam:

RBLs: dnsbl.njabl.org, sbl.spamhaus.org, xbl.spamhaus.org, and dul.dnsbl.sorbs.net
SPF:Sender (not adopted widely yet, but it does block a few messages a day even now)
Blocking specific subject lines (during virus outbreaks this can help)
Blocking mail "from" non-existant domains

I really have no choice, I cannot afford not to take these measures. I explain all of them to my clients, nobody has had a problem yet. These measures catch roughly 75% of spam and viruses, and as far as I know, no false positives.

Re:IM2000 by digital+bath · 2004-04-06 07:28 · Score: 2, Informative

And under the current system, the spammer doesn't know anything about the recipient (or even that the email address is valid) unless he does something stupid like reply or click on a web link. Under this system, the spammer would know which addresses were valid by watching which messages were picked up.

Not entirely true. If a user is running a mail client that allows HTML mail, then the spammer can make the client request something unique from the spammer's server - an image, for example. I've seen spam email with images like this:

<image src="http://1.2.3.4/verify.php?email=YOUR_EMAIL_HE RE" />

When the user previews or opens that mail, their client will request that "image", and the spammer immediately knows that your email is valid.

--
find / -name "*.sig" | xargs rm

This is just a less-good PKI solution by 0x0d0a · 2004-04-06 07:33 · Score: 2, Informative

While I'm pretty strongly of the opinion that a PKI system with a trust network and signed content is ultimately going to be the only effective long-term way to deal with spam, this isn't great.

It's essentially just a PKI system, but requires effort on the part of the individuals to manually set up a trusted transmission channel for authentication data for each person, breaks security if an email is exposed, does not provide strong authentication benefits, and seems to be open to forgery containing data from an original email. It still requires the installation of software.

Instead of transmitting each "set of formulas" via a trusted channel, one could hand over an RSA pubkey, and instead of some weird proprietary embedding of secrets, one could simply sign the email. This provides all the benefits of the proposed system, operates in a regular manner, is strong against compromise of a client machine or of sent email, and there are, to some degree, systems in place to handle signing.

I would advise against this solution. It provides no benefits that a conventional email signing system lacks, and has some serious weaknesses.

--
May we never see th

Re:Is Poster Author? -- YES by Anonymous Coward · 2004-04-06 07:46 · Score: 1, Informative

Registrar Name....: Register.com
Registrar Whois...: whois.register.com
Registrar Homepage: http://www.register.com

Domain Name: colinfahey.com
Created on..............: 23 Oct 2001 12:25:20
Expires on..............: 23 Oct 2004 12:25:20

Registrant Info:
Colin Fahey

Colin Fahey
1068 Stanford
Irvine, CA 92612
US
Phone: 9498239921
Fax..:
Email: cpfahey@earthlink.net

Administrative Info:
Colin Fahey
Colin Fahey
1068 Stanford
Irvine, CA 92612
US
Phone: 9498239921
Fax..:
Email: cpfahey@earthlink.net

Not a full proof solution by Vermy · 2004-04-06 07:57 · Score: 2, Informative

The problem with your solution, is that I have never given out my email other than a hand select few whom I trust. However, I am now receiving spam by the handful daily (though overthecounter anti-spam software has been next to perfect for filtering it out).

The problem is, that my email is somewhat generic with my first initial, last name, plus a numeric conditioner. This email was assigned by the provider. Unfortunately, many spammers, once they realize how emails are formatted for an ISP, can easily run through a list formatting it with the most common names and values. They will no doubtedly waste some emails to addresses that don't exist, but they also hit a large number of valid addresses without the use of a list.

So you must have a fairly unique address or creative provider. That, and somewhat lucky that your address hasn't gotten out yet. But it will, eventually.

A much better, novel approach that just needs PR.. by mr.+squishie · 2004-04-06 08:48 · Score: 2, Informative

I keep posting about this, I've submitted a story about it, but nobody ever listens, and this strikes me as the only ORIGINAL idea that I've heard in a long time:

Unsolicited Commando

Everyone says that filtering all the spam in the world isn't going to help if we can't stop users from clicking on it. They're right. So if we can't stop them from clicking, why not do the reverse--flood the SPAMMER'S inbox with false positives of our own?? Basically UC is a little program that goes to companies that spam's websites and fills out their sign up forms with real looking but randomly generated info. At SOME point, there is an opportunity cost to checking up on these false positives. For example, if it costs $0.02 to check up on a false positive, and the companies make $10 for each order they sell from spamming, then we need is a distributed network to put in more than 500 false responses for each positive response they receive. If you've got a distributed network of 1000+ computers, and you put in a false positive every 30 seconds, then in 1 hr that's enough 120,000 false positives or enough to cover for 240 real responses. The beauty of this is that there is no longer any profit for the business using the spammer. It hits them where it hurts most.

But this method requires a large distributed network to work! It could, but nobody seems to know about it! Right now it's just some guy's pet project--if this thing got a serious team and some serious PR, it could really take the spamming world by storm! (Of course you'd have to watch out for abuses--targetting innocent businesses networks--but we already have large blacklists a la spamcop and under an open framework I think it'd be safe enough to use.)

For god's sake people, if we got a large enough network, it could really work!

Check your filter training database by Julian+Morrison · 2004-04-06 09:03 · Score: 2, Informative

Have you overtrained your filter? That tends to weaken its usefulness after awhile. If so, remove the training DB and retrain it from scratch.

Disposable Email Address Services Review by CheapScott · 2004-04-06 11:29 · Score: 2, Informative

About.com had a write-up a month ago.

Slashdot Mirror

Analysis of Spam, and a Proposed Solution

15 of 370 comments (clear)