Analysis of Spam, and a Proposed Solution

← Back to Stories (view on slashdot.org)

Analysis of Spam, and a Proposed Solution

Posted by michael on Tuesday April 6, 2004 @06:48AM from the enlarge-your-knowledge-now dept.

2bot_or_not_2bot writes "Spam: The Phenomenon is a detailed analysis of spam: products, scams, viruses, obfuscation methods, etc. Failed, and doomed-to-fail, methods of blocking spam are described. A general solution is proposed that does not: invade privacy, perform wide censorship or blacklisting, or involve payment and cooperation with corporations (beyond the transport and storage of data)." Hmmm.

5 of 370 comments (clear)

Min score:

Reason:

Sort:

Boycott of Microsoft's Caller ID for E-mail by Anonymous Coward · 2004-04-06 06:53 · Score: 5, Informative

There's a boycott occurring for Microsoft's Caller ID for E-mail. They're asking for anyone developing a mail client, spam filter or mail transport agent to use a more open protocol, rather than a patented one.
Not a scholoarly article - here's the text by Catamaran · 2004-04-06 06:54 · Score: 3, Informative

This is not a scholarly article. Here is his summary:

CONCLUSION REGARDING PROPOSED METHOD

I did not describe the details of how the proposed system would work, but I hope the proposal aspect of this article leads to more thinking about solutions to spam -- especially about solutions that avoid invasion of privacy by any form of content analysis or packet tracking, or cooperation with specific corporations, or censorship.
The web page contains lots of images of SPAM that the author has received.
Here is the text of his proposal:

SPAM CONTROL PROPOSAL This section contains a proposal for SOFTWARE and SOCIAL PRACTICES that have the potential of greatly reducing the nuisance of spam from a person's life. GENERAL INFORMATION Things required by this proposal: (1) A person who wishes to greatly reduce spam must install software on each computer with an e-mail client application (such as Microsoft Outlook). (2) A person who wishes to greatly reduce spam, when sharing his or her e-mail address, must also go through the trouble of sharing a code number. (3) Mailing list services must make a slight modification to their databases and mailing scripts to store and use codes in addition to e-mail addresses. Things that are NOT required by this proposal: (1) Changes to e-mail servers, e-mail protocols, e-mail content standards, or Internet infrastructure, are not required. (2) Existing spam countermeasures (content-filtering, IP blacklisting, anti-spam laws, etc) will not be necessary. (Such countermeasures are futile and dangerous anyhow.) (3) It is possible that changes to existing e-mail clients will not be required. Things that will NOT be directly helped by this proposal: (1) Internet bandwidth consumed by the futile efforts of spammers trying to make it through to people. (Once the futility becomes apparent worldwide, the spamming model may naturally be a very unattractive waste of time.) (2) E-mail "inbox" clogging while the spammer profession lingers on, before the futility of spamming has a chance to sink in worldwide. (3) People with e-mail clients and services provided by giant corporations may not experience the diminished spam until the giant corporations have a chance to update software. Other qualities of this proposal: (1) Totally open technology; not "security through obscurity". (2) Non-commercial, public-domain method, can be implemented by anyone without consideration. (3) Totally smooth transition from current e-mail clients, servers, mailing list services, etc. (4) Privacy preserved (no content analysis), and possibly even improved (as proposed software becomes more widespread). CORE CONCEPT The following paragraphs describe the core concept of the method. Certain details will be discussed in the "Use Cases" section: Messages received by an e-mail client will be sorted by codes contained in the message subject fields or within the message bodies. Spam messages are extremely unlikely to contain the proper codes, and are thus diverted to an anonymous-sender category. Unlike an e-mail address alone, which is a single, unmoving target for spammers, the additional codes are generated by formulae, and are tiny, constantly-moving targets in a huge expanse of possible target locations. Furthermore, any breach of trust can instantly be traced to specific unscrupulous people, and immediately and conveniently patched. The concept can be likened to "spread-spectrum" communication, or, much more loosely, "port knocking". CORE IMPLEMENTATION The following paragraphs describe the core implementation of the method. Three encrypted files are stored on an e-mail client machine: (1) PRIMARY FORMULA TABLE: Encrypted table with entries in the form: ( SHA hash of recipient e-mail address, primary formula ) (2) SECONDARY FORM

--
Test 1 2 3 4
Wrong by JohnGrahamCumming · 2004-04-06 06:54 · Score: 5, Informative

From TFA:
Salting the message with random words thwarted Bayesian filtering.
No, it hasn't. That's utter nonsense. This entire article is filled with statements like this with no justification. How about reading my presentation at the MIT Spam Conference that showed that random word insertion did not fool POPFile (or other Bayesian filters).
John.
Have the users pay for it... by Vexler · 2004-04-06 06:55 · Score: 4, Informative

Here is another way of looking at it: Spammers exist because there are idiots out there who fall for "vicod1n" or "pen1s enl@rgement" or what have you. We should have users who are purchasing these products pay an additional "spam tax" on it, to compensate for the wasted bandwidth and so on. Sort of like "shipping and handling fee". Actually, it comes close to the Internet tax idea that Congress is punting about, but applied to spams.
Bandwidth and storage for the ISP by RT+Alec · 2004-04-06 07:01 · Score: 5, Informative
I administer a mail server for a small ISP. The problem with filtering on the user's end is that my costs are consumed by the time the user deals with the spam. I don't think, as the article suggests, that spammers will slow down if their message is not being read, in fact they will just spew out ever more spam. If a 1/10 of 1% hit rate does not deter them, a smaller hit rate won't either.

I have to put some upper limit to the amount of storage I can give each person (right now I allow 100M, which I think is quite reasonable). But if a user goes on vacation and does not check their e-mail for a month, they could have their inbox filled with spam and viruses (not much difference these days, from a server admin point of view). This will preven legitamate messages from coming through. Therefore, I use the following technical measures to help reduce spam:
- RBLs: dnsbl.njabl.org, sbl.spamhaus.org, xbl.spamhaus.org, and dul.dnsbl.sorbs.net
- SPF:Sender (not adopted widely yet, but it does block a few messages a day even now)
- Blocking specific subject lines (during virus outbreaks this can help)
- Blocking mail "from" non-existant domains
I really have no choice, I cannot afford not to take these measures. I explain all of them to my clients, nobody has had a problem yet. These measures catch roughly 75% of spam and viruses, and as far as I know, no false positives.