Slashdot Mirror


TarProxy Creates Tar Pit... For Spammers

agravaine writes "I ran across TarProxy, which, IMHO, is one of the cleverest spammer-handling ideas I've seen yet. The gist: Early detection of incoming spam [using the statistical techniques pioneered on the client side] could be used to create an artificial scarcity of bandwidth experienced only by spammers." This project hasn't gone very far yet, but essentially is slows SMTP requests to suspected spammers. If this really works, and is installed on enough of the net, it could work. 144 spam so far today. Anything would be an improvement. CT Yup, it's a dupe. There wasn't anything better to post at 9am on a sunday, so you can just bitch about me instead ;)

10 of 164 comments (clear)

  1. Deja Vu by BorgDrone · · Score: 4, Informative

    Anyone else get a bit of a deja vu feeling reading this newspost ?

    1. Re:Deja Vu by baptiste · · Score: 4, Funny

      Maybe we can use statistical techniques on the client side to help the editors avoid duplicate stories.

  2. Brilliant by FunWithHeadlines · · Score: 4, Insightful
    What's inefficient about current spam solutions?:

    "These classifiers already come in many forms. There are POP3 proxies, IMAP proxies, mail file processors, and even classifiers built directly into mail clients. I use PopFile (a naïve Bayesian classifier in a POP3 proxy) at home with great success. Some work better than others, but with a little training, they all seem to work pretty well. Unfortunately, they have a common shortcoming: They don't cause the spammers any pain."

    What is the goal?:

    "And we all want to cause spammers pain."

    How do they want to accomplish this pain?:

    "None of these classifiers are capable of causing the spammers any pain because the spammer is long gone by the time the classifier has the opportunity to process the message. What we need is a way to use the classifier against the spammer while the spammer is still connected."

    This is brilliant. If all you do is clean up after the spammers when they are long gone, there is little motivation for them to stop. So what if they've dumped a bunch of garbage in your in-box? They don't stick around to see you clean up. But this idea hits them while they are in the process of spamming you.

    That's the key: Make it harder/more unpleasant/less cost-effective for the spammers and you discourage them from spamming. Hit the source, not the results.
    ------

  3. Interesting, but wont solve the problem by TheAB · · Score: 4, Insightful

    This is an interesting pro-active approach, but isnt most mail sent through open proxies, which have absent sysadmins? If we cant get them to lock down their mail servers, how can we get them to install this?

  4. Easy to defeat, just use dynamic spamming software by sanermind · · Score: 4, Interesting

    In the spirit of repetition...

    Easy to defeat, just use spamming software that dynamically increases it's connection pool whenever it encounters a 'slow' SMTP recipient. Even if a large part of the net population were running this, the spammer could just spawn thousands of simultanious (slowed down, yes) connections, and still maximize his bandwidth utilization. If it takes 2 minutes to send each message, it dosen't matter if he's sending 5000 messages at once!

    I believe linux, for example, allows up to 8192 open sockets, and I think this can be changes with a sysctl command, and most definitely could be with a few changes to kernel headers.

    Sure, it would take a machine with decent memory, but that's not too hard to find.

    --

    ---
    the pen is mightier than the sword, the sword is mightier than the court, the court is mightier than the pen.
  5. Causing pain, but indirectly by httptech · · Score: 4, Interesting
    Most spammers don't send hundreds of thousands of emails from their own connection. Generally they use open relays to propagate a few messages each with a huge RCPT list. So, tarpitting does nothing to the spammer directly. However, tarpitting the open relay does accomplish something - it could cause a huge backlog of mail, eventually letting the relay choke off its own resources as spammers kept trying to dump messages on it. This would cause some indirect pain to the spammers, as finding open relays that could deliver mail quickly would be difficult. It might also alert the mail administrator to the problem, thus encouraging them to close their server to relaying.

    And you would not need to roll this out on most of the net. If the large ISP and webmail providers started doing this it would have a significant impact. Much of the spammer's distribution list consists of a few domains; yahoo, hotmail, aol, etc. If the large providers implemented tarpits it could quickly damage the ready supply of open relays for spammers.

  6. Doesn't sound easy to me - please help me clarify by Featureless · · Score: 4, Interesting

    Hmm... This calls for some TCP geekhood and some strong math. I am way too hung over for math. Let's just talk about it in broad strokes first.

    If I'm tuning this package, I can make these delays REAL big. I mean, email is one of those systems where a false positive resulting in even a... let's say an **8 hour** delay to a legitimate message would still be considered perfectly fine for most purposes. There's fuzzy logic in play here; I'm thinking not all delays will be equal. But what if you were just really harsh on suspected spam? Not such a loss IMO. Of course... I haven't considered that you will have increased reliability problems trying to hold a stream open for 8 hours, but remember, a legitimate mailserver will keep resending, and as we go bayesian on servers, perhaps we will learn to resend for a little longer as well? Or perhaps there is another protocol solution (i.e. letting the sender know they're being delayed for spam... so perhaps giving them the option to reformulate their message and resend?) Let's just press on. The precise amount of the delay may not necessarily be important.

    If I'm sending 50 million messages (a modest spammer's run, if I'm well enough informed) and each one holds me on the line for 8 hours that means 400 million hours if run serially. At the 8192 concurrent thread barrier that's still almost 50,000 hours (~5 years)... with mathematical convenience, to do this entire run in 8 hours you will require **50 million** concurrent threads? Or should I have just stayed in bed longer?

    Now it's looking like the exact length of the delays, and the exact number of concurrent threads is not actually something worth too much niggling debate. We just have to get familiar with the orders of magnitude we're dealing with.

    Consider the protocol-to-data ratio of an SMTP transaction over TCP alone. How much is data and how much is just protocol overhead in a given mail transaction? We can figure this out down to the last bit, but I'm going to just throw out the hypothetical notion that when you have to initiate a new SMTP transaction for every message you send, the bandwidth overhead for doing this millions of times is not inconsiderable.

    And we have to think of the other end. Spammers may write themselves custom TCP/IP stacks, but receivers certainly will not. Consider AOL. AOL encompasses some significant percentage of your list of victims. What is AOL going to do with anywhere _near_ that many simultaneous connections... ***from just one spammer?*** Why, call the FBI, of course! It's a DOS attack!

    I'll stop now. I wouldn't be surprised if there were other angles on this I haven't considered. But at first blush it doesn't seem nearly as easy to beat as you suggest.

    Perversely I think the biggest danger in this technique is that it may become widespread and then force spammers to really confront Bayesian filtering head-on. Of course, just thinking aloud (and this probably is undoable for privacy reasons, but just to open a line of speculation) you can do some interesting things with these kinds of filters... retain lists of email addresses that you've received mail from (and/or replied to) more than once... they get a lower score (and a lower delay) than first-time senders... etc. etc. So it's not clear even with very well-designed spam (another cost increase for spammers!) that you could win against the filter.

  7. New! From SethSoft! by Dark+Lord+Seth · · Score: 4, Funny

    TarEditor!

    A revolutionary new system designed to keep your editors from posting duplicate stories! This new systems works by a punishment system, ranging from mild electrostatic shocks to gushes of boiling sulphuric acid! When properly applied, this piece of software will introduce a new era in the world of online journalism, having a significant effect on most long-term strategies and pleasing the stock market at the same time! This is all you're going to need in our post-modern capitalist system for guaranteed success!

    Features include:

    • Compatible with all editors!
    • Name based modifier system! Want Taco to get kicked in the nuts for the slightest mistake? NO PROBLEM!
    • HTTP proxy supporting HTML 4.0, 4.1 (all forms) XML, XHTML, DHTML AND MORE!
    • Punishments ranging from painful poke in the side to obliberating one's kneecap with a sledgehammer!
    • Up to date XML support!
    • 3 year on site support!

    TarEditor is copright (C) SethSoft, 1998-2003. All rights reserved. This work is protected under dutch copyright laws and the DMCA so I can sue some random people into debt from time to time.

  8. /. duplicate problem by Multics · · Score: 4, Insightful
    Given:

    1) the /. creaters are by-in-large the /. people that control the posting of stories

    2) Most stories contain at least one URL

    3) URL's, by in large, are unique

    Then;

    Would it be so hard to modify the actual posting code to check that the URL hadn't already been part of a story header within say the last 60 days?

    Such a check would help both /. and all others that run / code.

    Just a thought!

    -- Multics

  9. TCP slow-down mechanism on IETF list by karl.auerbach · · Score: 4, Interesting

    I suggested a similar mechanism to constipate TCP connections on the IETF e-mail list last summer. The basic idea is to add some new calls to the TCP API so that an application could peek at the incoming traffic without it being acknolwledged at the TCP level. If the incoming stream were something bad, then the application could tell the TCP stack to go into a slow acknowledgement mode, thus capturing the spammer in slow-mode transfer.

    For more, see http://www1.ietf.org/mail-archive/ietf/Current/msg 17009.html

    The difficulty is getting enough of these deployed so that spammers, and open relays, have a good chance of getting stuck.