Slashdot Mirror


TarProxy Creates Tar Pit... For Spammers

agravaine writes "I ran across TarProxy, which, IMHO, is one of the cleverest spammer-handling ideas I've seen yet. The gist: Early detection of incoming spam [using the statistical techniques pioneered on the client side] could be used to create an artificial scarcity of bandwidth experienced only by spammers." This project hasn't gone very far yet, but essentially is slows SMTP requests to suspected spammers. If this really works, and is installed on enough of the net, it could work. 144 spam so far today. Anything would be an improvement. CT Yup, it's a dupe. There wasn't anything better to post at 9am on a sunday, so you can just bitch about me instead ;)

23 of 164 comments (clear)

  1. Deja Vu by BorgDrone · · Score: 4, Informative

    Anyone else get a bit of a deja vu feeling reading this newspost ?

    1. Re:Deja Vu by baptiste · · Score: 4, Funny

      Maybe we can use statistical techniques on the client side to help the editors avoid duplicate stories.

  2. Another similar project... by jdreed1024 · · Score: 2, Funny
    I came across another similar project on Friday. I think a have a link to it somewhere...

    Ah, here it is. Using Statistics to Cause Spammers Pain. Posted on some website named slashdot.org. It also called itself TarProxy and .... oh ... uh .... never mind.

    --
    There is no sig, there is only Zuul.
  3. Spammers could put time limit on SMTP connections by bubblegoose · · Score: 3, Insightful

    I don't see how effective this could be. How long before spammers get smart and set their SMTP program to give up after X seconds?

    Telemarketers killed the Telezapper, they would do the same here, its just a junk-busting arms race.

    --
    I hope that someday we will be able to put away our fears and prejudices and just laugh at people. - Jack Handey
  4. Brilliant by FunWithHeadlines · · Score: 4, Insightful
    What's inefficient about current spam solutions?:

    "These classifiers already come in many forms. There are POP3 proxies, IMAP proxies, mail file processors, and even classifiers built directly into mail clients. I use PopFile (a naïve Bayesian classifier in a POP3 proxy) at home with great success. Some work better than others, but with a little training, they all seem to work pretty well. Unfortunately, they have a common shortcoming: They don't cause the spammers any pain."

    What is the goal?:

    "And we all want to cause spammers pain."

    How do they want to accomplish this pain?:

    "None of these classifiers are capable of causing the spammers any pain because the spammer is long gone by the time the classifier has the opportunity to process the message. What we need is a way to use the classifier against the spammer while the spammer is still connected."

    This is brilliant. If all you do is clean up after the spammers when they are long gone, there is little motivation for them to stop. So what if they've dumped a bunch of garbage in your in-box? They don't stick around to see you clean up. But this idea hits them while they are in the process of spamming you.

    That's the key: Make it harder/more unpleasant/less cost-effective for the spammers and you discourage them from spamming. Hit the source, not the results.
    ------

  5. Interesting, but wont solve the problem by TheAB · · Score: 4, Insightful

    This is an interesting pro-active approach, but isnt most mail sent through open proxies, which have absent sysadmins? If we cant get them to lock down their mail servers, how can we get them to install this?

    1. Re:Interesting, but wont solve the problem by Fat+Casper · · Score: 2, Insightful
      If we cant get them to lock down their mail servers, how can we get them to install this?

      We don't need them to install this. They are its intended victims. If you have an open relay, the spam comes from you. TarProxy doesn't cut off or blacklist, it just makes it expensive to send spam. These "absent sysadmins" are going to have to show up at work and lock themselves down. Then they will magically stop being penalised. I don't give a damn about the spam they recieve, just that which they forward.

      --
      I spent a year in Iraq looking for WMD and all I found was this lousy sig.
  6. Easy to defeat, just use dynamic spamming software by sanermind · · Score: 4, Interesting

    In the spirit of repetition...

    Easy to defeat, just use spamming software that dynamically increases it's connection pool whenever it encounters a 'slow' SMTP recipient. Even if a large part of the net population were running this, the spammer could just spawn thousands of simultanious (slowed down, yes) connections, and still maximize his bandwidth utilization. If it takes 2 minutes to send each message, it dosen't matter if he's sending 5000 messages at once!

    I believe linux, for example, allows up to 8192 open sockets, and I think this can be changes with a sysctl command, and most definitely could be with a few changes to kernel headers.

    Sure, it would take a machine with decent memory, but that's not too hard to find.

    --

    ---
    the pen is mightier than the sword, the sword is mightier than the court, the court is mightier than the pen.
  7. OpenBSD's PF? by grub · · Score: 2, Insightful


    OpenBSD has a (alpha? beta? alpha hydroxy? I dunno) anti-relay addition to the PF firewall. Theo first mentioned it here and it was carried the story here. It sounds similar in that it puts the onus of time and bandwidth waste back on the spammers.

    --
    Trolling is a art,
  8. Causing pain, but indirectly by httptech · · Score: 4, Interesting
    Most spammers don't send hundreds of thousands of emails from their own connection. Generally they use open relays to propagate a few messages each with a huge RCPT list. So, tarpitting does nothing to the spammer directly. However, tarpitting the open relay does accomplish something - it could cause a huge backlog of mail, eventually letting the relay choke off its own resources as spammers kept trying to dump messages on it. This would cause some indirect pain to the spammers, as finding open relays that could deliver mail quickly would be difficult. It might also alert the mail administrator to the problem, thus encouraging them to close their server to relaying.

    And you would not need to roll this out on most of the net. If the large ISP and webmail providers started doing this it would have a significant impact. Much of the spammer's distribution list consists of a few domains; yahoo, hotmail, aol, etc. If the large providers implemented tarpits it could quickly damage the ready supply of open relays for spammers.

  9. Re:Spammers could put time limit on SMTP connectio by JanneM · · Score: 3, Insightful

    No problem. It means X seconds in which they do not send another message, and no meaasge sent through that SMTP gateway. With enough mailservers doing this, it will severely limit the number of messages they can send in a given time.

    --
    Trust the Computer. The Computer is your friend.
  10. How about doing something with *closed* relays? by rnt · · Score: 3, Interesting

    Not exactly the same thing as the article is about, but still related: My mailserver is properly secured and refuses to relay anything except legitimate mail (i.e. it will accept incoming mail for users on the domains it serves and it only relays mail to the outside world when it's from a predefined set of internal machines). There are plenty of spammers trying to convince my mailserver to send their spam to other people, but all get a nice "relaying denied" message and a couple of lines in my maillog.

    I think it's a safe bet all relaying attempts originating from the outside of my network are spammers. The information in the maillog about denied relaying attempts should give an accurate list of IP-numbers used by spammers.

    Doesn't this give some interesting opportunities?
    Creating spamtrap daemons that listen on servers that aren't mailservers (so the fact the behave similar to a real mailserver and listen to the same TCP port is just a coincidence). Those server should be unlisted, not have any DNS records pointing at them being MX for any domains, etc.
    The only way to find them should be be randomly scanning an IP range.
    In that case the only people using them would be spammers trying to abuse random mailservers and it would be pretty safe to have the fake mailserver pretend to accept the mail, wait a while, try to gobble up some resources of the spammer, and finally dumping the spam-attempt to /dev/null and telling the spammer what he or she wants to hear: I have delivered your junk. The logs would prove useful, the spam is prevented. Happy happy, joy joy.

    The biggest disadvantage would be that such a fake relaying server would probably trigger some of the open-relay scanners (although the clueful scanners would wait until a message is actually received). Hmmm, spammers could do the same, really probing a mailrelay before trying to use it...

    Anyway, it would cost spammers more and more effort and probably annoy the hell out of them, which is a Good Thing.

  11. But he asked for it... by Anonymous Coward · · Score: 2, Informative
    From CmdrTaco's journal entry 2/18/03

    Course in 2 full hours, not a single user emailed me. It's quite different then say, 3 years ago where every story I posted resulted in a few dozen emails in my box. I guess its a credit to the comment system that users have figured out that they can interact better online & with each other then by emailing me directly. But that said, we have a third of a million readers. Not one of them thought 'Hey, maybe I should tell Rob a dupe was posted'?


    And then he goes on to say that subscribers should check for dupes, since they're paying customers and all.
  12. How about a Slashdot tar-pit for editors? by wackybrit · · Score: 3, Funny

    Every time they post a dupe their bandwidth to Slashdot gets cut by half.

    After a few dupes they end up with 5 minutes between server requests, which gives them ample time to check whether it was actually a dupe or not.

    Et voila, a Slashdot that only gets 2 posts a day, but at least they're not dupes.

  13. Can't handle POP by DuckWing · · Score: 2, Interesting

    While this is a very cool idea and works if you run your *own* mail server, it doesn't do any good for those of us that grab our mail from ISP's and use POP (or some other protocol). It means we have to convince our ISP to use this product/concept, which in my case (cable company) is impossible since they are a bunch of twits anyway.

    --
    -- DuckWing
  14. Re:God, why do people care about duplicate stories by cowmix · · Score: 2, Insightful

    Dupes waste everyone's time. They show the lack care from the /. staff. They could either automate DUPE detection or read their own site a little more carefully. They choose to do neither.

    Arg.

  15. Doesn't sound easy to me - please help me clarify by Featureless · · Score: 4, Interesting

    Hmm... This calls for some TCP geekhood and some strong math. I am way too hung over for math. Let's just talk about it in broad strokes first.

    If I'm tuning this package, I can make these delays REAL big. I mean, email is one of those systems where a false positive resulting in even a... let's say an **8 hour** delay to a legitimate message would still be considered perfectly fine for most purposes. There's fuzzy logic in play here; I'm thinking not all delays will be equal. But what if you were just really harsh on suspected spam? Not such a loss IMO. Of course... I haven't considered that you will have increased reliability problems trying to hold a stream open for 8 hours, but remember, a legitimate mailserver will keep resending, and as we go bayesian on servers, perhaps we will learn to resend for a little longer as well? Or perhaps there is another protocol solution (i.e. letting the sender know they're being delayed for spam... so perhaps giving them the option to reformulate their message and resend?) Let's just press on. The precise amount of the delay may not necessarily be important.

    If I'm sending 50 million messages (a modest spammer's run, if I'm well enough informed) and each one holds me on the line for 8 hours that means 400 million hours if run serially. At the 8192 concurrent thread barrier that's still almost 50,000 hours (~5 years)... with mathematical convenience, to do this entire run in 8 hours you will require **50 million** concurrent threads? Or should I have just stayed in bed longer?

    Now it's looking like the exact length of the delays, and the exact number of concurrent threads is not actually something worth too much niggling debate. We just have to get familiar with the orders of magnitude we're dealing with.

    Consider the protocol-to-data ratio of an SMTP transaction over TCP alone. How much is data and how much is just protocol overhead in a given mail transaction? We can figure this out down to the last bit, but I'm going to just throw out the hypothetical notion that when you have to initiate a new SMTP transaction for every message you send, the bandwidth overhead for doing this millions of times is not inconsiderable.

    And we have to think of the other end. Spammers may write themselves custom TCP/IP stacks, but receivers certainly will not. Consider AOL. AOL encompasses some significant percentage of your list of victims. What is AOL going to do with anywhere _near_ that many simultaneous connections... ***from just one spammer?*** Why, call the FBI, of course! It's a DOS attack!

    I'll stop now. I wouldn't be surprised if there were other angles on this I haven't considered. But at first blush it doesn't seem nearly as easy to beat as you suggest.

    Perversely I think the biggest danger in this technique is that it may become widespread and then force spammers to really confront Bayesian filtering head-on. Of course, just thinking aloud (and this probably is undoable for privacy reasons, but just to open a line of speculation) you can do some interesting things with these kinds of filters... retain lists of email addresses that you've received mail from (and/or replied to) more than once... they get a lower score (and a lower delay) than first-time senders... etc. etc. So it's not clear even with very well-designed spam (another cost increase for spammers!) that you could win against the filter.

  16. New! From SethSoft! by Dark+Lord+Seth · · Score: 4, Funny

    TarEditor!

    A revolutionary new system designed to keep your editors from posting duplicate stories! This new systems works by a punishment system, ranging from mild electrostatic shocks to gushes of boiling sulphuric acid! When properly applied, this piece of software will introduce a new era in the world of online journalism, having a significant effect on most long-term strategies and pleasing the stock market at the same time! This is all you're going to need in our post-modern capitalist system for guaranteed success!

    Features include:

    • Compatible with all editors!
    • Name based modifier system! Want Taco to get kicked in the nuts for the slightest mistake? NO PROBLEM!
    • HTTP proxy supporting HTML 4.0, 4.1 (all forms) XML, XHTML, DHTML AND MORE!
    • Punishments ranging from painful poke in the side to obliberating one's kneecap with a sledgehammer!
    • Up to date XML support!
    • 3 year on site support!

    TarEditor is copright (C) SethSoft, 1998-2003. All rights reserved. This work is protected under dutch copyright laws and the DMCA so I can sue some random people into debt from time to time.

  17. /. duplicate problem by Multics · · Score: 4, Insightful
    Given:

    1) the /. creaters are by-in-large the /. people that control the posting of stories

    2) Most stories contain at least one URL

    3) URL's, by in large, are unique

    Then;

    Would it be so hard to modify the actual posting code to check that the URL hadn't already been part of a story header within say the last 60 days?

    Such a check would help both /. and all others that run / code.

    Just a thought!

    -- Multics

  18. Re:Spammers could put time limit on SMTP connectio by Mike+Schiraldi · · Score: 2, Informative

    Additionally, almost all spam goes through an open relay -- the spammers almost never talk directly to the final mailserver. So TarProxy isn't hurting the spammers so much as the open relay sysadmins. The open relay sysadmins, seeing their mail servers slow down and run out of disk space, will either take the time to figure out what's going on (and hopefully solve the problem by securing their server), or do nothing and have their server hammered to the point where it can barely spam anymore.

  19. Re:Spammers could put time limit on SMTP connectio by IKEA-Boy · · Score: 3, Insightful

    I don't see how effective this could be. How long before spammers get smart and set their SMTP program to give up after X seconds?


    This doesn't matter since most spammers use open relays to send their junk. They generaly don't have control over the timers for the relays they are using. The relay will be slowed down to a crawl making it less useful for them. Of course the spammer can get around this by running his own mailserver but this means he needs to invest a lot more money in bandwidth/hardware/upkeep etc. and he will make himself much more visible to the net.

  20. TCP slow-down mechanism on IETF list by karl.auerbach · · Score: 4, Interesting

    I suggested a similar mechanism to constipate TCP connections on the IETF e-mail list last summer. The basic idea is to add some new calls to the TCP API so that an application could peek at the incoming traffic without it being acknolwledged at the TCP level. If the incoming stream were something bad, then the application could tell the TCP stack to go into a slow acknowledgement mode, thus capturing the spammer in slow-mode transfer.

    For more, see http://www1.ietf.org/mail-archive/ietf/Current/msg 17009.html

    The difficulty is getting enough of these deployed so that spammers, and open relays, have a good chance of getting stuck.

  21. Re:Maybe just remove duplicates from the front pag by Neon+Spiral+Injector · · Score: 2, Funny

    Maybe there should be a special Dupe Section. It'll surely get more usage than the radio section.