Now that O2's MMS servers are offline, it's safe for us to announce a more serious vulnerability that permitted the easy discovery of thousands of truly private MMS messages including videos. See the blog link for more details.
Showing all the countries, playing Small World -- most of South America... Norway, Sweden, Finland, Estonia, Latvia, Lithuania, Denmark, Netherlands, Belgium, Poland, Czech, Switzerland, Portugal, Spain, Italy, Malta, Croatia, Slovakia, Hungary, Romania, Greece, Turkey, Jordan, Egypt, Niger, Mali, Senegal, Guinea, Ivory Coast, Cameroon, Kenya, Botswana, South Africa... man, way too many countries!
Unless someone implements Google App Engine on EC2
on
Google Previews App Engine
·
· Score: 3, Insightful
It's inevitable -- someone will write an alternative hosting environment for App Engine applications. Google will also doubtless eventually start selling an App Engine appliance to start penetrating the enterprise market.
> STP server correlates this information with the data received from other MTAs > and replies with a number that reflects how likely the sender is a junk mail source.
How exactly is this done? What differentiates spammers from legitimate senders? And how is this idea any better than reputation databases which assess the long term sending history of particular mail servers (and domains, where domain authentication is provided)?
These days the vast majority of spam is sent from botnets. Botnets by definition fly under the radar -- the only thing you know about them is that you know nothing about them until it's too late.
Correlating traffic patterns is an interesting idea, but the author doesn't flesh it out. What specific correlations would you make? Give us the details!
Yes, some spammers will rapidly adapt to this technique, but it will take a very long time for the majority of them to do so. Grey-listing is still a relatively effective technique, although for reasons explained by other comments in this post, grey-listing causes enough errors with legitimate senders as to be unacceptable to receivers who care about deliverability.
Spam is more than a nuisance now. It's a cause of service disruptions and outages. No-listing and systems that take it a step farther by using reputation to temporarily refuse connections to particular hosts are now part of the fabric for large email receivers.
Several of the recent articles on the huge increase in spam talk about how the new generation of spam trojans are adapting to beat greylisting. For example, the SpamThru trojan contains a full MTA capable of assembling spam messages from templates it downloads from "template servers".
For personal usage, this is a reasonable technique. Our research has shown that 95% of deliveries from Windows machines are spam.
However, if you are considering using fingerprinting in a business or service provider setting, rejecting connections from Windows machines is a bad bad horrible idea. Microsoft Exchange is run by almost as many companies as Sendmail these days (trust me, we've surveyed 400,000 mail servers to determine this). Blocking them all will result in many unhappy end users.
However... fingerprinting can be a very useful technique to identify a bad sender when nothing else is known about it. For example, with our connection management software, you can configure it to throttle (i.e. slow down, traffic shape, etc.) connections from Windows-based hosts if the host has no previous good reputation. See an overview of the technique in this OnLAMP article by Stas Bekman.
I can actually comment on that. We've surveyed 400,000 mail servers at organizations around the world and have found that Sendmail still holds on to 13% of the market.
Yes, we looked at using POE. We concluded that POE is just far more than we needed for this application. It would have been too difficult to make POE rock performance-wise in addition to ensuring that POE used an efficient event library like libevent. And in this kind of application, you need awesome performance. We profiled the app with strace for weeks to get rid of unnecessary system calls.
spamd gave us our initial inspiration. I talked with Bob Beck at the Cansecwest security conference after he presented on spamd and was -- to put it mildly -- blown away.
It's important to understand that spamd does not actually deliver mail. It just responds r e a l l y s l o w l y and then returns a 400-series code to force the sender to try again. After the first time, a packet filter rule is added that redirects that sender to a real MTA, which receives the message.
So in essence spamd is (primarily) used as a grey-listing system.
Traffic Control actually delivers the mail in addition to efficiently slowing down connections from _certain_ senders. In that way it's a lot more sophisticated and less prone to deliverability problems. Deliverability is a major concern for corporate customers -- even though spam is also a big deal.
[full disclosure: I work with Stas at MailChannels]
We looked at using the new Perl threads, but Perl 5.8 threads suffer from a few severe limitations.
1. When you create a new thread, a complete copy of the interpreter is made. The new thread makes use of this new interpreter instance and cannot communicate with the original thread except via the threads::shared module or some traditional IPC mechanism. In short, they're no better than forking a new process and in many ways, they are far worse than this.
2. Perl threads are still quite unstable.
Yes, we could have used Python. Or Ruby. Both these languages have better threading support by leaps and bounds. Additionally, they have great asynchronous libraries like Twisted. Why'd we use Perl? Well, I suppose it's in our blood. Between Stas and the rest of the dev team, we have a good cross-section of Perl talent.
[full disclosure: I work with Stas at MailChannels]
> The inherent delays for just about every message would be particularly painful for business email users, but > even residential ISP customers are constantly opening tickets when they observe a delay (I work closely > with several large ISPs, which is how I know).
That would be a problem if every single message was slowed down, but it's not. The system uses sender reputation and behaviour to ensure that only malicious senders are slowed down. Our customers have found more often that end-users notice better deliverability when this technology is in place -- because the load on the spam filters is so much reduced that queues don't back up causing really bad delays.
One way or another, you have to delay some of the traffic. You either do it up front and selectively -- applying the pain to the bad senders -- or you do it after the messages are queued, which hurts the recipients.
[full disclosure and shillery alert: I work with Stas at MailChannels]
You make some very good points -- and these are all concerns we had when we set out to build this software. Fortunately for the world, these concerns have turned out to be unwarranted. Furthermore, our experience in actually deploying this technology has been far more breathtaking than we had imagined -- both in terms of spam mitigation and improvements in scalability.
> the core assumption, and the only thing that makes this work, is that botnet spam software will _always_ just > give up after 30 seconds;
I have a theory that spammers will always be impatient. I believe this theory for several reasons:
1. Spam campaigns are now recognized by anti-spam companies in minutes or hours. New campaigns therefore have a very short life expectancy and have to be completed as fast as possible. If mail can't get delivered fast, it's time to move on to a new domain to get it moving again. With collaborative filters like Cloudmark recognizing campaigns in less than 60 seconds, spammers obviously have to move traffic fast.
2. Botnets are not unlimited in their size or bandwidth capacity. Typicaly botnets these days are between 1,000 and 10,000 hosts. Any larger and the command and control channels are very quickly noticed and shut down by service providers. Botnets cost money too -- $250/hour for a 10K botnet is typical.
3. Spammers raison d'etre is to send lots of mail and hope that a small percentage of recipients buy something. The only way to make the business profitable is to send huge amounts of mail. If all zombie traffic in the world was magically being slowed down, spamming would no longer be profitable and spammers would tend to focus more on things like highly targeted phishing instead. Not surprisingly, we're already starting to see this.
4. Because #3 isn't going to happen any time soon, and in light of the technical constraints (1 and 2), spammers have no choice but to abort their connections within a very short time frame. It's just the nature of the economic beast. Hanging on is just for posterity. It doesn't make economic sense.
5. It works. And it's very very scalable. By slowing down traffic and multiplexing what remains, mail server load drops by 90%. In big installations, that means no more being paged in the middle of the night because your cluster of 4-way Xeons with 8GB of RAM is borked by a distributed spam burst.
Oh -- and of course you can't just slow everything down. It's important to be very selective so as not to delay everything.
> if this throttling technique ever became commonplace, spammers would just write their > own asynchronous mailer -- it's not THAT hard...
Actually, it is that hard. Even Stas got a headache working on this project.
But even if it was easy, it would be pointless for a spammer to launch more than one connection per zombie. If a sender is marked as suspicious, the sender's concurrency is severely limited. One connection per zombie, at 5 bytes per second -- that's just not economic.
> furthermore, i bet there are some shitty legitimate MTAs that would just give up too, causing actual > mail to get discarded:)
Let's just say the gap between the patience of spammers and the patience of legitimate MTAs is very large indeed. And by carefully fingerprinting and assessing sender reputation, this problem can be minimized to the point where it is a far smaller problem than content filter false positives.
I also want to point out that this technology does not make email suck by slowing it down. It in fact speeds up delivery of legitimate mail in most cases because the load is so reduced on the rest of the infrastructure.
Just talk to our customers. One of them was running four 4-way Xeon boxes with 8GB of RAM each -- all this to service the spam filtering needs of just 10,000 end users. He told us he hadn't slept a full night in months because of load-based outages. Since installing the software Stas built, the only alert he's received is a notification that the load level dropped below the panic threshold!
I'm Canadian and from my perspective, outsourcing is a good thing for Americans. Americans have long complained about the loss of jobs to foreign countries where wages are lower (such as Canada). The truth is that, despite this outsourcing, Americans are still far better off than workers in other countries. Americans earn more, have more time off, and have greater choice of employers than the workers in any other country in the world. Just ask any Canadian what the number one reason to move to the States is and he'll answer: it's the salary, stupid.
So what is the average geek to do about this outsourcing problem? Retire to India. For the amount your Prius will fetch on Craigslist, you can live like a king for many years in India.
"Positive" authentication is not very useful
on
Dealing with Phishing
·
· Score: 2, Informative
End users cannot distinguish well between legitimate sites and phishing sites. Adding in sugar such as the date of the user's last login is helpful only as a positive reminder that the user is on the right site. It's better than nothing, but not by a factor of 10.
Phishing cannot be prevented completely -- it's a social engineering phenomenon and as such will adapt to any technological intervention that tries to stop it. The best possible "solution" to phishing combines a) hardware authentication, b) increasingly "locked down" web browsers, c) web site "reputation", and d) better anti-phishing protection in email services and software.
Companies like Cloudmark leverage a vast and very active user community to almost instantly detect and mitigate new phishing campgaigns. IronKey, founded by the president of the Anti-Phishing Working Group, is developing hardware tokens for authentication. IE7 and Firefox continue to improve their defenses against XSS attacks and the like. And there are good efforts underway to develop URL reputation systems that can help users avoid browsing sites that are dangerous.
.. with a large, expensive manual classification component. The problem with Pandora and all human-based music classification systems is that the classification is based on very high level song characteristics such as genre. A lot of information about the music is missed if these high level characteristics are all you look at.
I am familiar with a startup called Memotrax (http://www.memotrax.com/ which is using technology development by a group of mathematicians and computer music experts at the University of British Columbia. Their approach works amazingly well and does not rely on high level characteristics such as genre. As a result, it's quite possible for Memotrax to find you a piece of electronica that is very well related to that Keith Jarret piano piece you were just listening to.
There is no public demo available yet, but look out for it. It's truly amazing.
Here's one:
http://images.google.com/hosted/life/f?q=porn++source:life&imgurl=c1ab21f9fc98b624
http://blog.mailchannels.com/2008/07/update-o2-leaking-customer-photos.html
Now that O2's MMS servers are offline, it's safe for us to announce a more serious vulnerability that permitted the easy discovery of thousands of truly private MMS messages including videos. See the blog link for more details.
Yes, it will be released 7/11 in the USA.
From the engadget coverage:
Showing all the countries, playing Small World -- most of South America... Norway, Sweden, Finland, Estonia, Latvia, Lithuania, Denmark, Netherlands, Belgium, Poland, Czech, Switzerland, Portugal, Spain, Italy, Malta, Croatia, Slovakia, Hungary, Romania, Greece, Turkey, Jordan, Egypt, Niger, Mali, Senegal, Guinea, Ivory Coast, Cameroon, Kenya, Botswana, South Africa... man, way too many countries!
It's inevitable -- someone will write an alternative hosting environment for App Engine applications. Google will also doubtless eventually start selling an App Engine appliance to start penetrating the enterprise market.
> STP server correlates this information with the data received from other MTAs
> and replies with a number that reflects how likely the sender is a junk mail source.
How exactly is this done? What differentiates spammers from legitimate senders?
And how is this idea any better than reputation databases which assess the long term
sending history of particular mail servers (and domains, where domain authentication
is provided)?
These days the vast majority of spam is sent from botnets. Botnets by definition
fly under the radar -- the only thing you know about them is that you know nothing about
them until it's too late.
Correlating traffic patterns is an interesting idea, but the author doesn't flesh it out.
What specific correlations would you make? Give us the details!
Yes, some spammers will rapidly adapt to this technique, but it will take a very long time for the majority of them to do so. Grey-listing is still a relatively effective technique, although for reasons explained by other comments in this post, grey-listing causes enough errors with legitimate senders as to be unacceptable to receivers who care about deliverability.
Spam is more than a nuisance now. It's a cause of service disruptions and outages. No-listing and systems that take it a step farther by using reputation to temporarily refuse connections to particular hosts are now part of the fabric for large email receivers.
Because Henry Stern is a well known anti spam researcher.
Several of the recent articles on the huge increase in spam talk about how the new generation of spam trojans are adapting to beat greylisting. For example, the SpamThru trojan contains a full MTA capable of assembling spam messages from templates it downloads from "template servers".
For personal usage, this is a reasonable technique. Our research has shown that 95% of deliveries from Windows machines are spam. However, if you are considering using fingerprinting in a business or service provider setting, rejecting connections from Windows machines is a bad bad horrible idea. Microsoft Exchange is run by almost as many companies as Sendmail these days (trust me, we've surveyed 400,000 mail servers to determine this). Blocking them all will result in many unhappy end users.
However... fingerprinting can be a very useful technique to identify a bad sender when nothing else is known about it. For example, with our connection management software, you can configure it to throttle (i.e. slow down, traffic shape, etc.) connections from Windows-based hosts if the host has no previous good reputation. See an overview of the technique in this OnLAMP article by Stas Bekman.
Who would have thought that plunking a child in front of a television for hours on end would condition that child to become an extreme introvert?
I can actually comment on that. We've surveyed 400,000 mail servers at organizations around the world and have found that Sendmail still holds on to 13% of the market.
Yes, we looked at using POE. We concluded that POE is just far more than we needed for this application.
It would have been too difficult to make POE rock performance-wise in addition to ensuring that POE used an efficient event library like libevent.
And in this kind of application, you need awesome performance. We profiled the app with strace for weeks to get rid of unnecessary system calls.
[shillery notice: I am CEO at MailChannels]
spamd gave us our initial inspiration. I talked with Bob Beck at the Cansecwest security conference after he presented on spamd and was -- to put it mildly -- blown away.
It's important to understand that spamd does not actually deliver mail. It just responds r e a l l y s l o w l y and then returns a 400-series code to force the sender to try again. After the first time, a packet filter rule is added that redirects that sender to a real MTA, which receives the message.
So in essence spamd is (primarily) used as a grey-listing system.
Traffic Control actually delivers the mail in addition to efficiently slowing down connections from _certain_ senders.
In that way it's a lot more sophisticated and less prone to deliverability problems. Deliverability is a major concern for corporate customers -- even though spam is also a big deal.
[full disclosure: I work with Stas at MailChannels]
We looked at using the new Perl threads, but Perl 5.8 threads suffer from a few severe limitations.
1. When you create a new thread, a complete copy of the interpreter is made. The new thread makes use of this new interpreter instance and cannot communicate with the original thread except via the threads::shared module or some traditional IPC mechanism. In short, they're no better than forking a new process and in many ways, they are far worse than this.
2. Perl threads are still quite unstable.
Yes, we could have used Python. Or Ruby. Both these languages have better threading support by leaps and bounds. Additionally, they have great asynchronous libraries like Twisted. Why'd we use Perl? Well, I suppose it's in our blood. Between Stas and the rest of the dev team, we have a good cross-section of Perl talent.
[full disclosure: I work with Stas at MailChannels]
> The inherent delays for just about every message would be particularly painful for business email users, but
> even residential ISP customers are constantly opening tickets when they observe a delay (I work closely
> with several large ISPs, which is how I know).
That would be a problem if every single message was slowed down, but it's not. The system uses sender reputation and behaviour to ensure that only malicious senders are slowed down. Our customers have found more often that end-users notice better deliverability when this technology is in place -- because the load on the spam filters is so much reduced that queues don't back up causing really bad delays.
One way or another, you have to delay some of the traffic. You either do it up front and selectively -- applying the pain to the bad senders -- or you do it after the messages are queued, which hurts the recipients.
[full disclosure and shillery alert: I work with Stas at MailChannels]
:)
You make some very good points -- and these are all concerns we had when we set out to build this software.
Fortunately for the world, these concerns have turned out to be unwarranted. Furthermore, our experience in actually deploying this technology has been far more breathtaking than we had imagined -- both in terms of spam mitigation and improvements in scalability.
> the core assumption, and the only thing that makes this work, is that botnet spam software will _always_ just
> give up after 30 seconds;
I have a theory that spammers will always be impatient. I believe this theory for several reasons:
1. Spam campaigns are now recognized by anti-spam companies in minutes or hours. New campaigns therefore have a very short life expectancy and have to be completed as fast as possible. If mail can't get delivered fast, it's time to move on to a new domain to get it moving again. With collaborative filters like Cloudmark recognizing campaigns in less than 60 seconds, spammers obviously have to move traffic fast.
2. Botnets are not unlimited in their size or bandwidth capacity. Typicaly botnets these days are between 1,000 and 10,000 hosts. Any larger and the command and control channels are very quickly noticed and shut down by service providers. Botnets cost money too -- $250/hour for a 10K botnet is typical.
3. Spammers raison d'etre is to send lots of mail and hope that a small percentage of recipients buy something. The only way to make the business profitable is to send huge amounts of mail. If all zombie traffic in the world was magically being slowed down, spamming would no longer be profitable and spammers would tend to focus more on things like highly targeted phishing instead. Not surprisingly, we're already starting to see this.
4. Because #3 isn't going to happen any time soon, and in light of the technical constraints (1 and 2), spammers have no choice but to abort their connections within a very short time frame. It's just the nature of the economic beast. Hanging on is just for posterity. It doesn't make economic sense.
5. It works. And it's very very scalable. By slowing down traffic and multiplexing what remains, mail server load drops by 90%. In big installations, that means no more being paged in the middle of the night because your cluster of 4-way Xeons with 8GB of RAM is borked by a distributed spam burst.
Oh -- and of course you can't just slow everything down. It's important to be very selective so as not to delay everything.
> if this throttling technique ever became commonplace, spammers would just write their
> own asynchronous mailer -- it's not THAT hard...
Actually, it is that hard. Even Stas got a headache working on this project.
But even if it was easy, it would be pointless for a spammer to launch more than one connection per zombie. If a sender is marked as suspicious, the sender's concurrency is severely limited. One connection per zombie, at 5 bytes per second -- that's just not economic.
> furthermore, i bet there are some shitty legitimate MTAs that would just give up too, causing actual
> mail to get discarded
Let's just say the gap between the patience of spammers and the patience of legitimate MTAs is very large indeed. And by carefully fingerprinting and assessing sender reputation, this problem can be minimized to the point where it is a far smaller problem than content filter false positives.
I also want to point out that this technology does not make email suck by slowing it down. It in fact speeds up delivery of legitimate mail in most cases because the load is so reduced on the rest of the infrastructure.
Just talk to our customers. One of them was running four 4-way Xeon boxes with 8GB of RAM each -- all this to service the spam filtering needs of just 10,000 end users. He told us he hadn't slept a full night in months because of load-based outages. Since installing the software Stas built, the only alert he's received is a notification that the load level dropped below the panic threshold!
Touche -- I can't possibly disagree with you on the benefits of living in Canada.
And I suppose being the #1 location in which outsourcing of IT takes place can't be a bad thing, either.
I'm Canadian and from my perspective, outsourcing is a good thing for Americans. Americans have long complained about the loss of jobs to foreign countries where wages are lower (such as Canada). The truth is that, despite this outsourcing, Americans are still far better off than workers in other countries. Americans earn more, have more time off, and have greater choice of employers than the workers in any other country in the world. Just ask any Canadian what the number one reason to move to the States is and he'll answer: it's the salary, stupid.
So what is the average geek to do about this outsourcing problem? Retire to India. For the amount your Prius will fetch on Craigslist, you can live like a king for many years in India.
End users cannot distinguish well between legitimate sites and phishing sites. Adding in sugar such as the date of the user's last login is helpful only as a positive reminder that the user is on the right site. It's better than nothing, but not by a factor of 10.
Phishing cannot be prevented completely -- it's a social engineering phenomenon and as such will adapt to any technological intervention that tries to stop it. The best possible "solution" to phishing combines a) hardware authentication, b) increasingly "locked down" web browsers, c) web site "reputation", and d) better anti-phishing protection in email services and software.
Companies like Cloudmark leverage a vast and very active user community to almost instantly detect and mitigate new phishing campgaigns. IronKey, founded by the president of the Anti-Phishing Working Group, is developing hardware tokens for authentication. IE7 and Firefox continue to improve their defenses against XSS attacks and the like. And there are good efforts underway to develop URL reputation systems that can help users avoid browsing sites that are dangerous.
TechTarget is paid by Oracle and others to write articles like this, so keep in mind the bias when reading the linked article.
Flickr was actually acquired for $35M, or so the rumours go.
That's eight figures, not seven.
Always look to spammers and pornographers to solve the world's most challenging computational puzzles before anyone else.
Here here!
.. with a large, expensive manual classification component. The problem with Pandora and all human-based music classification systems is that the classification is based on very high level song characteristics such as genre. A lot of information about the music is missed if these high level characteristics are all you look at.
I am familiar with a startup called Memotrax (http://www.memotrax.com/ which is using technology development by a group of mathematicians and computer music experts at the University of British Columbia. Their approach works amazingly well and does not rely on high level characteristics such as genre. As a result, it's quite possible for Memotrax to find you a piece of electronica that is very well related to that Keith Jarret piano piece you were just listening to.
There is no public demo available yet, but look out for it. It's truly amazing.