Ask Jonathan Zdziarski
You may recognize the name Jonathan Zdziarski from a recent Slashdot book review of his book Ending Spam. Aside from his DSPAM spam filter Jonathan has also contributed several other projects to the open source community under the GNU General Public License. These projects include Verizon-Compatible SMIL Multimedia Gateway, The Reactive Automated Blackhole List Server, Apache DoS Evasive Maneuvers Module, and several others. Want to know how to effectively contribute projects to the open source community? Curious to ask another programmer about his history? Now is the time to ask. Moderators will select the top few questions that we will forward on to Jonathan sometime tomorrow. The answers to the questions will be displayed next Tuesday when we will encourage Jonathan to participate in the discussion as time permits.
How do you pronounce your name?
What part of computers do you think is best? Which part is the worst?
Show this to your friends and family that don't know what a real hacker is
that's my question.
"I'd rather be a lightning rod than a seismometer." -Ken Kesey
Seeing how Johnathan has put much of his time and effort into Open Source projects over the years, it would seem he is a good canadate for this question: What do you think about the proposed change to the GPL with the upcoming GPL 3? Is it a welcomed breath of fresh air to the Open Source Community, or will it just be a reiteration of the previous GPL? What are your thoughts and comments on the GPL 3?
--
Do you get those pesky Nigerian 419 emails? Post them here, and watch the database grow! : http://urgentmessage.org/
Do you have any suggestions for the enthousiastic yet inexperienced? Perhaps a listing of projects in need of developers, with some indication of the level of experience suggested (as well as languages required).
Most antispam software seems to be fairly reactionary - wither it is based on keyword patters, urls, sender, ip, or the checksum of the message a certain amount of spam has to first be sent and identified before additional messages will be tagged and blocked. Spf, domainkeys, etc... requires a certain percentage of the Internet to adopt before they will be truely effective.
What do you see on the horizon as the next big technique to battle spam? How will this affect legitimate users on the Internet?
"The similarities of sysadmins and drug dealers: both measure stuff in K's, and both have users."
postfix or qmail? (i vote postfix)
char name[]="Jonathan Zdziarski";
cout 18
Do you feel disadvantaged in comparison to people whose last name is "Smith" or "Jones"???
Mr. Zdziarski, it appears as if you are a supporter of use of statistical methods to filter out spam. But these filtering methods have limitations, in that there are ways of getting around these filters. Since human beings can recognize spam better than any software filter, do you not believe that more emphasis should be put on developing software that facilitates DIY spam filtering?
Have you noticed any decrease in the amount of spam since a few of the hardcore spammers have finally been prosecuted? I always wonder if scare tactics will work against these guys, or if they will just move their colo to some small country offshore where it becomes harder to press charges.
Hi,
what was the first project you send code to, what was the code doing,
where and why you had used the software from the project?
Regards
Are shameless self publicists more or less inclined to write F/OSS?
A feeling of having made the same mistake before: Deja Foobar
I guess the more serious version of this question is the tradeoff of precision and false negatives vs. overkill and false-positives. For instance, my email provider lets me pick country-blacklists, so I reject all email from China, Korea, and Nigeria, where I don't know anybody, and Japan gets accepted with extra filtering, because I know a couple people there who normally don't send me mail - it's not quite a nuke-Asia-from-orbit approach, because people who actually do want mail from people in China can accept it, but people who don't can reject it all and lose the occasional message from a friend at a cybercafe.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Jon, your acheivements thus far are impressive. I am personally most impressed by your adherence to Open Source Solutions in a corporate environment.
I myself have had numerous interactions with less-than-technically-savvy management-types. Any time I bring up solutions that are quite obviously a better technical and financial choice over software-giant-type solutions; conversation seems to hit a brick wall. The ignorance of these people on such topics is astounding, and I find many approaches I have tried seem to yield no results in the short term. "Well, yes, your example proves that we would save $500,000 per year using that Open Source solution. But We've decided to go the Microsoft (or what-have-you) route."
With your track record, I can only assume you have found some ways to overcome this closed-mindedness.
I would greatly appreciate any input you have on this; from the perspective of someone who has overcome this obstacle before.
A couple fans told me that my last journal entry was mint; give it a shot. Hope you like.
What punishment do YOU feel is appropriate when a government agency gets a wriggling, thrashing spammer in its pincers?
You can't talk about Wikipedia's flaws on Wikipedia
How much is your name worth at Scrable?
You can't take the sky from me...
Could you create a "First Post" (And other crapy look-alikes) filter as well?
Which of these Don Martin sound effects is closest to the pronunciation of your last name?
{ }FAGROON KLIBBLE KLIBBLE
{ }KLOINK
{ }SKRITCH SCRITCHA SKRINK
I feel like I'm taking CRAZY pills!
How do you deal with spam checking software causing a delay at the point where you do the spam filtering? As communication backup becomes more important in the business place you have some companys dealing with literally millions if not billions of emails a day. Even an efficent filter will take to go through that many emails, How do you deal with this?
I have two questions:
1. In your new book, you basically state that Bogofilter is not a bayesian filter, which was news to some of the Bogofilter people I have spoken to. Can you explain why you feel that Bogofilter is not a bayesian filter?
2. Bayesian filters have been around for some time now but there still seems to be no standardized testing methods for determining how well filters work in comparison to one another. Do you think that comparitive testing would be useful and if so, how should it be performed?
Thanks Jonathan.
Ok, quick and easy question. For my master's thesis, I am testing some additions and tweaks to the common bayesian spam filter. Without revealing my plans... what are some tweaks or additions you would like to see tested? Thanks.
74 points if you get the 50-point bonus for a word seven letters or longer, 24 if not.
*Note: There is only one Z in the tiles, so the second Z is a blank, and is pointless... just like this post.
You can't talk about Wikipedia's flaws on Wikipedia
I'd like to know how to properly pronounce that last name!
I might know what I'm talkin' about, but then again, this is Slashdot...
With malware becoming increasingly complex (from simply annoying viruses to trojans that turn zombie boxes into SPAM factories), do you see another product coming into play that takes antispyware/url filtering (firewall)/antivirus to a new level? Like some sort of unified product (NOT like the 'packages' offered by Norton or McAfee---security suites are dissimilar products just grouped together) I still think that user education is first and foremost, but perhaps some kind of heuristical scanner that integrates the roles of personal firewall, email filter, anti spyware, process control, etc. into a _single_ package. Our Windows users think that AVG+ZoneAlarm+Spybot SD+FireFox is enough, but there are still the social engineering aspects. Besides, we need more than ClamAV for Linux(and the OpenSource community). It won't be too long until Alternative OS's become key targets. I also say 'Thank you' for your contributions to the OpenSource movement and being part of what makes that movement great.
As much as many people hate it, there's always a percentage that buys advertised items. And with their wallet, this small percentage supports the other camp. You may hate this method of doing business, but there's the other side too: products sold, bring income and jobs for people making these products. For the small percentage of buyers, some products/services may be very much appreciated, or give them things they can't easily obtain otherwise. We'll leave legalities out for now. Some things may be appreciated because they're illegal, but there may be other things that are illegal, while many feel they shouldn't be.
And ofcourse in marketing, there's the saying "there's a new sucker born every day".
Given the fact that some percentage of people seems to want this type of marketing, do you think it will ever die out, or is there only hope of controlling it to acceptable/managable levels?Jonathan,
I develop and manage a lightweight Open Source Application that's used to send announce only and discussion mailing lists, similar to the Mailman and Majordomo projects. It's very popular and has a loyal following.
What advice do you have as a developer of this program to:
* Help my users send legitimate messages (either by education (specifically) or by programming techniques)
* Help Spam Filtering Software check the messages my program sends out for possible abuse
* Be a part of the solution to sending legitimate messages to many people, rather than perhaps be part of the problem.
I understand that any tool can be circumvented and abused and I do believe context always plays a part in how to judge something as Good or Bad. I'm sure like many different types of software, Spammers are a problem for my business as well.
I find myself in an interesting position, where I can change how many email messages are sent out. If I can send "better" email messages that are not filtered as spam if they are legitimate and can stop possible abuse of my program, I can help in a solution to people who would like to send out announce only and discussion email messages.
Thanks for your time.
Dada Mail - Program, Art Project or Absurdity?
The SMTP standard that we use for mail transfer was developed in the late 70's - early 80's and has, for the most part, never been updated. In that time period, the idea of hordes of spam flowing through the net wasn't even considered.
It has always been the most obvious solution to me that what we really need is SMTP 2.0, where a server only accepts mail from a user that can authenticate themselves with a name and password. A server can also accept mail from another server, but only for mail directed at legitimate users on it's system. Mail servers would have to register with a central authority, and must include their active IP address in that registration. Any attempt to deliver mail from an unregistered server is bounced.
Wouldn't this simple fix stop 99% of spammers in their tracks? Isn't it about time we updated the SMTP standard?
Life, the Universe, and Everything... in my image.
For example, certain spam blacklists would censor more than was strictly necessary (a subjective opinion, I realize) to block a spammer -- sometimes blocking a whole Class C to get one individual. This would cause other innocent users in that netspace to have their e-mail to hosts using the blacklists silently dropped without any option of fixing the problem besides switching ISPs.
This is an extreme example, but most anti-spam approaches have the following characteristics:
Recently I had to fix an installation where daily messages from a particular host stopped appearing in a mailbox. This system was connecting with an ISP that had offered no spam filtering and had been using a client-based Bayesian classifier with great success, but suddenly the mail coming into the system had scaled back by a factor of ten. Sure enough, the ISP installed a server-based spam filter which took out most of the spam and a good deal of the legitimate mail -- they had a (not well publicized) means of accessing the account settings and turning off the filter, and a holding tank for mail classified as spam, but beyond the last two weeks everything was thrown out.
I'm curious about what you think about server-based approaches vs. client-based approaches to spam classification and filtering and if, maybe, the cure is worse than the disease.
Try not. Do or do not, there is no try.
-- Dr. Spock, stardate 2822-3.
Why so bitter? Can you not be happy that you contributed? Not to lower yoru contributions in the Commercial Licensed Code realm, but they seem to be doing fine without evne though your experience through the trasnitional to open source did not go the way you would have wanted as far as yoru experience in that area.
Fred Grott(aka shareme) http://mobilebytes.wordpress.com
This is arguably out of scope for this interview, but I still feel it's something many Slashdotters would be interested in hearing about.
On your webpage you have an essay describing your Christian beliefs and why you have them. You say many things that most Slashdotters (and nerds and scientist in general) regard as utterly ridiculous. You think the earth is no more than 10,000 years old, you think Christianity is logical, you regard the Bible as a historial document, etc.
No doubt you are aware of the fact that most nerds disagree with you on these things. Indeed, they might even consider you "crazy" for holding them.
Without going into the truths of the beliefs in question, which I'm sure will be debated enough in the Slashdot thread anyway (and I hope you'll join in), what do you think the reason is that so many scientists, nerds and people otherwise rather similar to you think your beliefs are obviously incorrect? Do you think they are all deluded? Do you agree that there might be a possibility that your beliefs are not rational (again, without going into whether or not they are so)?
Best regards,
an AC
NOTE TO MODS: You can include any combination of the introduction, question 1, 2, 3, or 4. Just: if you send more than one question, try to include them in sequential order.
INTRODUCTION
--------------
Microsoft's Ryan Hamlin recently made a statement statement about "the amazing vehicle of e-mail marketing" said in New Zealand. Representing Microsoft, he essentially argued that the New Zealand Government's Unsolicited Electronic Messages Bill is too broad. He doesn't like that it
-prevents corporations from gaining any advantage out of collecting masses of information about the public, using it, and selling it to third parties.
-makes it so when a user opts out of one companies e-mail service, he opts out of all of these.
Now why this is such an issue for Microsoft I don't know, I wouldn't go so far as to venture that they might be (planning on) selling personal information gathered from their users without their knowledge. (With their right to do so hidden in some thirty page long EULA, of course.) Or perhaps granting third parties the addresses of their users with MSN e-mail accounts. I can't say for sure whether they would do something like that or not. But this calls into question multiple ideas.
So here are my questions:
*Do you think that "e-mail marketing" is an "amazing vehicle", as Microsoft's Ryan Hamlin so aptly worded it? (In other words, do you think that there is any merit to the claim that e-mail is used honestly and effectively for corporate marketing purposes? Even if unsolicited?)
*What is your position/opinion on e-mail marketing in general, whether "legitimate" or not?
*Should Companies have any right to sell personal information collected from users to third parties? Or is it that it's their god-given right and if we don't want that mail, it's our duty to protect ourselves? Where are you on that spectrum?
and finally:
*Do you think penelties and anti-spam laws are too soft on spammers, or are they just right? If you could make any changes, would you, and what?
I would like to know who you are and why Slashdot is asking you anything. Did you ask Slashdot to do this? Who are you and why should we care?
Dear Jonathan Zdziarski,
I work in the credit and accounts department of Union Bank Plc,GHANA. I solicit to write you in respect of a foreign customer with a Domicilliary account. His name is Engineer Manfred Becker. Since the demise of this our customer, Engineer Manfred Becker,who was an oil merchant/contractor, I have kept a close watch of the deposit records and accounts and since then nobody has come to claim the money in this a/c as next of kin to the late Engineer. He had only $18.5mllion in his a/c and the a/c is coded. It is only an insider that could produce the code or password of the deposit particulars. As it stands now,there is nobody in that position to produce the needed information other than my very self considering my position in the bank.
Based on the reason that nobody has come forward to claim the deposit as next of kin, I hereby ask for your co operation in using your name as the next of kin to the deceased to send these funds out to a foreign offshore bank a/c for mutual sharing between myself and you.
Kindly send your reply to my private email address: ssagoe@hotmail.com
Sincerely yours,
Mr.Sabo Sagoe
No Points. Proper names are not allowed.
I get over 350 SPAM every day.
WTF is being done about it?
---
My confirmation word is : condom
I wrote a work to compare techniques and open source tools to fight spam, and dspam wins: http://web.onda.com.br/nadal/Trabalho_Jeronimo_Zuc co.pdf
What do you think about combining other methods (like SPF, DK, Reverse DNS, etc) on network level, before the mail are arrived ?
Thanks and congratulation for your jobs !!
Jeronimo Zucco
Pick anyone you like;
1. If you'd had a chance to sit down with the world's biggest spammers how would you convince them to stop?
2. Is spam worth the effort - collecting, generating, sending emails? Does it even have any positive outcome for the company or do most people simply delete it? Are there even people that read or reply on spam?
3. If #2 answer = no: why do they even bother and do you think sooner or later they will realize it's not working and find a differetn solution for advertizing?
Just a couple for now:
1. In your book, "Ending Spam" you are pretty harsh on commercial filters and basically anything that's not statistical filtering. You make very good points in favor of statistical filtering, but I feel that you've missed a major fact about spam. Statistical filtering requires that the end-user get actively involved in the spam filtering process. What happens when they don't (because, in general, they won't) How does that affect the attacks you described in chapter 7 and what techniques would you recommend to mitigate apathetic users? A lot of the mitigation strategies for the attacks delineated require (at least somewhat) active end-users.
2. Why did you give so much coverage to Marty Lamb's TarProxy? The project appears to have died long before your book came out and I can't find reference to anyone who actually used it in production. I am surprised that you gave so much berth to a project that was basically unproven, especially in the face of proven, commercial technologies that are in the same space, such as the SMS 8160.
I recall hearing a story that you created DSPAM as a response to the trashy emails that your religious leader was receiving. I also see that your religion plays a large role in your life. I'm curious, how a thinking, logical, Christian such as yourself feels about the "intelligent design" movement?
Is this a misinterpretation of scripture? A reaction filled with fear against science? An attempt to distance ourselves from animals so that the atrocities occuring in modern industrial-meat production can be justified? Or is it a revival of much-needed spiritual values in our country?
In addition, I'm curious what your take is on the Intelligent Falling theory?
- passion
Y'all realize Jon is really a nice fellow who is quite easy to get in touch with. If anybody really has a desire to contact him with a question, why don't you? If you wish to open a discussion with him, why don't you catch him on IRC?
Rob
From:
castlecops.com
"ABOUT THE AUTHOR:
Jonathan A. Zdziarski has been fighting spam for eight years, and has spent a significant portion of the past two years working on the next generation spam filter DSPAM. His research in algorithmic theory and neural networking has led to the development of many new approaches in language classification, and he has played a key role in designing some popular algorithms in use today, including Message Inoculation, Bayesian Noise Reduction, and the first functional Neural Networking algorithm for spam filters. Zdziarski lectures widely on the topic of spam and was a speaker at the 2004 and 2005 MIT Spam Conference.
"
My question would be: You have made loud claims about the accuracy of your DSPAM filter without providing any back up. The only review of DSPAM that I can find does not support your claims.
Do you have any actual tests of DSPAM that can prove your claims of incredible accuracy?
why on earth is Slashdot interviewing such a person without that being the SOLE topic of discussion?
Not that I'm suggesting that should happen; I'm suggesting he doesn't deserve an interview with such an obviously damaged brain. Without going into anything about "God" or the Bible, the Earth is certainly more than 10000 years old; that's just about the stupidest claim I've ever heard (apart from the one that says the Earth is only 5000 years old..)
http://www.religioustolerance.org/oldearth.htm
Jonathan: Consider that if this argument were true, and "God" created the universe 10000 years ago to *look* like it were billions of years old, it would be equally valid (and equally provable) to say that he created it 5 minutes ago, with the same intentions. He created the universe with you in your chair and dumped a bunch of false memories in your head. Why do you suppose 10000 years is the "magic" number?
Even accepting that, why in the world would God purposely create things to look so much older? What interest does he have in fooling his own creation? (Yeah, yeah.. mind of God and all that.. fake answers need not apply.)
I'm sorry for the OT question, but it is challenging for people of logic and intellect to take creationists seriously in any reasoned discussion, mostly because they wouldn't know one if it hit them in the face.
Is there a Knoppix-derivative with Windows spyware tools working under Wine or with native windows-spyware-tools-for-linux coupled to a captive NTFS filesystem to tidy up boogered PC's without the Rooted Windows running?
Are you serious? Are you for real? Do you seriously believe that *you* are so much smarter than the majority of people in the world who are relying on hundreds, sometimes thousands, of years of tradition?
Even if you were right you are so obnoxious about it that you won't convince anyone of anything except to ignore you. If you had the guts to not post as an AC people could at least mark you so they would never have to see your stupid posts again.
(I'm posting AC because of the creationist hating modders. What's your excuse? They love your point of view)
They pick Microsoft because they want to ensure continuity of support after they fire you because of your massive, obvious attitude problem.
Jonathan:
A quick browse over your book left me wondering about chapter 10.
You stress the importance of continuity when testing a corpus of messages.
Others have presented the following method to testing:
shuffle ham and spam,
use some defined percentage of each to train, rest to test,
calculate results...,
delete database,
repeat,
average results.
Obviously, continuity is lost.
Could you give some more support to why continuity is so important?
The rest is correct. OK, I am Polish, so he may have a different opinion.