If KDE/Gnome can just come up with something unique and useful , and chuck the Win98-ish crap
This is exactly the opposite direction from what is being done, and for good reason. Right now, the focus is not on re-inventing everything, but figuring out where the common elements of GNOME and KDE's HIG's can be merged, and also where they are unique. Then an effort to merge those last chunks can procede by actually changing the two where appropriate.
Also, you may not realize just what an HIG is. It actually has very little to do with what you *see* so much as how you see it. Check out the GNOME HIG for more details. This specifies things like what buttons you should put on an alert dialog; when you should use modal vs non-modal windows; default keyboard shortcuts and menu names; etc.
If all you want is a more BeOS, MacOS, etc. looking desktop, or even a totally unique look, you can do that within the constraints of the HIG of either GNOME or KDE.
From the announcement:
Having a shared document will also allow us to start looking at commonalities
between the documents and perhaps create common chapters or sections on basic
guidelines and lessons that are desktop and toolkit-independent (e.g.,
accessibility and internationalization tips, general usability principles).
One little nit (though I agree with you, fundamentally), when you say "accurate" printing, you're making assumptions about what that means. If someone considers accurate to mean exactly what they see on the screen, then you should send that bit-for-bit to the printer with one exception: color correction, which you have to do do account for printer vs. monitor differences in color.
However, if you start re-rendering the screen in the DPI of the printer vs the screen, you will get a DIFFERENT image. That might be ok, or even desirable, but it's not quite "accurate".
These issues are important as we move UNIX-like OSes into the pure desktop arena. We'll have to tackle issues that are purely aesthetic and don't really have a 100% right answer.
The main reason to do this is rendering speed. Storage size is also smaller, but really you care about the rendering speed of having 5 apps open, all of which use dozens of icons (I'm running galeon, and I count 16 icons in it alone... galeon tends to be light-weight when it comes to baubles compared to say, a spreadsheet).
People complain that GNOME and KDE are memory-hogs and slow, but realistically, most of the overhead is in things like pixmap storage (not going to go away with SVG or PNG, since both have to be rendered down to an X Pixmap). Beyond that, you have to start hacking away at every bit of performance and memory use you can find. This is one such.
I assume that KDE already has or is working on SVG too. It's a logical step. Heck, they *could* just use this lib if they don't already have one.
"Recently I've had a chance to do some web design with PHP. Previously I'd used Perl because I'd heard from many people that Perl was the end all and be all of scripting languages for the web. Imagine my suprise to discover that PHP was vastly superior!"
Using Perl to do Web development is like using a hammer to build a house. I recommend using hammers when building houses, but it's not quite the be-all, end-all tool. Neither is PHP, VB, C#, Java, etc, etc.
There are Perl-based content management systems like bricolage that do a very nice job of abstracting away the routine tasks of Web development and providing some extra tools that are very useful.
I also suggest trying PHP as a templating language, while using Perl for the direct server manipulation (e.g. what mod_perl is really useful for) and for large, stand-alone back-end chunks.
A few of your points:
"Ease of use" -- You don't discuss ease of use. You discuss time to learn to program in the language. Different. It took me about 2 years to start thinking in Perl. Four years to feel that I understood it deeply enough to call myself an expert. Another 3 to realize I was wrong.:-)
"The OO of PHP is excellent" -- Don't know anything about PHP's OO model. I do know that Perl doesn't have one. Perl 6 will introduce an OO model. Perl 5 allows you to roll your own. It does have some very nice tools for building an OO model, but that's not the same thing. I see this as a strength long-term (as it allowed Perl's OO usage to mature before being set in concrete), but it is a weakness in current usage.
"Outstanding database support" -- I don't like to say that Perl "is the best" at anything, but certainly in this department, you're way off the mark. Perl has amazingly well abstracted database support (DBI) for Oracle, Sybase, MySQL, MS SQL, PostgreSQL, DB2 and just about every other relational database known to man. It also has a very nice Web-based abstraction layer over DBI which allows you to hide some of the details in ways that Web developers tend to want.
"Data Structures" -- The mind boggles. Perl's complex data structures are sufficient to say the very least.
The rest is mostly misunderstanding and noise.
Yes, I realize the post I'm responding to was cut-and-paste from someone else's bad post by a self-professed troll, but I really don't like the idea that someone is going to see this and think it's true....
If you're running RH8.0 and want to use a version of Gnome that's a little more current, may I suggest that you check out Garnome? It's a very nice ports-based Gnome distribution based (currently) on the latest 2.2RC1 (2.1.90)
I installed it on my Laptop which is running RH80, and it fixed a lot of things that were pissing me off. Upgrading galeon from their site didn't hurt either.
SA calculates scores for its other tests centrally. Word-tracking is done locally based on those other rules. It's a way of weighting the centrally-managed scores to your local mail's makeup.
Re:I'm making one when I get home
on
Potato Bazookas
·
· Score: 1
"Very well. So how about this situation: an anti-porn group decides that the best way to get rid of porno mags is to photograph everyone going into the local 7-Eleven as well as their license plates, regardless of whether or not they come out with a copy of Hustler and buys an ad in the local paper saying "these people support evil pornography by supporting the largest porn dealer in the US," with photos of them and their cars, with names and addresses, courtesy of the DMV."
It's again a weak analogy, obviously crafted to cast DNSBLs in the light of evil privacy invaders.
Facts:
1. You don't get on a DNSBL by sharing a netblock with spammers. Your netblock gets on a DNSBL by having spammers who the owning ISP refuses to squelch. So, your example would be that your city block gets published.
2. A DNSBL is not published in a forum like a newspaper. It's a stand-alone resource that is queryable. So, your example would be having your city block publised in a mail-order list of porn-supporting city-blocks.
3. No one ever got a date from being on a DNSBL;-)
"I have no problem with proper use of the blacklists. Proper use would be having something like SpamAssassin (which can do it's own queries to the dbs) use whether something is on the blacklist to give it a slightly higher score on the spam count. However, such a tool must be entirely opt-in, and users should have the ability to use their own.spamassassin files, with customized weights that would include disabling the blacklist lookups."
So services like cell-phone and pager email should never have spam-filtering, since you're not given a shell via which to modify such a file? Poo! I want my ISP to dump the spam. I'm actually fine with them even dumping the ultra-rare message that looks just like spam, but I wanted to see. I just want the 2-300 pieces of spam that I get per day (which is what you get when your email address pre-dates the existence of spam) to GO AWAY.
I use SpamAssassin for this, but for those who can't, I think think the ISPs should take on that burden and do the right thing. Black-lists, Razor and other strong indicators of spammishness should be very close in score to the threshold so that they almost push a message over the threshold by themselves.
1) Ethnicity is not a contractual or consentual matter. You cannot "opt out" of your ethnicity and go with another provider. The same cannot be said for ISP. If you do business with an ISP that supports spamming, I think you should expect service to be degraded by that activity, and be pissed with said ISP when/if you find that they've tarnished YOUR reputation by doing business with scum.
2) The phrase at the end of your mail should be "when you buy drugs from vendors known to support terrorism, you support terrorism". There are holes in that statement, just as there are holes in the logic of blacklists, but voluntary blacklists are one of our best weapons against spam just as not buying from disreputable vendors is our best defense against as simple consumers against the misuse of our money. In the drug example, what's often glossed over is that leagal recreational drugs would not have to be purchased from disreputable vendors....
Remember also that blacklists are reputation repositories like credit reports or campaign donation lists. An ISP is just as free to use a blacklist to determine who to allocate more bandwith to because they want to support spam as they are to use it for blocking. It's not a unilateral descision process, even when it's unilateral, and I don't think it should be. Also, I think that if my ISP decides to block such access, they should be willing to give me access to an un-blocked server if I want it.
Exactly. As I said, MS is very good at this sort of "acquire good technology -> productize -> sell" model. It's not something that a lot of companies can do well, and if you've ever seen it done badly, you'll begin to get a sense for how hard it was for MS to do this.
It was likely not "bad admins" so much as bueracracy. Most large companies make it very hard to make any kind of change, which leads to a situation where only the scariest, hairiest bugs get patched. This one may simply have seemed too complex for the average person to exploit until it was too late.
This problem is actually a very interesting one that I've been looking at for years. It happens in everything from 300-person companies to giant mega-corps. It's not because people are stupid, but because large systems only can only avoid tripping on themselves by imposing arbitrary controls.
I think that the right solution is staged anarchy, which is sort of what many large companies (e.g. Microsoft, AT&T, IBM, etc) do with their research divisions or via acquisitions or both. The idea is that you let smart people go nuts and create the unsupportable. You then get more, but different smart people to turn THAT into the supportable. You then get more average corportate drones to convert the supportable into the existing production framework. You then present the existing production framework to the first group of smart people and let them start over again.
You get about a 6-month cycle if you do it right, and you keep reaping the benefits of wild-eyed hacking as well as stability.
Microsoft takes a lot of flack for their technology, but they do this one thing well. You may not like such things as NT, C#, etc, but they are fairly large and complex beasts that most companies would not be capable of cranking out on their own (hence the benefits of open source development so that they don't have to). MS was able to draw on (and some would say corrupt) the smart work of their research folks and of technologies that they acquired and "MS all over it" until it fit their sales and support model, which is one of the reasons that they could do something like go from "Internet-illiterate" to winning the browser war, practically overnight.
IBM does this quite a lot as well (all of their hard drive advances come from this sort of process).
But, aren't those "false positives" (usually so-called innocent open relays and people sharing netblocks with spammers) what you want?
In the case of open relays, yes a whole company can be hosed mail-wise when the get on a list, but if multiple BLs agree, then they've got a problem that needs to be fixed.
For the case of people who share a spammers address range, I feel for them, but... do I really want to take the pressure off of them in favor of flooding the world with spam? I'd personally be pissed at my ISP for allowing such spammers to screw over MY reputation among the BLs. ISPs should behave accordingly, but right now why would they? They get far more money from spammers than from people who will leave because a few folks listening to the BLs get mail from your customers.
Spam is an ugly thing, and combating it is hard. Casualties are going to arrise. The question is: how do you minimize that list of casualties and make sure that people know the safety dance ahead of time.
I think, at the Internet level, RBLs (mirrored by you, obviously for speed's sake) and such are your best weapon. The more of the net you have by the short patch-cables, the more significant you make each RBL that you listen to.
At the personal level, each of these newly "discovered" techniques (I remember a/. article about using gzip for analysis of other document structures years ago) will make a fine addition to statistical systems like SpamAssassin, which uses them to build a very accurate model of a piece of mail's "spamishness".
A fear of being too late is well placed in this case. When it comes to issues like legislation, it's often too late. Time and again throughout history different nations have had very progressive laws, often stictly conforming to public opinion. And each time in the end they have been changed or made null and void. Or do you see the laws of the ancient Greeks still in power in Greece? Laws as liberal as the bill of rights simply cannot last, arguments like "we hope there is too much profit, and they are too late" only serve the purpose of drawing more attention to the feeding frenzy and stifling change more completely, which is good in this case.
Obviously, the above doesn't scan so well, as I was trying to keep strictly to your wording. However, I think it does do a good job of illustrating your main logical failure. "Look upon their works, ye mighty and rejoice" isn't as on the money as you might think, especially if the pattern holds and change for the better only comes through pain and suffering and ulitmately leads to corruption anyway (e.g. the French Revolution, Russian Revolution, etc).
Your carefully crafted message is a great worry of the people who work on SA.
The only defenses against this are:
1. In-transit info (RBLs, recieved headers, forgery detection by an MTA, etc) can hit the score big-time 2. Consensus-based checks (e.g. Razor2) 3. Body analysis (as you note) lends some score 4. If all else fails, your technique will shape later updates to the scoring, and scores will adapt to the abuse automatically.
SA is not perfect, but it makes spamming MUCH more difficult, and I think ultimately will increase the difficulty to the point where spammers are thinking in terms of reaching hundreds or thousands of boxes, not millions. That changes the economics for them, and makes it likely that it's no longer profitable.
SA does exactly what you say. That's how they develop the scores for each of the tests. The new word-analysis test will then feed off of those other scores in order to train itself, becoming even more accurate over time.
Honestly, this kind of narrow-minded reading is why there are so many strange sub-genre's of realistic-fantasy and so few new and interesting SF authors today. If you get branded as an SF author, you're a hack, and worse: you can't write anything else! Strange sub-genres can be good, but they should not have to exist in for an author to be taken seriously.
I remember Piers Anthony writing about this (mind you, I'm not a P.A. fan, but he had a point). He wanted to do some historical fiction, but all his publisher wanted was another crappy Xanth book. He actually would have had to break back into writing all over again just to get out of his genre.
On the other hand, authors like Gene Wolfe, Neil Gaiman, Philip K. Dick, Ian Banks and Jonathan Lethem are capable of science fiction (or magical realism with an SF flavor in the case of the latter two) that I would compare with any author in any genre.
There's also great writing in many other genres including non-fiction (e.g. The Art of Eating by M.F.K. Fisher) and even technical books (e.g. Applied Cryptography is one of the most engrosing books I've ever read, and it doesn't even have a story!) and media (check out some of the many excellent films, graphic literature (aka comics) and musical story-telling that's been around for a fairly long time).
Open minded reading is essential to getting the most out of the vast body of literature out there. If you say, "oh well, robot on the cover," you're pretty much doomed from the onset.
So, I gotta get this straight.... You read Zodiac, Snow Crash and Diamond Age and you were stunned that Cryptonomicon had a non-ending?!
You must have stopped 10-pages short of the end of Diamond Age then!
Stephenson has a wonderful ability to write about technological concepts in a way that is interesting and informative to the casual reader while (at least to me) engrossing for the long-time professional as well. I read Cryptonomicon and Applied Cryptography back-to-back and I have to say that he did a good job of capturing the really interesting parts of cryptography.
The end was standard Stephenson drop-off. He's turned on by the IDEA, not the story. To him, it seems, the idea is all that's worth writing about, and when he's done, the rest is a chore. I'm just guessing, as I don't know the man, but that's the way Diamond Age came off to me, and Cryptonomicon to a lesser extent.
I still find his idead compelling enough to keep reading. I see him as sort of the Arthur C. Clarke of this generation. A friend pointed out that while many engineers in the 50s would have said that Clarke didn't know "enough" about their field, he knew enough about several and had the vision to put them together in a way that told the story that the engineers could not.
I don't know that any of us in the trenches are telling the story that Cryptonomicon told in a way that will ever get to as many people. It's not a hugely important story, but certainly one that I think should be told.
Incorrect. SA is using that technique (and has for a fairly long time now) centrally to generate their score lists. That's important, and it's a very strong part of SA.
However, in the next release of SA (and I'm currently running it out of CVS, so it's hardly vapor), they will *also* be using full word scoring heuristics. That scoring will result in a boolean "spamishness" which will in turn be assigned a score centrally (whihc users can override, of course).
By way of example, here's a recent summary of one of my pieces of spam:
Content analysis details: (12.50 points, 4 required) NO_REAL_NAME (1.3 points) From: does not include a real name INVALID_DATE (1.6 points) Invalid Date: header (not RFC 2822) BAYES_90 (2.0 points) BODY: Bayesian classifier says spam probability is 90 to 99%
[score: 0.9645] RAZOR2_CF_RANGE_91_100 (0.0 points) BODY: Razor2 gives a spam confidence level between 91 and 100
[cf: 100] RAZOR2_CHECK (3.9 points) Listed in Razor2, see http://razor.sf.net/ DATE_IN_PAST_03_06 (0.2 points) Date: is 3 to 6 hours before Received: date MSG_ID_ADDED_BY_MTA_3 (2.0 points) 'Message-Id' was added by a relay (3) FORGED_MUA_OUTLOOK (1.0 points) Forged mail pretending to be from MS Outlook MISSING_MIMEOLE (0.5 points) Message has X-MSMail-Priority, but no X-MimeOLE
As I said previously, the interesting part here is not the word-analysis, but the fact that the database for that word analysis is generated dynamically by looking at your mail, and applying SA's other rules. Self-training of this sort has proven highly successful in tests, and may yield the next quantum of spam-filtering effectiveness.
Notice also that while that 2.0 points from Bayes is a big push to this spam's score, it's not enough to mark it as spam on it's own. This is the power of SpamAssassin. No one test says, "this is spam", and so no one test is trusted on its own.
Good idea on it's face, but the problem is that only the sophisticated spammers are using SA to pre-filter mail. A lot of the porno boys just don't care. They'd rather carpet-bomb and let the chips fall where they may....
I've been ajs@ajs.com since 1994. The spam that I get is an ever-growing mountain, but it's very managable thanks to SpamAssassin. You should check it out. I use evolution as my MUA, and I have a single virtual folder for "Junk" that includes spam, automatic mail from systems that won't shut up, etc. I delete everything in it from time to time during the day, and never think much about it. Sometimes I go through it a bit to see if anything has gotten stuck due to black hole services and such, but mostly I just let the system do its job.
I can't imagine changing email addresses every month. People send me mail who have not communicated with me for YEARS. How would they know what to use?
Everyone but the folks at SpamAssassin have been focusing on the idea that any one technique for identifying spam is doomed to diminishing returns.
Over at SpamAssassin, they've been busily creating a system that collects "good enough" tests by the dozens and uses them to collectively score a message and determine its general "spamishness". The system relies on a complex scoring system that is determined, not by the whim of human programmers, but on the results of a genetic training system that pits one set of scores against another until equilibrium is reached for a given set of example spam and non-spam.
See my other post here for how Bayesian filtering will be used to allow this system to feed back on itself and improve as it sees more of your spam and non-spam....
The latest development Spamassassin has an interesting application of Bayesian filtering. Basically, it takes all of SA's existing heuristics, uses that to develop a sense of what is and is not spam, and then pumps the results through a Bayesian filter that learns from these messages.
As with any other SA test, no single element of the chain is trusted enough to definitively call something spam, but if a message would have squeeked through before, this new filter can put the final nail in its coffin through word analysis against previous spam.
So, why did I use a subject about "ENDING spam"? Because one of the tools that spammers have is SA itself. They can use it to score their messages and determine how "spamish" it is. The problem now is that each SA installation will have subtly different scoring, and the message may be "ok" according to the spammer's version, but my version has a better sense of the mail that *I* get.
SpamAssassin is definitely a tool worth checking out if you have not already. Install it in daemon mode (spamd) and then use "spamc -f" in your procmailrc or the equiv for your MTA.
This is exactly the opposite direction from what is being done, and for good reason. Right now, the focus is not on re-inventing everything, but figuring out where the common elements of GNOME and KDE's HIG's can be merged, and also where they are unique. Then an effort to merge those last chunks can procede by actually changing the two where appropriate.
Also, you may not realize just what an HIG is. It actually has very little to do with what you *see* so much as how you see it. Check out the GNOME HIG for more details. This specifies things like what buttons you should put on an alert dialog; when you should use modal vs non-modal windows; default keyboard shortcuts and menu names; etc.
If all you want is a more BeOS, MacOS, etc. looking desktop, or even a totally unique look, you can do that within the constraints of the HIG of either GNOME or KDE.
From the announcement:
One little nit (though I agree with you, fundamentally), when you say "accurate" printing, you're making assumptions about what that means. If someone considers accurate to mean exactly what they see on the screen, then you should send that bit-for-bit to the printer with one exception: color correction, which you have to do do account for printer vs. monitor differences in color.
However, if you start re-rendering the screen in the DPI of the printer vs the screen, you will get a DIFFERENT image. That might be ok, or even desirable, but it's not quite "accurate".
These issues are important as we move UNIX-like OSes into the pure desktop arena. We'll have to tackle issues that are purely aesthetic and don't really have a 100% right answer.
The main reason to do this is rendering speed. Storage size is also smaller, but really you care about the rendering speed of having 5 apps open, all of which use dozens of icons (I'm running galeon, and I count 16 icons in it alone... galeon tends to be light-weight when it comes to baubles compared to say, a spreadsheet).
People complain that GNOME and KDE are memory-hogs and slow, but realistically, most of the overhead is in things like pixmap storage (not going to go away with SVG or PNG, since both have to be rendered down to an X Pixmap). Beyond that, you have to start hacking away at every bit of performance and memory use you can find. This is one such.
I assume that KDE already has or is working on SVG too. It's a logical step. Heck, they *could* just use this lib if they don't already have one.
"Recently I've had a chance to do some web design with PHP. Previously I'd used Perl because I'd heard from many people that Perl was the end all and be all of scripting languages for the web. Imagine my suprise to discover that PHP was vastly superior!"
:-)
Using Perl to do Web development is like using a hammer to build a house. I recommend using hammers when building houses, but it's not quite the be-all, end-all tool. Neither is PHP, VB, C#, Java, etc, etc.
There are Perl-based content management systems like bricolage that do a very nice job of abstracting away the routine tasks of Web development and providing some extra tools that are very useful.
I also suggest trying PHP as a templating language, while using Perl for the direct server manipulation (e.g. what mod_perl is really useful for) and for large, stand-alone back-end chunks.
A few of your points:
"Ease of use" -- You don't discuss ease of use. You discuss time to learn to program in the language. Different. It took me about 2 years to start thinking in Perl. Four years to feel that I understood it deeply enough to call myself an expert. Another 3 to realize I was wrong.
"The OO of PHP is excellent" -- Don't know anything about PHP's OO model. I do know that Perl doesn't have one. Perl 6 will introduce an OO model. Perl 5 allows you to roll your own. It does have some very nice tools for building an OO model, but that's not the same thing. I see this as a strength long-term (as it allowed Perl's OO usage to mature before being set in concrete), but it is a weakness in current usage.
"Outstanding database support" -- I don't like to say that Perl "is the best" at anything, but certainly in this department, you're way off the mark. Perl has amazingly well abstracted database support (DBI) for Oracle, Sybase, MySQL, MS SQL, PostgreSQL, DB2 and just about every other relational database known to man. It also has a very nice Web-based abstraction layer over DBI which allows you to hide some of the details in ways that Web developers tend to want.
"Data Structures" -- The mind boggles. Perl's complex data structures are sufficient to say the very least.
The rest is mostly misunderstanding and noise.
Yes, I realize the post I'm responding to was cut-and-paste from someone else's bad post by a self-professed troll, but I really don't like the idea that someone is going to see this and think it's true....
Ditto on that front.
If you're running RH8.0 and want to use a version of Gnome that's a little more current, may I suggest that you check out Garnome? It's a very nice ports-based Gnome distribution based (currently) on the latest 2.2RC1 (2.1.90)
I installed it on my Laptop which is running RH80, and it fixed a lot of things that were pissing me off. Upgrading galeon from their site didn't hurt either.
Cool, you get your ISP, I'll get mine.
Nuff said.
Yes, yes, no and yes :-)
SA calculates scores for its other tests centrally. Word-tracking is done locally based on those other rules. It's a way of weighting the centrally-managed scores to your local mail's makeup.
A german page with a good picture of one
"Very well. So how about this situation: an anti-porn group decides that the best way to get rid of porno mags is to photograph everyone going into the local 7-Eleven as well as their license plates, regardless of whether or not they come out with a copy of Hustler and buys an ad in the local paper saying "these people support evil pornography by supporting the largest porn dealer in the US," with photos of them and their cars, with names and addresses, courtesy of the DMV."
;-)
.spamassassin files, with customized weights that would include disabling the blacklist lookups."
It's again a weak analogy, obviously crafted to cast DNSBLs in the light of evil privacy invaders.
Facts:
1. You don't get on a DNSBL by sharing a netblock with spammers. Your netblock gets on a DNSBL by having spammers who the owning ISP refuses to squelch. So, your example would be that your city block gets published.
2. A DNSBL is not published in a forum like a newspaper. It's a stand-alone resource that is queryable. So, your example would be having your city block publised in a mail-order list of porn-supporting city-blocks.
3. No one ever got a date from being on a DNSBL
"I have no problem with proper use of the blacklists. Proper use would be having something like SpamAssassin (which can do it's own queries to the dbs) use whether something is on the blacklist to give it a slightly higher score on the spam count. However, such a tool must be entirely opt-in, and users should have the ability to use their own
So services like cell-phone and pager email should never have spam-filtering, since you're not given a shell via which to modify such a file? Poo! I want my ISP to dump the spam. I'm actually fine with them even dumping the ultra-rare message that looks just like spam, but I wanted to see. I just want the 2-300 pieces of spam that I get per day (which is what you get when your email address pre-dates the existence of spam) to GO AWAY.
I use SpamAssassin for this, but for those who can't, I think think the ISPs should take on that burden and do the right thing. Black-lists, Razor and other strong indicators of spammishness should be very close in score to the threshold so that they almost push a message over the threshold by themselves.
"It's not a unilateral descision process, even when it's unilateral"
:)
oops. I meant "even when it's unanimous". Duh
1) Ethnicity is not a contractual or consentual matter. You cannot "opt out" of your ethnicity and go with another provider. The same cannot be said for ISP. If you do business with an ISP that supports spamming, I think you should expect service to be degraded by that activity, and be pissed with said ISP when/if you find that they've tarnished YOUR reputation by doing business with scum.
2) The phrase at the end of your mail should be "when you buy drugs from vendors known to support terrorism, you support terrorism". There are holes in that statement, just as there are holes in the logic of blacklists, but voluntary blacklists are one of our best weapons against spam just as not buying from disreputable vendors is our best defense against as simple consumers against the misuse of our money. In the drug example, what's often glossed over is that leagal recreational drugs would not have to be purchased from disreputable vendors....
Remember also that blacklists are reputation repositories like credit reports or campaign donation lists. An ISP is just as free to use a blacklist to determine who to allocate more bandwith to because they want to support spam as they are to use it for blocking. It's not a unilateral descision process, even when it's unilateral, and I don't think it should be. Also, I think that if my ISP decides to block such access, they should be willing to give me access to an un-blocked server if I want it.
Exactly. As I said, MS is very good at this sort of "acquire good technology -> productize -> sell" model. It's not something that a lot of companies can do well, and if you've ever seen it done badly, you'll begin to get a sense for how hard it was for MS to do this.
It was likely not "bad admins" so much as bueracracy. Most large companies make it very hard to make any kind of change, which leads to a situation where only the scariest, hairiest bugs get patched. This one may simply have seemed too complex for the average person to exploit until it was too late.
This problem is actually a very interesting one that I've been looking at for years. It happens in everything from 300-person companies to giant mega-corps. It's not because people are stupid, but because large systems only can only avoid tripping on themselves by imposing arbitrary controls.
I think that the right solution is staged anarchy, which is sort of what many large companies (e.g. Microsoft, AT&T, IBM, etc) do with their research divisions or via acquisitions or both. The idea is that you let smart people go nuts and create the unsupportable. You then get more, but different smart people to turn THAT into the supportable. You then get more average corportate drones to convert the supportable into the existing production framework. You then present the existing production framework to the first group of smart people and let them start over again.
You get about a 6-month cycle if you do it right, and you keep reaping the benefits of wild-eyed hacking as well as stability.
Microsoft takes a lot of flack for their technology, but they do this one thing well. You may not like such things as NT, C#, etc, but they are fairly large and complex beasts that most companies would not be capable of cranking out on their own (hence the benefits of open source development so that they don't have to). MS was able to draw on (and some would say corrupt) the smart work of their research folks and of technologies that they acquired and "MS all over it" until it fit their sales and support model, which is one of the reasons that they could do something like go from "Internet-illiterate" to winning the browser war, practically overnight.
IBM does this quite a lot as well (all of their hard drive advances come from this sort of process).
Interesting stuff.
But, aren't those "false positives" (usually so-called innocent open relays and people sharing netblocks with spammers) what you want?
In the case of open relays, yes a whole company can be hosed mail-wise when the get on a list, but if multiple BLs agree, then they've got a problem that needs to be fixed.
For the case of people who share a spammers address range, I feel for them, but... do I really want to take the pressure off of them in favor of flooding the world with spam? I'd personally be pissed at my ISP for allowing such spammers to screw over MY reputation among the BLs. ISPs should behave accordingly, but right now why would they? They get far more money from spammers than from people who will leave because a few folks listening to the BLs get mail from your customers.
Spam is an ugly thing, and combating it is hard. Casualties are going to arrise. The question is: how do you minimize that list of casualties and make sure that people know the safety dance ahead of time.
I think, at the Internet level, RBLs (mirrored by you, obviously for speed's sake) and such are your best weapon. The more of the net you have by the short patch-cables, the more significant you make each RBL that you listen to.
/. article about using gzip for analysis of other document structures years ago) will make a fine addition to statistical systems like SpamAssassin, which uses them to build a very accurate model of a piece of mail's "spamishness".
At the personal level, each of these newly "discovered" techniques (I remember a
A fear of being too late is well placed in this case. When it comes to issues like legislation, it's often too late. Time and again throughout history different nations have had very progressive laws, often stictly conforming to public opinion. And each time in the end they have been changed or made null and void. Or do you see the laws of the ancient Greeks still in power in Greece? Laws as liberal as the bill of rights simply cannot last, arguments like "we hope there is too much profit, and they are too late" only serve the purpose of drawing more attention to the feeding frenzy and stifling change more completely, which is good in this case.
Obviously, the above doesn't scan so well, as I was trying to keep strictly to your wording. However, I think it does do a good job of illustrating your main logical failure. "Look upon their works, ye mighty and rejoice" isn't as on the money as you might think, especially if the pattern holds and change for the better only comes through pain and suffering and ulitmately leads to corruption anyway (e.g. the French Revolution, Russian Revolution, etc).
Your carefully crafted message is a great worry of the people who work on SA.
The only defenses against this are:
1. In-transit info (RBLs, recieved headers, forgery detection by an MTA, etc) can hit the score big-time
2. Consensus-based checks (e.g. Razor2)
3. Body analysis (as you note) lends some score
4. If all else fails, your technique will shape later updates to the scoring, and scores will adapt to the abuse automatically.
SA is not perfect, but it makes spamming MUCH more difficult, and I think ultimately will increase the difficulty to the point where spammers are thinking in terms of reaching hundreds or thousands of boxes, not millions. That changes the economics for them, and makes it likely that it's no longer profitable.
SA does exactly what you say. That's how they develop the scores for each of the tests. The new word-analysis test will then feed off of those other scores in order to train itself, becoming even more accurate over time.
Honestly, this kind of narrow-minded reading is why there are so many strange sub-genre's of realistic-fantasy and so few new and interesting SF authors today. If you get branded as an SF author, you're a hack, and worse: you can't write anything else! Strange sub-genres can be good, but they should not have to exist in for an author to be taken seriously.
I remember Piers Anthony writing about this (mind you, I'm not a P.A. fan, but he had a point). He wanted to do some historical fiction, but all his publisher wanted was another crappy Xanth book. He actually would have had to break back into writing all over again just to get out of his genre.
On the other hand, authors like Gene Wolfe, Neil Gaiman, Philip K. Dick, Ian Banks and Jonathan Lethem are capable of science fiction (or magical realism with an SF flavor in the case of the latter two) that I would compare with any author in any genre.
There's also great writing in many other genres including non-fiction (e.g. The Art of Eating by M.F.K. Fisher) and even technical books (e.g. Applied Cryptography is one of the most engrosing books I've ever read, and it doesn't even have a story!) and media (check out some of the many excellent films, graphic literature (aka comics) and musical story-telling that's been around for a fairly long time).
Open minded reading is essential to getting the most out of the vast body of literature out there. If you say, "oh well, robot on the cover," you're pretty much doomed from the onset.
Good luck!
So, I gotta get this straight.... You read Zodiac, Snow Crash and Diamond Age and you were stunned that Cryptonomicon had a non-ending?!
You must have stopped 10-pages short of the end of Diamond Age then!
Stephenson has a wonderful ability to write about technological concepts in a way that is interesting and informative to the casual reader while (at least to me) engrossing for the long-time professional as well. I read Cryptonomicon and Applied Cryptography back-to-back and I have to say that he did a good job of capturing the really interesting parts of cryptography.
The end was standard Stephenson drop-off. He's turned on by the IDEA, not the story. To him, it seems, the idea is all that's worth writing about, and when he's done, the rest is a chore. I'm just guessing, as I don't know the man, but that's the way Diamond Age came off to me, and Cryptonomicon to a lesser extent.
I still find his idead compelling enough to keep reading. I see him as sort of the Arthur C. Clarke of this generation. A friend pointed out that while many engineers in the 50s would have said that Clarke didn't know "enough" about their field, he knew enough about several and had the vision to put them together in a way that told the story that the engineers could not.
I don't know that any of us in the trenches are telling the story that Cryptonomicon told in a way that will ever get to as many people. It's not a hugely important story, but certainly one that I think should be told.
Incorrect. SA is using that technique (and has for a fairly long time now) centrally to generate their score lists. That's important, and it's a very strong part of SA.
However, in the next release of SA (and I'm currently running it out of CVS, so it's hardly vapor), they will *also* be using full word scoring heuristics. That scoring will result in a boolean "spamishness" which will in turn be assigned a score centrally (whihc users can override, of course).
By way of example, here's a recent summary of one of my pieces of spam:
Content analysis details: (12.50 points, 4 required)
NO_REAL_NAME (1.3 points) From: does not include a real name
INVALID_DATE (1.6 points) Invalid Date: header (not RFC 2822)
BAYES_90 (2.0 points) BODY: Bayesian classifier says spam probability is 90 to 99%
[score: 0.9645]
RAZOR2_CF_RANGE_91_100 (0.0 points) BODY: Razor2 gives a spam confidence level between 91 and 100
[cf: 100]
RAZOR2_CHECK (3.9 points) Listed in Razor2, see http://razor.sf.net/
DATE_IN_PAST_03_06 (0.2 points) Date: is 3 to 6 hours before Received: date
MSG_ID_ADDED_BY_MTA_3 (2.0 points) 'Message-Id' was added by a relay (3)
FORGED_MUA_OUTLOOK (1.0 points) Forged mail pretending to be from MS Outlook
MISSING_MIMEOLE (0.5 points) Message has X-MSMail-Priority, but no X-MimeOLE
As I said previously, the interesting part here is not the word-analysis, but the fact that the database for that word analysis is generated dynamically by looking at your mail, and applying SA's other rules. Self-training of this sort has proven highly successful in tests, and may yield the next quantum of spam-filtering effectiveness.
Notice also that while that 2.0 points from Bayes is a big push to this spam's score, it's not enough to mark it as spam on it's own. This is the power of SpamAssassin. No one test says, "this is spam", and so no one test is trusted on its own.
Good idea on it's face, but the problem is that only the sophisticated spammers are using SA to pre-filter mail. A lot of the porno boys just don't care. They'd rather carpet-bomb and let the chips fall where they may....
I've been ajs@ajs.com since 1994. The spam that I get is an ever-growing mountain, but it's very managable thanks to SpamAssassin. You should check it out. I use evolution as my MUA, and I have a single virtual folder for "Junk" that includes spam, automatic mail from systems that won't shut up, etc. I delete everything in it from time to time during the day, and never think much about it. Sometimes I go through it a bit to see if anything has gotten stuck due to black hole services and such, but mostly I just let the system do its job.
I can't imagine changing email addresses every month. People send me mail who have not communicated with me for YEARS. How would they know what to use?
Everyone but the folks at SpamAssassin have been focusing on the idea that any one technique for identifying spam is doomed to diminishing returns.
Over at SpamAssassin, they've been busily creating a system that collects "good enough" tests by the dozens and uses them to collectively score a message and determine its general "spamishness". The system relies on a complex scoring system that is determined, not by the whim of human programmers, but on the results of a genetic training system that pits one set of scores against another until equilibrium is reached for a given set of example spam and non-spam.
See my other post here for how Bayesian filtering will be used to allow this system to feed back on itself and improve as it sees more of your spam and non-spam....
The latest development Spamassassin has an interesting application of Bayesian filtering. Basically, it takes all of SA's existing heuristics, uses that to develop a sense of what is and is not spam, and then pumps the results through a Bayesian filter that learns from these messages.
As with any other SA test, no single element of the chain is trusted enough to definitively call something spam, but if a message would have squeeked through before, this new filter can put the final nail in its coffin through word analysis against previous spam.
So, why did I use a subject about "ENDING spam"? Because one of the tools that spammers have is SA itself. They can use it to score their messages and determine how "spamish" it is. The problem now is that each SA installation will have subtly different scoring, and the message may be "ok" according to the spammer's version, but my version has a better sense of the mail that *I* get.
SpamAssassin is definitely a tool worth checking out if you have not already. Install it in daemon mode (spamd) and then use "spamc -f" in your procmailrc or the equiv for your MTA.
Very nice tool, and a real time-saver for me.