I did not say the bugs would become resistant to antibiotics. Some bugs will survive better than others and those are the ones that will reproduce. Those aren't necessarily the ones you want crawling about. And the bugs that are merely wounded will come back bigger, better, stronger.
The rampant use of hexachlorophene in soap ("Dial" consumer bar-soap used to have it!) was associated with a number of nasty situations, including bacterial outbreaks in hospitals. For this reason, its use is now severely restriced. "Dial" consumer bar-soap used to have it!
How can it be measured? There are lots of people who have creative ideas that I haven't thought of, but here's an intervention study that one might do:
Select people working in a lab and assign them to workstations. Split the population into "treatment" and "control" groups. Irradiate, UV-treat, replace, or otherwise sanitize the treatment group's keyboards from time to time. Measure the proportion of each that get sick over a predetermined time interval.
There are non-intervention designs as well, but I won't elaborate any. My point here is not to list all possible study designs, but to illustrate that "species of bateria" is not a necessary component.
There should be enough epidemiologic data that we don't have to rely on bogus measures like "number of germs" to try to estimate the risk of catching something from a keyboard. I suspect it is minimal.
I have a bottle of cleaning fluid that that purports to kill 99.something% of bacteria. Does that make me safer? Probably not; instead I'm helping the natural selection process to breed super-bugs that are resistant to antiseptic.
The specious "germ" argument is exactly the same as the one used to compute risk of intrusion by the number of reported exposures in a software system. What matters is infection/intrusion, not exposure. And it *can* be measured, so why bother to measure the bogus quantities?
I've been unable to find a lot of evidence of Condi's academic career. Her PhD in Sovietology has been published as a book. One that has received some less-than-complimentary reviews. She made a meteoric rise through the ranks to become Provost at Stanford. She won a teaching award at Stanford. She has also been a US administration hack since before she got her PhD.
If somebody could point me to her academic CV - not the fluff the White House publishes - showing her journal publications and other scholarly contributions, I'd be much obliged.
There are several reasons why they might hate fluorescent: flicker, noise, poor color spectrum, prejudice. Not necessarily color temperature. The first three are overcome by electronic ballast, high accuracy fluorescence, and the choice of whatever temperature you like.
It is the incandescent colour that is the wrong temperature, not the LEDs. Mid-day sun is nominally 5600K, and morning/evening higher. So why do you want to emulate candle-light?
Completeness of spectrum is another issue. Cheap fluorescent tubes have huge mercury spikes and little red - maybe 55% on the accuracy scale. Good tubes achieve 95% - a marked difference. This is independent of the colour temperature.
White LEDs (at leat the ones you commonly buy today) are also fluorescent, but with pretty decent spectral accuracy. It would at least in theory be possible to build an RGB array of monochrome LEDS that would produce apparent white light.
White LEDs are already 3 times as efficient as mercury fluorescent, and fluorescent tubes are 3 times as efficient as incandescent. They (fluorscent and LEDs) can get pretty good colour accuracy, too, if they want to. The only thing holding them back is price. I'm not sure what this new invention might bring to the table in that regard.
Maybe. At this time there's still a factor of 100 or so difference in price/byte and a big performance differenc, too. Flash is great for portability but it has a long way to go before being the method of choice for archival storage of videos. Hard drive is already there.
All that said, you seem uncomfortable with static rules of any kind, so if you don't buy into what I've said above, then I suggest that you stop using SA. Static rules are a giant advantage, but if you are going to defeat most of their value, then you might as well not suffer their overhead.
I wrote that paper, and the configuration I posted here is what was used in the best-scoring run.
For your convenience, here's a link to the Spamassassin code that makes the auto-learn decision. Note that sub learn calls _get_autolearn_points() which uses "score set 2" which does not include the Bayes result. Also notice the string of ad hoc tests (which cannot be disabled) based on head_only_points and body_only_points. The main negative effects are: (1) that Spamassassin fails to train on your ordinary good mail, resulting in more higher resulting false positives; (2) although the Bayes filter flags a large number of spams that the ruleset would not otherwise catch, it is not reinforced on these (worse, if the ruleset says these were ham, the Bayes filter is incorrectly trained to believe this is good mail).
In summary, auto-learn re-evaluates the message using only the static rules - not the bayes rules. Then, if the static rules give an extreme score that differs from the bayes score, and a couple of extra ad hoc conditions hold (number of "hits" exceeds some threshold) the bayes filter is trained.
You can adjust the "extremeness" of the score under which Bayes is trained but training will not be on what Spamassassin reports; only on what the static rules report. It is perfectly possible for Spamassassin to report "Spam" yet train as "Ham" or vice versa. This behaviour is unacceptable in a supervised training setup. I've had it correctly classify a message, only to misclassify the next instances of nearly the same message, because of this behaviour. Auto-whitelists have a similar problem.
There is no Spamassassin user parameter to alter this behaviour. I have hacked Spamassassin but it is obviously not reasonable to post a solution that requires a source change. The only way I know to make Spamassassin train properly - on its own judgments - is to force feed it externally.
The reason that Spamassassin's auto-learn is set up this way is to support unsupervised learning - in a server where it is seldom, if ever, corrected. In this setup, the built-in rules work marginally better than simple self-training. But in the supervised setup they are a disaster.
Auto-learn in spamassassin is broken. In fact my mail script automatically calls sa-learn for every message, with ham or spam depending on what Spamassassin claims. Then if I want to correct it I call sa-learn over again with the correct classification. That's why the user-prefs file has it turned off.
I should make this more clear in my notes. Thanks for pointing it out.
Absolutely. It is cathartic to punish spam by reporting it to your spam filter. And, of course, fully automatic systems aren't nearly as good as claimed. (Neither are learning filters - 99.9...% accuracy? pshaw! - but they're better than non-learning ones.)
If a user has to go into the SPAM box and double check that no mistakes have been made then the system is worse than not having any SPAM checking at all.
Not true. First, if the user's mailbox is cluttered with spam, the user is more likely to overlook good mail. More likely than a good spam filter. Second, it is way easier to scan a list of predominantly spam for occasional good mails (and vice versa) than to have everything jumbled together. Third, spam filters are good enough that one does not need normally to look through the quarantine list. Instead it can be searched if and when email goes missing. Almost all spam that is misclassified by a filter is weird in some way - cold call, internet transaction, advertising. Generally one of two mitigating circumstances holds: (1) there is a secondary social mechanism whereby the missing mail will be noticed and retrieved [e.g. nobody assumes that a cold call is delivered, and a reply to an internet transaction would be expected]; (2) the user doesn't really care about the email [e.g. advertising from their frequent flyer plan].
I've found greylisting to be the best solution so far
Greylisting "works" only because spammers aren't on to it yet. And it is intrusive - adding delay and risk of non-delivery. Greater risk, I posit, than the risk of using a spam filter.
I use Spamassassin with a special user configuration file and I train it systematically. In this configuration it works pretty well (much, much, better than out-of-the box). But Bogofilter and Popfile work about as well. As does just the Bayesian component of Spamassassin, ignoring all the other cruft. DSPAM, on the other hand, doesn't work at all well for me.
There are lots of alternatives. Bogofilter, spamprobe, spambayes, popfile, dbacl, are all quite effective.
Re:Comparison to other tools
on
DSPAM v3.6 Released
·
· Score: 2, Informative
TREC's Spam Track will evaluate several spam filters. There's also a toolkit for do-it-yourself comparison.
Although DSPAM is not an official participant at TREC, three configurations will be evaluated for comparison - with tum, toe, and teft training modes. Zdziarski reported some of the preliminary results in his interview, but complete and comparative results won't be available until TREC in November.
I'll try one more time. There is no rule that says the compiler has to translate function calls and return into assembly-language (stack-based) calls and returns. There is a technique, known as continuation passing, in which returns are never used. You may educate yourself by acquiring Appel's book "Compiling with continuations" or by reading Guy Steele's master's thesis or, indeed, by reading the references already given.
If you implement procedures with a stack, the incremental cost of putting additional variables on the stack is small. But that stack allocation is not "for free" and we are talking about abandoning stack allocation in favour of heap allocation.
You have replaced your magical mystical compiler with magical mystical hardware that gives you something for nothing. Not so. A hardware stack simply does some computations under the covers. You can use those same computations to allocate heap storage. Whereas with stack allocation you need a call and a ret instruction for every call, you only need the call instruction with heap allocation. You bank all the savings from not doing all those rets and the savings pay for the collection.
I'm not going to bother to respond further. Believe what you want to believe.
OK, I'll enlighten you. Well, not you, who stubbornly refuse to be enlightened, but somebody else who may be reading.
With a stack or with copying GC, you do allocation by doing a simple add (or subtract) and a check for overflow. On many architectures the check can be implemented using dynamic address translation (virtual memory) capability of the hardware.
With a stack you must pop the stack when you're done. Another simple arithmetic operation. With garbage collection, you do nothing at this time. Running score: stack 2 operations (push,pop); GC 1 operation (allocate).
But with GC you eventuallly run out of space so you copy all the in-use storage. Here's a formula:
in-use == allocated - freed .
That is, the total number of pieces of in-use storage is strictly less (and in practice substantially less) that the number of allocations. Running score: stack 2 operations * #allocations; GC 1 operation * #allocations + 1 operation * (substantially less than #allocations).
That is, for each allocation a stack regimen does a push and a pop, and GC does an allocation and some fraction (substantially less than 1) of a copy.
While I'm at it I'll point out that there are many cases in which procedures may be implemented without using dynamic allocation - stack or GC-ed - at all . Your allusion to that mystical compiler that works some sort of stack magic "for free" is simply wrong.
I did not say the bugs would become resistant to antibiotics. Some bugs will survive better than others and those are the ones that will reproduce. Those aren't necessarily the ones you want crawling about. And the bugs that are merely wounded will come back bigger, better, stronger.
The rampant use of hexachlorophene in soap ("Dial" consumer bar-soap used to have it!) was associated with a number of nasty situations, including bacterial outbreaks in hospitals. For this reason, its use is now severely restriced. "Dial" consumer bar-soap used to have it!
How can it be measured? There are lots of people who have creative ideas that I haven't thought of, but here's an intervention study that one might do:
Select people working in a lab and assign them to workstations. Split the population into "treatment" and "control" groups. Irradiate, UV-treat, replace, or otherwise sanitize the treatment group's keyboards from time to time. Measure the proportion of each that get sick over a predetermined time interval.
There are non-intervention designs as well, but I won't elaborate any. My point here is not to list all possible study designs, but to illustrate that "species of bateria" is not a necessary component.
There should be enough epidemiologic data that we don't have to rely on bogus measures like "number of germs" to try to estimate the risk of catching something from a keyboard. I suspect it is minimal.
I have a bottle of cleaning fluid that that purports to kill 99.something% of bacteria. Does that make me safer? Probably not; instead I'm helping the natural selection process to breed super-bugs that are resistant to antiseptic.
The specious "germ" argument is exactly the same as the one used to compute risk of intrusion by the number of reported exposures in a software system. What matters is infection/intrusion, not exposure. And it *can* be measured, so why bother to measure the bogus quantities?
I've been unable to find a lot of evidence of Condi's academic career. Her PhD in Sovietology has been published as a book. One that has received some less-than-complimentary reviews. She made a meteoric rise through the ranks to become Provost at Stanford. She won a teaching award at Stanford. She has also been a US administration hack since before she got her PhD.
If somebody could point me to her academic CV - not the fluff the White House publishes - showing her journal publications and other scholarly contributions, I'd be much obliged.
Well, sort of. Recall that Time has named Hitlter, Khomeni, Stalin, bin Laden and Nixon as person of the year.
x.org has four letters and five characters.
Vista? Wine has yet to make it to the NT era. All its dll-s are win98 compatible, not winXP.
There are several reasons why they might hate fluorescent: flicker, noise, poor color spectrum, prejudice. Not necessarily color temperature. The first three are overcome by electronic ballast, high accuracy fluorescence, and the choice of whatever temperature you like.
It is the incandescent colour that is the wrong temperature, not the LEDs. Mid-day sun is nominally 5600K, and morning/evening higher. So why do you want to emulate candle-light?
Completeness of spectrum is another issue. Cheap fluorescent tubes have huge mercury spikes and little red - maybe 55% on the accuracy scale. Good tubes achieve 95% - a marked difference. This is independent of the colour temperature.
White LEDs (at leat the ones you commonly buy today) are also fluorescent, but with pretty decent spectral accuracy. It would at least in theory be possible to build an RGB array of monochrome LEDS that would produce apparent white light.
White LEDs are already 3 times as efficient as mercury fluorescent, and fluorescent tubes are 3 times as efficient as incandescent. They (fluorscent and LEDs) can get pretty good colour accuracy, too, if they want to. The only thing holding them back is price. I'm not sure what this new invention might bring to the table in that regard.
Maybe. At this time there's still a factor of 100 or so difference in price/byte and a big performance differenc, too. Flash is great for portability but it has a long way to go before being the method of choice for archival storage of videos. Hard drive is already there.
For your convenience, here's a link to the Spamassassin code that makes the auto-learn decision. Note that sub learn calls _get_autolearn_points() which uses "score set 2" which does not include the Bayes result. Also notice the string of ad hoc tests (which cannot be disabled) based on head_only_points and body_only_points. The main negative effects are: (1) that Spamassassin fails to train on your ordinary good mail, resulting in more higher resulting false positives; (2) although the Bayes filter flags a large number of spams that the ruleset would not otherwise catch, it is not reinforced on these (worse, if the ruleset says these were ham, the Bayes filter is incorrectly trained to believe this is good mail).
Some explanation appears here.
In summary, auto-learn re-evaluates the message using only the static rules - not the bayes rules. Then, if the static rules give an extreme score that differs from the bayes score, and a couple of extra ad hoc conditions hold (number of "hits" exceeds some threshold) the bayes filter is trained.
You can adjust the "extremeness" of the score under which Bayes is trained but training will not be on what Spamassassin reports; only on what the static rules report. It is perfectly possible for Spamassassin to report "Spam" yet train as "Ham" or vice versa. This behaviour is unacceptable in a supervised training setup. I've had it correctly classify a message, only to misclassify the next instances of nearly the same message, because of this behaviour. Auto-whitelists have a similar problem.
There is no Spamassassin user parameter to alter this behaviour. I have hacked Spamassassin but it is obviously not reasonable to post a solution that requires a source change. The only way I know to make Spamassassin train properly - on its own judgments - is to force feed it externally.
The reason that Spamassassin's auto-learn is set up this way is to support unsupervised learning - in a server where it is seldom, if ever, corrected. In this setup, the built-in rules work marginally better than simple self-training. But in the supervised setup they are a disaster.
Auto-learn in spamassassin is broken. In fact my mail script automatically calls sa-learn for every message, with ham or spam depending on what Spamassassin claims. Then if I want to correct it I call sa-learn over again with the correct classification. That's why the user-prefs file has it turned off.
I should make this more clear in my notes. Thanks for pointing it out.
Absolutely. It is cathartic to punish spam by reporting it to your spam filter. And, of course, fully automatic systems aren't nearly as good as claimed. (Neither are learning filters - 99.9...% accuracy? pshaw! - but they're better than non-learning ones.)
I use Spamassassin with a special user configuration file and I train it systematically. In this configuration it works pretty well (much, much, better than out-of-the box). But Bogofilter and Popfile work about as well. As does just the Bayesian component of Spamassassin, ignoring all the other cruft. DSPAM, on the other hand, doesn't work at all well for me.
There are lots of alternatives. Bogofilter, spamprobe, spambayes, popfile, dbacl, are all quite effective.
Although DSPAM is not an official participant at TREC, three configurations will be evaluated for comparison - with tum, toe, and teft training modes. Zdziarski reported some of the preliminary results in his interview, but complete and comparative results won't be available until TREC in November.
I'll try one more time. There is no rule that says the compiler has to translate function calls and return into assembly-language (stack-based) calls and returns. There is a technique, known as continuation passing, in which returns are never used. You may educate yourself by acquiring Appel's book "Compiling with continuations" or by reading Guy Steele's master's thesis or, indeed, by reading the references already given.
Mr. Coward, you are an idiot.
If you implement procedures with a stack, the incremental cost of putting additional variables on the stack is small. But that stack allocation is not "for free" and we are talking about abandoning stack allocation in favour of heap allocation.
You have replaced your magical mystical compiler with magical mystical hardware that gives you something for nothing. Not so. A hardware stack simply does some computations under the covers. You can use those same computations to allocate heap storage. Whereas with stack allocation you need a call and a ret instruction for every call, you only need the call instruction with heap allocation. You bank all the savings from not doing all those rets and the savings pay for the collection.
I'm not going to bother to respond further. Believe what you want to believe.
OK, I'll enlighten you. Well, not you, who stubbornly refuse to be enlightened, but somebody else who may be reading.
With a stack or with copying GC, you do allocation by doing a simple add (or subtract) and a check for overflow. On many architectures the check can be implemented using dynamic address translation (virtual memory) capability of the hardware.
With a stack you must pop the stack when you're done. Another simple arithmetic operation. With garbage collection, you do nothing at this time. Running score: stack 2 operations (push,pop); GC 1 operation (allocate).
But with GC you eventuallly run out of space so you copy all the in-use storage. Here's a formula:
That is, the total number of pieces of in-use storage is strictly less (and in practice substantially less) that the number of allocations. Running score: stack 2 operations * #allocations; GC 1 operation * #allocations + 1 operation * (substantially less than #allocations).That is, for each allocation a stack regimen does a push and a pop, and GC does an allocation and some fraction (substantially less than 1) of a copy.
While I'm at it I'll point out that there are many cases in which procedures may be implemented without using dynamic allocation - stack or GC-ed - at all . Your allusion to that mystical compiler that works some sort of stack magic "for free" is simply wrong.
Uh huh. And check for stack overflow? Or do you work in Redmond?
And then are you planning to free the storage?
How many operations do you think GC takes? Did you read the reference? Of course not.
As far as I can see
Your vision is impaired.
Where in "int foo(int x, int y){return x+y;}" does it say "stack frame?"