That's just the latest name for "global warming." They realized that they had to hedge their bets because as the evidence for global warming dries up, they'd be screwed. By calling it "climate change" they can continue to press for radical liberal wealth-redistribution schemes as long as the earth's climate doesn't remain static... which, research says, is a pretty safe bet.
I wonder how brilliant the clouds really were. The result of a picture can be extremely different just by adjusting the aperture and exposure time. I do it all the time to get cool pictures of sunsets--far more cool than what I was actually looking at when I took the picture.
Furthermore, even if it were a 100% natural cycle...
If it were a 100% natural cycle, it's probable that only arrogance would lead us to believe that we could change it. In reality, it's probably more like a 90% natural cycle--in which case it's still arrogant to believe that we can substantially change nature from doing what it wants to do.
... the direction we're headed could have disastrous risks to humanity- simultaneous worldwide coastal flooding would cost millions of lives.
You mean like in "The Day After Tomorrow?" Enough Hollywood. Any theoretical coastal flooding caused by theoretical man-made global warming is not going to kill millions of people unless they stand there like idiots as the waters rise over the course of a few decades.
Enough of the alarmism, already. If there's a case to make, make the case. Speaking idiocies like "millions of people will die from simultaneous coastal flooding" just makes people tune you out as a scaremonger. Which most of the global warming crowd is.
No-one claims to understand how the climate works completely. It's just a matter of how honest they are about how much we don't understand. Normally, one finds that the more hell-bent "scientists" are on instituting political policies aimed at their pet theory, the less honest they are about how much we really don't know.
They don't know a Goddamn thing about these clouds and already they are making potential applications connections?
Mod parent up. He's right on. They don't know anything about these clouds, but since they're apparently changing it must be global warming's fault. Priceless.
'These observations suggest a connection with global change in the lower atmosphere and could represent an early warning that our Earth environment is being changed.'"
No, that's Al Gore's job. Now you're going to have to take away his Nobel prize and give to the clouds.:)
... the movie is about rampant commercialism destroying society
I saw the movie by accident one afternoon when I happened to be channel-surfing (which doesn't happen often, really). It was stupid the whole way through but yet I didn't change the channel.
But what exactly is the problem with commercialism? How does it really destroy society? Government-mandated use of specific products, sure. But that's not commercialism, that's government-mandated purchasing. What is wrong with commercialism? It's what drives the economy which allows us to have a society and a culture. Those things don't happen by themselves and if no-one is buying anything--including things they don't strictly need--things would be a lot darker than the future portrayed by the movie.
I do the same thing but always have found the prices at the store (Best Buy and Circuit City) DO match their website. I use their websites both to check prices and to check inventory. I then walk in, pick it up, and walk out in less than 5 minutes. That way I don't have to drive to each store and then, potentially, back to the first store that had the better price. Usually the price differences aren't that huge anyway so the only way to make it worth your time is to do your comparison shopping online.
A lot of them also got lied to. Their dealers misled them on what they could hope to afford or what the resale value of their houses might be, or what the costs of maintaining a house are.
I really have a hard time blaming anyone but the borrower. Dealers/lenders are there to give you the credit that you are asking for. They aren't financial counselors. They aren't your parents that can or should tell you all the "hidden costs" of owning a home. They are there to evaluate your application and if you met the requirements, loan you money. Period.
Consumers needed to be treated like grown adults, not children that need hand-holding. "I didn't think I could afford the house, but the lender guy told me I could" is something I'd hesitantly accept from a 15-year old, but not from an adult. "I thought my 2.6% adjustable rate mortgage would never change" is also something I might believe from a 12-year old, but not a 15-year old and much less an adult.
People that are in trouble now got into trouble all by themselves. Blaming the person or company that loaned them the money is just passing the buck and is just another example of the trend in our country of always blaming someone else. Someone killed 5 people? Blame his parents. Someone was driving too fast in snowy weather and crashed into someone else? Blame the city for inadequate snow removal. Someone falls off the top step of a ladder? Blame the ladder company for not putting a warning sticker on the ladder. Someone can't pay the mortgage? Blame the company that loaned him the money.
We really need to get back to personal responsibility in this country. We'd be a lot more productive, have more intelligent citizens, and a lot fewer lawsuits.
Best Buy and Circuit City do fine in my area. They always have what I need and when I've had to return something, they've never had a problem.
Heck, at my wife's request when we happened to be in the area of a Best Buy, I tried to return an iPod radio transmitter thingy that we had bought about a month before but that had been flaking out ever since. I didn't have the receipt with me and I didn't have the box. The guy looked up my receipt based on my credit card number and found that I had bought it 35 days ago (instead of the 30 or 31 that they technically accept returns). The guy told me to get another one off the shelf and bring it up. He just scanned it, took back our broken unit, and told me to have a nice day. We got a brand-new one, boxed, and out of the 30-day in-store exchange period.
I don't know why, but I stopped shopping at CompUSA probably a decade ago. I can't even remember why. Maybe it was just too far away. But for the last few years, there's been CompUSA across the street from Best Buy which is a few blocks away from Circuit City. When I need something, I'll check Circuit City and Best Buy but virtually never stop by CompUSA. It's not an intentional decision or bad blood, just somewhere along the line I stopped thinking they'd have what I need or that their price would be any better.
What performance guarantee do or did you give your clients? That's what the 99% is. A performance guarantee. For all comers. No training required. Would that be a yawn for you?
We claim 99% as well just to be safe. In reality, I've given you the rates I experience in real life. Would their system be a yawn for me? Depends on how much further north of 99%. Personally, where I am now, I wouldn't even switch for a 99.95% performance guarantee. The difference between what I get now and that wouldn't be enough to lead me to switch to something new.
You require some training, your results are for one user only -
I prefer my spam filtering to be based on what I consider spam, not on what anyone else considers spam. Yes, there's some initial training involved. If you don't like that, don't use Bayesian... obviously.
If you want to correspond more, email me. I'm sure/. readers are bored to tears.
I am interested in *accurate* statistics. You are the one who used 4 significant figures to quote an accuracy figure 99.77% in support of your argument that your Bayesian filter was superior to a method whose accuracy was reported with much less precision: "at least 99%." If you'd said "about 99 1/2 percent" I probably wouldn't have jumped on you, but your statement wouldn't have had as much impact, would it?
TFA boasted "better than 99%" which is no big deal and my Bayesian filter does better than that, no question. While the difference between 99.8% and 99.7% may be of critical interest to you, it's not practically all that difference. I didn't post 4 significant digits to be a jerk or to presume I had no statistical errors. That's simply the number I got from my spam filter's spam summarization which was number of spams caught / total numbers of spams. It produced and displayed a statistic with four significant figures and I posted it here.
But when you say it is as good as it can get, or that it is better than something else, be prepared to justify your claims.
And I didn't say either. I said that "over 99% accuracy" (the claim I saw in TFA) is not that impressive anymore, and it's not. My Bayesian filter has been doing better than that for years.
Unless you searched through the 30,000 spams you received in that month, how would you know?
As a follow-up... I don't have to look through 30,000 spams. My spam filter will automatically order the spams in order of spaminess. By default, it only shows me the ones with a spam probability of 80% and below. The vast majority of spam gets 90%+ spaminess and I don't recall any false positive ever being north of 60%. So by looking at spams between 50% and 80% I actually don't have to look at much spam but at the same time I have a very, very low probability of missing false positives since the false positives that do exist don't come in north of 80%.
Now if a legitimate email came in north of 80%, that'd be some accomplishment!:)
Have you run your filter on the TREC corpora? They simulate exactly the sort of deployment you're talking about. Under laboratory conditions the best filters get the sort of results you're talking about, but transferability to the field has yet to be established. And the best filters aren't what I'd call "Bayesian."
I have not run any diagnostic corpora against the filter, no.
As for "Bayesian," I think a big part of the key to the successfulness of my Bayesian filter is the addition of additional Bayesian tokens. I don't only parse out the message and use that as tokens, but there are other aspects of the message that I convert into a token that is then used in a Bayesian application. For example, the lack of a subject becomes a token in and of itself. The presence of an HTTP address with an IP address instead of a domain name also becomes another token. The presence of more than 5 images becomes a token, the presence of more than 10 becomes another, etc. So my Bayesian filter isn't just tokenizing incoming messages, but doing some basic analysis of the messages and converting noteworthy aspects of the messages into tokens that would otherwise not be flagged as anything Bayesianable.
Perhaps that's helping the efficiency of the filter. That was the idea in doing so, of course!:)
So you get 30,000 spams a month, and 1.4%*30,000 = 420 legit emails/month.
I actually get "30,000+". Last month I got 36,590 messages. 35,739 (97.6%) were spam, 70 (0.1%) were virus (determined by very rudimentary conditions, basically just file extensions), and 781 (2.1%) were legitimate email.
48 months * 420 legit/month = 20,160 emails.
I have cumulative statistics from May 27, 2004 which indicate 42,165 (4.3%) legitimate email, 904,258 (93.4%) spam, and 21,319 (2.2%) virus. I have monthly statistics prior to May 27, 2004 but I'd have to sum them manually and I'm not going to do that right now.:)
If it is really "a false positive or two" that's a 1/10,000 error rate. But if it's really "a false positive or two per year" it's 1/2500. And if it's really "a false positive or two per month" it's 1/210. Unless you searched through the 30,000 spams you received in that month, how would you know?
Like I said, I'm not on a statistics crusade here. Do I know if there were 1 or 5 false positives I missed? No, I don't. But due to the nature of my communication, a real missed email would usually provoke a phone call inquiring as to why I didn't respond. That hasn't happened since soon after I started using the Bayesian filter and it was still learning. If I've missed any mails, they have not been critical. And the ones that I happened to see in the spam folder were always the kind that I was tempted to not even report as a false positive. I would think that if I find false positives, I do it with a statistically random probability of success--so if I was getting a lot of false positives that were real critical mail, I'd think I would have found one accidentally or it would have been reported to me by the sender.
You seem to be interested in super-exact statistics relating to spam filtering. I do not claim to have investigated it with that level of precision. What I can tell you is that my Bayesian filter works with a high enough success against spam and with a low enough false positive rate that, to me, spam isn't a problem for me. I still avoid posting my email address on the web but I no longer bother using temporary email addresses when buying something online, etc. I have enough confidence in the spam filter that there's no reason for me to bother.
P.S. Do I know you?
I don't have any reason to think you do. But I guess I don't know that for sure.:)
If this response doesn't answer your questions, let me know and I'll email you.
99.77% is not preposterous, but I would be interested to know the methods you used to measure this, and in particular how you know that errors are not underreported.
I developed a spam filtering service which is still operating and is available to the public. I will not mention it here so as not to be accused of shameless self-promotion. But the service keeps track of the total number of messages received, how many are spam, how many false positives, and how many missed spam.
To measure 99.77% accuracy you'd need tens of thousands of messages. Did you really read them all carefully and adjudicate them without any access to (and hence influence from) the filter's opinion?
I've used the same email address since 1993. I get 30,000+ messages per month, 98.6% of which is spam. I have spam statistics on a monthly basis back to November 2002. In total, I've received 904,202 spams since May 27th, 2004--99.83% of which were caught by the Bayesian filter during that time. Last month, 99.91% of my spam was caught.
Did you do it twice so as to measure your own reliability? Did you provide totally accurate and immediate feedback to your filter in real-time, whenever it made a mistake?
I'm not on a crusade to prove accuracy so, no, I didn't do it twice. These are the real-world results I'm getting. Is it possible I have missed a false positive or two? Over the last 4 years, sure. I'm sure it must have happened. If I missed them, they're obviously not included in the statistics. But you're never going to have perfect statistics. What I can assure you is that the Bayesian filter caught 99.83% of the spam and I've never missed any email that I considered critical. The few false positives that it has committed and that I noticed were messages that were verging on spam and which I wouldn't have cared if they had been categorized as such--I marked them as false positives simply to help my filter in the future, not because its assessment was particularly wrong.
As for real-time feedback, I usually do report it immediately. If not immediately, within a few hours (if the spam was downloaded while I was away from the email program). Worst case, I always report it the same day.
I bet you are missing some errors and/or biasing your judgements to conform to those of your filter and/or not counting "gray mail" which could go either way.
It can't go either way. It's either spam or it's not.
The major difference between your filter and the one in TFA is that yours requires that you continually train it and the one in TFA doesn't.
Bayesian requires that you initially train it. Once it's reasonably trained, it essentially trains itself. Yes, I report the occasional spam that gets through. But with 99.77% accuracy, that's not very often.
A second option may be to ask users to mark their spam manually for a day or so. That should also be manageable.
TFA specifically says: "users do not need to help the system identify spam other than to express personal preferences, if they so desire." The "personal preferences" is maybe enabling different sets of keywords, but it definitely sounds like they claim that no multi-day training is necessary.
If multi-day training is necessary, it's no better than Bayesian. My Bayesian filter generally gets north of 99.8% accuracy. Good enough for me and it works stand-alone, it doesn't require tens of thousands of other users to work.
Now, let's say that BOTH YOU AND JANE receive the same message M.
That's the problem I have with this. Spam stopped being truly mass produced years ago. Each spam is now normally sent to each user with a different mix of nonsense. The probability of two different people receiving the same message is virtually zero.
I happen to think the solution is metered billing and micropayments
I happen to think that that cure is worse than the ailment the system currently has to deal with. I'm in favor of a new mail protocol that could help *reduce* (though not entirely eliminate) the problem--but micropayments? No, absolutely not.
Was the previous technology less than 1% accurate?
I was wondering the same thing. They claim "over 99% accuracy." My simple Bayesian filter varies between 99.75% and 99.91%. "Over 99% accuracy" isn't all that amazing anymore, really. I'm still trying to understand how this new approach works, though.
I posted to my political-ish blog the other day that if we had spent the money we're spending on Iraq on instead rebuilding the tsunami ravaged countries (several of which were strongly Muslim), that would have done a lot more good. Actually, if we had spent even 10% of that money it probably would have done more good.
Hmm, that's interesting. I'm a conservative and while I tend to disagree with just giving away money, tragedies like the tsunami actually do make for interesting opportunities where "giving" away aid like that would seem perfectly sensible to me. It's moral, it's compatible with the ideals of tmost conservatives (not giving away money uselessly but definitely helping those that REALLY need help), and it would definitely have some positive impact on how we'd be viewed around the world. If we helped them rebuild a little further inland, it would also start prepping them for having to deal with the rising sea levels that are supposedly expected.
I'm not sure how much it would impact terrorism, though, since most of the tsunami victims are not in the main havens of the terrorists that threaten us. It might give us brownie points with Europe. But I don't really care what Europe thinks of us.
I'm generally against hand-out foreign aid, but I'd be all in favor of setting aside a good $10, $20, or $30 billion a year that would be used precisely to give overwhelming help to countries when they have disasters like the tsunami.
Indeed, I sometimes think it is a point of pride for some immigrants to NOT learn english when they live in the US.
I would assume you don't many immigrants from Latin America, then. Immigrants take no pride in not knowing English. Yes, believe it or not, even immigrants dislike being ignorant. Some immigrants may be more motivated than others, but I've known a lot of immigrants--many of which don't speak English--but not a single one of them is PROUD of that fact. In fact, most are ashamed.
That's just the latest name for "global warming." They realized that they had to hedge their bets because as the evidence for global warming dries up, they'd be screwed. By calling it "climate change" they can continue to press for radical liberal wealth-redistribution schemes as long as the earth's climate doesn't remain static... which, research says, is a pretty safe bet.
I wonder how brilliant the clouds really were. The result of a picture can be extremely different just by adjusting the aperture and exposure time. I do it all the time to get cool pictures of sunsets--far more cool than what I was actually looking at when I took the picture.
If it were a 100% natural cycle, it's probable that only arrogance would lead us to believe that we could change it. In reality, it's probably more like a 90% natural cycle--in which case it's still arrogant to believe that we can substantially change nature from doing what it wants to do.
You mean like in "The Day After Tomorrow?" Enough Hollywood. Any theoretical coastal flooding caused by theoretical man-made global warming is not going to kill millions of people unless they stand there like idiots as the waters rise over the course of a few decades.
Enough of the alarmism, already. If there's a case to make, make the case. Speaking idiocies like "millions of people will die from simultaneous coastal flooding" just makes people tune you out as a scaremonger. Which most of the global warming crowd is.
No-one claims to understand how the climate works completely. It's just a matter of how honest they are about how much we don't understand. Normally, one finds that the more hell-bent "scientists" are on instituting political policies aimed at their pet theory, the less honest they are about how much we really don't know.
Mod parent up. He's right on. They don't know anything about these clouds, but since they're apparently changing it must be global warming's fault. Priceless.
No, that's Al Gore's job. Now you're going to have to take away his Nobel prize and give to the clouds. :)
I saw the movie by accident one afternoon when I happened to be channel-surfing (which doesn't happen often, really). It was stupid the whole way through but yet I didn't change the channel.
But what exactly is the problem with commercialism? How does it really destroy society? Government-mandated use of specific products, sure. But that's not commercialism, that's government-mandated purchasing. What is wrong with commercialism? It's what drives the economy which allows us to have a society and a culture. Those things don't happen by themselves and if no-one is buying anything--including things they don't strictly need--things would be a lot darker than the future portrayed by the movie.
I do the same thing but always have found the prices at the store (Best Buy and Circuit City) DO match their website. I use their websites both to check prices and to check inventory. I then walk in, pick it up, and walk out in less than 5 minutes. That way I don't have to drive to each store and then, potentially, back to the first store that had the better price. Usually the price differences aren't that huge anyway so the only way to make it worth your time is to do your comparison shopping online.
I really have a hard time blaming anyone but the borrower. Dealers/lenders are there to give you the credit that you are asking for. They aren't financial counselors. They aren't your parents that can or should tell you all the "hidden costs" of owning a home. They are there to evaluate your application and if you met the requirements, loan you money. Period.
Consumers needed to be treated like grown adults, not children that need hand-holding. "I didn't think I could afford the house, but the lender guy told me I could" is something I'd hesitantly accept from a 15-year old, but not from an adult. "I thought my 2.6% adjustable rate mortgage would never change" is also something I might believe from a 12-year old, but not a 15-year old and much less an adult.
People that are in trouble now got into trouble all by themselves. Blaming the person or company that loaned them the money is just passing the buck and is just another example of the trend in our country of always blaming someone else. Someone killed 5 people? Blame his parents. Someone was driving too fast in snowy weather and crashed into someone else? Blame the city for inadequate snow removal. Someone falls off the top step of a ladder? Blame the ladder company for not putting a warning sticker on the ladder. Someone can't pay the mortgage? Blame the company that loaned him the money.
We really need to get back to personal responsibility in this country. We'd be a lot more productive, have more intelligent citizens, and a lot fewer lawsuits.
Best Buy and Circuit City do fine in my area. They always have what I need and when I've had to return something, they've never had a problem.
Heck, at my wife's request when we happened to be in the area of a Best Buy, I tried to return an iPod radio transmitter thingy that we had bought about a month before but that had been flaking out ever since. I didn't have the receipt with me and I didn't have the box. The guy looked up my receipt based on my credit card number and found that I had bought it 35 days ago (instead of the 30 or 31 that they technically accept returns). The guy told me to get another one off the shelf and bring it up. He just scanned it, took back our broken unit, and told me to have a nice day. We got a brand-new one, boxed, and out of the 30-day in-store exchange period.
No complaints about Best Buy from me!
I don't know why, but I stopped shopping at CompUSA probably a decade ago. I can't even remember why. Maybe it was just too far away. But for the last few years, there's been CompUSA across the street from Best Buy which is a few blocks away from Circuit City. When I need something, I'll check Circuit City and Best Buy but virtually never stop by CompUSA. It's not an intentional decision or bad blood, just somewhere along the line I stopped thinking they'd have what I need or that their price would be any better.
We claim 99% as well just to be safe. In reality, I've given you the rates I experience in real life. Would their system be a yawn for me? Depends on how much further north of 99%. Personally, where I am now, I wouldn't even switch for a 99.95% performance guarantee. The difference between what I get now and that wouldn't be enough to lead me to switch to something new.
I prefer my spam filtering to be based on what I consider spam, not on what anyone else considers spam. Yes, there's some initial training involved. If you don't like that, don't use Bayesian... obviously.
Nah, I'm bored too.
TFA boasted "better than 99%" which is no big deal and my Bayesian filter does better than that, no question. While the difference between 99.8% and 99.7% may be of critical interest to you, it's not practically all that difference. I didn't post 4 significant digits to be a jerk or to presume I had no statistical errors. That's simply the number I got from my spam filter's spam summarization which was number of spams caught / total numbers of spams. It produced and displayed a statistic with four significant figures and I posted it here.
And I didn't say either. I said that "over 99% accuracy" (the claim I saw in TFA) is not that impressive anymore, and it's not. My Bayesian filter has been doing better than that for years.
As a follow-up... I don't have to look through 30,000 spams. My spam filter will automatically order the spams in order of spaminess. By default, it only shows me the ones with a spam probability of 80% and below. The vast majority of spam gets 90%+ spaminess and I don't recall any false positive ever being north of 60%. So by looking at spams between 50% and 80% I actually don't have to look at much spam but at the same time I have a very, very low probability of missing false positives since the false positives that do exist don't come in north of 80%.
Now if a legitimate email came in north of 80%, that'd be some accomplishment! :)
I have not run any diagnostic corpora against the filter, no.
As for "Bayesian," I think a big part of the key to the successfulness of my Bayesian filter is the addition of additional Bayesian tokens. I don't only parse out the message and use that as tokens, but there are other aspects of the message that I convert into a token that is then used in a Bayesian application. For example, the lack of a subject becomes a token in and of itself. The presence of an HTTP address with an IP address instead of a domain name also becomes another token. The presence of more than 5 images becomes a token, the presence of more than 10 becomes another, etc. So my Bayesian filter isn't just tokenizing incoming messages, but doing some basic analysis of the messages and converting noteworthy aspects of the messages into tokens that would otherwise not be flagged as anything Bayesianable.
Perhaps that's helping the efficiency of the filter. That was the idea in doing so, of course! :)
I actually get "30,000+". Last month I got 36,590 messages. 35,739 (97.6%) were spam, 70 (0.1%) were virus (determined by very rudimentary conditions, basically just file extensions), and 781 (2.1%) were legitimate email.
I have cumulative statistics from May 27, 2004 which indicate 42,165 (4.3%) legitimate email, 904,258 (93.4%) spam, and 21,319 (2.2%) virus. I have monthly statistics prior to May 27, 2004 but I'd have to sum them manually and I'm not going to do that right now. :)
Like I said, I'm not on a statistics crusade here. Do I know if there were 1 or 5 false positives I missed? No, I don't. But due to the nature of my communication, a real missed email would usually provoke a phone call inquiring as to why I didn't respond. That hasn't happened since soon after I started using the Bayesian filter and it was still learning. If I've missed any mails, they have not been critical. And the ones that I happened to see in the spam folder were always the kind that I was tempted to not even report as a false positive. I would think that if I find false positives, I do it with a statistically random probability of success--so if I was getting a lot of false positives that were real critical mail, I'd think I would have found one accidentally or it would have been reported to me by the sender.
You seem to be interested in super-exact statistics relating to spam filtering. I do not claim to have investigated it with that level of precision. What I can tell you is that my Bayesian filter works with a high enough success against spam and with a low enough false positive rate that, to me, spam isn't a problem for me. I still avoid posting my email address on the web but I no longer bother using temporary email addresses when buying something online, etc. I have enough confidence in the spam filter that there's no reason for me to bother.
I don't have any reason to think you do. But I guess I don't know that for sure. :)
If this response doesn't answer your questions, let me know and I'll email you.
I developed a spam filtering service which is still operating and is available to the public. I will not mention it here so as not to be accused of shameless self-promotion. But the service keeps track of the total number of messages received, how many are spam, how many false positives, and how many missed spam.
I've used the same email address since 1993. I get 30,000+ messages per month, 98.6% of which is spam. I have spam statistics on a monthly basis back to November 2002. In total, I've received 904,202 spams since May 27th, 2004--99.83% of which were caught by the Bayesian filter during that time. Last month, 99.91% of my spam was caught.
I'm not on a crusade to prove accuracy so, no, I didn't do it twice. These are the real-world results I'm getting. Is it possible I have missed a false positive or two? Over the last 4 years, sure. I'm sure it must have happened. If I missed them, they're obviously not included in the statistics. But you're never going to have perfect statistics. What I can assure you is that the Bayesian filter caught 99.83% of the spam and I've never missed any email that I considered critical. The few false positives that it has committed and that I noticed were messages that were verging on spam and which I wouldn't have cared if they had been categorized as such--I marked them as false positives simply to help my filter in the future, not because its assessment was particularly wrong.
As for real-time feedback, I usually do report it immediately. If not immediately, within a few hours (if the spam was downloaded while I was away from the email program). Worst case, I always report it the same day.
Yet that's the success rate I'm getting.
It can't go either way. It's either spam or it's not.
Bayesian requires that you initially train it. Once it's reasonably trained, it essentially trains itself. Yes, I report the occasional spam that gets through. But with 99.77% accuracy, that's not very often.
TFA specifically says: "users do not need to help the system identify spam other than to express personal preferences, if they so desire." The "personal preferences" is maybe enabling different sets of keywords, but it definitely sounds like they claim that no multi-day training is necessary.
If multi-day training is necessary, it's no better than Bayesian. My Bayesian filter generally gets north of 99.8% accuracy. Good enough for me and it works stand-alone, it doesn't require tens of thousands of other users to work.
That's the problem I have with this. Spam stopped being truly mass produced years ago. Each spam is now normally sent to each user with a different mix of nonsense. The probability of two different people receiving the same message is virtually zero.
I happen to think that that cure is worse than the ailment the system currently has to deal with. I'm in favor of a new mail protocol that could help *reduce* (though not entirely eliminate) the problem--but micropayments? No, absolutely not.
I was wondering the same thing. They claim "over 99% accuracy." My simple Bayesian filter varies between 99.75% and 99.91%. "Over 99% accuracy" isn't all that amazing anymore, really. I'm still trying to understand how this new approach works, though.
Hmm, that's interesting. I'm a conservative and while I tend to disagree with just giving away money, tragedies like the tsunami actually do make for interesting opportunities where "giving" away aid like that would seem perfectly sensible to me. It's moral, it's compatible with the ideals of tmost conservatives (not giving away money uselessly but definitely helping those that REALLY need help), and it would definitely have some positive impact on how we'd be viewed around the world. If we helped them rebuild a little further inland, it would also start prepping them for having to deal with the rising sea levels that are supposedly expected.
I'm not sure how much it would impact terrorism, though, since most of the tsunami victims are not in the main havens of the terrorists that threaten us. It might give us brownie points with Europe. But I don't really care what Europe thinks of us.
I'm generally against hand-out foreign aid, but I'd be all in favor of setting aside a good $10, $20, or $30 billion a year that would be used precisely to give overwhelming help to countries when they have disasters like the tsunami.
I second the "bravo." Nicely done. And, no, I didn't intentionally set that up for him. :)
I would assume you don't many immigrants from Latin America, then. Immigrants take no pride in not knowing English. Yes, believe it or not, even immigrants dislike being ignorant. Some immigrants may be more motivated than others, but I've known a lot of immigrants--many of which don't speak English--but not a single one of them is PROUD of that fact. In fact, most are ashamed.