Can a Bayesian Spam Filter Play Chess?
martin-boundary writes "The typical Bayesian spam filters learn to distinguish ham from spam just by reading thousands of emails, but is this all they can do?
This essay shows step by step how to teach a Bayesian filter to play chess against a human, on Linux, with
XBoard."
What a great article. Talk about lateral thinking.
I actually found it to be a decent lesson in why spam filters are only a temporary solution to a problem. If you cut out the "mumbo jumbo" portions of it, it could be used to explain why reactionary methods are only barely sufficient.
The basic premise, once you get to the very end, is one that anyone SHOULD know based on the nature of a spam filter, but some people seem to have difficulty understanding; spam filters can react, often quite well, but they can never predict. As he puts it, there is previous history but no strategy. When you are only trying to protect yourself from a limited number of bad results that are similar to other bad results, that's sufficient. However, it does not (and can not) address the problem at it's root. As long as there are thinking humnans trying to beat the filter, some will get through.
Never confuse volume with power.
the author should have the spam filter analyze the games in reverse (from victory to beginning). that would probably produce better results. of course the spam filter would need to handle more than 7 moves out.
No, it won't. This is actually what was done in the article.
The problem is in the length of the sequences. If you could learn sequences of, say, length 60 (for 30-move games), maybe your filter would become a reasonably good player. Unfortunately, the need for computational resources increases exponentially with each extra move added.
The complexity of chess is simply too high to learn this way.
Chess openings might be learned this way, but it is not very useful to do so. The results will be worse than opening libraries, which are very good at the moment.
Not only is the topic unusual and entertaining, but this article is also a good tutorial on data massaging, pattern matching, and combining disparate unix tools to accomplish a task. This article showcases how powerful and useful unix command tools can be.
If you want a good step by step tutorial to help you understand the usage of unix command line tools to accomplish a non trivial task, then you should read this.