Can a Bayesian Spam Filter Play Chess?
martin-boundary writes "The typical Bayesian spam filters learn to distinguish ham from spam just by reading thousands of emails, but is this all they can do?
This essay shows step by step how to teach a Bayesian filter to play chess against a human, on Linux, with
XBoard."
What is the point of this? Why modify something to do something at a sub par level when one can just design something to do the job well.... (I know the answers- to see if it can be done, to have fun etc.... but still, why?)
Wouldn't this be like modifying a Ferrari with off road tires and using it for Baja racing, when you would just be better off buying a truck?
I wish they would focus more on making the spam filter work well, rather than diddling with a chess-bot. I still find important email in my spam folder, while still having spam in my inbox...
Sort of like when I go a new cell phone- the chap at the store told me about all the bells and whistles, whilst all I wanted was a phone that worked as a phone and didn't drop calls....
And All I Ask is a Tall Ship And a Star to Steer Her By
The short answer is "yes".
Now, ask that question again, this time including the word "well".
Who cares? It is kinda cool someone even thought of this. Methinks you need to review the definition of "nerd".
See my journal for slashdot ID's by year. Mine created in 2005. http://slashdot.org/journal/289875/slashdot-ids-by-year
What a great article. Talk about lateral thinking.
I can imagine, depending on how many games are played and the available memory space, that it will develop to have a decent opening. But nothing more then that.
I actually found it to be a decent lesson in why spam filters are only a temporary solution to a problem. If you cut out the "mumbo jumbo" portions of it, it could be used to explain why reactionary methods are only barely sufficient.
The basic premise, once you get to the very end, is one that anyone SHOULD know based on the nature of a spam filter, but some people seem to have difficulty understanding; spam filters can react, often quite well, but they can never predict. As he puts it, there is previous history but no strategy. When you are only trying to protect yourself from a limited number of bad results that are similar to other bad results, that's sufficient. However, it does not (and can not) address the problem at it's root. As long as there are thinking humnans trying to beat the filter, some will get through.
Never confuse volume with power.
What do you expect, your cat is an idiot! I bet your cat can't even read!
the author should have the spam filter analyze the games in reverse (from victory to beginning). that would probably produce better results. of course the spam filter would need to handle more than 7 moves out.
No, it won't. This is actually what was done in the article.
The problem is in the length of the sequences. If you could learn sequences of, say, length 60 (for 30-move games), maybe your filter would become a reasonably good player. Unfortunately, the need for computational resources increases exponentially with each extra move added.
The complexity of chess is simply too high to learn this way.
Chess openings might be learned this way, but it is not very useful to do so. The results will be worse than opening libraries, which are very good at the moment.
Not only is the topic unusual and entertaining, but this article is also a good tutorial on data massaging, pattern matching, and combining disparate unix tools to accomplish a task. This article showcases how powerful and useful unix command tools can be.
If you want a good step by step tutorial to help you understand the usage of unix command line tools to accomplish a non trivial task, then you should read this.
Reading this article, I was reminded of the old children's story about "stone soup". You remember that one -- someone advertises that he can make soup from a stone, and various others gather around to watch this amazing feat. Well, the soup needs a little extra seasoning, so he gets someone to put in some carrots while the stone cooks, then he adds some onions, etc, etc... I think you can see where this is going.
Sure you can make a chess playing program from a spam filter.
You just need to throw in a legal move generator, and a game database, and some capture heuristics, and position displayer, etc, etc...
"it is a successful tiny step in a direction that no-one else has thought of going"
Except for the fact that the bayesian filter in question was originally designed for identifying spam, this statement is incorrect. Bayesian filters have been applied to chess in a variety of ways, including analyzing move sequences and analyzing the current placement of pieces. In general, the strategic algorithms (that project the game forward) have been better competitors.
Not that chess algorithms do not contain statistics-based procedures, only that ones that also look forward in the game have provided better outcomes.
You are completely correct. Chess is a very, very complicated game to teach to a computer and has resisted all our current machine-learning methods. The reason is simple - current machine learning is not flexible (invariant). This means that if you teach it how to play in a given position, it will not be able to extrapolate what it learned and apply that knowledge to a similar but different position.
Machine learning, as we know it, involves giving a black box program large amounts of data (in this case sample games) and having that black box "remember" the data. We have given this black box no way to actually "think" about what it has been given. It has no memory that says "controlling a large number of squares is good." Instead, it has a memory saying that "in this specific position, a grandmaster would have moved his pawn forward." This is simply a terrible way to teach a computer chess, and imo until machine learning can be more flexible and "think," it will be useless for things like chess.
Just to clarify, computers *can* beat just about every human chess-player. They just don't use machine learning - instead, they look at every possibile position that can occur from the current position (given time and memory constraints), run a simple program on all the positions to determine which is best (hard-coded by a human, of course) and pick the move that will lead to the best position. Simple and requiring a lot of speed but no "intelligence," this method is perfectly suited to the computers of today.