Bayesian Tail
flok writes "We all know anti-spam-software using Bayesian filtering. The results with these are amazingly good. So that made me thinking: why not create a tool which monitors logfiles and determines using a Bayesian filter what events to display and what not? That's why I created btail. Btail is just that: it monitors a logfile and filters it with a Bayesian filter. The results are above my own expectations!"
Still very preliminary at this point, but shows promise. Now, to build and try it out!
I am not your blowing wind, I am the lightning.
Why not use it to colorize, Or to rebuild the logs in HTML.
01:56 Plasma injector #1 offline, switching to #2 backup.
02:23 Overheat in plasma injector #2.
02:44 Failure to shutdown plasma injector #2.
02:58 Overheat in reactor core.
03:20 Containment weakening.
03:25 Containment weakening.
03:30 Containment weakening.
03:35 Five minutes to containment failure.
03:40 FIVE SECONDS TO WARP CORE BREACH!!!
Better be careful to train the filter about those warnings that don't happen very often, but when they do, you really want to know about them.
One line blog. I hear that they're called Twitters now.
I currently use CRM114 and on the mailing list, some one (Evan Prodromou) has created a program that does just this using the CRM114 language. It is called "Monkeyplexer" based on the idea that you could train a monkey to sort your mail box into folders.
r -0.7.tar.gz
If you pop over to the CRM114 site and search the general list archives for monkeyplexer to find the discussions about it.
Here is the last version announcement that I could find in my mailbox:
monkeyplexer is a tool for automatically sorting incoming email messages into appropriate folders. A new version of monkeyplexer, 0.7, is now available. http://bad.dynu.ca/~evan/monkeyplexer/monkeyplexe
This version includes the following changes:
You can specify which mailboxes to use, instead of which mailboxes to exclude. This can save some typing and some time at runtime, at the expense of dynamically updating the list. You can tell the monkeytrainer to only train messages that were received in the last few weeks, days, hours, minutes -- whatever. The monkeyplexer remembers which messages have been trained for which folders. If you train a message for a different folder, the monkeyplexer will automatically forget the first folder before training for the new one. Thanks to everyone who has installed monkeyplexer already. I hope this new version helps some people out. I find it easier and more accurate.
~ESP
Bayesian filtering could be used for lots of things outside of spam. One example could possibly be Wikis, determining spam from ham modifications (well, yes, it is spam here). I've had some other ideas that involve Bayesian, but they've escaped me for the moment.
Before you walk a mile in someone's shoes, you should insult them so you know how they are and what they're doing.
All due respect, you're being a bit hard on the guy. He's not doing badly here.
:|
The [brackets] used in the usage message are standard in the Unix world for specifying an optional or default argument. Just look at any man page. So that, actually, is pretty straightforward. The name of the default config file would likely also be spelled out in the man page, which I would expect, so that's not confusing.
As for changing the if construct into a switch, well, I'm trusting the accuracy of your excerpt, but I didn't find his code to be very difficult to read, to be honest, and certainly not a candidate for DailyWTF, which typically contains laughably horrible code.
As far as other code may go, the guy states that this is in a nascent stage, so jumping on his source files seems like a bit of an easy shot
Chr0m0Dr0m!C
It's 10 PM. Do you know if you're un-American?
assert(expired(knowledge));