Seeking Prior Art on Markov-Based SPAM Filters?

← Back to Stories (view on slashdot.org)

Seeking Prior Art on Markov-Based SPAM Filters?

Posted by Cliff on Thursday November 28, 2002 @03:38AM from the when-patents-stand-in-the-way-of-a-spam-free-inbox dept.

Theovon asks: "One of today's hot topics seems to be SPAM filtering. I have wanted very much to make my own contribution to this, but I have been thwarted by a patent. Probably before Paul Graham began working on his Bayesian SPAM filter, I began work on a Markov-model based filter. Things were going well until I posted to the usenet about it, and got this Google Groups response. This usenet post describes a Mitsubishi patent issued in 2000, US Patent #6,112,021. One of the key aspects of my design was that I would train the Markov model with both positive and negative examples. This patent is spot-on what I'm doing, because it deals specifically with the idea of using negative examples in Markov models to filter, among other things, 'inappropriate web content.' Well, the patent looks like a good one, assuming they really developed this idea. I mean, of course, I would think it's a good idea; I came up with it too, and it works very well. (Then again, I also thought it was 'obvious')..."

"Not too long ago, I was discussing it with a co-worker who has a degree in Electrical Engineering. She had taken an AI class in college which mentioned the idea of negative examples for Markov models, and this was well before the year 2000. The bottom line is that I think I have a great idea that could potentially add to our collective arsenal against the ever-growing SPAM problem. I would very much like to work on it and publish it under GPL, but before I can do this, I have to protect my self against the patent and the large pocketbook of Mitsubishi.

I'm not asking for legal advice. I have already consulted an attorney, and it was suggested that I should remove the SourceForge project, which I have done. I also attempted to contact the EFF (no luck so far). I'm asking for those of you out there who are familiar with this sort of thing to help me to find verifiable prior art that dates from before the 2000 patent. I would very much like to share with the world my ideas and the code I have written, but this is standing in our way."

5 of 36 comments (clear)

Min score:

Reason:

Sort:

Re:What's next, a patent on counting sheep? by the+eric+conspiracy · 2002-11-28 05:37 · Score: 4, Informative

The whole point of mathematical techniques is that they work no matter what you apply them to--which is why you didn't used to be able to patent mathematics.

Current patent law does forbid patenting natural law, including mathematical principals. The problem is not the lack of that law, but the fact that clever lawyers have been successful in shaving ever closer to effectively getting coverage of an algroithm.

In this patent what we have is the following claim:

1. A Markov model discriminator system comprising:

means for providing an input sequence to be classified;

means coupled to said means for providing an input sequence for performing two different model likelihood calculations based on two different Markov model parameters representing two different classes of examples, each class having a predetermined characteristic on which discrimination between classes is based;

means for comparing said likelihood calculations as to which of said characteristics said input sequence is likely to exhibit based on said comparison, thus to classify said input sequence; and

means for training said means coupled to said input sequence for performing said two calculations by generating said Markov model parameters taking onto account both negative and positive examples of said different classes.

Now notice that nowhere is there an attempt to claim ownership of the actual algorithm, only means of implementing that algorithm.

Another key word here is "comprising". This word is key to understanding what is covered by this invention. In this case it covers any means of implementing the process described that contains all of the steps described. Other steps can be added and the patent would still apply.

This link gives a good overview on how to read a patent claim.

http://www.tms.org/pubs/journals/JOM/matters/mat te rs-9511.html
Markov Models, Negative Reward, Negative Examples by hawkstone · 2002-11-28 05:48 · Score: 5, Informative

I'm not sure I fully understand what you're attempting and what prior art you are looking for, but let me try anyway.

The google groups message says the patent covers explicity "markov model discriminator using negative examples". Markov decision problems are often those associated with a set of states and a policy which transitions between them. For spam filtering, I would guess there are three states: unknown, spam, non-spam, and the policy needs only determine if the transition should go between the unknown and and either the spam or non-spam state.

The optimal policy in an MDP is static, and thus it would say "the optimal policy is to mark an email spam". This sounds useless. A Partially Observable MDP (POMDP) may follow more with your plans since it follows percepts and probability distributions over the current set of beliefs.

As for training for an optimal policy for MDPs, one algorithm ("Value Iteration") makes use of a reward function. This reward function is explicitly allowed to be negative, and probably should be for some decisions or else the problem is trivial. And "positive" or "negative" examples is just a boolean simplification of multiple categories. Why not have "pr0n" "ads" "personal" "work" and a bunch of other categories?

In this case, positive and negative are meaningless, but all (both) are *always* neeeded to define the optimal policy.

In other words, I'm not sure I understand enough of what you are attempting to help find prior art, but from what I can tell "negative" examples are inherently part of defining the optimal policy in an MDP. For a reference, Russel & Norvig, Artificial Intelligence, A Modern Approach (1995) has a ton of information about MDPs and the training of them.
Re:Obviosly Not by the+eric+conspiracy · 2002-11-28 05:56 · Score: 4, Informative

This obviously depends on the definition of "ordinary" and "the art"

Since this is law, not programming, definitions of terms are worked out over time. One thing to keep in mind is that in an infringement trail, if you try to assert obviousness to invalidate a patent, the burden of proof is on you.

Interestingly, the long-felt commercial need for an effective spam filter may put the use of Markov algorithms as spam filters into the 'non-obvious' category, and thus patentable.

Here is how it is normally done in an infringement case:

The court MUST determine obviousness based on four factual inquiries:

(1) the differences between the prior art and challenged claims;

(2) the level of ordinary skill in the field of the pertinent art at the time of plaintiff's invention;

(3) what one possessing that level of skill would have deemed to be obvious from the prior art reference; and

(4) objective evidence of obviousness or nonobviousness.

See B.F. Goodrich Co. v. Aircraft Braking Systems Corp., 72 F.3d 1577, 1582 (Fed.Cir.1996); Hoover Group, Inc. v. Custom Metalcraft, Inc., 66 F.3d 299, 303 (Fed. Cir. 1995). It is important to note, however, that the test endorsed by the United States Supreme Court contained only a three part inquiry, which does NOT mandate that secondary considerations be considered. Rather, the Supreme Court test, which was set forth in Graham v. John Deere Co., 383 U.S. 1, 86 S.Ct. 684, 15 L.Ed.2d 545 (1966), states that secondary considerations "might" be useful in determining obviousness. Graham v. John Deere Co., 383 U.S. at 17-18. Nevertheless, the Graham factors as set forth by the Supreme Court are:

(1) the scope and content of the prior art;

(2) the differences between the prior art and the claims at issue; and

(3) the level of ordinary skill in the pertinent art.

Graham v. John Deere Co., 383 U.S. at 17.

Notwithstanding, today secondary considerations of obviousness MUST be considered by the court before reaching a conclusion on obviousness. These secondary considerations include, but are not limited to:

(1) the commercial success of the invention;

(2) whether the invention satisfied a long-felt need in the industry;

(3) failure of others to find a solution to the problem at hand; and

(4) unexpected results.

B.F. Goodrich, 72 F.3d at 1582; see also Hybritech, Inc. v. Monoclonal Antibodies, Inc., 802 F.2d 1367, 231 U.S.P.Q. 81 (Fed. Cir. 1986), cert denied, 480 U.S. 947 (1987).
Much prior art by .@. · 2002-11-28 06:31 · Score: 3, Informative

Markov models have been used extensively in cognitive science, particularly the field of text comprehension, for the purpose of studying text comprehension and retension.

Spreading acitivation models are mathematically similar to Markov chains, and ha ve also been used for similar purposes, trained and/or designed with both positi ve and negative exemplars.

Further, concept categorization and discovery research in cognitive science has used Markov chains for automated categorization, once again using positive and n egative examples.

If you're serious about this, I'd check the literature in this area, particularl y journals like Discoure Processes and similar. Also check the Proceedings of t he Cognitive Psychology conference.

All of this is published, peer-reviewed work well before 2000. Your best bet is to hit a decent university library and start searching.

Heck, my dissertation work didn't use negative examples, but the spreading activ ation models were originally Markov chains.

Markov chains and their use in AI, cognitive psychology, speech/text recognition , categorization, and text comprehension have been in use for decades. With the interdisciplinary approach common in the 80's and 90's, there were many efforts that adapted things like Markov chains and neural network learning algorithms, simulated annealing techniques, and so forth. It shouldn't be hard to dig some up. Worst case, find a cognitive psychology professor specializing in text proc essing and ask. UCSD has a strong showing, as UC Boulder, Manitoba, U Memphis, U Chicago, and many, many others.

--
.@.
a better mousetrap? by octalgirl · 2002-11-28 13:24 · Score: 3, Informative

IANAL, but it's my understanding that just because there is a patent, it doesn't mean you can't improve on the invention. Quote from http://usgovinfo.about.com/blpatents.htm:

Utility Patents - The most common type of patent, utility patents are issued to inventors of new devices and processes or improvements to existing devices or processes. Most utility patents are issued for inventions that improve existing devices. Inventors of the proverbial "better mousetrap," seek utility patents.

The existing patent you mention says "inappropriate web page material"

It may be possible the wording can be changed to be more specific - "unsolicited electronic mail messages" and "advertisements". I've bounced around the patent site a bit, and it seems there are a lot of patents that are given with prior art in existence, they just have to find it and say 'reference this patent' or something. If someone can get a patent for swinging on a swing, it's worth a shot if you've got the $$.