Would you want to live in a castle? No running water, no heat, no insulation, hard to modify... Even most of the 100-year-old homes I've seen are awful by modern standards. The rooms just weren't designed for the way people live today. Do you really think you can predict what sort of house people would want to live in 300 years from now?
Maybe a better question is how to make your house easy to adapt to new needs and easy to dismantle and recycle once the adaptability isn't good enough.
Given how bad and dangerous some drivers are, prohibiting motor vehicles on sidewalks seems like a fine thing to do. Are people really going to be that much safer and more attentive drivers on a segway than in a car?
By leaving it out, Paul implicitly assumes that 50% of mail is spam -- that's his "prior" estimate of the spam rate.
Including the prior is only important if you want to treat the result as a probability, which, as Gary points out, isn't reasonable to do with Naive Bayes because of the incorrect independence assumption. For a classification task, it's enough to set the cutoff correctly to give you the performance you want (preferably through automated learning, but also potentially set by hand). What's really missing is the notion of a loss matrix that formally defines the value of the different types of classification errors (false positive, false negative) and that guides the cutoff selection.
What I'd like to see, and I suspect I'm not alone here, is similar software that can sort email into any number of categories, not just spam and non-spam.
Any content-based classifier that works for spam/non-spam could also work for other categories, though the signals, and therefore the accuracy, might be different.
But enough theory. What you want is ifile. It does exactly what you describe.
ifile is a general mail filtering system that works with a mail client to intelligently filter mail according to the way the user tends to organize mail. ifile uses the machine learning algorithm Naive Bayes to classify e-mail documents.
I'm trying to be paranoid here, but for craps sake, all these records are already tossed out in the public domain.
The fact and amount of your transaction are essentially public record. The itemized list may not be. In any case, I don't care if marketroids know what I buy. They're just going to send me junk mail. The government might decide to throw me in jail, which is significantly more annoying.
I'm not sure that I'd buy an algorithm book if they stuffed everything
on some machine readable medium, either. Those things tend to go bad,
and to go missing.
You are allowed to make backups, you know. Besides, space is
cheap, so you should just copy those things to your hard drive. That
way they're at hand when you need them, and they get backed up with
the rest of your system.
First of all, our storage needs are pretty flexible. If we had ten times as much space, we'd stop using mp3/ogg and switch to a lossless compression scheme. And as digital cameras get better, photos will take up more space.
But the real problem isn't that no device will be able to hold my data. The problem is that such a device will be much larger than I want. They already are. Right now I need about 34GB (most of which is music). Ideally, I wouldn't need a big chunk of metal under my desk to store it. I'd much prefer to have everything on my laptop or even my iPAQ. If I could fit 100GB on a microdrive, I would use it up tomorrow.
If you're just going to use it in your car, this isn't the best thing, but the article overlooks the best reason to use an iPAQ and enclosed drive: you can use it without the car. It has its own battery, so you can drop it in your bag and walk around with 40GB of music. (IBM has a low power 40GB 12.5mm 2.5" drive.)
I'm going this route as soon as the 2-slot PCMCIA sleve comes out so I can have a microdrive for OS and buffer and keep the big drive spun down most of the time.
the fact that it is an obvious idea to any programmer [..] is not something an examiner can use to reject the application.
Yes, it is. That's stated specifically in 35 USC 103(a).
A patent may not be obtained [..], if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.
If this holds up, they'll soon find themselves being sued by pirates, medieval knights, firemen, and the association of people with bumpy yellow heads.
Planning and scheduling have been considered AI problems for a long time. Pretty much anything trying to solve an AI problem is considered to be AI technology. It just may not be interesting or nontrivial technology. Many people overvalue the term "AI", thinking that anything associated with that label should be able to appreciate a fine wine or cry at an opera, but there's really no objective way to decide which techniques are nontrivial enough to warrant the stamp of pretentiousness
Originally A.I was a goal to achieve a "thinking machine", one that didn't require outside input
If by "outside input" you mean preprogramming, then that's a meaningless definition. Everything is programmed, in hardware or software, through easily understood symbolic algorithms, opaque matrices, or bundles of neurons.
as far as strictly defined by academia today, yes, you're absolutely correct, a.i. is just solving problems
The definition is vague, but it's narrower than just solving problems. A crowbar solves problems, but few would call it AI. Originally, AI was a term to describe making people out of sand. Some core problems were identified (vision, language, planning, learning) and some techniques were invented or applied (artificial neural nets, belief networks, support vector machines). Now AI describes those core problems, even in contexts not meant to result in fully-functioning people, and also describes the technology that's often applied to those problems.
Seems like the problem of classifying what is offensive is just as hard as classifying what programs will terminate.
You're saying, "Gosh, that sounds hard. I guess it must be impossible.", only you're saying it in computer science terms that you apparently read off a cereal box. Either learn the science behind the terms, or stick to shorter words so lazy people can tell how ignorant you are. Besides, they don't need to do it perfectly, they only need to do it well enough to satisfy some people, and approximate solutions to the halting problem are much less of a big deal.
You must have read a different article than I did. I missed the part where they said "Neural nets are magic, so we don't have to do any hard work.". Any classifier is only as good as the features given to it.
The big question is what sort of feature extraction they're using. The answer is: we don't know, because they haven't told us. That doesn't mean it's crap, it means we don't know.
Yes, varying standards of classification is a problem. The response of the filter providers seems to be "Bummer".
The other problem is that different communities disagree on the relative cost of the different kinds of errors. (In AI and stat terms, that's the loss matrix.)
When the classifier outputs "porn, 60% likelihood", do you block it or not? Pro-speech groups would say no, and pro-censorship groups would say yes.
Cutting edge graphics, and killer AI always show up in the gaming
industry before anywhere else.
Hogwash. You just don't see it because you're not in the right place.
I don't know about graphics, but good and practical AI techniques have
been flourishing in logistics and data analysis, in manufacturing and
distribution, and dozens of military applications, plus fraud
detection, consumer incentive modeling, financial forecasting, and
dozens of other areas that consumers don't directly care about.
According to their site, RDF has been around since 1997. Why did it take six years to work out the details?
I've been running Familiar, which uses X11, on my iPaq for over two years. It's fun to push windows to it over 802.11b ethernet.
The story mentions that EPIC was the one to file suit against DoubleClick. Now would be a good time to send them some money.
Would you want to live in a castle? No running water, no heat, no insulation, hard to modify... Even most of the 100-year-old homes I've seen are awful by modern standards. The rooms just weren't designed for the way people live today. Do you really think you can predict what sort of house people would want to live in 300 years from now?
Maybe a better question is how to make your house easy to adapt to new needs and easy to dismantle and recycle once the adaptability isn't good enough.
Given how bad and dangerous some drivers are, prohibiting motor vehicles on sidewalks seems like a fine thing to do. Are people really going to be that much safer and more attentive drivers on a segway than in a car?
Including the prior is only important if you want to treat the result as a probability, which, as Gary points out, isn't reasonable to do with Naive Bayes because of the incorrect independence assumption. For a classification task, it's enough to set the cutoff correctly to give you the performance you want (preferably through automated learning, but also potentially set by hand). What's really missing is the notion of a loss matrix that formally defines the value of the different types of classification errors (false positive, false negative) and that guides the cutoff selection.
Any content-based classifier that works for spam/non-spam could also work for other categories, though the signals, and therefore the accuracy, might be different.
But enough theory. What you want is ifile. It does exactly what you describe.Is it compatible with the Odyssey? Can I play pong on it?
sudo echo "127.0.0.1 slashdot.org www.slashdot.org" >> /etc/hosts
Wasn't this the whole point of Mozilla's XUL?
The fact and amount of your transaction are essentially public record. The itemized list may not be. In any case, I don't care if marketroids know what I buy. They're just going to send me junk mail. The government might decide to throw me in jail, which is significantly more annoying.
It's all documented.
You are allowed to make backups, you know. Besides, space is cheap, so you should just copy those things to your hard drive. That way they're at hand when you need them, and they get backed up with the rest of your system.
First of all, our storage needs are pretty flexible. If we had ten times as much space, we'd stop using mp3/ogg and switch to a lossless compression scheme. And as digital cameras get better, photos will take up more space.
But the real problem isn't that no device will be able to hold my data. The problem is that such a device will be much larger than I want. They already are. Right now I need about 34GB (most of which is music). Ideally, I wouldn't need a big chunk of metal under my desk to store it. I'd much prefer to have everything on my laptop or even my iPAQ. If I could fit 100GB on a microdrive, I would use it up tomorrow.
If you're just going to use it in your car, this isn't the best thing, but the article overlooks the best reason to use an iPAQ and enclosed drive: you can use it without the car. It has its own battery, so you can drop it in your bag and walk around with 40GB of music. (IBM has a low power 40GB 12.5mm 2.5" drive.)
I'm going this route as soon as the 2-slot PCMCIA sleve comes out so I can have a microdrive for OS and buffer and keep the big drive spun down most of the time.
Yes, it is. That's stated specifically in 35 USC 103(a).
If this holds up, they'll soon find themselves being sued by pirates, medieval knights, firemen, and the association of people with bumpy yellow heads.
Planning and scheduling have been considered AI problems for a long time. Pretty much anything trying to solve an AI problem is considered to be AI technology. It just may not be interesting or nontrivial technology. Many people overvalue the term "AI", thinking that anything associated with that label should be able to appreciate a fine wine or cry at an opera, but there's really no objective way to decide which techniques are nontrivial enough to warrant the stamp of pretentiousness
If by "outside input" you mean preprogramming, then that's a meaningless definition. Everything is programmed, in hardware or software, through easily understood symbolic algorithms, opaque matrices, or bundles of neurons.
The definition is vague, but it's narrower than just solving problems. A crowbar solves problems, but few would call it AI. Originally, AI was a term to describe making people out of sand. Some core problems were identified (vision, language, planning, learning) and some techniques were invented or applied (artificial neural nets, belief networks, support vector machines). Now AI describes those core problems, even in contexts not meant to result in fully-functioning people, and also describes the technology that's often applied to those problems.
You're saying, "Gosh, that sounds hard. I guess it must be impossible.", only you're saying it in computer science terms that you apparently read off a cereal box. Either learn the science behind the terms, or stick to shorter words so lazy people can tell how ignorant you are. Besides, they don't need to do it perfectly, they only need to do it well enough to satisfy some people, and approximate solutions to the halting problem are much less of a big deal.
You must have read a different article than I did. I missed the part where they said "Neural nets are magic, so we don't have to do any hard work.". Any classifier is only as good as the features given to it.
The big question is what sort of feature extraction they're using. The answer is: we don't know, because they haven't told us. That doesn't mean it's crap, it means we don't know.
Yes, varying standards of classification is a problem. The response of the filter providers seems to be "Bummer".
The other problem is that different communities disagree on the relative cost of the different kinds of errors. (In AI and stat terms, that's the loss matrix.) When the classifier outputs "porn, 60% likelihood", do you block it or not? Pro-speech groups would say no, and pro-censorship groups would say yes.
Hogwash. You just don't see it because you're not in the right place. I don't know about graphics, but good and practical AI techniques have been flourishing in logistics and data analysis, in manufacturing and distribution, and dozens of military applications, plus fraud detection, consumer incentive modeling, financial forecasting, and dozens of other areas that consumers don't directly care about.
AAAI has (or points to) lots of good introductory material on decision trees, and machine learning in general.
You left out Waterfall, my spectrogram viz plugin.