Artificial Intelligence Overview
spiderfarmer writes: "Well, it feels slightly odd to suggest one of my own articles, but here goes. I've recently completed a brief overview of the current state of AI. The article concept was focused on Cyc, but scope creep being what it is, I ended up doing an overview of the entire field. Some of the Slashdot gang were fairly helpful in pointing me towards experts who would talk to me and towards white papers and books I might not have otherwise found. So, I thought they might be interested in how I put all the information together."
try A.L.I.C.E. (that's http://www.alicebot.org/ for the goatsecx paranoid). Its one of the better bots that has won awards and stuff. Sure it isn't perfect, but its a neat toy to play with.
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
With an academic background in moral philosophy and today working as a developer of data mining systems, I can empathise with the author's frustration at trying to understand how (if at all) morality and AI intersect.
The main problem really is that the term 'AI' is applied to any algorithm for classification, prediction, or optimisation which operates using anything beyond a simple set of heuristics. Such algorithms seem magical to the lay-person, resulting in the over-enthusiastic application of the 'intelligence' moniker.
Summary
'AI' is a term used inappropriately for a range of algorithms that attempt to learn without having to specify an exact set of rules for every case. Although these algorithms are currently incapable of displaying real intelligence, it is possible that one day they may. This point is however debatable, and the interested reader should read for themselves the differing points of view of experts in the field, including Daniel Dennett, Roger Penrose, Steven Pinker, Richard Dawkins, and Douglas Hofstadter. If they do ever get to the point that they can act intelligently and flexibly, it will be important that they are trained with appropriate moral premises to ensure that there actions are appropriate in our society.
To understand these so-called 'AI' tools it is useful to develop a little structure...
Output
AI tools are used for classification, prediction, or optimisation. Classification works by showing a computer a set of cases which have a number of properties (sex, age, smoker status, presence of cancer...), and 'training' the algorithm to understand the patterns of how properties tend to occur together. Prediction can then be used to show the algorithm new cases in which one or more of the properties are blank--the algorithm can use its classification training to guess the most likely values of the missing properties. For instance, given sex, age, and smoker status, guess the probability of presence of cancer. is a generalisation of classification--rather than training to minimise classification error, train to maximise or minimise the value of any modelled outcome. For instance, whereas an insurer could use classification algorithms to find the likelihood of someone dying by age x, an optimisation approach could be trained to find the price at which modelled profitability of an applicant is maximised.
Functional form
AI tools create a mathematical function from their training. For instance for a classification algorithm this function returns the probability of a particular category for a particular case. The form of this function is an important factor in classifying AI tools. The most popular forms are 'neural networks' and 'decision trees'. Neural networks are interesting because certain types (networks with 2 hidden layers) can approximate any given multi-dimensional surface. Decision trees are interesting because given a large enough tree any surface can be approximated, and in addition a tree can be easily understood by a human, which is very useful in many applications. Other functional forms include linear (as used in linear regression which many will remember from school) and rule-based (as used in expert systems, and similar to a decision tree). One interesting functional form is the network of networks which combines multiple neural networks, feeding the output of one into the input of others. This forms allows the training of network modules that learn to recognise specific features, which is closer to how our brains work than the single network approach.
The most flexible functional form is that used by practitioners of genetic programming (which also defines a specific training function). Genetic programming creates a function which is any arbitrary piece of computer code. The code is often Lisp, although lower level outputs such as assembly language and even FPGA configurations have been used successfully.
Training function
The training algorithm looks at the past cases and tries to find the parameters of the functional form that meet the classification or optimisation objective. This is where the real smarts come in. One naive approach is to try lots of randomly chosen parameters and pick the best. Genetic algorithms are a variant of this approach that pick a bunch of random sets of parameters, find the best sets and combine features from them, introduce a bit of additional randomness, and repeat until a good answer is found. Local/global search works by picking one set of parameters and varying each property a tiny bit to see whether the result is improved or gets worse. By doing this it locates a 'good direction' which it uses to find a new candidate set of parameters, and repeats the process from there. Hybrid algorithms are currently popular since they combine the flexibility of genetic algorithms with the speed of local search. Most neural networks today are trained with local search, although more recent research has examined more robust approaches such as genetic algorithms, Bayesian learning, and various hybrids.
Learning type
Supervised learning approaches take a set of cases for training and are told "here is the property we will trying to predict/optimise, and here is it's value in previous observed cases". The algorithm then uses this context to find a set of parameters for the functional form using this context that the analyst provides. Unsupervised learning on the other hand does not specify prediction of any particular property as being the training goal. Instead the algorithm looks for 'interesting' patterns, where 'interesting' is defined by the research. For instance, cluster analysis is an unsupervised learning approach that groups cases that are similar across all properties, normally using simple measurements of Euclidian distance (that's just a fancy word for how far away something is when you've got more than one dimension).
Contextual learning is a far more interactive approach where the analyst interacts with an algorithm during training constantly providing information about what patterns are interesting, and where the algorithm should investigate next. Systems like Cyc use contextual learning to try to capture the rich understanding of context that humans can feed in.
AI and moral philosophy
We are still a long way from seeing an algorithm that can interact in a flexible enough way that we could mistake it for human in a completely general setting (the Turing Test for intelligence). However, given the ability of flexible training functions such as genetic algorithms, we may find that one day an algorithm is given enough inputs, processing power, and flexibility of functional form that it passes this test. The 'morals' that it shows will depend entirely on the inputs provided during training. This is not like humans, who have some generally consistent set of moral rules encoded through evolutionary outcomes (for instance, tendency to care for the young and related). Our moral premises are the underlying 'givens' that form the foundation of what we consider 'right' and 'wrong'. Ensuring that an AI algorithm does not act in ways we consider inappropriate relies on our ability to include these moral premises in the input that we train it with. This is why Lenat talks about teaching Cyc that killing is worse than lying--this is potentially a moral premise. Finding the underlying shared moral premises of a society is a complex task, since for any given premise you can say 'why?' But repeatedly asking 'why?' you eventually get to a point where the answer is 'just because'--this is the point at which you have found a basic premise.
Summary
'AI' is a term used inappropriately for a range of algorithms that attempt to learn without having to specify an exact set of rules for every case. Although these algorithms are currently incapable of displaying real intelligence, it is possible that one day they may. This point is however debatable, and the interested reader should read for themselves the differing points of view of experts in the field, including Daniel Dennett, Roger Penrose, Steven Pinker, Richard Dawkins, and Douglas Hofstadter. If they do ever get to the point that they can act intelligently and flexibly, it will be important that they are trained with appropriate moral premises to ensure that there actions are appropriate in our society.
I hope that some of you find this useful. Feel free to email if you're interested in knowing more. I currently work in applying these types of techniques to helping insurers set prices and financial institutions make credit and marketing decisions.
Jeremy Howard
The Optimal Decisions Group
I would recommend re-thinking your division of AI into subfields. You are indescriminately mixing technologies and application areas.
For example, neural networks are a technology and NLP is an application area. I know people working in NLP that use Lisp, and I know others that use neural networks. In AI, technologies and application areas are (mostly) orthogonal.
Granted, there probably isn't a perfect breakdown of AI into subfields, but making the distinction above will help you and your readers get a grip on what AI is all about faster.
Sheesh, evil *and* a jerk. -- Jade