The Baby Bootstrap?
An anonymous reader asks: "Slashdot recently covered a story that DARPA
would significantly cut CS research. When I was completing graduate
work in AI, the 'baby bootstrap' was considered the holy grail of military
applications. Simply put, the 'baby bootstrap' would empower a computing device to learn like a child with a very good memory. DARPA poured a small fortune into the research. No sensors, servos or video input - it only needed terminal I/O to be effective. Today the internet could provide a developmental database far beyond any testbed that we imagined, yet there has been no significant progress in over 30 years. MindPixels
and Cycorp seem typical of poorly funded efforts headed in the wrong direction, and all we hear from DARPA is autonomous robots. NIST seems more interested in industrial applications. Even Google
is remarkably void of anything about the 'baby bootstrap'. What went wrong? Has the military really given up on this concept, or has their research moved to other, more classified levels?"
Sure, that was the engine of thought behind stories such as WarGames and 9x109 names of god. Somehow, unfettered access to data and time with "neural networking" capacity to form links and create linkages to pieces of data ("associative memory") would be all that was needed to create intelligence, and perhaps even sentience.
...
Minsky came up wrong on the single layer perceptron, AI was wrong on the purely feed-forward neural-network systems, Rumelhart and McLelland got some good promo off of their feed forward net that could learn to pronounce idiosyncracies, and Sejnowski got a great job at the salk from the AI delusions. But no, it appears to not have gone anywhere... thus far.
Later comment will be positive.
What happened was that research focused
on machine learning models and inference
models for belief networks. The work
in this area since the 80s has been
*spectacular* and has impacted other
areas of research. (E.g., speech
recognition, image processing, computer
vision, algos to process satellite information
faster, stock analysis, etc.)
So, mourn the loss of the tag phrase "baby
bootstrap", and celebrate the *unbelievable*
advanced in belief nets, causal analysis,
join trees, probabilistic inference,
and uncertainty analysis. There are
literally dozens of classes taught at
even non-research oriented Univs (e.g.,
teaching colleges or vocational-oriented
schools) on this very subject.
(As for your concern that the web is not
being mined for ML context, just look at
semantic web research, and other belief
net analysis of text corpuses. Try
scholar.google.com instead of just
plain old google to find relevant
citations.)
The early AI research paid off BIG TIME,
albeit in a direction that nobody could
have predicted. Researchers did not keep
using the phrase "baby bootstrap" so
your googling will give you a different
(and wrong) conclusion.
My suggestion is that we need to explore all the possible permutations of persons, places, and things, as they're reflected in the full range of literature, and classify these permutations to discover the underlying patterns.
(I've tried to make a start with my AntiMath and fractal-thicket indexing.)
Yes, Mindpixel [singluar] is poorly funded [I know because every cent spent to date has come from my pocket]...but the directon is correct... Move everything that isn't in computers, into computers. Just look at what GAC knows about reality [visit the mindpixel site and you can see a random snapshot of some validated common sense]... the project has nearly 2 million mindpixels now...I have a copy on my ibook and I can do some profound search related things because of all the deep semantics I have that google can't touch, at least until they invest in mindpixel ...
The Cognitive Machines Group @ the MIT Media Lab under Deb Roy seem to be on the right track. Steve Grand's work is interesting as well.
This way to the egress...
...and parents/pain for what is 'correct.' I don't think the concept is gone, but there are problems that are buried in the question as posed which (I think) became clearer stumbling blocks as technology advanced. NOTE: I'm not an AI theorist, nor do I play one on TV; I just like the idea and read a lot. Hence, this is all pulled out of my fundament.
Cycorp is not a poorly funded idea in the wrong direction. Cycorp chose a different tack; they decided that rather than trying to build a reality and correctness filter, they'd rely on human brains to do it for them (like trusting your parents implictly) and instead concentrated on the connectivity of the 'facts' accrued by the 'baby.' CYC is still very much around, and is very much in demand by various parts of the government and industry - if you want to play with it yourself, you can download a truncated database of assertions called OpenCYC. Folks have even gone so far as to graft it onto an AIML engine, to produce a chatbot with the knowledge of OpenCYC behind it.
The problem: how does your baby learn what's real and what's REAL NINJA POWER? Or, pardon me, what's REAL NINJA POWER and what's just a poser? Someone's gotta teach it. Which means it has to learn not only facts, but how to evaluate facts. So it has to learn facts, and how to handle facts - which means it has to learn how to learn. Which means you need to know that answer from the git-go. Tortuous games with logic aside, the onus is now much more heavily on the designer to have a functioning base - whereas with the Cyc approach, the only 'correctness' that is required is that of information, and perhaps that of associativity or weight - which can be tweaked, dynamically. The actual structure of how that information is related, acquired, stored and related is not relevant once decided. Having said all this, Cyc is (from the limited demos I've seen) quite impressive at dealing with information handed to it. It just wouldn't do very well at deciding what do do with that information - that's the job of the humans that gave it the info. It can tell you about the information, but not what to do with it. That task requires volition, really.
Volition is a killer. What is it? How do you simulate it? How do you create it? Is it random action? Random weighted action? Path dependent action? Purely nature, purely nurture? When it comes down to it, the human is (as far as we know) not a purely reactive system, which CyC (AFAIK) is. Learning requires not only accepting information, but deciding what to do with it - deciding how it will be integrated into the whole. If the entity itself isn't making that decision, then the programmer/designer/builder has already made it in the design or code - and then it's not really learning, is it?
Sorry if this is confused. As I said, I don't do this for a living.
A hero is someone who knows when to run away. I am a hero. -Trent the Uncatchable
Bootstrapped learning something useful, even from an information ocean like the internet, is *HARD*.
Doubly so if you have no goals, and your task is just to "learn". It would come back with garbage.
Perhaps the real killer is that even if it did learn something, the information acquired in its unguided search through the internet would be completely alien. You'd then have to launch a second project to figure out what the hell your little guy learned.
And you'd probably figure it out was mostly garbage.
If you want a machine that learns like a human, it may very well need the same kind of extremely rich interface with its environment that a human has.
Some researchers now believe that "the intelligence is in the IO". See for example the human intelligence enterprise.
"The danger is not that a particular class is unfit to govern. Every class is unfit to govern." - Lord Acton
Skynet anyone? The problem with any project like this is, what happens when the program learns about hacking? If it is as adaptive as a child, then it should be able to mature and pretty soon you have a terribly devious artificial blackhat hacker on your hands.
It _would_ learn about hacking. Come on. Such an entity would be born in a pure data environment. Getting through a basic firewall would probably seem like jumping over a small fence does to a 6-years old. Getting to jump over better firewall would probably take time - in the sense that the entity would need to learn - but, since it would become a survival trick, it would happen.
Artificial intelligence is not bad in and of itself at all.
No technology is either good or bad. Only the use we make of it can be considered as such, and it still depends on what you consider is good/bad. If I was to say "War on Iraq is bad", how many people would react by saying it's good?
The problem is when we want a machine that thinks like humans, especially a program that could potentially control our military.
I don't think that's the point of the "baby bootstrap" thing. The only point is to get it to think. But, just like you learnt how to think according to the way you perceive the world, through your five human senses, an AI built that way would react according to its own senses. How it would interpret that data and react to it is something - I'm willing to bet - that would be completely alien to us.
Given the record of flesh and blood humans toward each other in the 20th century alone, an artificial life form with the same basic psychological makeup as a human would be potentially an evil that'd make Hitler, Stalin and Pol Pot look like church ladies.
This is only valid if you don't consider what I just said. Such an AI would probably be more interrested in getting the human race to serve it in an absolutely hidden way - build more computers, extend the networks, research better networking technologies - until it _can_ replace us. Even then, that would make sense on an evolutionnary point of view.
AI that is capable of adapting to only one scenario is probably for all intents and purposes totally safe.
This is called an automaton. It is not AI.
. AI that is capable of adapting in general and learning like a human will probably ultimately have the same psychological defects as a human, including a propensity for violence.
Most of the defects you are speaking about are related to our very nature - we are, after all, an evolution of omnivorous primates. We are therefore predators, with an important tendency towards territorialism and whatever comes with it. We are stuck somewhere between instinct and reason. Anyway, my point is that even if an AI was to learn "like" an human ("by undergoing the same process"), it certainly wouldn't react like one.
I sense much beer in you. Beer leads to intoxication, intoxication leads to hangover. Hangover leads to sobering.
Let anyone submit a program that produces, with no inputs, one of the major natural language corpuses as output.
S = size of uncompressed corpus
... or the Kolmogorov-like compression ratio.
P = size of program outputting the uncompressed corpus
R = S/P
Previous record ratio: R0
New record ratio: R1=R0+X
Fund contains: $Z at noon GMT on day of new record
Winner receives: $Z * (X/(R0+X))
Compression program and decompression program are made open source.
If Larry has any questions about the wisdom of this prize he should talk to Craig Nevill-Manning.
If, in the unlikely event, Craig Nevill-Manning has any questions about the wisdom of this prize, he should talk to Matthew Mahoney, author of "Text Compression as a Test for Artificial Intelligence"
"The Turing test for artificial intelligence is widely accepted, but is subjective, qualitative, non-repeatable, and difficult to implement. An alternative test without these drawbacks is to insert a machine's language model into a predictive encoder and compress a corpus of natural language text. A ratio of 1.3 bits per character or less indicates that the machine has AI."
This "K-Prize" will bootstrap AI.
OK, so he can christen it the "Page K-Prize" if he wants.
Seastead this.
The number is the measured probability of truth:
1.00 Fish must remain in water to continue living.
0.68 truth is a relative concept
0.89 we all need laws
0.94 is shakespeare dead?
0.91 is intelligence relative ?
0.97 Doors often have handles or knobs.
1.00 A comet and an asteroid are both moving celestial objects.
0.96 Is Russian a language?
0.00 are the northern lights viewable from all locations ?
0.86 Being wealthy is generally desirable.
0.79 Democracy is superior to any other form of government
0.90 aRE TREES GREEN
1.00 Is eating important?
0.02 Is sex a strictly human endeavour?
0.14 Snails are insects.
1.00 velvet is a type of cloth
0.37 are you lonely ?
0.81 If GAC makes a mistake, will it learn quickly?
0.86 a cat is a mammal
0.85 Memorex makes recording media
0.06 most people enjoy frustrating tasks
0.04 Lima beans are a mineral.
0.07 Star Wars is based upon a true story
0.92 is it okay for someone to believe something different?
0.97 do you breath air ?
0.59 Some people are more worthy dead than alive.
1.00 sunlight on your face is in general a pleasant feeling
0.93 DOA stands for "Dead On Arrival"
0.00 Could a housecat bite my arm off?
0.42 Is the herb Astragalus good for your immune system?
0.00 worms have legs
0.33 Is it necessary to have a nationality?
0.93 Getting forced off the internet sucks!!!
0.90 Bolivia is a country located in South America.
0.92 Massive objects pull other objects toward their center. The pulling force is gravity.
1.00 xx chromosomes produce a girl
0.13 Do all people in the world speak a different language
0.78 Human common sense is a combination of experience, frugality of effort, and simplicity of thought.
1.00 The use of tobacco products is thought to cause more than 400,000 deaths each year.
0.90 Is a low-fat diet is healthier than a high-fat diet?
0.00 you should kill all strangers
1.00 Electrical resistance can be measuter in ohms
0.73 Esperanto, an artifical language, can never be really valuable because it has no cultural roots.
1.00 Swimming is good for you.
0.57 the end justifies the means
0.13 Is Martha Stewart a hottie?
1.00 1 mile is about 1.6 kilometer
0.76 The US elections are of little interest to 5,000,000,000 people.
0.00 November is the first month in the normal calendar.
0.77 is a music cd better than a olt time record?
1.00 Music can help calm your emotions
0.80 a didlo is a sex toy
1.00 Running is good exercise.
0.00 No building in the world is made of wood
0.06 Is sauerkraut made from peas?
0.11 DID MICKEY MOUSE SHOOT JR
1.00 is keyboard usual part of computer?
0.96 Tokyo is the capital of Japan.
0.93 In general men run faster than women.
1.00 is russia near china
IMNSHO, such things lead absolutely nowhere.
I'm pretty sure that anything that looks even remotely like intelligence will never be achieved by a mechanism that isn't useful for itself. Intelligence has one reason to exist, survival, and at least our concept of it has to be linked to the environment.
Imagine you were born a brain in a vat: blind, deaf, mute, lacking all ways of sensing the environment except a text interface somehow connected to your brain. Does somebody really believe that given such terrible limitations it's possible to make an entity that can somehow relate to a human and make sense? The whole concept of a surronding 3D environment would make absolutely no sense to it.
I think it doesn't matter how much stuff you feed to CYC, it will never be able to understand it. How could it even understand such things as the different colors, the whole concepts of sound, space, movement, pain if it's not able to feel them? These things are impossible to explain to somebody who doesn't have at least some way of perceiving at least part of them.
Here I think that Steve Grand (the guy who made the Creatures games) has a good point here. To make an artificial being you'd need to start from the low level, so that complex behavior can emerge, and provide a proper environment.
That is a horrible constraint to put on AI problems which are (very likely) non-linear and in a hard-to-guess problem space.
Also, many training algorithms assume that the network is in a non-cyclic layout. Loops are Bad. You can do grids, in self-training networks, but you still can't really cycle. Brains cycle.
Third, neural networks tend to be small. For trained networks, the number of training cycles and the length of each both rise exponentially with the number of neurons involved. The human brain has a few billion neurons. Training using the current methods breaks long before that point.
Finally, the IDIOTS who call themselves "Hard AI" developers insist on using clean data and dirty environments. Nonono! The human brain doesn't work that way. The human brain collects data from the real world that is incredibly dirty - especially if it's a computer geek's brain. It then models this in a clean environment (the mind). This is the exact reverse of the way virtually all AI is done, especially robotics.
That won't work. The brain doesn't depend on the data being "exact", it depends on it being vague. The model turns that vagueness into a perception of the real world and all operations are directly carried out on that perception. The output is then fed to the muscles to duplicate the output in the real world.
A comparable system would be to have a simulated robot in a Virtual Reality. External sensors would be used to update the VR. The robot would then explore various possibilities in the simulated world, before mapping the preferred course of action onto the motors driving a real-world device to which the sensors are attached.
in other words, robotics should be mostly in cyberspace, with only the last component (the update mechanism) bolted onto the real world for good measure. The robotics people actually build are much closer to the autonomic nervous system in the brain (sometimes referred to as the reptillian brain). Indeed, we see that modelling reptiles in this way is progressing exceedingly well. Well, duh!
What is NOT progressing is intelligent response to the environment, because that is NOT reproducable using the mechanisms in favour.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Automatic Meaning Discovery Using Google:
We have all kind of "AI-like" technology in our computers right now -- spam filtering, intelligent search engines, collaborative filtering (for instance TiVo recommendations), speech/image/OCR/handwriting recognition, etc. This stuff is real and useful and improving all the time. We just don't call it "AI" as much, because "AI" is a word associated with failed aspirations. What we have are highly refined statistical systems that are optimized for a particular problem.
... stuff that isn't instantly derivable from a + b = c.
What the "baby bootstrap" is really referring to is "the great emergent AI" which, like HAL-9000, will be able to empathize with humans, navigate a starship, and play a mean game of chess -- because if a system can perform one intelligent operation, it can perform another operation requiring an equal amount of intelligence, right?
One major stumbling block (I think) is that of optimization. The relatively simple problem of speech recognition takes a major percentage of a modern CPU's power, and is still 95-98% accurate. This is heavily optimized software written by very smart people with a couple decades of research behind it.
A hypothetical "great emergent AI" system would have to perform the function of speech-recognition -- since it is supposed to be like a child or like a HAL-9000 -- but it would have to come up with a same-or-better implementation of this very complex algorithm, using some emergent process. It would have to figure out the equivilent of FFTs, cepstral coefficients, lattice search
What we think our brain does is solve problems with a semi-brute-force algorithm. (Just throw billions of neurons at it!) However we still don't have the kind of computing power to implement a one-algorithm-fits-all learning process like the brain. Unfortunately, research for this "generic learning" is in a rut, with genetic algorithms and neural networks being exhausted top contenders. What will be next?
There are several arguments against the possibility of strong AI. First and foremost, there is disagreement on fundamental philosophical issues.
All proponents of strong AI have to somehow make a stand against at least John Searle's famous Chinese Room argument and Terry Winograd's phenomenological (and biological) account, in his book Computers and Cognition. Hubert Dreyfus provides, of course, an even deeper phenomenological argument in "What computers (still) can't do". (Dreyfus does give Neural Networks some chance, perhaps that is why the original poster is still enthusiastic about the "Baby Bootstrap"?)
Since their arguments are available in the links above and/or other places on the web, I will not repeat them here. My point is that anyone who is seriously interested in AI has to really consider their philosophical ground, and has to do so in the light of arguments against it. After all, the arguments pointed to above are still more recent than arguments for strong AI.
In other words, I would like to ask of (strong) AI proponents to answer a just what this "learning" is, that the baby bootstrap is subject to? What "knowledge" will it contain? Oh, and what about its means of "expression", "language" as you may call it?
There's alot of worry in DoD about how remote controlled fighters and bombers can resist signal hijacking. This isn't much of an issue with today's predator aircraft because we're aware of the information capabilities of our enemy, but we can't build a fleet of next generation fighters that we intend to use for twenty years if we believe there's a reasonable chance that 12 years from now, the Chinese will have to capacity to make our aircraft theirs at the touch of a button.
An expensive remote-controlled fighter is useless unless it has onboard AI at least good enough to disengage from combat and return home on its own if it loses its control signal. Even at that, it would probably still not be worth the expense unless it could actually carry out a combat mission without a remote pilot. Jamming signals is just too easy to trust that the enemy won't be able to do it.
This was a point Nietzche made in Beyond Good and Evil, that the will is the least-well understood aspect of human nature, and the one we make the most assumptions about our understanding of. Interesting that will/volition/motive/morality (aspects of the same grey area) pose such a fundamental problem to AI...
putfwd.com - 1GB Free file storage with a twist
Biological chauvinism.
It comes down to a matter of perspective. While Searle couldn't possibly grok that the system of the book, the worker/ordertaker, and the room opening "understands" Chinese, he thinks it natural to believe that the system of neurons, blood vessels, organs, and bodily fluids called "Mao Zedung" understands Chinese.
Why? Merely convention. Defining intelligence by mechanism (in Searle's case: neurons) is problematic because it precludes definition in situations where mechanism is unknown. If an alien race landed on Earth tomorrow and demanded to speak to our leader, are we going to kill one and dissect it to verify it has neurons before we negotiate?
Put another way, Mao Zedung's clone, properly taught, knows Chinese. A supercomputer of the future, exactly simulating the effects of all of the neurons in Mao Zedung's head, should "know" Chinese too, otherwise one ends up with an analog of dualism's "zombie" problem. The brain of Mao Zedung's clone could have been replaced by a wireless link to the supercomputer. So, even though Mao clone will act and behave exactly the same as if he had a real brain, he's doesn't "understand" Chinese.
To answer your question, we can't preclude silicon from being intelligent merely by decree. We have to evaluate artificial intelligence the same way we evaluate biological intelligence: by observing the outputs from the party in question, applying semantic content to those outputs, and seeing if that semantic content jives with our own understanding of what it means to be intelligent.
Simply put, the 'baby bootstrap' would empower a computing device to learn like a child with a very good memory. ... No sensors, servos or video input - it only needed terminal I/O to be effective.
The input stream at a terminal would hardly appeal to a child so how can a proper evaluation of the learning be done?
Suppose the input is a sequence of zeros and ones. Could the AI come to any kind of understanding? Perhaps a prediction whether the next input might be a 0 or a 1, eh? But no! Let's fool the AI now by telling it who is the real boss. The AI has no idea that it is being spoken to by a terminal. The next input is the letter "g". How unpredictable!
Garbage in, garbage out - let's look carefully. A child plays and experiments. A great deal of a child's theories are garbage. The world in a child's eyes is a set of samples. Like the Mars rovers a child could follow a path that seems fairly limited in character, then bingo, something new comes up.
Intelligent behavior in a child emerges when different theories are assembled towards a goal. First the child realizes that s/he has some ability to either influence the environment or to manipulate information (which may be stored as symbols or images, as far as a computer is concerned). If the child conceives of particular classes of objects, the child can begin to reason. Several concepts such as self, ability, action, time, place, class, possession, etc. would be regarded as fundamental or at the very least useful. As a child accumulates and refines these concepts in the mind, the child can reason more and more correctly or effectively.
An simple artificial world can be represented as a set of strings that are transmitted to a baby bootstrap. The simple strings would be a simple bootstrap for priming the learning mechanism by letting it realize a number of essential concepts. Then more complex worlds as well as more arcane representations (such as natural language) can be used in order for the AI to interact with the greatest possible group of users.
Still, the limited input feed is bound to cause the most ridiculous problems. Pointing out that the learning system has a big memory doesn't give me any idea what the machine will achieve.
Know your pads. One time pad: good for cryptography. Two timing pad: where to take your mistress.