Domain: idsia.ch
Stories and comments across the archive that link to idsia.ch.
Comments · 53
-
Re: Sure, whatever
He's most likely not even the first one to suggest these things.
Most notably, the work by Konrad Zuse predates that of Wolfram:
- Rechnender Raum (PDF), original article in German, Elektronische Datenverarbeitung, 8: 336–344, 1967.
- Calculating Space (PDF), English translation of the book version, MIT Technical Translation, 1970 (2012 version (PDF)). -
Re: Sure, whatever
He's most likely not even the first one to suggest these things.
Most notably, the work by Konrad Zuse predates that of Wolfram:
- Rechnender Raum (PDF), original article in German, Elektronische Datenverarbeitung, 8: 336–344, 1967.
- Calculating Space (PDF), English translation of the book version, MIT Technical Translation, 1970 (2012 version (PDF)). -
Re:Average across models
This is true, however the current craziness about deep-learning NN is due to the fact that they are incredibly effective at some computer vision tasks, including very difficult ones, that were thought almost impossible until recently. They beat other classifiers by a large margin. However not long ago SVM were the rage, etc. No doubt in a few years we will have exported the good feature of deep-learning to other methodologies.
-
Re:Average across models
This is true, however the current craziness about deep-learning NN is due to the fact that they are incredibly effective at some computer vision tasks, including very difficult ones, that were thought almost impossible until recently. They beat other classifiers by a large margin. However not long ago SVM were the rage, etc. No doubt in a few years we will have exported the good feature of deep-learning to other methodologies.
-
Re:Toward Egan/Wolfram Territory
The idea that the universe computes comes originally from Konrad Zuse in 1967, who built the first working computer in 1941.
English translation (by MIT) of the Calculating Space book, 1969 (pdf)
-
Re:Toward Egan/Wolfram Territory
The idea that the universe computes comes originally from Konrad Zuse in 1967, who built the first working computer in 1941.
English translation (by MIT) of the Calculating Space book, 1969 (pdf)
-
Calculating Space
The idea that the universe computes comes originally from Konrad Zuse in 1967, who built the first working computer in 1941.
English translation of the Calculating Space book by MIT, 1969 (pdf)
-
Calculating Space
The idea that the universe computes comes originally from Konrad Zuse in 1967, who built the first working computer in 1941.
English translation of the Calculating Space book by MIT, 1969 (pdf)
-
Re:They want strong AIStrong vs weak is a mere philosophical distinction that is completely meaningless as far as anybody knows.
As for not knowing the state of the art, were you aware that, for example, a so-called deep-learning algorithm recently achieved super-human performance in recognizing road signs, despite being a rather general algorithm? Super-human, as in, it had a lower recognizing road signs than people did. It's stunning. And surprisingly, the algorithm(s) doing this are among those that could be most credibly called bio-inspired.
-
Re:They want strong AIStrong vs weak is a mere philosophical distinction that is completely meaningless as far as anybody knows.
As for not knowing the state of the art, were you aware that, for example, a so-called deep-learning algorithm recently achieved super-human performance in recognizing road signs, despite being a rather general algorithm? Super-human, as in, it had a lower recognizing road signs than people did. It's stunning. And surprisingly, the algorithm(s) doing this are among those that could be most credibly called bio-inspired.
-
Re:Not again!
First, the possibility of intelligent machines is glimpse. All our present technology is not able to achieve intelligence. This is mainly because we do not know what that is. Furthermore, to be dangerous they must be equipped with greed and (the illusion of) a free will. It is most unlikely that someone would build that on purpose or by accident. In short, I think it is impossible to built such machine.
A rack of IBM servers can beat the best Jeopardy players on Earth. In a few years the same level of Watson will fit in a 1U. A few years later it will be on your smartphone. But that's just anecdotal evidence of one recent achievement in AI research; the actual threat is from self-improving systems of which Watson is not a member. But nearly all the technology is available now: Goedel machines, if built, would simply try to achieve whatever goal they were programmed for while also searching for proofs that possible modifications to any of their algorithms would improve the speed of achieving the goals while still maintaining the correctness of the algorithms, and if found, implementing those changes. Self-directed, self-improving, goal-seeking software has the potential to undergo a runaway process in which it improves itself faster than humans would be able to improve it, eventually achieving greater effective intelligence (speed and efficiency at achieving goals) than the humans who created it. At that point the software doesn't need free will or greed to be dangerous; it just needs an improperly or carelessly stated goal that if fulfilled will be detrimental to humanity. Goals for intelligent software will be formal logical specifications, not things like "make people happy" or "increase the GDP" because those English phrases don't have formal definitions that an algorithm can use to plan actions to achieve goals. If the formal specification actually was close to "maximize GDP" the algorithm might find that the most efficient way of maximizing GDP was hyperinflation. Or it might simply advise the creation of billions of shell companies that could artificially increase GDP trading worthless services while producing nothing else of value. In general the problems that humans want to solve are hard problems where simple solutions that don't meet a very long list of critical requirements will have detrimental "optimal" solutions if any of the critical requirements are left out of a formal goal.
-
No really big news...
You might want to check the PhD of this guy in 1998 entitled "Ant Colony Optimization and its application to adaptive routing in telecommunication networks".
There are plenty of other ant like heuristics to network routing even older than this. Ant behavior modelization dates as far as 1989 (from J-L. Deneubourg), and routing was the first practical application for the derivative algorithms.
-
No really big news...
You might want to check the PhD of this guy in 1998 entitled "Ant Colony Optimization and its application to adaptive routing in telecommunication networks".
There are plenty of other ant like heuristics to network routing even older than this. Ant behavior modelization dates as far as 1989 (from J-L. Deneubourg), and routing was the first practical application for the derivative algorithms.
-
Re:Bit-Strings and Digital Physics
You might credit John Archibald Wheeler [wikipedia.org], H. Pierre Noyes, Ted Bastin, C.W. Kilmister, and David McGoveran before Stephen Wolfram.
You forget to mention Konrad Zuse, who wrote Calculating Space (PDF) in the late 1960s.
-
Re:Most influential on me...
And if you enjoy von Neumann's book, move on to Konrad Zuse's "Rechnender Raum" (1969), here's a PDF of the English translation.
-
Re:And this is different from OSPF how?
you should read the original papers before saying that the authors play semantic games
;-)
http://www.idsia.ch/~gianni/antnet.htmlthe differences are many, here are some important ones. Full paths are explicitly sampled by control packets (the 'ants') (wich is quite different from locally observing link costs and then flooding them). Sampling full paths also involves some
core issues that do not find any counterpart in ospf: how often ants are generated (i.e., when, how often, do I need to refresh my local routing information?), which destinations should be sampled? (some destinations are more important than others...), what kind of stochastic decision policy should be implemented in the hop by hop decisions? (how to balance exploration vs. exploitation). The routing tables adaptively constructed through ant sampling do not identify one single shortest path, but rather a bundle of paths that can be used to automatically implement stochastic multi-path routing (and have therefore automatic load balancing). Last but not least, OSPF is based on the knowledge of the full topology of the network (at the level the router is), that implies keeping flooding of up-to-date link information throughout the network. On the other hand, AntNet's routing table are just based on the knowledge of the local topology, that is, of the connected neighbors. This also makes the algorithm quite robust for possible ant losses (or 'errors' in the estimate of the quality of the sampled paths). -
Re:Interesting Paper
that was one of the original papers on the topic, I suggest also to read the (a bit later) AntNet papers I wrote with Dorigo. I have a web page on the topic (nothing much, but there are some good references there):
-
Re:Hill Climbing
it's not really a stochastic hill-climbing algorithm. aco is based on repeated solution sampling, where each solution is constructed (by an 'ant') by stochastically selecting solution components one-at-a-time (e.g., in a tsp, by adding cities to the partial solution until a feasible Hamiltonian tour is constructed). The outcome of each constructed solution is then used to update/learn pheromone variables, that in turn bias/direct the stochastic decision policy that is applied to select each individual solution component while the solution is being constructed. The main idea is to implement, through repeated solution construction, a process of collective learning of the pheromone variables, that are the local parameters of the stochastic solution construction policy. An hill-climb approach would start from a full feasible solution (e.g., a complete TSP tour) and then it would iteratively modify this solution in the direction of the best local improvement (according to a defined local neighborhood) until no improvement looks possible anymore. No learning from previous solutions happen here. No solution construction is involved here. The two approaches are in fact totally different but somehow complementary: it's normal practice to combine aco with some local search: after N (N>= 1) iterations of the ant solution construction phase, the pheromone values are used to identify a candidate solution (or multiple candidate solutions), that are taken as starting points for the local search. The local search is then executed, and the final reached point(s) (the local minimum) is(are) then used to update the pheromone values used by the ant search. The process is then iterated. A good overview of how this works can be found in the ACO definition paper from Dorigo and Di Caro (1999), frankly speaking I think it's quite easy to read: Dorigo M., Di Caro G.A., "The Ant Colony Optimization Meta-Heuristic", in Corne D., Dorigo M., Glover F., "New Ideas in Optimization", McGraw-Hill, 1999. http://www.idsia.ch/~gianni/Papers/OptBook.ps.gz A more 'massive' discussions of these topics can be found in my thesis: http://www.idsia.ch/~gianni/Papers/thesis.pdf http://www.idsia.ch/~gianni/my_thesis.html
-
Re:Hill Climbing
it's not really a stochastic hill-climbing algorithm. aco is based on repeated solution sampling, where each solution is constructed (by an 'ant') by stochastically selecting solution components one-at-a-time (e.g., in a tsp, by adding cities to the partial solution until a feasible Hamiltonian tour is constructed). The outcome of each constructed solution is then used to update/learn pheromone variables, that in turn bias/direct the stochastic decision policy that is applied to select each individual solution component while the solution is being constructed. The main idea is to implement, through repeated solution construction, a process of collective learning of the pheromone variables, that are the local parameters of the stochastic solution construction policy. An hill-climb approach would start from a full feasible solution (e.g., a complete TSP tour) and then it would iteratively modify this solution in the direction of the best local improvement (according to a defined local neighborhood) until no improvement looks possible anymore. No learning from previous solutions happen here. No solution construction is involved here. The two approaches are in fact totally different but somehow complementary: it's normal practice to combine aco with some local search: after N (N>= 1) iterations of the ant solution construction phase, the pheromone values are used to identify a candidate solution (or multiple candidate solutions), that are taken as starting points for the local search. The local search is then executed, and the final reached point(s) (the local minimum) is(are) then used to update the pheromone values used by the ant search. The process is then iterated. A good overview of how this works can be found in the ACO definition paper from Dorigo and Di Caro (1999), frankly speaking I think it's quite easy to read: Dorigo M., Di Caro G.A., "The Ant Colony Optimization Meta-Heuristic", in Corne D., Dorigo M., Glover F., "New Ideas in Optimization", McGraw-Hill, 1999. http://www.idsia.ch/~gianni/Papers/OptBook.ps.gz A more 'massive' discussions of these topics can be found in my thesis: http://www.idsia.ch/~gianni/Papers/thesis.pdf http://www.idsia.ch/~gianni/my_thesis.html
-
Re:Hill Climbing
it's not really a stochastic hill-climbing algorithm. aco is based on repeated solution sampling, where each solution is constructed (by an 'ant') by stochastically selecting solution components one-at-a-time (e.g., in a tsp, by adding cities to the partial solution until a feasible Hamiltonian tour is constructed). The outcome of each constructed solution is then used to update/learn pheromone variables, that in turn bias/direct the stochastic decision policy that is applied to select each individual solution component while the solution is being constructed. The main idea is to implement, through repeated solution construction, a process of collective learning of the pheromone variables, that are the local parameters of the stochastic solution construction policy. An hill-climb approach would start from a full feasible solution (e.g., a complete TSP tour) and then it would iteratively modify this solution in the direction of the best local improvement (according to a defined local neighborhood) until no improvement looks possible anymore. No learning from previous solutions happen here. No solution construction is involved here. The two approaches are in fact totally different but somehow complementary: it's normal practice to combine aco with some local search: after N (N>= 1) iterations of the ant solution construction phase, the pheromone values are used to identify a candidate solution (or multiple candidate solutions), that are taken as starting points for the local search. The local search is then executed, and the final reached point(s) (the local minimum) is(are) then used to update the pheromone values used by the ant search. The process is then iterated. A good overview of how this works can be found in the ACO definition paper from Dorigo and Di Caro (1999), frankly speaking I think it's quite easy to read: Dorigo M., Di Caro G.A., "The Ant Colony Optimization Meta-Heuristic", in Corne D., Dorigo M., Glover F., "New Ideas in Optimization", McGraw-Hill, 1999. http://www.idsia.ch/~gianni/Papers/OptBook.ps.gz A more 'massive' discussions of these topics can be found in my thesis: http://www.idsia.ch/~gianni/Papers/thesis.pdf http://www.idsia.ch/~gianni/my_thesis.html
-
Life Is NOT an RPG: Life IS
Yours In Astrokan,
Kilgore Trout -
Show me the runny
Schmidhuber has interesting claims, like about his Goedel machine, an algorithm that makes provably globally optimal self-modifications.
But he never seems to get around to actually writing the code, or even non-vague pseudocode to implement these algorithms to show how they actually work and that they actually work. I guess it's just an "implementation issue". Ah, the chorus of the pure theorist...
-
Konrad Zuse?
Why ascribe the idea's of Konrad Suse to Wolfram?? Calculating space, 1967 (PDF)
-
first stored-program computer ..
-
Re:Wow!
Please tell me ONE major advance in whatever field you work in.
It doesn't really work like that. Very rarely in science is there some major advance that you can specifically point to. Everything we had back then in AI we still have, but it is so much better. The neuroscience side of things is progressing, we are getting better data about the brain. Developing practical applications based on that. Voice recognition is pretty good now, automatic translation is better, computational vision is better, autonomous robots are better. And frankly, it's rather difficult to put theories on artificial intelligence when the computational capability isn't yet there, so things are getting better as computation gets faster.
What would satisfy you as a major advance in the field? The problem with AI techniques is that they very quickly leave the field of AI, or they don't yet have any practical applications. Juergen Schmidhuber's work on Recurrent Neural Nets is very impressive and he's a good name to watch for the future of AI.
-
Re:Fundamental research?
This is pretty theoretical, and arguably quite fundamental: http://www.idsia.ch/~juergen/unilearn.html Disclosure: I work with Juergen Schmidhuber
-
Zuse
You may also like to read Zuse's thesis.
-
Re:Wow...
>Jeez, Can someone please give me the short version explanation about why everyone is bagging on Wolfram?
- Wolfram makes it looks like he invented everything about cellular automata, not giving credit where it is due (especially Konrad Zuse).
- Wolfram works with stuff like Non-Disclosure Agreements on mathematical issues, which is not academic ethos. He forbade the publication of a fundamental mathematical proof by one of his employees.
- Wolfram masturbates far to much in his New Kind of Science. -
Re:Artificial Intelligence?As Matt Mahoney explained it to me when we were brainstorming the prize criteria:
Hutter's* AIXI, http://www.idsia.ch/~marcus/ai/paixi.htm makes another argument for the connection between compression and AI that is more general than the Turing test. He proves that the optimal behavior of an agent (an interactive system that receives a reward signal from an unknown environment) is to guess that the environement is most likely computed by the shortest possible program that is consistent with the behavior observed so far. In other words, the most likely outcome for any experiment is the one with the simplest explanation, where "simplest" means the smallest program that could model what you currently know about the universe.
He gives a formal proof, but it basically says that the only possible distribution of the infinite set of programs (or strings) with nonzero probability is one which favors shorter programs over longer ones. Given any string of length n with probability p > 0, there are an infinite set of strings longer than n, but only a finite number of these can have probability higher than p.
-
Universes and Universal Turing MachinesAn hypothesized (meta)algorithm running our universe has been proposed in "The New AI: General & Sound & Relevant for Physics" by Jürgen Schmidhuber of Dalle Molle Institute for Artificial Intelligence:
"Systematically create and execute all programs for a universal computer, such as a Turing machine or a CA; the first program is run for one instruction every second step on average, the next for one instruction every second of the remaining steps on average, and so on."
This actually computes all universes -- not just ours. It also computes what might be thought of as nested universes, giving rise to the idea promoted by Smolin that some universes might be more prolific than others. Among the consequences of this hypothesis is:"Large scale quantum computation will not work well, essentially because it would require too many exponentially growing computational resources in interfering 'parallel universes'".
Prof. Schmidhuber's post-doc student, Marcus Hutter, of Hutter Prize for Lossless Compression of Human Knowledge fame came up with some of the key breakthroughs in "The New AI" upon which Schmidhuber's hypothesis is based. -
Universes and Universal Turing MachinesAn hypothesized (meta)algorithm running our universe has been proposed in "The New AI: General & Sound & Relevant for Physics" by Jürgen Schmidhuber of Dalle Molle Institute for Artificial Intelligence:
"Systematically create and execute all programs for a universal computer, such as a Turing machine or a CA; the first program is run for one instruction every second step on average, the next for one instruction every second of the remaining steps on average, and so on."
This actually computes all universes -- not just ours. It also computes what might be thought of as nested universes, giving rise to the idea promoted by Smolin that some universes might be more prolific than others. Among the consequences of this hypothesis is:"Large scale quantum computation will not work well, essentially because it would require too many exponentially growing computational resources in interfering 'parallel universes'".
Prof. Schmidhuber's post-doc student, Marcus Hutter, of Hutter Prize for Lossless Compression of Human Knowledge fame came up with some of the key breakthroughs in "The New AI" upon which Schmidhuber's hypothesis is based. -
Universes and Universal Turing MachinesAn hypothesized (meta)algorithm running our universe has been proposed in "The New AI: General & Sound & Relevant for Physics" by Jürgen Schmidhuber of Dalle Molle Institute for Artificial Intelligence:
"Systematically create and execute all programs for a universal computer, such as a Turing machine or a CA; the first program is run for one instruction every second step on average, the next for one instruction every second of the remaining steps on average, and so on."
This actually computes all universes -- not just ours. It also computes what might be thought of as nested universes, giving rise to the idea promoted by Smolin that some universes might be more prolific than others. Among the consequences of this hypothesis is:"Large scale quantum computation will not work well, essentially because it would require too many exponentially growing computational resources in interfering 'parallel universes'".
Prof. Schmidhuber's post-doc student, Marcus Hutter, of Hutter Prize for Lossless Compression of Human Knowledge fame came up with some of the key breakthroughs in "The New AI" upon which Schmidhuber's hypothesis is based. -
Universes and Universal Turing MachinesAn hypothesized (meta)algorithm running our universe has been proposed in "The New AI: General & Sound & Relevant for Physics" by Jürgen Schmidhuber of Dalle Molle Institute for Artificial Intelligence:
"Systematically create and execute all programs for a universal computer, such as a Turing machine or a CA; the first program is run for one instruction every second step on average, the next for one instruction every second of the remaining steps on average, and so on."
This actually computes all universes -- not just ours. It also computes what might be thought of as nested universes, giving rise to the idea promoted by Smolin that some universes might be more prolific than others. Among the consequences of this hypothesis is:"Large scale quantum computation will not work well, essentially because it would require too many exponentially growing computational resources in interfering 'parallel universes'".
Prof. Schmidhuber's post-doc student, Marcus Hutter, of Hutter Prize for Lossless Compression of Human Knowledge fame came up with some of the key breakthroughs in "The New AI" upon which Schmidhuber's hypothesis is based. -
Re:Ohio story
The Wright brothers, who developed and flew the first airplane, were from Ohio.
Developed and flew the first _powered_ airplane.
You mean _powered_and_controllable_ airplane. Ader built a powered airplane before the Wright brothers, but the controls were not up to par. I seem to remember Lilienthal was also on the way to a powered craft, but it also would not have been as controllable as the Wright Brothers' craft. More here, if it helps. -
Wait a minute
Earth without humans... yes, but could it run GNU/Linux?
er...
Sounds funny but some physicists really investigate the possibility of universe being a big computer (and we are the bits, right? and it looks like we are just a bunch of six billion noise bits, so I wouldn't be surprised if a noise filter wipes us out of existence)
-
Wasn't Conrad Zuse first?
Actually, the title 'First Electronic Computer' is not as cut-n-dried as that. There is good evidence that the title should really go to the Z3 from Conrad Zuse. Other that Mauchly/Eckert his system is generally considered to be the best contender for first electronic computer.
http://www.idsia.ch/~juergen/zuse.html -
Fund the C-PrizeThe NSA can get what it wants via a compression prize competition. Compressing a corpus must find the most predictive patterns.
They could fund a prize competition such as the following:
Let anyone submit an open source program that produces, with no inputs, one of the major natural language corpora as output.
S = size of uncompressed corpus
P = size of program outputting the uncompressed corpus
R = S/P (the compression ratio).Award monies in a manner similar to the M-Prize:
Previous record ratio: R0
New record ratio: R1=R0+X
Fund contains: $Z at noon GMT on day of new record
Winner receives: $Z * (X/(R0+X))Compression program and decompression program are made open source.
Explanation For an idea of why the C-Prize can solve the AI problem, if it is solvable, see Matthew Mahoney's comment on it:
Matt Mahoney
Matt Mahoney is the author of Text Compression as a Test for Artificial Intelligence which states:
Jun 17, 7:18 pm show options
Newsgroups: comp.compression
From: "Matt Mahoney"
Date: 17 Jun 2005 20:18:59 -0700
Local: Fri, Jun 17 2005 7:18 pm
Subject: Re: The C-Prize
Hutter's AIXI, http://www.idsia.ch/~marcus/ai/paixi.htm makes another argument for the connection between compression and AI that is more general than the Turing test. He proves that the optimal behavior of an agent (an interactive system that receives a reward signal from an unknown environment) is to guess that the environement is most likely computed by the shortest possible program that is consistent with the behavior observed so far. In other words, the most likely outcome for any experiment is the one with the simplest explanation, where "simplest" means the smallest program that could model what you currently know about the universe.
He gives a formal proof, but it basically says that the only possible distribution of the infinite set of programs (or strings) with nonzero probability is one which favors shorter programs over longer ones. Given any string of length n with probability p > 0, there are an infinite set of strings longer than n, but only a finite number of these can have probability higher than p.
-- Matt Mahoney
It is shown that optimal text compression is a harder problem thanartificial intelligence as defined by Turing's (1950) imitation game; thus compression ratio on a standard benchmark corpuscould be used as an objective and quantitative alternative test for AI (Mahoney, 1999).
(Mahoney is also a competitor who has some winnings from The Calgary Corpus Compression Challenge -
Solve the AI problem and the world will love you.How about solving the AI problem for the good of humanity?
Let anyone submit an open source program that produces, with no inputs, one of the major natural language corpora as output.
S = size of uncompressed corpus
P = size of program outputting the uncompressed corpus
R = S/P (the compression ratio).Award monies in a manner similar to the M-Prize:
Previous record ratio: R0
New record ratio: R1=R0+X
Fund contains: $Z at noon GMT on day of new record
Winner receives: $Z * (X/(R0+X))Compression program and decompression program are made open source.
Explanation For an idea of why the C-Prize can solve the AI problem, if it is solvable, see Matthew Mahoney's comment [tinyurl.com] on it:
Matt Mahoney
Matt Mahoney is the author of Text Compression as a Test for Artificial Intelligence which states:
Jun 17, 7:18 pm show options
Newsgroups: comp.compression
From: "Matt Mahoney"
Date: 17 Jun 2005 20:18:59 -0700
Local: Fri, Jun 17 2005 7:18 pm
Subject: Re: The C-Prize
Hutter's AIXI, http://www.idsia.ch/~marcus/ai/paixi.htm makes another argument for the connection between compression and AI that is more general than the Turing test. He proves that the optimal behavior of an agent (an interactive system that receives a reward signal from an unknown environment) is to guess that the environement is most likely computed by the shortest possible program that is consistent with the behavior observed so far. In other words, the most likely outcome for any experiment is the one with the simplest explanation, where "simplest" means the smallest program that could model what you currently know about the universe.
He gives a formal proof, but it basically says that the only possible distribution of the infinite set of programs (or strings) with nonzero probability is one which favors shorter programs over longer ones. Given any string of length n with probability p > 0, there are an infinite set of strings longer than n, but only a finite number of these can have probability higher than p.
-- Matt Mahoney
It is shown that optimal text compression is a harder problem thanartificial intelligence as defined by Turing's (1950) imitation game; thus compression ratio on a standard benchmark corpuscould be used as an objective and quantitative alternative test for AI (Mahoney, 1999).
(Mahoney is also a competitor who has some winnings from The Calgary Corpus Compression Challenge.)Now, who might fund something like the C-Prize?
-
The avoidable danger: Bias'they're trying to build the machine that will pass the Turing test'
The profound danger of a biased AI here is quite avoidable. The theoretic problem of unbiased AI has been formally solved by Marcus Hutter with AIXI:
Computational AI. There are strong arguments that AIXI is the most intelligent unbiased agent possible in the sense that AIXI behaves optimally in any computable environment.
This is the reason I set up the following definition of the C-Prize:
Let anyone submit a program that produces, with no inputs, one of the major natural language corpora as output.
S = size of uncompressed corpus
P = size of program outputting the uncompressed corpus
R = S/P (the compression ratio).Award monies in a manner similar to the M-Prize:
Previous record ratio: R0
New record ratio: R1=R0+X
Fund contains: $Z at noon GMT on day of new record
Winner receives: $Z * (X/(R0+X))Compression program and decompression program are made open source.
Explanation For an idea of why the C-Prize can solve the AI problem, if it is solvable, see Matthew Mahoney's comment on it:
Matt Mahoney
Jun 17, 7:18 pm show options
Newsgroups: comp.compression
From: "Matt Mahoney"
Date: 17 Jun 2005 20:18:59 -0700
Local: Fri, Jun 17 2005 7:18 pm
Subject: Re: The C-PrizeHutter's AIXI, http://www.idsia.ch/~marcus/ai/paixi.htm makes another argument for the connection between compression and AI that is more general than the Turing test. He proves that the optimal behavior of an agent (an interactive system that receives a reward signal from an unknown environment) is to guess that the environement is most likely computed by the shortest possible program that is consistent with the behavior observed so far. In other words, the most likely outcome for any experiment is the one with the simplest explanation, where "simplest" means the smallest program that could model what you currently know about the universe.
He gives a formal proof, but it basically says that the only possible distribution of the infinite set of programs (or strings) with nonzero probability is one which favors shorter programs over longer ones. Given any string of length n with probability p > 0, there are an infinite set of strings longer than n, but only a finite number of these can have probability higher than p.
-- Matt Mahoney
Matt Mahoney is the author of Text Compression as a Test for Artificial Intelligence which states:
It is shown that optimal text compression is a harder problem thanartificial intelligence as defined by Turing's (1950) imitation game; thus compression ratio on a standard benchmark corpuscould be used as an objective and quantitative alternative test for AI (Mahoney, 1999).
(Mahoney is also a competitor who has some winnings from The Calgary Corpus Compression Challenge.)
Now a big question here is whether it might be possible to create a verifiably unbiased AI without making the compression program open source. In any case I don't think it is wise to trust any AI that hasn't at least gone through a compression competition with other purportedly unbiased AI's compressing an open source corpus.
Now, who might fund something like the C-Prize?
Well, here's a suggestion:
Since:
-
The avoidable danger: Bias'they're trying to build the machine that will pass the Turing test'
The profound danger of a biased AI here is quite avoidable. The theoretic problem of unbiased AI has been formally solved by Marcus Hutter with AIXI:
Computational AI. There are strong arguments that AIXI is the most intelligent unbiased agent possible in the sense that AIXI behaves optimally in any computable environment.
This is the reason I set up the following definition of the C-Prize:
Let anyone submit a program that produces, with no inputs, one of the major natural language corpora as output.
S = size of uncompressed corpus
P = size of program outputting the uncompressed corpus
R = S/P (the compression ratio).Award monies in a manner similar to the M-Prize:
Previous record ratio: R0
New record ratio: R1=R0+X
Fund contains: $Z at noon GMT on day of new record
Winner receives: $Z * (X/(R0+X))Compression program and decompression program are made open source.
Explanation For an idea of why the C-Prize can solve the AI problem, if it is solvable, see Matthew Mahoney's comment on it:
Matt Mahoney
Jun 17, 7:18 pm show options
Newsgroups: comp.compression
From: "Matt Mahoney"
Date: 17 Jun 2005 20:18:59 -0700
Local: Fri, Jun 17 2005 7:18 pm
Subject: Re: The C-PrizeHutter's AIXI, http://www.idsia.ch/~marcus/ai/paixi.htm makes another argument for the connection between compression and AI that is more general than the Turing test. He proves that the optimal behavior of an agent (an interactive system that receives a reward signal from an unknown environment) is to guess that the environement is most likely computed by the shortest possible program that is consistent with the behavior observed so far. In other words, the most likely outcome for any experiment is the one with the simplest explanation, where "simplest" means the smallest program that could model what you currently know about the universe.
He gives a formal proof, but it basically says that the only possible distribution of the infinite set of programs (or strings) with nonzero probability is one which favors shorter programs over longer ones. Given any string of length n with probability p > 0, there are an infinite set of strings longer than n, but only a finite number of these can have probability higher than p.
-- Matt Mahoney
Matt Mahoney is the author of Text Compression as a Test for Artificial Intelligence which states:
It is shown that optimal text compression is a harder problem thanartificial intelligence as defined by Turing's (1950) imitation game; thus compression ratio on a standard benchmark corpuscould be used as an objective and quantitative alternative test for AI (Mahoney, 1999).
(Mahoney is also a competitor who has some winnings from The Calgary Corpus Compression Challenge.)
Now a big question here is whether it might be possible to create a verifiably unbiased AI without making the compression program open source. In any case I don't think it is wise to trust any AI that hasn't at least gone through a compression competition with other purportedly unbiased AI's compressing an open source corpus.
Now, who might fund something like the C-Prize?
Well, here's a suggestion:
Since:
-
Re:15 minutes?
I think he was refering to the generally held first flight (Montgolfier in 1783, if you ignore Meerwein in 1781). These flights ranged beteween 10-20 mins.
The more you look at the Wrights flights to less important they get - they are really an example of the way the Americans will go to any length to glorify their own exploits over the rest of the world. http://www.idsia.ch/~juergen/planetruth.html gives a reasonable view of important aviation advances, though it misses out possibly the most important - Cayley in 1849-53 with the first heavier-than-air. -
Re: Wolframhttp://www.idsia.ch/~juergen/wolfram.html http://en.wikipedia.org/wiki/Stephen_Wolfram
All Stephen Wolfram did was compile 20 years of research in information theory, emergent systems, and the like, and call it a "New Kind of Science" and claim it as his own. There's a scathing letter from someone at the Santa Fe Institute documenting every claim Wolfram calls his own and a corresponding paper from the Institute published years before NKoS. There are tons of these.
Wolfram is a genius, but NKoS is no evidence of that fact.
-
Re:First Powered Flight
The reason Richard Pearse's "flights" weren't considered the first powered flight is because he basically "powered" his contraptions off the edge of a cliff and glided to a land. The Wright brothers actually lifted off the ground under their own power, as opposed to having the ground drop away (not to mention there was never any proof of his flights actually taking place). Nice conspiracy theory though, keep it up.
Check out this page: http://www.idsia.ch/~juergen/planetruth.html to see even more information on powered flight and others' accomplishments before and after the Wrights. First powered flight was a dirigible 50 years before Pearse and the Wrights, the first "heavier than air" powered flight took place in 1890, over a decade before Pearse and the Wrights... etc. etc. etc. -
A far better contest is compression.Compression is a far better basis for intelligence competition than chess, the Turing test or even SAT verbal analogy tests.
Marcus Hutter's AIXI paper provides a proof that if an agent is a good model for human behavior, and the universe is computable, that the most intelligent program is the smallest program that losslessly compresses the set of observations of the universe.
I've formalized a prize competition based on this criterion as the C-Prize, modeled after the Methusela Mouse Prize. The big difference is that instead of lifespan the metric is intelligence. Here is the currently published C-Prize criteria:
Since all technology prize awards are geared toward solving crucial problems, the most crucial technology prize award of them all would be one that solves the rest of them:
The C-Prize -- A prize that solves the artificial intelligence problem.
The C-Prize award criterion is as follows:
Let anyone submit a program that produces, with no inputs, one of the major natural language corpora as output.
S = size of uncompressed corpus
P = size of program outputting the uncompressed corpus
R = S/P (the compression ratio).Award monies in a manner similar to the M-Prize:
Previous record ratio: R0
New record ratio: R1=R0+X
Fund contains: $Z at noon GMT on day of new record
Winner receives: $Z * (X/(R0+X))Compression program and decompression program are made open source.
Explanation A very severe meta-problem with artificial intelligence is the question of how one can define the quality of an artificial intelligence.
Fortunately there is an objective technique for ranking the quality of artificial intelligence:
Kolmogorov Complexity
Kolmogorov Complexity is a mathematically precise formulation of Ockham's Razor, which basically just says "Don't over-simplify or over-complicate things." More formally, the Kolmogorov Complexity of a given bit string is the minimum size of a Turing machine program required to output, with no inputs, the given bit string.
Any set of programs which purport to be the standards of artificial intelligence can be compared by simply comparing their Artificial Intelligence Quality. Their AIQs can be precisely measured as follows:
Take an arbitrarily large corpus of writings sampled from the world wide web. This corpus will establish the equivalent of an IQ test. Give the AIs the task of compressing this corpus into the smallest representation. This representation must be a program that, taking no outside inputs, produces the exact sample it compressed. The AIQ of an AI is simply the ratio of the size of the uncompressed writings to the size of the program that, when executed, produces the uncompressed writings.
In other words, the AIQ is the compression ratio achieved by the AI on the AIQ test.
The reason this works as an AI quality test is that compression requires predictive modeling. If you can predict what someone is going to say, you have modeled their mental processes and by inference have a superset of their mental faculties.
Mechanics The C-Prize is to be modeled after the Methusela Mouse Prize or M-Prize where people make pledges of money to the prize fund. If you would like to help with the set up and/or administration of this prize award similar to the M-Prize let me know by email.
-
Re:Completely Unsurprised
This leads to all sorts of cool stuff, including things like a unification of Gödel's Proof, Turing's Halting proof, Hilbert's Tenth Problem, and chaos theory.
And provably optimal AI. Truly fascinating stuff. -
Re:MODERATORS - plagiarism / karma whore
-
Something similar
Isn't this version cooler? (scroll half way down the page to see it)
-
Re:Revisionist HistoryThere's a link to the English translation of his paper here.
Don't know whether you've read it, but I'm just about to.
-
Definition Defined
As to the first computer, I don't quite understand what you are contesting. By Wilkes' machine you do mean EDSAC completed in 1949?
- Is 'go into service' the operative phrase here? Do you mean the first computer to do something useful? In that case, I don't disagree, but I do think it's beside the point. If we are talking about the first computer to do X, X could equally be compile a program for itself from a high level language, emulate another computer, or run a multitasking operating system, these are all significant things, but Baby was still the first computer.
- Or is 'stored program computer' the operative phrase? Are you suggesting that Baby wasn't a stored program computer?
- Or since you don't actually specify EDSAC, do you mean Wilkes had already built a computer before Baby?
As to the Von Neumann Architecture, like I said, I think this is probably a more difficult question.
You seem to suggest that Atanasoff had thought of the Von Neumann Architecture in the 1930s? (I'm not disagreeing, but you don't make it absolutely clear that this is what you are saying.) I understand that Konrad Zuse had thought of the Von Neumann Architecture in the 1930s, and had written about it in a patent application (info here), so I don't know if that would have preceded Atanasoff or not.
But then, it may have crossed Charles Babbage's mind too (whether or not he though it was a worthwhile idea). And it's even possible that Archimedes had written about the Von Neumann Architecture in the Alexandrian Library. Not that I'm saying he did, but who would know?
-
Re:Wow...
-1 Ignorant. I happen to have a degree in CS.
I may not have gotten the exact idea down, but yes a very good approximative traveling salesman algorithm is based on ant behavior.
Do some research here for some undergrads that used the idea learned from here(pdf)
(Which are link i got from a two minute perusal on google for "traveling salesman ants")
Please have an idea what you are talking about next time.
Here's the abstract from the latter source.
We describe an artificial ant colony capable of solving the traveling salesman problem (TSP). Ants of the artificial colony are able to generate successively shorter feasible tours by using information accumulated in the form of a pheromone trail deposited on the edges of the TSP graph. Computer simulations demonstrate that the artificial ant colony is capable of generating good solutions to both symmetric and asymmetric instances of the TSP. The method is an example, like simulated annealing, neural networks, and evolutionary computation, of the successful use of a natural metaphor to design an optimization algorithm.