Are Computers Ready to Create Mathematical Proofs?
DoraLives writes "Interesting article in the New York Times regarding the quandary mathematicians are now finding themselves in. In a lovely irony reminiscent of the torture, in days of yore, that students were put through when it came to using, or not using, newfangled calculators in class, the Big Guys are now wrestling with a very similar issue regarding computers: 'Can we trust the darned things?' 'Can we know what we know?' Fascinating stuff."
...precisely why it's so hard to see that a pyramid of cannonballs is an optimal stack? This seems almost axiomatic.
I guess the obvious Monte Carlo simulation doesn't constitute "proof," but still, what exactly is the big question here?
I think the issue will become: can we learn anything on our own, we don't want to rely on imposibly long proofs. My calc teacher hates us using calculators, she thinks it will be the end of calculus as we know it.
When the results become too inaccurate then you have to change your tools, so unless accuracy is key, such as NASA's calculations and the likes then you work on making a computer do as much as possible to eliminate human error and humans can be there to adjust the almost mythic fudge factor
Computers are a human creation...it's not a matter of whether we can trust the computer, but rather a matter of can we trust that the people who built the computer and coded the software it runs knew what they were doing and didn't make any errors. Computers can only do what we tell them to...so really it was humans who indirectly made the proofs by producing a system capable of doing so. All it really boils down to is whether the folks who made the system and it's software knew what they were doing or not and whether they made any errors or not.
I don't see what the big deal is; not only is the general problem of proving mathematical statements undecidable (even without considering Godel's theorem) but even solvable problems require a lot of human intervention to get solved. Most problems (ie - examples out of math textbooks) aren't going to come up with a proof in any reasonable amount of time by simply dropping it into a theorem prover and pushing "go".
"Automated theorem proving" is less an automatic process (like you'd get with an automated production line) and more of a mechanical assistance to the job (like using a fork-lift to move heavy things faster than you could by hand).
It not only takes work to convert a problem into a good representation, but then you have to structure the problem statement in such a way that a theorem prover can make optimal use of it. Often times, you're forced to, upon following the output, prove lemmas (sub-proofs).
Then, when you finally get a proof, you get the joy of trying to simplify it to something that -can- be understood by a person; again, this is part of the process that can't really be automated well.
my sig's at the bottom of the page.
..if you can prove the program. I know that a lot of people look down on axiomatic semantics and model checkers; but I also know some people that started in that area a long time ago that still believe in it, and even more that are trying to get faculty appointments out of this rebounding field. If you can prove a program does what you expect it to and it, it turn, can be shown to prove what you really wanted, I don't see the problem. Maybe some of our theoretical mathematicians just need a dose of practical computer science.
The problem with computer generated proofs is that in order to trust the result of the computer, you have to trust:
And of course, our understanding of the hardware depends on how accurate our understanding of the laws of physics is. Any mistake in either the source code, compiler, or hardware, and potentially the proof produced is incorrect. That's an awful amount of stuff you have to check just to make absolutely sure the computer is correct. Then, consider how many bug-free pieces of software you've encountered. ... Yeah, I can see why mathematicians would not trust computer generated proofs.
Of course, people are not infallible either, but that's well known and expected. It's all about how much uncertainty people are willing to accept.
True. Any argument is only as good as it's assumptions.
You may find the following project interesting:
It [acl2] is a common lisp programming language and a system for proving properties about code. It has been used to prove correctness properties about microprocessors[microcode], compilers, and application code. Notably, I believe it is currently being used to prove certain properties about the Java Sandbox.
When the FDIV bug was reported, AMD hired it's creators to mechanically verify correctness of it's floating point microcode on a model of the K5 developed on the system. Details
In fact, they were unable to verify it's correctness, as it wasn't correct. Fortunately, the code was able to be changed, prior to chip fabrication, saving AMD a lot of money and a lot of embarassment.
as a graduate in the fields of mathematics, i spent a large portion of my five undergraduate years doing proofs. there are a great many ways to prove things, sometimes applicable sometimes not. (e.g. using inductive proofs for numeric theorems is all well and good but completely useless for any sort of ring-theory or spatial proofs)
there are also several levels of depth for proofs, ranging from "i've found a counter example so i can write the whole thing off as garbage" to "i have exhaustively and rigorously proved this starting with the basic axioms of number theory and worked my way on up"
the latter is really the only acceptable way to prove anything seriously. sometimes when you are reworking an already- done proof to illustrate a point, other mathematicians will allow a bit of latitude when it comes to cutting corners, but for a proof as far-reaching as the one in the article, i would only be interested in a "rigorous" proof, that is, one that started with the foundational tenets of mathematics and combined those to form and prove other postulates, etc. very much a form of abstraction, not unlike large development projects.
the problem arises when one (or several) humans have to be able to objectively check the whole thing. to use my prior example of a large development project, no one developer at microsoft understand the whole of windows. it's too big for a single human to understand. each developer knows what he needs to do to complete his part, and so on and so forth.
traditionally, for proofs, a single mathematician (or a small group) would hammer out the whole proof, so the level of complexity remained at a human-understandable level. (even if tedious) my concern, as a mathematician, with using an automated solution would be the rapidly growing order of complexity needed to properly back up increasingly complex proofs. as stated in the article, it's like trying to proofread a phonebook. (only, you must also consider that for every element of the proof (a particular listing) there are several branching layers of complexity (fundamental, underlying proofs many layers deep) underneath. this gets more complicated in an exponential fashion) obviously this approach will only remain human-checkable for relatively small problems. (programmers: think of some horrible nondeterministic- polynomial problem like the "traveling salesman" problem. systems, like humans, are a finite resource, but if you increase the size of the problem, the complexity will quickly grow far beyond your ability to compute. large proofs suffer from the same difficulties, if not quite as concrete and pronounced as NP algorithms)
in closing, i would have to agree that proofs, no matter the effort and computing time put into them, really should not be viewed as being as rigorous as those provable "by hand" and human- understandable, even if the computer has arrived at a satisfactory conclusion, because we have no way of KNOWING if the computer has built up the proof correctly, except that it says it has.
In Soviet Russia jokes are formulaic and decidedly non-humorous.
On the other hand, the classification problem for finite simple groups has been rigorously solved, but the collected proof (done in bits by hundreds of mathematicians working over 30 years) is tens of thousands of pages in many different journals. given the standards of review, it is a virtual certainty that there is an error somewhere in there that hasn't been found. So, again, the solution to this problem is *probably* right, but it has been accepted as solved.
Wouldn't this be a good use for the computer, then - attempting to verify these disjointed proofs and combine them into one unified whole?
Ewige Blumenkraft.
I think the "as we know it" is the key word here. It creates a new juggling act that teachers have to deal with. On one hand, calculators are extremely useful for the things at which humans are error prone. If students can use calculators or Mathematica or something, they can check their arithmetic much more reliably, which is great for isolating problems with learning concepts from basic mathematical errors. Granted, some people don't see this as a bonus since it's good to be able to do arithmetic reliably without aid. For smaller numbers I agree, but in the real world people use calculators for arithmetic with numbers that have a lot of sig figs. Making students do this arithmetic by hand is just distracting them from learning the concepts they are supposed to be learning.
On the other hand, many students are prone to using these devices and applications as crutches and try to get away with doing things like using their calculator's implementation of Newton's Method instead of solving the problem themselves.
Some professors have found solutions to this problem, others havent. When I was at college, I think our math department had achieved a pretty good level of harmony with Mathematica - we were expected to do a lot of stuff by hand or in our heads - Gaussian elimination, for example - but in order to make the math seem useful, we were also exptected to be able to solve real-world problems with the stuff we learned. Not contrived "real-world" problems from your high-school textbook, but stuff like interpreting large and dirty scientific datasets where the specific technique we would have to use to solve the problem was something we could figure out, but not something that had been explicitly laid out by the textbook or in lecture. We had to apply the concepts we had learned to figure out the problem, but there was no way we were going to chug out that arithmetic by hand - when was the last time you tried to work on a 16x16 matrix using a pencil and paper? How about a 100x100 one?
For those who don't understand why mathematicians are always striving for better than "probably true", the Mertens Conjecture [Mathworld] is the commonly given example. It was thought to be true. As the Mathworld article notes, as late as 1985, arguments were made that there was no counter-example for numbers less than 10^20 or even 10^30. However, in 1987, someone proved a counter-example DOES exist. It is somewhere between 10^12 and 10^65. In short, you could process billions of integers and find that the conjecture held, leading you to believe it was "probably true". But it is actually false.
Can't remember where I came across this, but for entertainment:
Men are four:
He who knows and knows that he knows; he is wise, follow him.
He who knows and knows not that he knows; he is asleep, wake him.
He who knows not and knows that he knows not; he is ignorant, teach him.
He who knows not and knows not that he knows not; he is a fool, spurn him!
I'm sure it's coloured my attitude to supporting users:-)
Cameron Simpson, DoD#743 cs@cskk.id.au http://www.cskk.ezoshosting.com/cs/
What strange timing -- I just saw a talk this morning from Hales at CMU about his work on formalizing the Kepler conjecture ("theorem") in HOL Light.
I can understand not wanting to trust a big piece of C code that purports to check a huge number of cases of a proof doing numerical analysis. It took 6 years of 4 people trying to verify his computer proof of the Kepler conjecture--before they gave up. If a program is that hard to believe, then it does indeed deserve lesser status than a handwritten proof that can be checked by mathematicians in shorter time.
On the other hand, there are other computer tools that we really should trust, and that are revolutionizing mathematics. The idea is simple: write your proofs in such detail that they can be mechanically checked by a simple (easy to verify) procedure. This is much better than paper proofs, because the potential for human error is minimized. (I don't think anyone will argue that published proofs have often been wrong, and the proofs have not been caught by the peer reviewers!) Since it's really, really bad to believe wrong proofs, there's a very real benefit that is offset by the sometimes tedious work necessary in formalization.
That's what Tom Hales is doing with flyspeck, and I think it is the future of mathematics. (In fact, I have recently become addicted to mechanizing my own proofs in Twelf--it's not only immensely satisfying, but it helps me sleep better at night and makes for stronger papers.)
I know I am splitting hairs (but who isn't in this kind of discussion), but I can't wrap around my head around the difference DK and DD to the person in question .
:P.
What i mean is,effectively not knowing you know something is the same as not knowing you don't know something, since you are ignorant of it anyway. Of course, one can remind me that I know something, (e.g. "You idiot! That's just the Laplace-Beltrami operator you learned in class!"), but then to me I am now KK, not DK.
Maybe DK is just a very confusing way of saying "forgot for a moment".
Ok, my head is spinning again. Maybe the hairs are too fine to split anyway
Mode (3) smart-aleck mode. Press * to return to main menu.
No! Definitely not! Really! :-)
Most faults are software problems, not hardware, so having different machines won't help. Further, interestingly, most major software faults (at least, of the sort that make it through serious testing) tend to be conceptual problems, not coding ones, and different people tend to make the same mistakes. This means in practice that even when you have completely independent software implementations (called n-version programming), they're frequently all wrong in the same way at the same times. See the famous Knight & Leveson paper.
The Mobius function is a number theoretic function with the value:
This is why we have to prove the Riemann hypothesis. Not only does the problem quit bugging us, but we learn new things along the way. When Andrew Wiles proved Fermat, proving Fermat gave no immediate benefit, but it also proved an important conjecture about elliptical curves and I believe it provided the mathematics community with Galois theory.
(IANAM.)
--Stephen
Did you ever notice that *nix doesn't even cover Linux?
--Stephen
Did you ever notice that *nix doesn't even cover Linux?
Perhaps I'm missing something here, but I think the approach of verifying the validity of the proofs that come out of the kind of system described in the article is fundamentally the wrong approach.
Instead, mathematicians ought to focus on formally proving the proof generator. If it could be fomally proved that the proof generator only generated valid proofs, we could automatically trust all the proofs that it generated. Program proof and verification is a complex topic, but it's a quickly maturing area of CS.
I'm not sure that there are only those four catagories.
Godels theorem pretty much says that there are things that you can NEVER know are true or false...and that in some cases you can PROVE that you can never know them.
This leads to some other states that your knowledge can be in:
CERTAINTY OF UNCERTAINTY: In some cases you can know (by solid mathematical proof) that you can never know some particular thing. For example. we know with absolute certainty - that you can't solve the 'Halting Problem' (to prove whether an arbitary computer program will eventually halt or whether it'll run forever). That's something we KNOW we'll never be able to do no matter how smart we get. That's not the same as KK/KD/DK or DD. It's tempting to lump this in with KD - but in the case of simply being aware that we are ignorant of something, we might take steps to resolve that ignorance. In this case, we know that this is something we cannot EVER know.
CERTAINTY OF POTENTIAL CERTAINTY: If you know the current "largest prime number" - then by definition, you don't know of a larger prime - but you DO know that you could definitely find a larger prime if you really wanted to.
UNCERTAINTY OF POTENTIAL CERTAINTY: Then there are other things that you might not know whether they are knowable or not - such as the proof of some of the classic mathematical problems. Before Fermat's last theorem was proved, nobody was sure whether it could be proved or not.
www.sjbaker.org
About 20 years ago, I worked for a company near Boston, and played in a co-ed softball league made up of teams from other companies in the area. One of the teams had an extremely attractive third basewoman, who was also quite friendly, which I discovered during a game where I had actually made it to third base. At the beer hall we went to after the games, I asked her what she did; she told me she was the head of Q&A for a product that would supposedly produce provably correct programs. You used some kind of GUI to draw something like a flowchart, typed in a few constraints, and then clicked on a button, and out would come bug-free code (in Fortran or something). "Sounds cool" I said, but she laughed and said that the development team had been having a really hard time getting the bugs out of the tool's parser. For the next few weeks, the joke was about the shortsightedness of her company's management - "Why don't they just do the obvious thing, and use their tool to generate the code for the parser?". They never were able to get all the bugs out, and went out of business a short time later.
While the article spaks about intentional errors, I could imagine the same to be true for bugs (while introducing a self-replicating bug by accident is unlikely, I'm not sure if it's unlikely enough to never occur).
Say, an optimizer is buggy. Say, in a certain line of code in the compiler, there's a '>' instead of a '<' sign. Unfortunately, this causes, under certain, rare conditions, the optimizer to miscompile '<' into '>'. Now, since triggering that bug is rare, it goes unnoticed for a while, and the buggy compiler is installed as default compiler for further development.
Now, at some later time, the bug is found, a developer looks at the code and finally finds the error. He replaces the '>' by '<' in the hope of fixing the bug. Unfortunately, he doesn't recognise that this very comndition now triggers the bug, having the compiler internally replacing '<' to '>' again, replicating the bug again.
Now, the test case will show that the bug is still there, but the developer will simply not find the bug in the code. After all, the bug isn't in the source any more. It's only there in the compiled program. At some later time, though, it might magically disappear by some unrelated change nearby, which causes the bug not to be triggered any more.
The Tao of math: The numbers you can count are not the real numbers.
If we can prove the Poincare Conjecture is true
OK, yes, that would definitely be worth doing. Various chunks of 3-dimensional geometric topology currently include the caveat ``... if the 3-dimensional Poincare Conjecture is true.''
(and it is now thought that it might be)
Depends who you talk to, of course. Perelman reckons he's got an outline proof of Thurston's Geometrisation Conjecture, which implies the Poincare Conjecture as a corollary. But as I understand it he's not actually proved it, just described how one would go about proving it - essentially saying ``It's over the other side of that hill''. Now Perelman is a very bright lad (cleverer than I am) but the Poincare Conjecture has a long history of almost being proved (my research supervisor (I've just finished a PhD in algebraic and geometric topology) almost did it about twenty years ago, for example, but there was a very subtle and unresolvable flaw in his argument) so I'm going to reserve judgement until the paper turns up in the Annals of Mathematics.
I wouldn't be surprised if the Conjecture turned out to be true, as the corresponding result is known to be true in dimensions 1,2,4,5,... (well, to be picky, the smooth 4-dimensional case hasn't been proved yet, but the topological and piecewise-linear ones have). But equally, I wouldn't be shocked if it weren't - 3-dimensional topology is a weird subject, as there's enough room to start discussing interesting things, but there's not so much room that everything trivialises.
Quick precis of the Conjecture:
``If it looks like a sphere, it is''.
Slightly less quick precis of the Conjecture:
Given any topological space X, we can assign a sequence of groups (`homotopy groups') pi_n(X) to it, so that pi_n(X) in some sense describes the n-dimensional structure of X. An n-manifold (a topological space which locally `looks like' ordinary, flat n-dimensional space) which has the same sequence of homotopy groups as the 3-dimensional sphere, is called a `homotopy 3-sphere'. The Poincare Conjecture is that the 3-sphere itself is the only homotopy 3-sphere.
So, to prove it, we have to show that every homotopy 3-sphere is topologically equivalent (`homeomorphic') to the 3-sphere. And to disprove it, we just have to find one counterexample - a homotopy 3-sphere which isn't equivalent to the 3-sphere itself.
Now to use a computer to prove the Conjecture, we need to find some way of verifying that every possible homotopy 3-sphere is equivalent to the 3-sphere. This is theoretically doable - there's an algorithm (the Rego-Rourke algorithm) which lists (with redundancy), all possible homotopy 3-spheres. There's another algorithm (the Rubenstein-Thompson algorithm) which, given a homotopy 3-sphere, can tell if it is equivalent to the 3-sphere. So in theory we just feed the output of the RRA into the RTA.
Except that there are an infinite number of possible homotopy 3-spheres to check, so if the Conjecture is true, this program will never terminate.
Now if you can find some way of reducing the cases under consideration to some finite subset (by, for example, showing that all but a finite number of homotopy 3-spheres obviously satisfy the conjecture) then using a computer suddenly becomes a worthwhile endeavour. This is basically what Appel and Haken did with the Four-Colour Map Theorem.
The computer approach is also useful for searching for counterexamples, and for verifying that all cases up to some level of complexity satisfy the Conjecture.
But where computers aren't currently going to help, is in the actual creative side of things - a lot of important modern mathematics (and the related theorems and proofs) has come from very clever humans saying ``OK, what happens if we try this (utterly weird and counterintuitive) thing here'' and having the aesthetic sense to tell when something's actually interesting or just irrelevant. Until computers are capable of this kind of creative/intuitive/aesthetic/etc behaviour, I doubt they're going to be replacing human research mathematicians any time soon.
nicholas