New Algorithm Provides Huge Speedups For Optimization Problems (mit.edu)
An anonymous reader writes: MIT graduate students have developed a new "cutting-plane" algorithm, a general-purpose algorithm for solving optimization problems. They've also developed a new way to apply their algorithm to specific problems, yielding orders-of-magnitude efficiency gains. Optimization problems look to find the best set of values for a group of disparate parameters. For example, the cost function around designing a new smartphone would reward battery life, speed, and durability while penalizing thickness, cost, and overheating. Finding the optimal arrangement of values is a difficult problem, but the new algorithm shaves a significant amount of operations (PDF) off those calculations. Satoru Iwata, professor of mathematical informatics at the University of Tokyo, said, "This is indeed an astonishing paper. For this problem, the running time bounds derived with the aid of discrete geometry and combinatorial techniques are by far better than what I could imagine."
You never know; I'm sure he had plenty of 1UPs.
How can I believe you when you tell me what I don't want to hear?
I guess they never heard of the "No Free Lunch" theorem for optimization, which believe it or not is the name of a proven theorem by David Wolpert that says the following is rigrorously true.
Averaged over all optimization problems every possible algorithm (that does not repeat a previous move) takes exactly the same amount of time to find the global minimum.
The collorary for this is that: If you show your algorithm outperforms other algorithms on a test set, then you have just proven it performs slower on all other problems.
That is the better it works on a finite class of problem, the worse it is on most problems.
it is even true that a hill climbing algorithm blunders into the minimum in the same average time as a hill descending algorithm. (yes that is true!).
Look it up. It's the bane of people who don't understand optimization.
The only things that distinguishes one optimization algorithm from another is:
1) if your problems are restricted to a subspace with special properties that your algorithm takes advantage of
2) your performance metric is different then the average time it takes to find the global minimum. for example, something that limits the worst case time to get within epsilon of the minimum.
Some drink at the fountain of knowledge. Others just gargle.
Math is hard.
Or, more accurately, math is believed to be hard.
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
To give a brief and hopefully laymen-friendly explanation:
It isn't an equation per se; P and NP are classes of computational problems. Roughly speaking, P is the set of problems that can be solved deterministically in polynomial time with respect to some input parameter, and NP is the set of problems that can be solved non-deterministically in polynomial time with respect to some input parameter (hence the "N"). A more approachable definition of NP for laymen is that it comprises problems for which a solution can be verified (or rejected) in polynomial time. For instance, prime factorization of an integer is in NP because, given an input integer and a list of prime factors, you can verify that the list is actually the prime factorization of the integer in polynomial time (by multiplying them together).
The significance, aside from pure theoretical interest, is that a lot of very interesting practical problems are known to be in NP, and knowing that those problems could be solved in polynomial time would be very useful.
Averaged over all problems, the time it takes to figure out which special case one has plus the time to solve that special case has to be the same as any other algorithm. The escape clause is if when you say "cutting plane" you are intrisically restricting the problem to a special case. In that case alrgorithmic speedups are plausible. Likewise is cutting plane is a class of algorithm and you are comparing variations on that particular class of algorithm then speed ups are again plausible. But if all possible problems are addressed as cutting plane problems then No it is not possible for there to be any average speed up.
But please realize that all is not lost. It is very likely that all problems of interest to humans is a tiny set of the space of all possible problems. It may well be that we are only interested in a very special subspace. I'd even bet on it. But defining what that subspace is will be hard.
Some drink at the fountain of knowledge. Others just gargle.
I read the summary. I didn't understand it. Then I read the article. I didn't understand it either.
Why, I oughtta!
Anyhow, I'm checking out the paper. Well, I will be after snuggle-time. Yay! A use-case for a tablet. I've a bit of work in this area though, specifically, modeling traffic - traffic optimization *is* hard. Throughput, where humans are concerned, is nearing an attempt at modeling chaos (and for those who think perfection is possible, I welcome their insights).
I've yet to read it but I suspect it's not a huge advance so much as it is working on prior modal work. (Not all work gets published, for better or worse, and some remains a trade-secret.) My guess is that (and this is just from a quick look, absolutely not to be taken as an authoritative statement) this is just further optimization, refinement, and it appears like it may be a good step forward.
I'm not sure that I agree with the phone analogy but, well, okay... We're not children.
It looks like, if you've got highway traffic going to A, B, and C and you have exit/merge nodes 1, 2, and 3 then you can take the current values and include the street traffic - as well as time of day data, up to and including the myriad traffic uses, that you can then more accurately predict if new merge 1.5 will increase traffic for traffic going from A to B while decreasing overall traffic (of increasing it) from B to C and if it will actually reduce surface street traffic. It also looks like you can use this to predict if an exit at 1.5 will decrease overall time for, example, delivery vehicles that need to reach surface streets or if they're better shoveled off to a new exit at 2.5 or left to exit on 3. You can also deduce predictions (never certainties) for throughput from A to C, A to B, and which exits or mergers are more likely to congest or reduce traffic on the highway as well as optimize for certain functions such as the aforementioned delivery trucks.
Basically, in programmer terms, the more subroutines you can have the more accurately you can refine your answers and the better the results *can* be. Of course this adds complexity and computational difficulty. This algorithm optimizes existing work by allowing you to use more subroutines more efficiently - perhaps think of it like calling a library or OOP (I guess?). As you were a teacher, it's akin to being able to (more easily) tailor an individualized lesson plan for each student who may, or may not, excel and then, at the end, they'd all be able to pass their test which may, or may not, be also tailored for them. Basically, you'd be able to blab a whole bunch of data at them and they'd then get the appropriate information, tailored to their needs, and may output the correct answers on the test.
This doesn't do anything (from what I see so far) to actually ensure you're inserting good data (of course). I am obviously stretching the analogies quite a bit but math is hard. Well, no... It's not hard so much but, rather, it is that most people learned via rote and not conceptually. Some idiot decided to make word problems, not a bad idea. Some idiot came along after them and told them to take the word problems apart and to ignore the words. Nobody actually showed them the concepts of the maths themselves and why the answers are the way they are. I could type a novella on the subject... Complete with examples! Nobody would listen/read it and I don't blame them.
In short, this is "just" improving on existing models and appears to be doing so by making various inputs more accessible or, more accurately, more streamlined. Again, a very cursory scan was done and I could be way off. Another quick look makes me think I'm correct. I really need to read it to be certain.
'Snot hard. *nods* Looks like good work from a quick glimpse. It'd be a shame to see it wasted on optimizing phones. ;-) Another quick peek does, indeed, note that they're claiming improvements on current models. Note: This was written in a few sporadic posts and the paragraphs are probably horrifically out of order. If they don't make sense then swap 'em around a bit until they
"So long and thanks for all the fish."
Think of a huge jigsaw puzzle. You might have thousands of pieces that could each go in thousands of possible places, and while you may know tricks to speed up the process, solving it would surely take a lot of trial and error and time. When you did finally solve the puzzle, though, it would only take you a moment to check the picture against the box and prove your solution was correct. That makes it a NP problem.
Another jigsaw puzzle has every piece labeled with column and row, so you can put the whole thing together in one sweep—no trial and error needed. Both solving the puzzle and verifying the solution would be simple. That makes it a P problem.
If P=NP, it would indicate the first jigsaw puzzle has labels on its pieces too, just harder to see. If you can find them, solving the puzzle becomes trivial.
There are a great number of NP problems in the world; if P=NP we could find the solutions to many intractable math problems within our grasp.
.
Unfortunately, one of those is encryption. Many encryption methods use a NP math problem as a lock and its solution as a key. We assume it would take an impractical amount of time to solve the math problem, so the only way in is by knowing the solution. But if P = NP, breaking that lock (by solving the math) could become as fast as using the key. Oops!
How can I believe you when you tell me what I don't want to hear?
Some very interesting results, but the two key contributions are (almost verbatim from the article):
1) With the best general-purpose cutting-plane method, the time required to select each new point to test was proportional to the number of elements raised to the power 3.373. Sidford, Lee, and Wong get that down to 3.
2) And in many of those cases (submodular minimization, submodular flow, matroid intersection, and semidefinite programming), they report dramatic improvements in efficiency, from running times that scale with the fifth or sixth power of the number of variables down to the second or third power.
So this seems to be a better cutting plane approach that improves the cutting process by reducing the time to find the next test point (in general), and for certain structured problems (like SDPs) this approach reduces the computation time significantly.
This does raise some interesting questions, such as: is there a limit to how fast you can (in a general problem, not using a specific problem's structure) find the next test point? Even if we don't know what algorithm gets us there, it would be useful to have a bound for future research.
Great, in theory. Then you end up with some drunk guy, driving backwards, on a one way street. As someone who made a career of modeling traffic, well, if we had intelligent drivers then I'd not have been able to retire at 50 after selling my company. Some dumb ass will ram the pylons and actually increase the overall time needed because emergency crews will be on-scene and the vehicle will be disabled. YOU might obey the rules but, well, you know how far that gets you.
Also, in an ideal world? We'd not need one single signal light nor stop sign. Yup. It could be done with ease. The only flaw in my world-overthrowing-plan is that humans are inherently stupid and more so when they're behind the wheel of an automobile. Trust me on this - I dare say that this is the one subject where I can speak authoritatively. Coming up with, and believing in, an ideal system that relies on the intelligence of the operators is as doomed as any political ideology that follows the same principles - meaning you're gonna need to be really draconian and authoritarian.
In short, we could just take the cars away. That'd make about as much sense as any other solution. Which, by the way, is me agreeing with you. It won't be perfect. No, it won't. However, your idea may actually mean that the overall throughput slows down and doesn't return to normal or result in an increased rate. It takes, on average, as long as THREE YEARS for them to acclimate to a new traffic pattern. Some, like the 'Magic Roundabouts' in the UK are just skipped by the locals who simply avoid them rather than adapt.
For every model of improvements you make, we'll invent a newer and dumber human. So, no... It won't be perfect. The pylons might be a stretch too far. Some places have actually opted to remove the tire puncture strips for those who go the wrong way on a one way route. Why? Idiots ended up holding up traffic. Sure, it had the desired result but there are more idiots. See the UK where they put in the pylons that rise up from the street, for example, and then lower only for certain vehicles with an attached sensor. People do exactly what you think they'll do - on a *very* regular basis, some of them even picking up speed to try to beat it. They don't. They do it again. And again... And again... And again...
Pylons probably aren't the solution - drivers education and mandatory re-licensing with strict adherence to testing standards might go somewhere but that's politically infeasible in my country.
"So long and thanks for all the fish."
I am an inconsiderate clod you... Oh, wait...
In Soviet Russia inconsiderate clods you?
"So long and thanks for all the fish."
The biggest problem with that IMHO, at least around where I live, is people cannot for the life of them zipper merge properly. The average person is a stupid shitty driver is the biggest modeling issue... They wait until the last moment to merge in, they don't equalize speed to traffic flow so they force the people on the highway to break heavily as they enter, the people on the highway don't watch the ramp and try to alleviate issues by maintaining or altering speed slightly, etc. But I'm sure you already knew that. :)
A nice explanation, but I can't resist a nitpick: Typical jigsaw puzzles are in P, because for every pair of pieces you can decide whether they fit together or not without having to consider other pieces; therefore solving this amounts to traversing an unoriented graph. An NP jigsaw must have the possibility of multiple pieces fitting one piece in general.