New Algorithm Provides Huge Speedups For Optimization Problems (mit.edu)

← Back to Stories (view on slashdot.org)

New Algorithm Provides Huge Speedups For Optimization Problems (mit.edu)

Posted by Soulskill on Saturday October 24, 2015 @03:07PM from the need-for-speed dept.

An anonymous reader writes: MIT graduate students have developed a new "cutting-plane" algorithm, a general-purpose algorithm for solving optimization problems. They've also developed a new way to apply their algorithm to specific problems, yielding orders-of-magnitude efficiency gains. Optimization problems look to find the best set of values for a group of disparate parameters. For example, the cost function around designing a new smartphone would reward battery life, speed, and durability while penalizing thickness, cost, and overheating. Finding the optimal arrangement of values is a difficult problem, but the new algorithm shaves a significant amount of operations (PDF) off those calculations. Satoru Iwata, professor of mathematical informatics at the University of Tokyo, said, "This is indeed an astonishing paper. For this problem, the running time bounds derived with the aid of discrete geometry and combinatorial techniques are by far better than what I could imagine."

5 of 129 comments (clear)

Min score:

Reason:

Sort:

Bad news for them by goombah99 · 2015-10-24 15:29 · Score: 4, Interesting

I guess they never heard of the "No Free Lunch" theorem for optimization, which believe it or not is the name of a proven theorem by David Wolpert that says the following is rigrorously true.
Averaged over all optimization problems every possible algorithm (that does not repeat a previous move) takes exactly the same amount of time to find the global minimum.
The collorary for this is that: If you show your algorithm outperforms other algorithms on a test set, then you have just proven it performs slower on all other problems.
That is the better it works on a finite class of problem, the worse it is on most problems.
it is even true that a hill climbing algorithm blunders into the minimum in the same average time as a hill descending algorithm. (yes that is true!).
Look it up. It's the bane of people who don't understand optimization.
The only things that distinguishes one optimization algorithm from another is:
1) if your problems are restricted to a subspace with special properties that your algorithm takes advantage of
2) your performance metric is different then the average time it takes to find the global minimum. for example, something that limits the worst case time to get within epsilon of the minimum.

--
Some drink at the fountain of knowledge. Others just gargle.
1. Re:Bad news for them by goombah99 · 2015-10-24 15:39 · Score: 4, Interesting
  
  To be fair, the actual article doesn't claim to violate the no free lunch theorem because it only applies to "some" problems. It's the headline that violates the no free lunch theorem. But one aspect of the no-free-lunch theorem that usually proves to be insanely diabolical is that it turns out that it's usually VERY hard to define which problems an alogorithm will do best on in general. Often you can see a specific set for which it is obvious. But as you complicate things the obviousness part goes out the window. FOr example, if you know your surface is convex and has a very simple shape like a multidimensional parabola, then it's trivial to hill descend to the bottom. But once you start putting in multiple minima and arbitrary discontinuities it becomes hard. Interestingly in high dimensional spaces know something is smooth isn't very useful as smoothness only removes a low number of degrees of freedom--leaving behind a smaller but still high dimensional space.
  Thus if there is a breakthrough here it's not the algorithm, it would be the ability to specify what kinds of problems this will work well on!
  
  --
  Some drink at the fountain of knowledge. Others just gargle.
Re:Oh... by KGIII · 2015-10-24 16:27 · Score: 3, Interesting

Why, I oughtta!
Anyhow, I'm checking out the paper. Well, I will be after snuggle-time. Yay! A use-case for a tablet. I've a bit of work in this area though, specifically, modeling traffic - traffic optimization *is* hard. Throughput, where humans are concerned, is nearing an attempt at modeling chaos (and for those who think perfection is possible, I welcome their insights).
I've yet to read it but I suspect it's not a huge advance so much as it is working on prior modal work. (Not all work gets published, for better or worse, and some remains a trade-secret.) My guess is that (and this is just from a quick look, absolutely not to be taken as an authoritative statement) this is just further optimization, refinement, and it appears like it may be a good step forward.
I'm not sure that I agree with the phone analogy but, well, okay... We're not children.
It looks like, if you've got highway traffic going to A, B, and C and you have exit/merge nodes 1, 2, and 3 then you can take the current values and include the street traffic - as well as time of day data, up to and including the myriad traffic uses, that you can then more accurately predict if new merge 1.5 will increase traffic for traffic going from A to B while decreasing overall traffic (of increasing it) from B to C and if it will actually reduce surface street traffic. It also looks like you can use this to predict if an exit at 1.5 will decrease overall time for, example, delivery vehicles that need to reach surface streets or if they're better shoveled off to a new exit at 2.5 or left to exit on 3. You can also deduce predictions (never certainties) for throughput from A to C, A to B, and which exits or mergers are more likely to congest or reduce traffic on the highway as well as optimize for certain functions such as the aforementioned delivery trucks.
Basically, in programmer terms, the more subroutines you can have the more accurately you can refine your answers and the better the results *can* be. Of course this adds complexity and computational difficulty. This algorithm optimizes existing work by allowing you to use more subroutines more efficiently - perhaps think of it like calling a library or OOP (I guess?). As you were a teacher, it's akin to being able to (more easily) tailor an individualized lesson plan for each student who may, or may not, excel and then, at the end, they'd all be able to pass their test which may, or may not, be also tailored for them. Basically, you'd be able to blab a whole bunch of data at them and they'd then get the appropriate information, tailored to their needs, and may output the correct answers on the test.
This doesn't do anything (from what I see so far) to actually ensure you're inserting good data (of course). I am obviously stretching the analogies quite a bit but math is hard. Well, no... It's not hard so much but, rather, it is that most people learned via rote and not conceptually. Some idiot decided to make word problems, not a bad idea. Some idiot came along after them and told them to take the word problems apart and to ignore the words. Nobody actually showed them the concepts of the maths themselves and why the answers are the way they are. I could type a novella on the subject... Complete with examples! Nobody would listen/read it and I don't blame them.
In short, this is "just" improving on existing models and appears to be doing so by making various inputs more accessible or, more accurately, more streamlined. Again, a very cursory scan was done and I could be way off. Another quick look makes me think I'm correct. I really need to read it to be certain.
'Snot hard. *nods* Looks like good work from a quick glimpse. It'd be a shame to see it wasted on optimizing phones. ;-) Another quick peek does, indeed, note that they're claiming improvements on current models. Note: This was written in a few sporadic posts and the paragraphs are probably horrifically out of order. If they don't make sense then swap 'em around a bit until they

--
"So long and thanks for all the fish."
The two key contributions by Kwyj1b0 · 2015-10-24 16:39 · Score: 3, Interesting

Some very interesting results, but the two key contributions are (almost verbatim from the article):
1) With the best general-purpose cutting-plane method, the time required to select each new point to test was proportional to the number of elements raised to the power 3.373. Sidford, Lee, and Wong get that down to 3.
2) And in many of those cases (submodular minimization, submodular flow, matroid intersection, and semidefinite programming), they report dramatic improvements in efficiency, from running times that scale with the fifth or sixth power of the number of variables down to the second or third power.
So this seems to be a better cutting plane approach that improves the cutting process by reducing the time to find the next test point (in general), and for certain structured problems (like SDPs) this approach reduces the computation time significantly.
This does raise some interesting questions, such as: is there a limit to how fast you can (in a general problem, not using a specific problem's structure) find the next test point? Even if we don't know what algorithm gets us there, it would be useful to have a bound for future research.
Re:Oh... by sexconker · 2015-10-24 16:40 · Score: 1, Interesting

Why, I oughtta!
Anyhow, I'm checking out the paper. Well, I will be after snuggle-time. Yay! A use-case for a tablet. I've a bit of work in this area though, specifically, modeling traffic - traffic optimization *is* hard. Throughput, where humans are concerned, is nearing an attempt at modeling chaos (and for those who think perfection is possible, I welcome their insights).
For controlled intersections, ACTUALLY control them.
Stop signs, traffic signals, etc. are all suggestions in the eyes of your typical driver.
If a yellow light meant "we're raising the fucking spiked pylon barrier" there would be a lot fewer issues.
It won't be perfect (we still have people who try to beat trains and drive round/through the arms as they're lowering) but it would be a lot better, especially after weeding out the retards.
Once you control behavior you can then look to optimizing flows of traffic, and you'll have much more success.