Article doesn't answer the question of why? Even the initial premise it would cost half a billion to digitize 90 forms and keep a data base seems absurd. What is so special that it has needs that would cost that much. A high school class project could do that it a month. Survey monkey could do it. Sure it might be shitty and scaling the backend tricky. But not very tricky. Now spend a million ir even two and you could do it well.
Indeed, small difference of large numbers is only a problem for computers who brute force things. Quad precision is for wimps. Engineers had lots of tricks for re-writing equations so that the terms would naturally sum to a small number without large intermediates. I recall learning 4 different ways to write the quadratic formula that would avoid cases where b^2-4ac was the difference of large numbers or -b + sqrt(b^2-4ac) was the difference of large numbers. Since comuters I don't think I've ever seen that used. it's always coded with the textbook -b + sqrt(b^2-4ac). This is also why many eignenvalue algorithms give signular results in modern compuations. People don't spend the time to figure out how to avoid those precision level differences.
Expensive slide rules sometimes get to 4 digit logs with very fine rulling and very long slide bars. But some manage to get extra precision using a vernier scale. You should see if yours has a vernier on it, most do. It's a clever trick.
The digital caliper replaced the analog caliper. I miss the vernier scale, a really clever invention to squeeze out one more digit of precision than one would think possible. I doubt most kids today have any idea what a vernier scale is. The differential micrometer is another very clever device which works like a mechanical version of the vernier. My guess someone thought of it after seeing a vernier scale.
there were lots of possible slide bars. Ones for trigonometry, etc.. Most of the boeing planes up to the 747 were made by engineers who still used slide rules. Sure there were calculators and computing machines too, but slide rules were still in use by old timers.
The articles says that the high res cameras and GPS will be a deterrent. One could imagine thieves wearing balaclavas, but still things like gate-analysis or stand-off iris imaging will put the thieves at risk of being spotted by other cameras later and then identified in the future. Perhaps it will also employ something like an electric eel defense.
In any case the economics of this are odd. They state that it will allow deliveries for $1 which they claim is 15 times less than a human driver. Empirically it does not cost $15 to deliver most packages. However that depends on what you are comparing too. A 40 pound package might easily cost over $25 to deliver. And if were talking same day delivery then more. On the otherhand, the last mile is not the only cost and I would assume that for heavy and same day delivery that much of the cost is in the upstream logistics. So perhaps this is meant for same day delivery of relatively heavy items where perhaps $15 makes sense.
An additional cost savings is in the return of items. Being able to return things easily even if it is not free is a major incentive to shop stores like Sierra Trading post, Amazon and others with liberal return policies. Being able to do that return for free would revolutionize the industry.
Thus I would conclude the big deal here is the low cost of returns.
Apple and Google I think won't mind this too much. I suspect they wanted to force the issue that the government has to come out and say, we will search e-mails rather than putting the squeeze on apple privately to sell out their customers with secret deals. If they get caught like AT&T did, it makes them look like crap and it doesn't hurt their competitors equally. Now if apple turns over a message they can just say every does it because its the law, and that's a fact. The "unbreakable" encryption part was probably inconvenient for gathering data. Apple I suspect still wants data, to make siri smarter, and searches more relevant. Google wants data because using it to sell improved advertising is their bussiness.
Coincidentally slashdot deals has a banner ad for the christopher walken (Pulp fiction) solution to taking your computer across the boarder privately. https://deals.slashdot.org/sal... Which is just the right size to hide any place you could fit a wristwatch.
If any one was wondering why google is pushing project Fi, behold! Google of course is best positioned not only to sell this data but correlate it across all your other tracking data. If you use their DNS or chrome or search hints at home or on their browser then they know every website you visit. Project Fi completes their mobile domination of your personal movement. No wonder the price of Project Fi is attractive.
What I would like is a device that I could connect to something like an arduino or small cortex board with lots of digital and analog I/O. It needs to run my programs very easily from some simple environment like the arduino or processing IDEs. And if possible I'd like it to just start up or wake-from-sleep into my code without a lot of overhead.
Then I'd have the ultimate interface device. Sure I can attach screens to my Pi or arduino now but with cell phones and now tablets getting ridiculously cheap, having their own WiFi and sound and very high quality multi-touch screens plus batteries it makes no sense to use anything else as the cpu. what's missing is the I/O. The problem is that usually these devices want to load a heavyweight operating environment, run your app only when a user clicks it, and generally phone home to the mothership at amazon.
So if this device is not a loss leader for Amazon then I'm hoping someone creates such an IDE so I can use it as an appliance.
THanks. I've read that. I think if you look at some of my other responses above that you will see that I agree that C.U.P problems are not of interest to humans. The real issue is not that. The issue is defining which algorithms are useful for which kinds of subsets of problems (each of which is non C.U.P. but in a different way. That is, unless you can say what your algorithm is good for (clearly) then all you are doing us fracturing the subspace of all problems into subsets but you still don't know when to use that algorithm. There can't be a universal algorithm that is best for all cases.
Yes I did explain in the follow up post to my original post. I said that the paper made no such claim. Instead it was the headlines that made the claim.
Right. I don't disagree. The sublty of the NFL theorem is not in its crass conclusion that brute force is just as good as anyhting else. it's in the realization first that algorithms only work on subspaces well. And that the real trick is not in creating the algorithm but being able to say what subspace it works well on.
TO see that consider the following thought experiment. imagine we divide the space of all problems into many subspaces and on each of these subspaces we have a very fast algorithm. Surely then this defeats the NFL theorem? No. The time it takes to decide what subspace solver to use, plus the time it takes to use it would be the same as brute force.
Thus anytime one says an algorithm is better, if you can't say when it's better, one hasn't quite solved the issue.
The escape clause is that it's very likely the class of problems humans care about is a very small subspace of all problems. But defining what that meansis hard.
This "non-repeat" issue is a red herring. The proviso on "not repeat" is not meant to exclude reversable trajectories. It simply means that going back to an earlier state will be a free-pass in terms of counting moves. We won't add that to your count. So this has nothing to do with the applicability of the no-free-lunch theorem
I totally agree with you. I indeed left out the issue of incompressible optimization functions and their infeasibility in memory. More generally, I also believe that the NFL threorem does not imply despair but rather that the problems that interest humans are a tiny subspace of all possible problems. On this subspace better average performance is entirely plausible. The rub, is that to make that claim you have to be able to specify what the subspace of interest to humans in. I think you might be able to define that in terms of algorithmic complexity. But that alone doesn't tell you if your algorithm must be better than another one on all problems of a specified maximum complexity. Thus any time someone says their algorithms outperform without telling us in what subspace they achieve this there is no proof of superior performance. Since the paper actually does make this claim it's confusing to sort out how they are limiting the problem. Likely this is my own failing at understanding their mathermatical definitions.
to be more complete. the NFL theorem ignores the computational cost of deciding the next move in the search as well. so really the NFL is counting the average number of guesses required not the time.
Your statement that NFL relies on all problems being equally likely is very close to equvalent to my statement that the subspace of problems of interest to humans is smaller than the subspace of all problems. So I think were in agreement.
The NFL does not imply optimization is hopeless. It just says a claim of improved speed is dubious till you tell us what subspace of problem you improved it on. This is what I stated to begin with. If an algorithm is faster on some problems then it's slower on most other problems.
Averaged over all problems, the time it takes to figure out which special case one has plus the time to solve that special case has to be the same as any other algorithm. The escape clause is if when you say "cutting plane" you are intrisically restricting the problem to a special case. In that case alrgorithmic speedups are plausible. Likewise is cutting plane is a class of algorithm and you are comparing variations on that particular class of algorithm then speed ups are again plausible. But if all possible problems are addressed as cutting plane problems then No it is not possible for there to be any average speed up.
But please realize that all is not lost. It is very likely that all problems of interest to humans is a tiny set of the space of all possible problems. It may well be that we are only interested in a very special subspace. I'd even bet on it. But defining what that subspace is will be hard.
To be fair, the actual article doesn't claim to violate the no free lunch theorem because it only applies to "some" problems. It's the headline that violates the no free lunch theorem. But one aspect of the no-free-lunch theorem that usually proves to be insanely diabolical is that it turns out that it's usually VERY hard to define which problems an alogorithm will do best on in general. Often you can see a specific set for which it is obvious. But as you complicate things the obviousness part goes out the window. FOr example, if you know your surface is convex and has a very simple shape like a multidimensional parabola, then it's trivial to hill descend to the bottom. But once you start putting in multiple minima and arbitrary discontinuities it becomes hard. Interestingly in high dimensional spaces know something is smooth isn't very useful as smoothness only removes a low number of degrees of freedom--leaving behind a smaller but still high dimensional space.
Thus if there is a breakthrough here it's not the algorithm, it would be the ability to specify what kinds of problems this will work well on!
I guess they never heard of the "No Free Lunch" theorem for optimization, which believe it or not is the name of a proven theorem by David Wolpert that says the following is rigrorously true. Averaged over all optimization problems every possible algorithm (that does not repeat a previous move) takes exactly the same amount of time to find the global minimum.
The collorary for this is that: If you show your algorithm outperforms other algorithms on a test set, then you have just proven it performs slower on all other problems.
That is the better it works on a finite class of problem, the worse it is on most problems.
it is even true that a hill climbing algorithm blunders into the minimum in the same average time as a hill descending algorithm. (yes that is true!).
Look it up. It's the bane of people who don't understand optimization.
The only things that distinguishes one optimization algorithm from another is: 1) if your problems are restricted to a subspace with special properties that your algorithm takes advantage of 2) your performance metric is different then the average time it takes to find the global minimum. for example, something that limits the worst case time to get within epsilon of the minimum.
Article doesn't answer the question of why?
Even the initial premise it would cost half a billion to digitize 90 forms and keep a data base seems absurd. What is so special that it has needs that would cost that much. A high school class project could do that it a month. Survey monkey could do it.
Sure it might be shitty and scaling the backend tricky. But not very tricky. Now spend a million ir even two and you could do it well.
A billion? Why?
VW engineers say they were only following orders. Where have I heard that before.
Nifty! I worked at San Jose RL in the 80s and also Yorktown with the Fellow who invented the permalloy readhead.
Indeed, small difference of large numbers is only a problem for computers who brute force things. Quad precision is for wimps. Engineers had lots of tricks for re-writing equations so that the terms would naturally sum to a small number without large intermediates. I recall learning 4 different ways to write the quadratic formula that would avoid cases where b^2-4ac was the difference of large numbers or -b + sqrt(b^2-4ac) was the difference of large numbers. Since comuters I don't think I've ever seen that used. it's always coded with the textbook -b + sqrt(b^2-4ac). This is also why many eignenvalue algorithms give signular results in modern compuations. People don't spend the time to figure out how to avoid those precision level differences.
Expensive slide rules sometimes get to 4 digit logs with very fine rulling and very long slide bars. But some manage to get extra precision using a vernier scale. You should see if yours has a vernier on it, most do. It's a clever trick.
The digital caliper replaced the analog caliper. I miss the vernier scale, a really clever invention to squeeze out one more digit of precision than one would think possible. I doubt most kids today have any idea what a vernier scale is. The differential micrometer is another very clever device which works like a mechanical version of the vernier. My guess someone thought of it after seeing a vernier scale.
there were lots of possible slide bars. Ones for trigonometry, etc.. Most of the boeing planes up to the 747 were made by engineers who still used slide rules. Sure there were calculators and computing machines too, but slide rules were still in use by old timers.
The articles says that the high res cameras and GPS will be a deterrent. One could imagine thieves wearing balaclavas, but still things like gate-analysis or stand-off iris imaging will put the thieves at risk of being spotted by other cameras later and then identified in the future. Perhaps it will also employ something like an electric eel defense.
In any case the economics of this are odd. They state that it will allow deliveries for $1 which they claim is 15 times less than a human driver. Empirically it does not cost $15 to deliver most packages. However that depends on what you are comparing too. A 40 pound package might easily cost over $25 to deliver. And if were talking same day delivery then more. On the otherhand, the last mile is not the only cost and I would assume that for heavy and same day delivery that much of the cost is in the upstream logistics. So perhaps this is meant for same day delivery of relatively heavy items where perhaps $15 makes sense.
An additional cost savings is in the return of items. Being able to return things easily even if it is not free is a major incentive to shop stores like Sierra Trading post, Amazon and others with liberal return policies. Being able to do that return for free would revolutionize the industry.
Thus I would conclude the big deal here is the low cost of returns.
Apple and Google I think won't mind this too much. I suspect they wanted to force the issue that the government has to come out and say, we will search e-mails rather than putting the squeeze on apple privately to sell out their customers with secret deals. If they get caught like AT&T did, it makes them look like crap and it doesn't hurt their competitors equally. Now if apple turns over a message they can just say every does it because its the law, and that's a fact. The "unbreakable" encryption part was probably inconvenient for gathering data. Apple I suspect still wants data, to make siri smarter, and searches more relevant. Google wants data because using it to sell improved advertising is their bussiness.
Rupert Murdoch's propaganda outlets do much the same in multiple countries.
Stand back man that SSD is whipping around.
Coincidentally slashdot deals has a banner ad for the christopher walken (Pulp fiction) solution to taking your computer across the boarder privately.
https://deals.slashdot.org/sal...
Which is just the right size to hide any place you could fit a wristwatch.
WTF? this is just a link to a logo for COreboot. no explanation of what it is or what makes it different other than just saying "its secure".
If any one was wondering why google is pushing project Fi, behold! Google of course is best positioned not only to sell this data but correlate it across all your other tracking data. If you use their DNS or chrome or search hints at home or on their browser then they know every website you visit. Project Fi completes their mobile domination of your personal movement. No wonder the price of Project Fi is attractive.
What I would like is a device that I could connect to something like an arduino or small cortex board with lots of digital and analog I/O. It needs to run my programs very easily from some simple environment like the arduino or processing IDEs. And if possible I'd like it to just start up or wake-from-sleep into my code without a lot of overhead.
Then I'd have the ultimate interface device. Sure I can attach screens to my Pi or arduino now but with cell phones and now tablets getting ridiculously cheap, having their own WiFi and sound and very high quality multi-touch screens plus batteries it makes no sense to use anything else as the cpu. what's missing is the I/O. The problem is that usually these devices want to load a heavyweight operating environment, run your app only when a user clicks it, and generally phone home to the mothership at amazon.
So if this device is not a loss leader for Amazon then I'm hoping someone creates such an IDE so I can use it as an appliance.
Maybe this has been done?
The coast of the united kingdom is exactly 43 long, but I'm still working out what undiscovered units that is in. :-)
THanks. I've read that. I think if you look at some of my other responses above that you will see that I agree that C.U.P problems are not of interest to humans. The real issue is not that. The issue is defining which algorithms are useful for which kinds of subsets of problems (each of which is non C.U.P. but in a different way. That is, unless you can say what your algorithm is good for (clearly) then all you are doing us fracturing the subspace of all problems into subsets but you still don't know when to use that algorithm. There can't be a universal algorithm that is best for all cases.
Yes I did explain in the follow up post to my original post. I said that the paper made no such claim. Instead it was the headlines that made the claim.
Right. I don't disagree. The sublty of the NFL theorem is not in its crass conclusion that brute force is just as good as anyhting else. it's in the realization first that algorithms only work on subspaces well. And that the real trick is not in creating the algorithm but being able to say what subspace it works well on.
TO see that consider the following thought experiment. imagine we divide the space of all problems into many subspaces and on each of these subspaces we have a very fast algorithm. Surely then this defeats the NFL theorem? No. The time it takes to decide what subspace solver to use, plus the time it takes to use it would be the same as brute force.
Thus anytime one says an algorithm is better, if you can't say when it's better, one hasn't quite solved the issue.
The escape clause is that it's very likely the class of problems humans care about is a very small subspace of all problems. But defining what that meansis hard.
This "non-repeat" issue is a red herring. The proviso on "not repeat" is not meant to exclude reversable trajectories. It simply means that going back to an earlier state will be a free-pass in terms of counting moves. We won't add that to your count. So this has nothing to do with the applicability of the no-free-lunch theorem
I totally agree with you. I indeed left out the issue of incompressible optimization functions and their infeasibility in memory. More generally, I also believe that the NFL threorem does not imply despair but rather that the problems that interest humans are a tiny subspace of all possible problems. On this subspace better average performance is entirely plausible. The rub, is that to make that claim you have to be able to specify what the subspace of interest to humans in. I think you might be able to define that in terms of algorithmic complexity. But that alone doesn't tell you if your algorithm must be better than another one on all problems of a specified maximum complexity. Thus any time someone says their algorithms outperform without telling us in what subspace they achieve this there is no proof of superior performance. Since the paper actually does make this claim it's confusing to sort out how they are limiting the problem. Likely this is my own failing at understanding their mathermatical definitions.
to be more complete. the NFL theorem ignores the computational cost of deciding the next move in the search as well. so really the NFL is counting the average number of guesses required not the time.
Your statement that NFL relies on all problems being equally likely is very close to equvalent to my statement that the subspace of problems of interest to humans is smaller than the subspace of all problems. So I think were in agreement.
The NFL does not imply optimization is hopeless. It just says a claim of improved speed is dubious till you tell us what subspace of problem you improved it on. This is what I stated to begin with. If an algorithm is faster on some problems then it's slower on most other problems.
Averaged over all problems, the time it takes to figure out which special case one has plus the time to solve that special case has to be the same as any other algorithm. The escape clause is if when you say "cutting plane" you are intrisically restricting the problem to a special case. In that case alrgorithmic speedups are plausible. Likewise is cutting plane is a class of algorithm and you are comparing variations on that particular class of algorithm then speed ups are again plausible. But if all possible problems are addressed as cutting plane problems then No it is not possible for there to be any average speed up.
But please realize that all is not lost. It is very likely that all problems of interest to humans is a tiny set of the space of all possible problems. It may well be that we are only interested in a very special subspace. I'd even bet on it. But defining what that subspace is will be hard.
https://en.wikipedia.org/wiki/...
To be fair, the actual article doesn't claim to violate the no free lunch theorem because it only applies to "some" problems. It's the headline that violates the no free lunch theorem. But one aspect of the no-free-lunch theorem that usually proves to be insanely diabolical is that it turns out that it's usually VERY hard to define which problems an alogorithm will do best on in general. Often you can see a specific set for which it is obvious. But as you complicate things the obviousness part goes out the window. FOr example, if you know your surface is convex and has a very simple shape like a multidimensional parabola, then it's trivial to hill descend to the bottom. But once you start putting in multiple minima and arbitrary discontinuities it becomes hard. Interestingly in high dimensional spaces know something is smooth isn't very useful as smoothness only removes a low number of degrees of freedom--leaving behind a smaller but still high dimensional space.
Thus if there is a breakthrough here it's not the algorithm, it would be the ability to specify what kinds of problems this will work well on!
I guess they never heard of the "No Free Lunch" theorem for optimization, which believe it or not is the name of a proven theorem by David Wolpert that says the following is rigrorously true.
Averaged over all optimization problems every possible algorithm (that does not repeat a previous move) takes exactly the same amount of time to find the global minimum.
The collorary for this is that: If you show your algorithm outperforms other algorithms on a test set, then you have just proven it performs slower on all other problems.
That is the better it works on a finite class of problem, the worse it is on most problems.
it is even true that a hill climbing algorithm blunders into the minimum in the same average time as a hill descending algorithm. (yes that is true!).
Look it up. It's the bane of people who don't understand optimization.
The only things that distinguishes one optimization algorithm from another is:
1) if your problems are restricted to a subspace with special properties that your algorithm takes advantage of
2) your performance metric is different then the average time it takes to find the global minimum. for example, something that limits the worst case time to get within epsilon of the minimum.