How I Completed The $5000 Compression Challenge
Patrick Craig points to this page which details (in email) the strange life of an online challenge as posed, clarified and (at least to some eyes) met successfully. This should give pause to anyone willing to pose -- or accept -- seemingly impossible problems. In short, Patrick's not $5,000 richer, but you may think he deserves to be.
Again, that depends on what you mean by "paper airplane." Admittedly, I won a similar contest involving rubber-band driven cars. The contest rules said that CDs or DVDs could not be used for tires due to the fact that they make for very narrow tires, thus having reduced friction. So I used hard drive platters. :-)
My journal has hot
However, the decompressor DOES depend on the filename numbering in order to work; it is part of the algorithm (and hence the "decompressor"), and those bytes (the bytes used to number the files) should be charged against the decompressor size. Then Patrick loses.
While not explicitly stated, those bytes ARE an integral part of the "decompressor" that was supplied (it can't work without them), and so including their size is consistent with the agreed upon rules, IMO.
"It's overkill, of course. But you can never have too much overkill." - Anonymous Slashdot Coward
Its an intriquing problem. I certainly wouldn't stake $5000 on nobody being able to compress the data I supplied, good random number generator or not. Why? because while the concept that random data in general cannot be compressed is correct, there are no patterns, it is not necessarily true that a given set of random data cannot be compressed.
For example, a file full of 0's, is perfectly possible from a true random number generator, and just as likely as any other set of numbers. As any good geek knows, a file full of 0's compresses very very well.
In this case, I think the challenge has one major problem, and that is that there is no real time limit on the entry, and no requirement that the decompressor work effectively in general, it can be entirely specialised.
This means that the problem *may* be solvable, simply by throwing ungodly amounts of CPU power at it. With the appropriate code, it would be possible to simply hunt for unintentional patterns in the data and hope. If you're lucky, there will be enough indexable patterns present that it will outweigh the cost of the indexing, and the code necessary to replace the index points.
If you're unlucky, the data will have no patterns in the space you searched, and therefore you're out of luck.
This is why the guy noted that if he got a big enough file, he could win the competition. As the filesize increases, it becomes increasingly likely that there are, accidentally, enough patterns in the data to cover the space required for the decompressor and the dictionary. Of course, as the filesize increases, you also have to burn considerably more CPU in order to locate the patterns in the first place.
Personally I think it would make an excellent project for a research team, not particularly for the money, but in order to create some really fast, specialised code for locating patterns in assumedly random data.
I am not aware of much effort in that area, most compression research goes into identifying patterns inherent in the data type, waveforms in audio, visual patterns in images, duplicate strings in text and binary files. It would be very interesting to see just how effective fast analysis code and heavy CPU is against the problem.
I suspect there may well be some crossover here with Cryptoanalysis, since one of the primary jobs in breaking crypto is finding the patterns that differentiate the ciphered data from pure randomness.
Any links appreciated.
You can't win a fight.
If you are for the challengee, then you believe that exact specificity is more important than intent. And you inadvertantly empower the legal profession to profit from being better able to specify rules, etc. Which, in turn, gives those with money even more power. They can write the rules by which your interaction with them will be governed. You can be certain that the rules will be written to the advantage of the guy with the money.
If, on the other hand, you are for the challenger, then you believe that his intent was more important. (Considering that the challenge came from the compression FAQ, the exact definition of "compression" should have not been vague. It seems clear to me that his intent was obvious, even though his challenge wasn't very specific.) Still there is a problem with valuing intent over specificity. That is it's easy to misrepresent intent after the fact. Thus the need to clearly specify our intentions prior to anything we do. That being the case, with accurate documentation, it seems possible to me that we can discern the original intent w/out having to specify everything legalistically. For example, I think that I was able to clearly determine the challenger's intent well after the fact, and I did this having read the challengee's representation of the events.
Now you may say, in rebuttal, "Who cares who I choose? I don't get to make the law." But think of yourself as the jury trying to determine to whom you would award the money. You may one day find yourself on a jury and trying to determine who should be awarded money. I see this as a microcosm of exactly how our court system has gotten into the state that it's in. Specificity is clearly more important than intent. And it's that way because juries award that way. Since juries are comprised of members of society, it's not a far stretch to say that society as a whole regards specificity as more important than intent.
Is there any chance that we can change this? And if we did, would it be better?
Key to financial independence: Spend less than you earn. Save and invest the difference. Do it for a long time.
.
Patrick I meant can I send you a compressor and several compressed files whose total file size is less than the original uncompressed file and from which I can regenerate the original uncompressed fileMike Sure.
Pay up Mike. Pay up. You were too careless to plan out the rules and think about your intent before you threw 5k into the sink. Cough up the cash or loose all honor.
Someone you trust is one of us.
When Goldman gleefully posted to the newsgroup that someone had accepted his challenge, he said "some people just don't understand information theory too well". Yeah, and some people just don't understand issuing challenges very well, either!
When you make a challenge like that you absolutely must be sure to word the rules carefully and try to avoid leaving any loopholes. In this case, all Goldman really needed to do was tack a sentence at the end of his original challenge saying "the data must actually be compressed according to the definition of 'compression' contained in this FAQ. Challengers who merely point out loopholes or exploits in the rules of the challenge will not be rewarded".
The moment Mr. Craig started iquiring about about the specifics of the rules and saying "wait and see" regarding his methods, that should have been a warning sign to Goldman that Craig had some trick up his sleeve. But no, Goldman was just so in love with the idea that he found some "impossible" task which suckers might try to undertake that he couldn't be bothered to think about such things.
What's even more embarassing to Goldman is his initial reaction to Craig's submission:
Even though Craig had already asked if more than one file was okay and Goldman had said yes!And Goldman's following point:
is even more ridiculous given that Goldman had already said that "sure" in response to Craig's question about multiple files whose "total file size" (which does not include directory entries) are less than the size of the original data file. Goldman tried to change the rules rather than admit he was sloppy in setting them up.Goldman also says "I think that I did not make a mistake", even though he admits in the same message that he needs to restate the rules:
"Perhaps"? How about "definitely"! And any "miscommunication" was entirely one-way. Craig was very sly and clever, and Goldman was too full of himself to realize what the fuck was going on until after the fact. A "pompus ass" indeed.Free Hans!
--
--
We have fought the AC's, and they have won.
Ok, I've done some information theory, I know it's impossible to create a perfect compressor. Boxes arguments and all that.
However, to make money, you don't have to.
All that is required is a compresser that saves 1 bit on better than 1 in 50 attempts.
For example, if we can save 1 bit on 50% of the attempts - even if the other 50% of the attempts results it catasrohpically large files. On average we've lost the compression war.
However, on 100 attempts, we've paid $100 x 100 or $10000 but we've won on half of these - netting us $250000 or a cool $240000 profit.
Anyone want to demonstrate this?
Only two things are infinite, the universe and human stupidity, and I'm not sure about the former. (Einstein)
> total file size is less than the original uncompressed file and from
> which I can regenerate the original uncompressed file
>
> Patrick
Sure -- but you send me a *decompressor*, I don't need the compressor.
Sure means Sure. As far as I can tell this is acceptance of Patrick's request. As Mike is the guy inventing the rules, I don't see why this wouldn't be binding.
If the guy had read the FAQ, he would know that such tricks are not compression
Did the rules as stated say that the solution had to comply with the principle laid out in the FAQ? Not that I can see. If you're going to pup up a $5,000 challenge, you damn well better post all the rules.
For instance: I had a friend who entered a paper airplane contest. He signed up for the "maximum distance" part. The only rules were that the contest organizers supplied the single piece of paper, you had five minutes to fold or tear it however you want to, and then everyone threw them at the same time. The one that stops moving farthest from the launch line wins.
He waited until 4:45 into the time, then crumpled the paper up into a ball and threw it. While everyone else's planes were circling lazily around, his went straight, landed, and rolled several more feet.
While you can debate all you want about whether his wad of paper counted as a "paper airplane" -- and we did debate (hey, it was college) -- he did saitsfy the rules of the contest. Had there been enough money on the line for someone to threaten legal action, I suspect he would have won.
Nope, no sig