A Fictional Compression Metric Moves Into the Real World
Tekla Perry (3034735) writes The 'Weissman Score' — created for HBO's "Silicon Valley" to add dramatic flair to the show's race to build the best compression algorithm — creates a single score by considering both the compression ratio and the compression speed. While it was created for a TV show, it does really work, and it's quickly migrating into academia. Computer science and engineering students will begin to encounter the Weissman Score in the classroom this fall."
A "combined score" for speed and ratio is useless, as that relation is not linear.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I thought I read an article the other day that said their algorithm seemed plausible on the surface but would eventually would begin to fall apart?
The so-called Weissman score is just proportional to (compression ratio)/log(time to compress).
I guess the idea is that twice as much compression is always twice as good, while increases in time become less significant if you're already taking a long time. For example, taking a day to compress is much worse than taking an hour, but taking 24 days to compress is only somewhat worse than taking one day since you're talking offline/parallel processing anyway.
The log() seems kind of an arbitrary choice, but whatever. It's no better or worse than any other made-up metric, as long as you're not taking it too seriously.
"They were pure niggers." – Noam Chomsky
From the article:
Misra came up with a formula
An algorithm can compress data quickly and fit it into a small number of bytes, but that doesn't mean what comes out the other end is recognizable. Without adding a weighting for lossiness, this "Weissman Score" has no merit whatsoever. Using the "Weissman Score", MP3 is always better than FLAC, and that's completely untrue for anyone who cares about audio.
Additionally, new generations of video encoders would arguably be "worse" under this weighting system compared to older generations, as improvements in video encoding are currently rather incremental, generally with massive speed penalties as they require significantly higher numbers of CPU cycles to burn through the algorithms required to compress efficiently at low bitrates while maintaining very little distortion/lossiness.
Again, this score doesn't matter because in the end, a compression algorithm is only as good as what comes out the other side.
He said it did work, it's just not as effective as other existing compression solutions.
Not only does it fail to account for loss or distortion, but also fails to consider the time to decompress. If a compression algorithm with a high Weissman score is applied to a video, it is useless if it cannot be decompressed fast enough to show the video at an appropriate frame rate.
Aside from centering around Silicon Valley, I don't see how these stories are related. That one is about a fictional compression algorithm, while this one is about a method for rating compression algorithms which is becoming nonfiction.
Two scores would be useful, one for compression_time:size and decompression_time:size, since for many applications the latter is more important in compress-once consume-many applications.
They're talking about the Score, not the compression algorithm. And your link doesn't mention anything about the Score.
The fictional compression algorithm doesn't work. The metric for rating compression algorithms does work (inasmuch as more compressed/faster algorithms achieve a better rating).
IIRC, the Drake equation was also a 'spitball' solution whipped off the cuff to address an inconvenient interviewer question. Subsequent tweaks have made it as accurate and reliable as when it was first spat out upon the world - and about as useless.
exactly. The compression algorithm is fictional; the score, while created for the show, can actually be calculated. Whether it will catch on as a metric remains to be seen.
Show About Self-Absorbed Assholes Who Think Their Stupid Ideas Are The Bees Knees Gains Popularity By Making Their Stupid Idea Sound Like Its The Bees Knees
Somebody should explain that to Professor Tsachy Weissman and Ph.D student Vinith Misra, who specifically stated it doesn't really work, and then school them on it then.
The compression algorithm is fictional and does not work. That is what your linked article discusses.
This is about the Weissman Score.
No metric is adequate for all purposes. This one is adequate for the task it was designed for, and is adequate for some other purposes as well. That's the best that can be expected of any tool. Always use the appropriate tools for the task at hand, of course.
"Convictions are more dangerous enemies of truth than lies."
What doesn't "really work" is the fictitious compression algorithm
developed on the show.
The "Weissman Score" metric, however, does work in assigning
a compression algorithm a somewhat valid score.
“We had to come up with an approach that isn’t possible today, but it isn’t immediately obvious that it isn’t possible,” says Misra.
Please explain why you think that means he said "it does work".
Where's our TV show?
The compression algorithm doesn't work, the compression and speed metric does. It does give arbitrary amounts of importance to compression and to speed, but Americans are used to arbitrary metrics.
Holy shit! Math works! Somehow, I don't think you can have a discussion about if a formula really returns a result or not. I now see that the idiot who wrote the summary was trying to say that the algorithm doesn't work, but math does. Alas that idiot has no ability to write. ... oh wait, it was you! Never mind.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Yes. That's the point, isn't it. They didn't invent math for the show. Claiming that a score "works" has no meaning, other than to say that math "works". Therefore, the only interpretation of the hideously poor writing is that the submitter is claiming the algorithm works.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Yes. Math "works". News at 11!
Why am I reminded of this Mexican ad when I read this?
https://www.youtube.com/watch?v=vqgSO8_cRio
Sounds a bit like the f1 measure used in classification systems, where the F-score is the harmonic mean of precision and recall. (where trying to higher precision yields lower recall and vice-versa) ;-)
however, I'm wondering how stable this Weissman score is. Compression algorithms might not all perform O(n) where n is size of data to compress.
Or it may actually give a very high score to something that doesn't compress at all.
public byte[] compress( byte[] input) { return input;}
I bet this gets a high Weissman score
Claiming that a score "works" has no meaning,
I could easily devise a cpu scoring methodology that scores CPU based on chip area / cost * clock speed / register width.
Such a score "works" in the sense that the function can be evaluated, but it wouldn't tell you anything about whether to buy an i7 vs a xeon vs a pentium 2.
The suggestion in the article is that the particular scoring methodology that was created for the show is useful for comparing compression algorithms, to the point that it may well be adopted by industry.
Therefore, the only interpretation of the hideously poor writing is that the submitter is claiming the algorithm works.
The writing was perfectly fine, your reading comprehension is what failed here.
Yes. He failed to comprehend that the submitter was pointing out that math really works, and a ratio of compression over time really does express a ratio.
Oh boy. A useless metric!
Compression ratio: Sure. But the problem is, it's possible to increase compression ratio by "losing" data. So you can obtain a high ratio, but the images as rendered will be blurry/damaged.
Compression Speed: This is just as dumb since compression speed is partially a function of the compression ratio, partially a function of the efficiency of the algorithm and partially a function of the amount of "grunt power" hardware you throw at it. So one portion of this is a nebulous "hardware norm" factor that can be gamed. The other is a function of the other factor (compression ratio) which can ALSO be gamed (and creates a bias towards lossy compression).
Basically something with a high Weismann number would be extremely lossy compression on high power hardware. Which basically negates the point of high resolution viewing, as any idiot can reduce a 1920x1080 frame to 19px by 11px, and then compress it. I can already take precompressed (and lossy) JPEG files, resample down to 19x11, then back up to 1920x1080. I can wind up reducing a 930K file down to 40K (basically a 95+% savings). And the image is completely indecipherable.
Take a look at an original image versus the same image on the above-described UCCT (UltraCrappyCompressionTechique).
http://cox-supergroups.com/The...
The above image is a PNG to prevent further compression artifacts from creeping into the sample.
The top portion of the image is the original 930K JPEG file.
The bottom portion is the resampled 40K JPEG file.
Chas - The one, the only.
THANK GOD!!!
Given that only a subset of Slashdot users are HBO subscribers, how is this relevant?
I want to delete my account but Slashdot doesn't allow it.
The reason there's no single metric available is because bandwidth isn't constant.
I'll and solve for a "best algorithm" given some different bandwidths, ignoring decompression time.
F1(X): 14 + X*(1- 0.00001%)
F2(X): 20 + X*(1-15%)
F3(X): 29*60*60 + X*(1-15.1%)
solving pairwise:
F1(40 seconds) = F2(40 seconds)
F1(8 days) = F3(8 days)
F2(3.31 years) = F3(3.31 years)
If the file can be transferred in 7 seconds, algorithm 1 is the clear winner (23.6% faster than algorithm 2, and nearly 5000x faster than algorithm 3).
If the file can be transferred in 7 days, algorithm 2 is the clear winner (17.6% faster than algorithm 1, and 20.2% faster than algorithm 3).
If the file can be transferred in 7 years, algorithm 3 is a marginal winner (0.062% faster than algorithm 2, and it's 17.8% faster than algorithm 1); also note that 0.062% is in the 30-40 hours range (you can get different answers depending on the number of seconds you use to compute 7 years).
Because. Everything is immediately obvious to slashdotters. QED.
No he failed to comprehend that people have found that particular method of calculating ratio of compression over time is proving to be *useful*.
I couldn't watch the first episode. Quit maybe 10 minutes into it. Does anyone here actually enjoy the show and think it's any good?
C'mon now, equal rights for AMD here.
It's not the years, honey, it's the mileage. - Colonel Henry Walton Jones, Jr., Ph.D.