Slashdot Mirror


A Fictional Compression Metric Moves Into the Real World

Tekla Perry (3034735) writes The 'Weissman Score' — created for HBO's "Silicon Valley" to add dramatic flair to the show's race to build the best compression algorithm — creates a single score by considering both the compression ratio and the compression speed. While it was created for a TV show, it does really work, and it's quickly migrating into academia. Computer science and engineering students will begin to encounter the Weissman Score in the classroom this fall."

3 of 133 comments (clear)

  1. Re:It really works? by phoenix_rizzen · · Score: 5, Informative

    They're talking about the Score, not the compression algorithm. And your link doesn't mention anything about the Score.

  2. Re:Useless without measure of lossiness/distortion by retchdog · · Score: 4, Informative

    it's for lossless compression only.

    anyway, you can just add a term representing the lost information and throw it into this "score". hey, why not? just figure out how important the lossiness is relative to compression rate. if it's very important, take the exp() of the loss metric; if it's unimportant (like time is), take the log(); finally, if it's just kind of important, leave it linear, or maybe square or square root. whatever.

    seriously, just make some shit up and throw it in. you won't compromise anything. it's already just made-up shit.

    --
    "They were pure niggers." – Noam Chomsky
  3. Re:Bullshit.... by mrchaotica · · Score: 5, Informative

    Can you explain in more detail?

    If you have a multi-dimensional set of factors of things and you design a metric to collapse them down into a single dimension, what you're really measuring is a combination of the values of the factors and your weighting of them. Since the "correct" weighting is a matter of opinion and everybody's use-case is different, a single-dimension metric isn't very useful.

    This goes for any situation where you're picking the "best" among a set of choices, not just for compression algorithms, by the way.

    Like, if you're trying to compress a given file, and one algorithm compressed the file by 0.00001% in 14 seconds, another compressed the file 15% in 20 seconds, and the third compressed it 15.1% in 29 hours, then the middle algorithm is probably going to be the most useful one.

    User A is trying to stream stuff that has to have latency less than 15 seconds, so for him the first algorithm is the best. User B is trying to shove the entire contents of Wikipedia into a disc to send on a space probe, so for him, the third algorithm is the best.

    You gave a really extreme[ly contrived] example, so in that case you might be able to say that "reasonable" use cases would prefer the middle algorithm. But differences between actual algorithms would not be nearly so extreme.

    --

    "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz