Slashdot Mirror


Ten Dropbox Engineers Build BSD-licensed, Lossless 'Pied Piper' Compression Algorithm

An anonymous reader writes: In Dropbox's "Hack Week" this year, a team of ten engineers built the fantasy Pied Piper algorithm from HBO's Silicon Valley, achieving 13% lossless compression on Mobile-recorded H.264 videos and 22% on arbitrary JPEG files. Their algorithm can return the compressed files to their bit-exact values. According to FastCompany, "Its ability to compress file sizes could actually have tangible, real-world benefits for Dropbox, whose core business is storing files in the cloud."The code is available on GitHub under a BSD license for people interested in advancing the compression or archiving their movie files.

10 of 174 comments (clear)

  1. From TFA: bit-exact or not? by QuietLagoon · · Score: 4, Interesting

    ...Horn and his team have managed to achieve a 22% reduction in file size for JPEG images without any notable loss in image quality....

    Without any notable loss in image quality.

    .
    Hmmm... that does not sound like "bit-exact" to me.

    1. Re:From TFA: bit-exact or not? by danielreiterhorn · · Score: 5, Informative

      I'm the author of the algorithm and it's bit-exact. It has no quality loss. I just committed a description of the algorithm https://raw.githubusercontent.... It is bit exact and lossless: you can get the exact bits of the file back :-)

    2. Re:From TFA: bit-exact or not? by dskoll · · Score: 4, Funny

      Compress his comment and all the redundancy will be gone.

    3. Re:From TFA: bit-exact or not? by danielreiterhorn · · Score: 5, Interesting

      Very insightful comments... let me go into detail
      I would say we have several advantages over H.264
      a) Pied Piper has more memory to work with than an embedded device (bigger model)
      b) Pied Piper does not need to seek within a 4 Megabyte block (though it must be able to stream through that block on decode) whereas H.264 requires second-by-second seekability (more samples in model).
      c) Pied Piper does not need to reset the decoder state on every few macroblocks (known as a slice), whereas H.264 requires this for hardware encoders (again, more samples per model).
      d) As opposed to a committee that designed H.264, Pied Piper had a team of 10 creative Dropboxers and guests, spending a whole week identifying correlations between the decoder state and the future stream. That is a big source of creativity! (design by commit, not committee)
      Our algorithm is, however streaming---and it's happiest to work with 4 MB videos or bigger
      Our decode window is a single previous frame--so we can pull in past information about the same macroblock-- but we only work in frequency space right now (there are some pixel space branches just to play with, but none has yielded any fruit so far) so the memory requirements are quite small.
      We are doing this streaming with just the previous frame as our state--- and it may matter--but we have a lot of work to do to get very big wins on CABAC... but given that we're not limited by the very small window and encoding parallelization requirements that CABAC is tied to, Pied Piper could well be useful soon!

    4. Re:From TFA: bit-exact or not? by danielreiterhorn · · Score: 5, Informative

      We also use arithmetic coding...but the gist of the improvement is that we have a much better model and a much better arithmetic coder (the one that VP8 uses) than JPEG did back then. I tried putting the JPEG arithmetic coder into the algorithm and compression got several percent worse, because that table-driven Arithmetic Coder just isn't quite as accurate as keeping counts as the VP8 one.

    5. Re:From TFA: bit-exact or not? by danielreiterhorn · · Score: 5, Interesting

      No one has tried to undo and redo compression of video files before. There are still doom9 forum posts asking for this feature from 12 years ago. I would say that saving lossless percentage points off of real world files is novel and important. And, since it's open source, if someone else gets more %age improvement than what we have, it could become as transformative as you describe.
      But the point is that we have something that's currently useful. It's out there and ready to be improved. It's lossless. And it has never before been tried.
      Also we did the entire algorithm in a week and aren't out of ideas!
      Besides we never claimed it was a revolution--leave that sort of spin to the marketeers...
      we're engineers trying to make things more efficient, a few percentage points at a time :-)

    6. Re:From TFA: bit-exact or not? by Megol · · Score: 4, Informative

      Interpolation isn't about adding noise.

      6 bit (per component) LCDs have for at least 10 years and probably much longer used dithering techniques to produce effective 16.2M colors (compared to a true 8 bit panel with 16.7M colors). This works very well for almost all use cases and provides smooth gradients but have the disadvantage that some image patters can produce flashing due to interference with the dithering algorithm.

      Dithering isn't about adding noise either BTW.

    7. Re:From TFA: bit-exact or not? by Bruce+Perens · · Score: 5, Insightful

      Rather than abuse every commenter who has not joined your specialty on Slashdot, please take the source and write about what you find.

      Given that CPU and memory get less expensive over time, it is no surprise that algorithms work practically today that would not have when various standards groups started meeting. Ultimately, someone like you can state what the trade-offs are in clear English, and indeed whether they work at all, which is more productive than trading naah-naahs.

  2. naysayers are missing the point by Ionized · · Score: 4, Informative

    comparing this to PNG or h.265 is missing the point - this is not a compression algorithm for creating new files. this is a way to take files you already have and make them smaller. users are going to upload JPG and h.264 files to dropbox, that is a given - so saying PNG is better is moot.

  3. Can it compress 3d videos? by leipzig3 · · Score: 4, Funny

    Can it compress 3d videos? That seems to be a real challenge.