Slashdot Mirror


Dropbox Open Sources DivANS: a Compression Algorithm In Rust Compiled To WASM (dropbox.com)

Slashdot reader danielrh writes: DivANS is a new compression algorithm developed at Dropbox that can be denser than Brotli, 7zip or zstd at the cost of compression and decompression speed. The code uses some of the new vector intrinsics in Rust and is multithreaded. It has a demo running in the browser.

One of the new ideas is that it has an Intermediate Representation, like a compiler, and that lets developers mashup different compression algorithms and build compression optimizers that run over the IR. The project is looking for community involvement and experimentation.

12 of 33 comments (clear)

  1. Apache License by infolation · · Score: 3, Informative

    It's not mentioned in the article but DivANS is released under the Apache License.

  2. Re:Main upside by KiloByte · · Score: 1

    Its main upside appears to be being written in rust.

    Yay for a language with API breaks that make its compiler unbuildable with previous minor versions of itself, and such a stellar portability.

    --
    The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
  3. Thats all well and good but... by Hentai007 · · Score: 2, Funny

    What's the Weissman score?

  4. ultimate compression- been there, done that by swell · · Score: 1, Funny

    I like Dropbox and I'm sure they have a nice algorithm, but . . .

    Does nobody remember the ultimate compression algorithm from 1995 that could scrunch any amount of data to less than 1024 bytes. The DataFiles/16 program got quite a lot of publicity for WEB Technologies.

    As I recall there were some inconveniences; for instance for really serious compression one had to run the software multiple times- compress, then compress the resulting file, then compress that resulting file. Nevertheless that was a lot of compression! There were minor technical glitches. For example, the decompressed file was quite unlike the original.

    --
    ...omphaloskepsis often...
  5. one man's huge ... by epine · · Score: 1

    Even a 1% improvement in compression efficiency can make a huge difference.

    Hard Drive Cost Per Gigabyte — July 2017

    Looks like we're on track for $20/TB, if you purchase in bulk.

    Let's monetize a "huge difference" at $1000 (which I regard as the smallest available value for a "huge difference").

    Thus, your 1% extra compression needs to save 50 TB to make a "huge difference" of one large.

    Correct me if I'm wrong, but I'm thinking your dataset needs to be on the order of 5 PB for a 1% compression improvement to shave 50 TB.

    5 PB works out to 200,000 single-layer Blu-ray disks.

    Nice home library. (I think we can already safely assume it's not mostly drama, unless you're Pacman-ratting a good half of the entire IMDB movie list, behind the darknet spider from hell.)

    1. Re:one man's huge ... by SoftwareArtist · · Score: 1

      Here are the prices for Amazon cloud storage. Depending on the type of storage, it ranges from $0.025 to $0.125 per GB per month. Yes, that's a lot more than buying a hard drive, but a huge stack of hard drives is pretty useless for storing lots of data. This gives immediate availability to all your data, backups, etc.

      Let's say a company has 1 PB of data they need to store. Depending on the type of storage they need, that will cost between $25,000 and $125,000 per month. A 1% reduction in that cost could save them over $1000 per month, which is definitely meaningful.

      --
      "I'm too busy to research this and form an educated opinion, but I do have time to tell everyone my uninformed opinion."
    2. Re:one man's huge ... by Anonymous Coward · · Score: 1

      Correct me if I'm wrong

      You are not wrong, but you are missing the entire point of this.

      If it was about disk storage space then Dropbox would be fine with just compressing it locally. There would be no need whatsoever to compile to WASM.
      The point of having a WASM compressor is that you can compress the files on the client side without them having to install any programs for it.
      The saving isn't in disk space, it is in bandwidth.

      While compressing in the browser might be inefficient there is still an extra bonus for Dropbox here since the workload is offloaded to the user rather than being done by their servers.

      Assuming they already compressed the data they will have exactly the same disk usage as before but with less bandwidth and less CPU usage.

    3. Re:one man's huge ... by Kjella · · Score: 1

      Let's say a company has 1 PB of data they need to store. Depending on the type of storage they need, that will cost between $25,000 and $125,000 per month. A 1% reduction in that cost could save them over $1000 per month, which is definitely meaningful.

      Except 1PB is a lot of data. Walmart for example have 40 PB in their data cloud, so they could save ~$40,000 on a $500,000,000,000 business. CERN has 200 PB so that'd save ~$200,000 compared to the $9,000,000,000 budget of the LHC. It's a rounding error and I think if you're working with that kind of data you've already worked on much more specific ways to compress it that won't leave much value in a general compression algorithm. Like Google working on a new video compression algorithm for YouTube makes sense. But there's little point in making yet another compression format for 1%, unless it's transparent like in a LTO tape drive or something.

      --
      Live today, because you never know what tomorrow brings
  6. Re: Main upside by phantomfive · · Score: 1

    The language, its standard library and its one and only implementation all sucking are all minor problems compared to Rust's totalitarian community. I don't intend this as an insult, but in my opinion most Rust community members suffer from some degree of autism or Asperger's

    I'd rather have that than have so party clowns or people who are obsessed with popularity. What you need for good quality is heavy focus and deep knowledge, and whether or not they have the social skills of a Kardashian is irrelevant.

    --
    "First they came for the slanderers and i said nothing."
  7. Re: Dead project by Reverend+Green · · Score: 1

    Nah. Most of them are straight up astroturfers and disinformation operators.

  8. No such thing as better universal compression by goombah99 · · Score: 1

    The more a compression algorithm can compress one file, there has to be another file that it actually makes larger. Entropy is a bitch.

    --
    Some drink at the fountain of knowledge. Others just gargle.
  9. Re:Main upside by nagora · · Score: 1

    I have no mod points and I must scream - yes. I am sick of new languages that are just toys for their development team to ticker with ad infinitum. A bit more thought before v0.0.1 and a lot less breaking v7.74.91 code with v7.74.91-r1 and I might bother my arse to look at these things.

    --
    "Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"