Slashdot Mirror


ZFS Gets Built-In Deduplication

elREG writes to mention that Sun's ZFS now has built-in deduplication utilizing a master hash function to map duplicate blocks of data to a single block instead of storing multiples. "File-level deduplication has the lowest processing overhead but is the least efficient method. Block-level dedupe requires more processing power, and is said to be good for virtual machine images. Byte-range dedupe uses the most processing power and is ideal for small pieces of data that may be replicated and are not block-aligned, such as e-mail attachments. Sun reckons such deduplication is best done at the application level since an app would know about the data. ZFS provides block-level deduplication, using SHA256 hashing, and it maps naturally to ZFS's 256-bit block checksums. The deduplication is done inline, with ZFS assuming it's running with a multi-threaded operating system and on a server with lots of processing power. A multi-core server, in other words."

10 of 386 comments (clear)

  1. Does that mean... by Anonymous Coward · · Score: 4, Funny

    Duplicate slashdot articles will be links back to the original one?

  2. Re:Hash Collisions by pclminion · · Score: 2, Funny

    Suppose you can tolerate a chance of collision of 10^-18 per-block. Given a 256-bit hash, it would take 4.8e29 blocks to achieve this collision probability. Supposing a block size of 512 bytes, that's 223517417907714843750 terabytes.

    Now, supposing you have a 223517417907714843750 terabyte drive, and you can NOT tolerate a collision probability of 10^-18, then you can just do a bit-for-bit check of the colliding blocks before deciding if they are identical or not.

  3. Re:Hash Collisions by icebike · · Score: 2, Funny

    If blocks that are supposedly from different files have the same block data, does it really matter if it's marked redundant?

    I thing the hash collision people are worrying about is when two blocks/files/byte-ranges are hashed to be identical but in fact differ.

    When that happens your Power Point presentation contains your Bosses bedroom-cam shots.

    --
    Sig Battery depleted. Reverting to safe mode.
  4. Re:This is good news... by jeffb+(2.718) · · Score: 3, Funny

    Use open source, get cutting edge things.

    The last time I tried to build an Intel box for Linux work, I lost my grip on the cheap generic case, and sustained a cut that sent me to the emergency room. One of the things I like about my Mac is the lack of cutting edges.

  5. Re:This is good news... by Anonymous Coward · · Score: 4, Funny

    Shoulda gone with a blade server, then you wouldn't have had to worry about the emergency room.

  6. Re:Next home server will be OpenSolaris (or fBSD) by buchner.johannes · · Score: 2, Funny

    Oh yeah? Well tux is cuter so I'm not switching.

    --
    NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
  7. Re:Open Source Cures Cancer by Anonymous Coward · · Score: 5, Funny

    Like a cutting edge CAD packages, games, financial management and office suites?

    Umm, dia, nethack, perl, emacs?

  8. Re:Open Source Cures Cancer by Anpheus · · Score: 4, Funny

    But if it breaks, or doesn't work, or you've hit a deadline on a project and can't deliver because Wine or the application broke, who are you going to call for support exactly? Not the people who made the software. Are you going to email the Wine mailing list and then, when they fail to deliver a timely solution for free, tell the client that open source is to blame?

    At least when I buy software, or make purchasing decisions from a business standpoint, knowing that the company will stand behind the product and our implementation of it is more important than that trying to pursue some ideal about information and it's anthropomorphized desire to be free.

  9. Slashdot on ZFS by Anonymous Coward · · Score: 1, Funny

    So ... any plans on using ZFS on slashdot to help de-duplicate stories?

  10. Infinite compression? by n9hmg · · Score: 3, Funny

    If a hash were a replacement for data. that's all we'd need....goedelize the universe? Sometimes I just want to scream, or weep, or shoot everybody....or just drop to my knees and beg them to think - just a little tiny insignificant bit - think. Maybe it'll add up. Probably not, but it's the best I can do.