Slashdot Mirror


ZFS, the Last Word in File Systems?

guigouz writes "Sun is carrying a feature story about its new ZFS File System - ZFS, the dynamic new file system in Sun's Solaris 10 Operating System (Solaris OS), will make you forget everything you thought you knew about file systems. ZFS will be available on all Solaris 10 OS-supported platforms, and all existing applications will run with it. Moreover, ZFS complements Sun's storage management portfolio, including the Sun StorEdge QFS software, which is ideal for sharing business data."

141 of 564 comments (clear)

  1. billion billion? by michael+path · · Score: 5, Funny

    From the article:

    Unlimited scalability
    As the world's first 128-bit file system, ZFS offers 16 billion billion times the capacity of 32- or 64-bit systems.

    Microsoft immediately countered by saying WinFS will now support "twelveteen million billion times" as much storage as Sun's ZFS, and is "a bazillion times" more secure.

    When reached for comment, Sun CEO Scott McNealy replied "neener neener". Microsoft CEO Steve Ballmer responded by putting gum in Sun President Jonathan Schwartz's hair.

    1. Re:billion billion? by Paulrothrock · · Score: 4, Insightful

      Billion billion is a perfectly valid number. Or would you rather they say 6.0 × 10^18? Most people can't imagine that. But people can (kind of) visualize a billion, and then multiply that by a billion, and see it's really, really big.

      --
      I'm in the hole of the broadband donut.
    2. Re:billion billion? by michael+path · · Score: 4, Informative

      How about quintillion?

    3. Re:billion billion? by backslashdot · · Score: 4, Funny

      a file systems for trillions of billions bytes of data?

      What's it for?

      Installing Windows ?

    4. Re:billion billion? by TedCheshireAcad · · Score: 4, Funny

      16 billion billion times!!

      pinky in corner of mouth.

    5. Re:billion billion? by HerculesMO · · Score: 4, Funny

      Given the fact there are an infinite amount of numbers, any murmur from your mouth or any written gibberish can be conveyed to a number.

      For example, 'sassdfadef' is a number I think is a 2 with one thousand 3s after it. It's really moot :)

      --
      The price is always right if someone else is paying.
    6. Re:billion billion? by escher · · Score: 4, Insightful

      You don't do much video editing, do you? ;)

    7. Re:billion billion? by poot_rootbeer · · Score: 4, Insightful

      As the world's first 128-bit file system, ZFS offers 16 billion billion times the capacity of 32- or 64-bit systems.

      A 64-bit (unsigned) binary number can already store values up to 16 billion billion (actually, closer to 18, but who's counting). That's roughly 2.5 billion individually addressable locations for every man, woman, and child living on Earth.

      Shouldn't that be enough to hold us for a few generations at least?

    8. Re:billion billion? by hackwrench · · Score: 5, Funny

      In fact, this page is just one big number.

    9. Re:billion billion? by jellomizer · · Score: 3, Insightful

      64 bits should be enough for everybody.
      Well 128 Bit is more of an issue of coming up with something without a limit or a limit that anyone any time soon will use up. The difference between 64bit and 128 bit is the diffence of a number that we can handle and comprehend to a number that is much to big for our minds to properly comprehend.
      How can someone fill a 64bit file system, Well a large company or government organization that stores all their persons files onto one file system. Or say a program that gives its logs in seporate files. Or say storing uncompiled movies frame by frame. Or having an archive of data spanning hundreds of years. Yes there are ways around it now. But sometimes have a file system that doesn't have those limits. Comes in handy, nor nessarly for not but to expend into the future.

      --
      If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    10. Re:billion billion? by sgant · · Score: 2, Funny

      Do you?

      Oh...um...just checked your web site...um...I guess you do.

      --

      "Leo Fender was in a 'state of grace' when he designed the Stratocaster." -- Paul Reed Smith
    11. Re:billion billion? by uberpeon · · Score: 2, Funny

      Dang it, my "+1 Rimshot" moderation option isn't available.... :)

    12. Re:billion billion? by Shadowlion · · Score: 5, Funny

      Even including all the world's porn.

      I dunno, man. I've got a lot of porn...

    13. Re:billion billion? by mikael · · Score: 3, Interesting

      Billion billion is a perfectly valid number. Or would you rather they say 6.0 × 10^18?

      Whenever I see or hear the word "billion" the first thing I ask is that US billion or British billion?

      "six times ten raised to the power of eighteen" seems much more clear and precise.

      --
      Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
    14. Re:billion billion? by ArsonSmith · · Score: 3, Funny

      Six point ohh times ten to the eighteenth.

      Yer right doesn't roll the way a billion billion does.

      --
      Paying taxes to buy civilization is like paying a hooker to buy love.
    15. Re:billion billion? by So_Belecta · · Score: 3, Funny

      Even easier, picture the population of China divided out, one per 1m^2, in a giant grid 1km x 1km, then turn it into a skyscraper with 1000 floors, each 3m to a total of 3km, with a layer of people on each floor. A truly massive number, but not un visualisable by any stretch of the imagination. Quite small, really, when you think about it, 6 of these skyscrapers could fit in an 6km square area.

    16. Re:billion billion? by Dazza · · Score: 5, Informative

      Hmm... another one who doesn't know that there's a fair amount of land outside the US borders.

      Nope. He said he'd never been outside the UK, so I'd be fairly certain he's aware of land outside the US.

      Also living in the UK, I can attest that whenever you hear '1 billion', '1000 million' is meant. The UK converted to this for accounting purposes during the 70's.

      The same I suspect is true for most of previously Europe-dominated countries (say India for example).

      India, in particular, is toally different. They don't rely on millions and billions but 'crore' and 'lakh' which are 10million and 100k respectively.

      --
      -- "I know that this is vitriol, no solution, spleen-venting, but I feel better having screamed, don't you ?"
    17. Re:billion billion? by Jugalator · · Score: 2, Insightful

      And how many have a clue of how much that is?

      --
      Beware: In C++, your friends can see your privates!
    18. Re:billion billion? by Just+Some+Guy · · Score: 4, Funny

      Our filesystem goes to eleven.

      --
      Dewey, what part of this looks like authorities should be involved?
    19. Re:billion billion? by mikael · · Score: 2, Informative

      Our newspapers regularly like to have front page headlines like "Chancellor raids nine billion pounds from company pension schemes". In this sense it means 9 thousand million pounds. At the same time we frequently have news reports from the USA, especially with regard to budget deficits in states like California.

      --
      Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
    20. Re:billion billion? by Just+Some+Guy · · Score: 2, Informative
      I'm pretty film-ignorant, but let's say that you're talking about the equivalent of a 10000x10000 image with 64 bits of color (because you clearly want to maintain all of the information possible). That's 800,000,000 bytes (10000*10000*8) per image. Impressive, but at 24 frames per second a 64-bit filesystem will still yield 960,767,920 seconds (30.4 years) of uncompressed footage.

      Again, what exactly are you planning to film? :)

      --
      Dewey, what part of this looks like authorities should be involved?
    21. Re:billion billion? by simcop2387 · · Score: 2, Funny

      the physicist says there are 1000 bytes in a kilobyte
      and the computer scientist says there are 1024 meters in a kilometer

    22. Re:billion billion? by stonecypher · · Score: 2, Insightful

      There's a big difference between visualizing the space containing a billion elements and visualizing the elements themselves. Try imagining all the little plastic millimeter chips that fill that half mile.

      Then, since it's actually a billion billion at stake, try to imagine that half by half mile square full of tiny plastic chips.

      Finally, put them in an oversized bathtub, surround the tub with video games, a bad pizza parlor and tired parents, and wham! You're Chuck E Cheese. Therefore, we can state firmly:

      1) Visualize Billion Billion.
      2) ??? [Which adequately describes setting up a chuck e cheese]
      3) Profit.

      In soviet slashdot, billion billion profits you.

      Pardon me; I have to find a way to convince myself that my hot grits cluster joke isn't outdated.

      --
      StoneCypher is Full of BS
    23. Re:billion billion? by escher · · Score: 2, Funny

      There's a big difference between visualizing the space containing a billion elements and visualizing the elements themselves.

      That's a very good point. I use the millimeter=1000 technique to try to get some kind of grip on large numbers of items and large distances.

      I've used it to try to grasp the large distances of space. It's still very difficult but I can get to Mars (and sometimes Jupiter) before my brain craps out. Once I was able to comprehend the size of the solar system out to Pluto but that only lasted a fraction of a second. (T'was really cool, though.)

    24. Re:billion billion? by harrkev · · Score: 3, Funny

      About the size of my student loan...

      --
      "-1 Troll" is the apparently the same as "-1 I disagree with you."
    25. Re:billion billion? by harrkev · · Score: 2, Funny

      Perhaps, but then you have to start adding restrooms, chars, elevators, etc. Soon, the whole thing gets out of hand.

      And, unless you put male and female chinese on different floors, soon you will have a lot more than a billion ;)

      --
      "-1 Troll" is the apparently the same as "-1 I disagree with you."
    26. Re:billion billion? by david.given · · Score: 3, Informative
      I dunno, man. I've got a lot of porn...

      Hmm.

      If you had a filesystem 2^64 bytes wide, and your average porn jpeg was 100kB, then this means that you could store 1x10^14 images on it. That's 100'000'000'000'000 of them.

      Assuming you're male and heterosexual, this means that every woman on the planet would have to take 30'000 compromising pictures of herself to fill it up; or about 60'000 assuming you're not into the weird stuff.

      You're right, that's a lot of porn.

    27. Re:billion billion? by lee7guy · · Score: 2, Informative

      What part of "lets define a millimeter as 1000" don't you get?

      --
      Ceterum censeo Microsoftem esse delendam
    28. Re:billion billion? by Ghostx13 · · Score: 2, Funny

      Lemme guess, you usually do this after getting high right? ;-)

    29. Re:billion billion? by Trolling4Dollars · · Score: 2, Funny

      People SHOULD have a clue about a quintillion in this day and age. I'll bet that there was a time when a million was out of the mental grasp of most of the population. It's the 21st century and humanity needs to progress as a whole to bigger and better things.

    30. Re:billion billion? by lee7guy · · Score: 2, Informative

      define: 1 mm = 1000.

      1 m = 1000 mm, per definition.

      1000x1000 = ?

      --
      Ceterum censeo Microsoftem esse delendam
  2. Open source by Splinton · · Score: 4, Informative

    And it looks like it's going to be opensourced along with most of Solaris 10!

    Presumably a 32 bit machine will be able to handle a 128 bit file system, in the same way as Solaris 10 is currently destined for (at most) 64 bits.

    1. Re:Open source by i621148 · · Score: 2, Funny

      so does that mean it could be available in Fedora Core III?

    2. Re:Open source by balster+neb · · Score: 5, Interesting

      Yes, it does look like it would be open-sourced as part of Solaris 10 (it was mentioned as one of the major new features).

      Assuming the Solaris 10 will be true open source (not like Microsoft's "shared source"), as well as GPL compatibile, would I be able to use ZFS on my GNU/Linux desktop? Will ZFS be a viable alternative to ext3 and ReiserFS? Or is the overhead too big?

    3. Re:Open source by tolan-b · · Score: 4, Insightful

      I suspect that whatever open source license Sun release Solaris under, they'll be careful to make sure it's incompatible with the GPL.

    4. Re:Open source by CrkHead · · Score: 5, Funny

      It looks like Microsoft may have its new WinFS after all...

    5. Re:Open source by GileadGreene · · Score: 4, Interesting
      From the article:
      More important, ZFS is endian-neutral. You can easily move disks from a SPARC server to an x86 server. Neither architecture pays a byte-swapping tax due to Sun's patent-pending "adaptive endian-ness" technology, which is unique to ZFS.[emphasis mine]
      So while it might be open-sourced, you're not likely to see it migrating to Linux or the BSDs any time soon.
  3. Out of letters. by Kenja · · Score: 3, Funny

    Of course ZFS is the last word in file systems. I mean, what can come after zed?

    --

    "Have you ever thought about just turning off the TV, sitting down with your kids, and hitting them?"
    1. Re:Out of letters. by Saint+Stephen · · Score: 5, Funny

      [fs, natch

    2. Re:Out of letters. by Goodbyte · · Score: 2, Funny

      The next file-system will be developed swedish bikini waxers to make it possiblie to call it åfs.

    3. Re:Out of letters. by badriram · · Score: 4, Informative

      I just wonder how many people on slashdot would even understand that....

      To those who dont know.. [ comes after Z in ASCII and unicode-latin

    4. Re:Out of letters. by drinkypoo · · Score: 2, Insightful

      To those who don't know, no amount of explanation can make a joke funny. In fact, if you have to explain the joke, it's pretty much guaranteed not to be funny. I found it kind of amusing - I didn't know that [ was the next character but I was able to guess that it was simply by what was said. Consequently, I found it amusing. The response from someone who doesn't think about that stuff is going to be similar to "Ah. That's funny." Followed by a shaking of the head as they walk off toward the water cooler to tell everyone what an insufferable nerd you are.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  4. Two things... by rincebrain · · Score: 5, Insightful

    1) Even Sun has succumbed to recursive acronyms, now.

    2) Is it just me, or is the post surprisingly bereft of unique details? I mean, integration with all existing applications is rather assumed, given that it's a file system and all...

    --
    It's only an insult if it's not true.
    1. Re:Two things... by InadequateCamel · · Score: 4, Funny

      I especially liked:
      "Neither architecture pays a byte-swapping tax due to Sun's patent-pending "adaptive endian-ness" technology"

      Adaptive endian-ness? What a stupid thing to include in a press release...there has to be a better way to say that.

      Just announced by Sun:
      "ANMF, our new file system (Ambiguous Nomenclature FS) will be filled with file cataloguing technology stuff that allows faster-ish operations that result in application goodness".

    2. Re:Two things... by Roadkills-R-Us · · Score: 2, Funny

      Is it just me, or is the post surprisingly bereft of unique details?

      Neither. I mean, it's you, but it's not just you.

      The details are there; you just can't remember them:

      ZFS, the dynamic new file system in Sun's Solaris 10 Operating System (Solaris OS), will make you forget everything you thought you knew about file systems.

    3. Re:Two things... by viktor · · Score: 2

      Sun talked about ZFS on USENIX Technical this summer in Boston, and if memory serves, ZFS gives you some actually rather impressive features.

      Partitions are a thing of the past. You add disks in one end, and create filesystems in the other. You do not need to do anything inbetween, it is done automatically.

      In fact, a lot of things are automatic. When betatesting ZFS, a customer found a bug: half of his identical disks were used twice as much as the other half - a clear bug. After looking into it, Sun's engineers determined that the problem was not with ZFS, but with the fact that half of the cheap disks were mislabled and actually just had half the write cache that their labels said. ZFS detected this and used the ones with bigger caches more.

      The design goal, which the ZFS team were close to this summer, is to give you filesystem access at raw I/O speed. One of ZFS' design goals is that it should not incur any overhead. Sounds impossible, but they were close already.

      Now, I know that the Slashdot crowd really only trusts software which is Open Source, needs you to fix ten lines of code before it compiles and which lacks any form of documentation apart from the source code, but ZFS is actually very, very cool even though it lacks all those trust-inducing features.

      Just imagine that it was a conceptual idea done by some Open Source-guy, and you'll see that it is indeed impressive. Then start implementing all ZFS' algorithms and ideas in a Open Source Linux variant, and we'll all be singing and dancing together. ;-)

  5. Hmf. by BJH · · Score: 5, Insightful

    Logically, the next question is if ZFS' 128 bits is enough. According to Bonwick, it has to be. "Populating 128-bit file systems would exceed the quantum limits of earth-based storage. You couldn't fill a 128-bit storage pool without boiling the oceans."

    So, what was the point of creating a 128-bit filesystem?

    -1, Marketing Hype.

    *Yawn*

    1. Re:Hmf. by Kenja · · Score: 5, Informative
      "So, what was the point of creating a 128-bit filesystem?

      Getting rid of file/drive size limitations for the foreseeable future?

      --

      "Have you ever thought about just turning off the TV, sitting down with your kids, and hitting them?"
    2. Re:Hmf. by grub · · Score: 2, Funny


      To store static routes for a lot of IPv6 addresses? :)

      --
      Trolling is a art,
    3. Re:Hmf. by elmegil · · Score: 4, Interesting
      "You'll never need more than 640K of memory". The point would be to be ready as storage densities increase. In the last 8 years we've gone from a terabyte filling a room to a terabyte on a desktop, and I'm sure there are more density breakthroughs coming.

      It's your density, Luke.

      --
      7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
    4. Re:Hmf. by Gentoo+Fan · · Score: 2, Funny

      Yes, I do understand exponentials, just not thinking terribly fast today. I forget the kilo/mega/tera scale... I think I'll just refer to the size as Wowzerbytes.

    5. Re:Hmf. by BJH · · Score: 2, Interesting

      The limitation on storage systems is, and has been for a while, the speed of transferring data in and out of the system, rather than the overall capacity.

      The highest-speed systems currently available can (maybe) transfer data at 300MB/s or so. To transfer a dataset of only 40 bits, it'd take approximately an hour. A 64-bit dataset is more than 16 million times as large - which means it'd take nearly two millenia to transfer on today's best systems.

      Even if transfer rates are increased by two orders of magnitude (effectively unthinkable for the forseeable future without the development of entirely new and currently unknown technologies), you've still only reduced that time from 2000 years to 20 years.

    6. Re:Hmf. by backslashdot · · Score: 2, Interesting

      That freaking 640k quote is over used!

      It would have been ridiculous AT THE TIME to address more data.. CPU's and software werent there yet.

      Look, there are limits to the amount of stuff people need! yeah so 640k wasnt enough doesnt mean 6 billion terabytes isnt going to be enough for you tomorrow.

      You know what .. why not have 512 bit file systems? Or 1024 bit filesystems? After all .. they said 640k would be enough for everyone .. and what happened? Global chaos and economic meltdown. Surely we need to prevent that from happening again. Oh yeah what's that? It never happened. The world still rotates.

    7. Re:Hmf. by PaSTE · · Score: 3, Insightful
      Let's do the math, shall we?

      2^128 is about 3.4e38. Now, let's be generous and asume we can control the spin of every electron we come across and incorporate it into a quantum storage device, such that each electron represented a bit of information (either left- or right-spin). Now, because I'm still being generous, I'm going to say the Earth's oceans contain 2e9 km^3, or 2e18 m^3 (compare here) Assuming all this water is liquid, its density is 1000 kg/m^3 (abouts), so we have 2e24 g of water.

      2e24 g of water contains about 1e23 moles of water molecules, or about 1e46 individual water molecules. With about 10 electrons per molecule, that's 1e47 electrons. So if we indeed "boil the oceans" in order to harvest the electrons to feed into our massive quantum storage system, we would have 1,000,000,000 spare electrons for things like hydrogen fuel cells.

      But this does not exceed the quantum limits of earth-based storage, even by a long shot. Bonwick even admits it: You couldn't fill a 128-bit storage pool without boiling the oceans. Boiling the oceans is definately an earth-based option for quantum storage, as we wouldn't have to import the materal from space. We also have other ways of harvisting electrons, like boiling humans and evacuating the atmosphere. To give you an idea, there's something like 10^54 electrons on earth, give or take a few hundred trillion. We'd need at least a 192-bit system to approach Earth's quantum (electron-based) limits.

      --
      /*No comment*/ #No comment //No comment ;No comment 'No comment REM No comment !No
  6. Unlimited scalability by Anonymous Coward · · Score: 3, Insightful
    Unlimited scalability
    As the world's first 128-bit file system, ZFS offers 16 billion billion times the capacity of 32- or 64-bit systems.
    But the last time I checked, 16 billion billion is still less than infinity.
    1. Re:Unlimited scalability by szo · · Score: 4, Funny

      How did you check it? Count up to it and then add one and see if you could? Just asking.

      Thanks

      Szo

      --
      Red Leader Standing By!
    2. Re:Unlimited scalability by sabinm · · Score: 2, Funny

      But the last time I checked, 16 billion billion is still less than infinity

      So what you're saying is that they offer absolutely no storage capacity at all. Taken from the absolute authority of all knowledge in the universe I quote:

      "Universe, The

      Some information to help you live in it.

      1. Area: infinite.

      2. Imports: none.

      It is impossible to import things into an infinite area, there being no outside to import things from.

      3. Exports: none.

      See Imports.

      4. Population: none.

      It is known that there are an infinite number of worlds, simply because there is an infinite amount of space for them to be in. However, not every one of them is inhabited. Therefore, there must be a finite number of inhabited worlds. Any finite number divided by infinity is as near to nothing as makes no odds, so the average population of all the planets in the Universe can be said to be zero. From this it follows that the population of the whole Universe is zero, and that any people you may meet from time to time are merely the products of a deranged imagination.

      emphasis added

      http://hhgproject.org/entries/universe.html/

      Extrapolate that to storage.
      Or to you for that matter.

      --
      http://cincyboys.blogspot.com/ Everything Cincinnati. Including the word 'Finnih'
    3. Re:Unlimited scalability by dynamo · · Score: 2, Informative

      Just because not all worlds are inhabited doesn't mean there aren't an infinite number. If you allow yourself to presume infinite space and infinite worlds, suppose 9% of them turn out to be inhabited, no matter how many you keep examining.

      Infinity is relative.

    4. Re:Unlimited scalability by UserGoogol · · Score: 2, Funny

      To reply to most people who replied to you:

      Yes, there are several errors in that population zero bit, which are almost certainly intentional. In order to properly play around with infinity, one must follow the rules of Calculus. But as Douglas Adams wisely knew, Calculus is not funny, and all jokes in the field are at best derivative. Thusly, it was integral to his success as a writer to stick to Algebra.

      --
      "Never attribute to malice that which can be adequately explained by stupidity." -- Hanlon's Razor
  7. Cool but.... by otis+wildflower · · Score: 3, Interesting

    ... it took them long enough.

    Perhaps they had to rewrite an LVM from scratch in order to opensource it?

  8. What is their disk allocation scheme? by grunt107 · · Score: 3, Informative

    Having a global pool does lessen maintenance/support, but what method are they using to place data on the disks?

    Frequently accessed data needs to be spread out on all the disks for the fastest access, so does that mean Sun has FS files/tables that track usage and repositions data based on that?

    1. Re:What is their disk allocation scheme? by Bobo_The_Boinger · · Score: 2, Interesting

      I was concerned about the ability to selectively remove a disk. Say I have 3 disks and ZFS has spread my data all over those three disks. How do I say, "I need to remove disk 2, please move all that data to other disks now."? Just a minor concern really, but something to think about.

      --
      --David
    2. Re:What is their disk allocation scheme? by dTb · · Score: 3, Informative

      According to the information given in this blog it is possible to "show how much space is used in each disk. If you want to reduce the amount of space in a pool by removing a disk, you could use this to choose the least-full disk, thus minimizing the time it will take to migrate that data to other disks".

    3. Re:What is their disk allocation scheme? by majid · · Score: 5, Interesting

      I was in a chat session with their engineers yesterday. It looks like they have adaptive disk scheduling algorithms to balance the load across the drives (e.g. if a drive is faster than others, it will get correspondingly more I/O). The scheduler also tries to balance I/O among processes and filesystems sharing the data pool.

      This is a good thing - queueing theory shows a single unified pool has better performance than several smaller ones. People who try to tune databases by dedicating drives to redo logs don't usually realize what they are doing is counterproductive - they optimize locally for one area, at the expense of global throughput for the entire system.

      ZFS uses copy-on-write (a modified block is written wherever the disk head happens to be, not where the old one used to be). This means writes are sequential (as with all journaled filesystems) and also since the old block is still on disk (until it is garbage collected) this gives the ability to take snapshots, something that is vital for making coherent backups now that nightly maintenance windows are mostly history. This also leads to file fragmentation so enough RAM to have a good buffer cache helps.

      Because the scheduler works best if it has full visibility of every physical disk, rather than dealing with an abstract LUN on a hardware RAID, they actually recommend ZFS be hosted on a JBOD array (just a bunch of disks, no RAID) and have the RAID be done in software by ZFS. Since the RAID is integrated with the filesystem, they have the scope for optimizations that is not available if you have a filesystem trying to optimize on one side and a RAID controller (or separate LVM software) on the other side. Network Applicance does something like this with their WAFL network filesystem to offer decent performance despite the overhead of NFS.

      With modern, fast CPUs, software RAID can easily outperform hardware RAID. It is quite common for optimizations like hardware RAID made at a certain time to become counterproductive as technology advances and the assumptions behind the premature optimization are no longer valid. A long time ago, IBM offloaded some database access code in its mainframe disk controllers. It used to be a speed boost, but as the mainframe CPU speeds improved (and the feature was retained for backward compatibility), it ended up being 10 times slower than the alternative approach.

  9. There already is a ZFS. by TheLoneGundam · · Score: 5, Informative

    IBM has ZFS on their z/OS Unix Systems Services (POSIX interfaces on z/OS) component. ZFS was developed to provide improvements over the HFS (Hierarchical File System) that they ship with the OS.

  10. Re:not alphabetically by laird · · Score: 5, Funny

    Nah, the ultimate filesystem has to be xyzzyfs! Your data magically appears... :-)

  11. will it be open source by joshtimmons · · Score: 2, Interesting

    We heard earlier that solaris 10 will be open source.

    I wonder if that means that this filesystem can be included in other kernels.

  12. Why don't they just describe the capacity in by wiredog · · Score: 5, Funny

    Sagans?

    1. Re:Why don't they just describe the capacity in by Insightfill · · Score: 3, Informative
      Here's a good source.

      "Johnny Carson, America's popular talk-show host, loved to affectionately mimic Carl - one of his favorite guests - by saying "billions and billions," until everyone associated it with Carl. Yet Carl never said that precise phrase in public until years later.

      He grew quite tired of it. I remember a concert for Planetfest, a Planetary Society celebration of space exploration in 1981. He spoke about space exploration while accompanied by music conducted by John Williams, and inevitably had to use the word "billions." As soon as he did, tittering broke out in the audience. He glared at the offenders and continued."

      Seriously, I would LOVE to use "Sagan" as a unit of counting "billions" or something.

  13. UFS2/SU by FullMetalAlchemist · · Score: 3, Interesting

    I'm really happy with UFS2/SU, and have been more than happy with the original UFS in general since 1994 when I first started off with NetBSD.
    But, with ZFS, maybe we finally have found a FS with replacing it with. I sure look forward to trying Solaris 10, though I'm sure that I will find that SunOS has a better feal to it, like always.

    Maybe DragonflyBSD will be the one to do this, FreeBSD is generally more restrictive to radical changes; for good reasons, you don't get that stability without reason.

  14. Re:rearchitected by sheriff_p · · Score: 3, Funny

    Sadly Google returns no hits for rearchistrated

    --
    Score:-1, Funny
  15. You just got to love the headline... by Chainsaw · · Score: 2, Funny

    "ZFS, the Last Word in File Systems?"

    The last word in file systems is "systems". And stop asking file systems these questions, you fool.

    --
    War is one of the most horrible things a human can be exposed to. And one of the worlds largest industries.
  16. If it's the last word by kick_in_the_eye · · Score: 3, Funny

    If it's the last word, why are we even talking about it?

  17. Just better than the old stuff from Sun by Ewan · · Score: 5, Insightful
    Reading the article, all I see is Sun saying how bad their old stuff was, e.g.:

    Consider this case: To create a pool, to create three file systems, and then to grow the pool--5 logical steps--5 simple ZFS commands are required, as opposed to 28 steps with a traditional file system and volume manager.

    and
    Moreover, these commands are all constant-time and complete in just a few seconds. Traditional file systems and volumes often take hours to configure. In the case above, ZFS reduces the time required to complete the tasks from 40 minutes to under 10 seconds.


    Compared to AIX or HP-UX, 28 steps is shockingly bad, both have had much simpler logical volume management for several versions now (AIX for 5 years or more? certainly as long as I have used it). The existing Solaris 9 logical volume infrastructure is years behind the competition, this is bringing it up to date, but not putting it far ahead.

    Ewan
    1. Re:Just better than the old stuff from Sun by Zapman · · Score: 2, Informative

      Well, I'm not 100% sure that's fair. AIX and HP still have their old school 'format -> mkfs' path, and that is what Sun is comparing their 'new world order' to. Now, if you want to do cool things like Raid, then you need to either do the hardware based stuff, or you play with Disksuite or Veritas Volume Manager[1].

      Both have more interesting and pretty ways of playing with volumes. Disksuite is a free, add on package, and Veritas charges an arm and a leg for their Volume Manager.

      In addition to the other cool features, ZFS is just a way to deepen the abstraction away from physical volumes.

      As to it's inherent coolness, or lack there of, I'll let y'all know when I've actually been able to play with it.

      [1]Had Sun been wise years ago, they would have just bought Veritas, and the world would be very different. Now however, Veritas is one of the largest software companies in the world.

      --
      Zapman
    2. Re:Just better than the old stuff from Sun by drinkypoo · · Score: 2, Informative

      How is this actually different from JFS on top of a LVM? Either way it's made up of blocks, which can be added to the filesystem later, located on any physical medium available, using RAID... The only measurable difference seems to be the 128-bitness, which as described elsewhere seems like a big fat waste of time for the next hundred years or so.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    3. Re:Just better than the old stuff from Sun by sysadmn · · Score: 2, Informative

      With AIX and HP-UX, there's still 28 steps. It's just that the manuals say: 1) Run smit (IBM version) or 1) Run SAM (HP-UX version). and you're supposed to read the menus to figure out the other 27 steps.

      --
      Envy my 5 digit Slashdot User ID!
    4. Re:Just better than the old stuff from Sun by jfinke · · Score: 3, Insightful
      I have always maintained that the only reason Veritas exists is to make up for shortcomings in Sun's volume management and file systems.

      IBM has had a LVM since the early to mid 90s.

      Linux has one now.

      If Sun had bothered to keep up the Jones on these little things, Veritas could possible have never become what they are.

      Last I heard, they were going to start offering VXVM and VVM on AIX. My AIX admins did not care. They figured why would they spend the money for the product when they already have a usable system that is supported by the OEM.

  18. What sort of crap is this? by LowneWulf · · Score: 4, Interesting

    COME ON! It may be a slow day, but how is this news? There's only one link, and it's to Sun's marketing info.

    Can someone please provide a link to some technical details other than it being 128-bit? What does this file system actually do that is even remotely special? What's under the covers? And, more importantly, does it actually work as described?

    -1,Uninformative

    1. Re:What sort of crap is this? by Anonymous Coward · · Score: 3, Insightful

      the funny thing about reading the article is that you get the details. you should try it sometime.

      here are some more details, but nowhere near as long a list as you'll get from reading the article (since the full list would mean quoting the article, which i suggest reading).
      - data checksums eliminate the need for fsck
      - easy to add disks to the pool
      - seems to support raid 0 and at least one real raid
      - data rollbacks (sound like netapp snapshots)
      - can mount the same filesystem on sparc or x86

      while not necessarily amazing, when you start adding all of it together it makes for a large improvement over ufs or vxvm. it's interesting, to say the least. i consider this a big announcement for the solaris platform (and, as more than one person pointed out, possibly linux and bsd since the code for it will ultimately be open source).

      as far as greater technical details, how are people even going to know it exists in order to, say, make independent performance benchmarks if there's no announcement. should everyone just discover the feature accidentally?

  19. That's a lot of storage by Gentoo+Fan · · Score: 5, Funny

    But of course you'll still have to have your boot image within the first 1024 cylinders.

  20. Oh wow! by Wakko+Warner · · Score: 2, Funny

    Does this mean the absolutely awful Disksuite/Solaris Volume Manager is finally, mercifully, dead, too?

    I'll do a dance of utter joy if so. Disksuite is 10 pounds of shit in a 5 pound bag.

    - A.P.

    --
    "Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
    1. Re:Oh wow! by elmegil · · Score: 2, Informative

      Until Veritas makes their product free, there's going to have to be SOMETHING that operates in that space that is under Sun's control, don't you think? Not to mention VxVM has plenty of warts all its own.

      --
      7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
    2. Re:Oh wow! by Wakko+Warner · · Score: 2, Informative

      Oh, I have no problem with Sun offering a VM of its own. It's the lack of functionality that's always concerned me. It always seemed silly to pay $25k for the kind of volume management on Solaris that you get for free in AIX and HP/UX.

      Also, I'm tired running a volume manager simply to mirror root, and a separate, expensive volume manager (with a different level of support from a different vendor) simply to manage my data volumes, and I'm distressed that this is the "standard" way to do it in Solaris.

      Hopefully, this changes things significantly.

      - A.P.

      --
      "Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
    3. Re:Oh wow! by elmegil · · Score: 2, Informative
      If someone from Sun has conviced you that this is "standard" or "necessary", you need to talk to their management. While many people do it that way, there's absolutely no reason, since you're already paying for Veritas, to just use Veritas and be done with it.

      You're right, it'd be nice to see some regularization.

      --
      7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
  21. Son of Jor-El by Duke+Machesne · · Score: 2, Funny

    Kneel Before Zod!

  22. Another quote to cherish by AsciiNaut · · Score: 4, Insightful
    I broke the habit of a lunchtime and RTFA. According to Jeff Bonwick, the chief architect of ZFS, "populating 128-bit file systems would exceed the quantum limits of earth-based storage. You couldn't fill a 128-bit storage pool without boiling the oceans."

    Who else instantly thought of, "640 K ought to be enough for anybody", uttered by the chief architect of twenty years of chaos?

    1. Re:Another quote to cherish by mdmarkus · · Score: 5, Informative

      From Bruce Schneier in Applied Cryptography: Thermodynamic Limitations One of the consequences of the second law of thermodynamics is that a certain amount of energy is necessary to represent information. To record a single bit by changing the state of a system requires an amount of energy no less than kT where T is the absolute temperature of the system and k is the Boltzman constant. (Stick with me; the physics lesson is almost over.) Given that k = 1.38x10^-16 erg/Kelvin, and that the ambient temperature of the universe is 3.2K, an ideal computer running at 3.2K would consume 4.4x10^-16 ergs every time it set or cleared a bit. To run a computer any colder than the cosmic background radiation would require extra energy to run a heat pump. Now, the annual energy output of our sun is about 1.21x10^41 ergs. This is enough to power about 2.7x10^56 single bit changes on our ideal computer; enough changes to put a 187-bit counter through all of its values. If we built a Dyson sphere around the sun and captured all of its energy for 32 years, without any loss, we could power a computer to count up to 2^192. Of course it wouldn't have the energy left over to perform any useful calculations with this counter. But that's just one star, and a measly one at that. A typical supernova releases something like 10^51 ergs. (About a hundred times as much energy would be released in the form of neutrinos, but let them go for now.) If all of the energy could be channedel into a single orgy of computation, a 219-bit counter could be cycled through all of its states. These numbers have nothing to do with the technology of the devices; they are the maxiumums that thermodynamics will allow. And they strongly imply that brute-force attacks against 256-bit keys will be infeasible until computers are built from something other than matter and occupy something other than space.

    2. Re:Another quote to cherish by Dracolytch · · Score: 2, Interesting
      Methinks you don't understand how insanely large 128 bits is.

      340282367000000000000000000000000000000 files.
      My first computer was about.. here ^
      My system is about... here ^
      And this... ^

      A gross overestimate of every file on every computer on the internet today (250 million computers, 5 million files per computer).
      Yep. I think they might be right on this one.

      ~D
      --
      This sig has been enciphered with a one-time pad. It could say almost anything.
    3. Re:Another quote to cherish by eclectro · · Score: 2, Funny

      Given that k = 1.38x10^-16 erg/Kelvin, and that the ambient temperature of the universe is 3.2K, an ideal computer running at 3.2K would consume 4.4x10^-16 ergs every time it set or cleared a bit

      ok-ok, I get that. But can it play ogg???

      --
      Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"
  23. Different architecture, same functionality? by perseguidor · · Score: 4, Interesting

    With traditional volumes, storage is fragmented and stranded. With ZFS' common storage pool, there are no partitions to manage. The combined I/O bandwidth of all of the devices in a storage pool is always available to each file system.


    Until now it does sound just like raid, but:


    When one copy is damaged, ZFS detects it via the checksum and uses another copy to repair it.

    No competing product can do this. Traditional mirrors can only handle total failure of a device. They don't have checksums, so they have no idea when a device returns bad data. So even though mirrors replicate data, they have no way to take advantage of it.


    I guess I just don't get it; I know they are talking about logical corruption and not a physical failure, but this is kind of like raid with somethink like SMART, or isn't it?

    And what kinds of corruption can there be? Journaling filesystems already work well for write errors and such, or so I thought.

    I know the architecture seems innovative and different (at least for me), but is there really new functionality?

    Sorry if I seem ignorant this time. I don't know if I was able to get my point across; the things this filesystem does, wouldn't they be better left on a different layer?
    --
    O make me a mask
  24. Re:The thing about that.. by robslimo · · Score: 2, Funny

    I've been working on a file system (inspired by an old Signetics memory device) that's likely to *really* be the last word. It's still in alpha because I'm having trouble verifying its functionality, but it seems to work very well so far.

    I call it WOFS.

  25. What I really want to see in a file system... by kcbrown · · Score: 5, Insightful
    ...and that I haven't seen in any file system announced to date, is a way of bundling multiple filesystem operations into a single atomic transaction that can be rolled back. This would clearly require an addition of four system calls (one to begin a transaction, one to commit it, one to roll it back, and one to set the default action, commit or rollback, on exit).

    Such a feature would rock, because it would be possible to make things like installers completely atomic: interrupt the installer process and the whole thing rolls back.

    --
    Use 'slashdot stuff' in the subject line in any email you send me if you want to get past the spam filter.
    1. Re:What I really want to see in a file system... by FullMetalAlchemist · · Score: 2, Informative

      There are several FS like this, but you don't know of them because they require completely new FS API to work with.
      With UFS2/SU we have snapshots which is a compromise; it does require any changes in the original UNIX API, and all current apps therefor work. On the other hand, it either requires a daemon or a competent user.

      So, either you have UNIX or you have something else. Plan9 has many advantages, still, we use BSD, Solaris or whatever.

    2. Re:What I really want to see in a file system... by dominator · · Score: 4, Informative

      Reiserfs will apparently soon have what you're looking for. Already, all primitive operations are atomic, but they plan on exporting a user-space transaction interface soon.

      http://www.namesys.com/benchmarks.html

      "V4 is a fully atomic filesystem, keep in mind that these performance numbers are with every FS operation performed as a fully atomic transaction. We are the first to make that performance effective to do. Look for a user space transactions interface to come out soon....

      Finally, remember that reiser4 is more space efficient than V3, the df measurements are there for looking at....;-) "

    3. Re:What I really want to see in a file system... by kcbrown · · Score: 2, Interesting
      There are several FS like this, but you don't know of them because they require completely new FS API to work with.

      Why is that? There's nothing inherently impossible about having the OS remember, via a transaction log, the changes that have taken place to a set of files made by a process, and then either committing them all or rolling back all of them at process exit time (or whenever the process does a commit() or rollback()). The file operations themselves can be identical, so all you really need are those 4 additional operations I mentioned previously.

      --
      Use 'slashdot stuff' in the subject line in any email you send me if you want to get past the spam filter.
  26. Apparently... by qtone42 · · Score: 5, Funny

    ... ZFS will also make you forget everything you knew about English grammar.

    "We've rethought everything and rearchitected it," says Jeff Bonwick

    Rearchitected? WTF? Howsaboot "Redesigned?"

    I'm still wrapping my brain around "adaptive endian-ness" as well.

    --QTone

  27. Re:rearchitected by miike · · Score: 2, Funny

    Soon it will show one hit!

  28. Sounds really nice by mveloso · · Score: 5, Informative

    Looks like Sun went out and redid their filesystem based on the performance characteristics of machines today, instead of machines of yesteryear.

    Some highllights, for those that don't (or won't) RTA:

    * Data integrity. Apparently it uses file checksums to error-correct files, so files will never be corrupted. About time someone did this.

    * Snapshots, like netapp?

    * Transactional nature/copy-on-write

    * Auto-striping

    * Really, Really Large volume support

    All of this leads to speed and reliability. There's a lot of other stuff (varying blocks sizes, write queueing, stride stuff which I haven't heard about in years), but all of it leads to above.

    Oh, and they simplified their admin too.

    It's hard to make a filesystem look exciting. Most of the time it just works, until it fails. The data checksum stuff looks interesting, in that they built error correction into the FS (like CDs and RAID but better hopefully).

    It might also do away with the idea of "space free on a volume," since the marketing implies that each FS grows/shrinks dynamically, pulling storage out of the pool as needed.

    Any users want to chime in?

    1. Re:Sounds really nice by pla · · Score: 2, Interesting

      Data integrity. Apparently it uses file checksums to error-correct files, so files will never be corrupted. About time someone did this.

      So, I take it that back in the days of DOS, you never got a CRC error trying to copy an important file off a floppy?

    2. Re:Sounds really nice by the+melon · · Score: 2, Informative

      All I can really say is if you have ever use a volume manager before
      you will rejoice at the ease of zfs.

      I have been using it on my main nfs server in my Solaris lab at Sun
      for quite a while now and it is great.

      I have a 1.6tb disk array that is allocated to a single zpool on the
      system. I can add/subtract drives/arrays to this pool at any time to
      increade decrease the amount of storage avalable to the pool.

      I can then creat, format and mount a zfs filesystem with one single
      command to the zpool. the filesystem will only consume as much of the
      zpool as it is actually using.

      It really is a great system.

  29. Patent-pending adaptive endianness? by yeremein · · Score: 3, Insightful
    ZFS is supported on both SPARC and x86 platforms. More important, ZFS is endian-neutral. You can easily move disks from a SPARC server to an x86 server. Neither architecture pays a byte-swapping tax due to Sun's patent-pending "adaptive endian-ness" technology, which is unique to ZFS.
    Bleh. How expensive is it to byte-swap anyway? Compared with checking whether the number you're looking at is already the right endianness? Just store everything big-endian; x86 systems can swap it in a single instruction anyway. It's not like all data needs to be byte-swapped anyway, just metadata. I can't imagine the penalty would come even close to the amount of time spent doing their integrity checksums anyway.

    Looks to me like nothing more than an excuse to put up a patent tollboth for anyone who wants to implement ZFS.
  30. Curious points by tod_miller · · Score: 3, Interesting

    "Sun's patent-pending "adaptive endian-ness" technology"

    ok, that aside. First 128bit file system, and get this: transactional object model

    I think this means it is optimistic but they figure it has blazing fast performance, who am I to argue. Fed up with killing this indexing garbage on the work machine, bloody microsoft, disabled it and everything and every full moon it seems to come out and graze on my HDD platter.

    From the MS article : This perfect storm is comprised of three forces joining together: hardware advancements, leaps in the amount of digitally born data, and the explosion of schemas and standards in information management.

    Then I started to suspect they would rant about moores law and sure e-bloody-nough

    Everyone knows Moore's law--the number of transistors on a chip doubles every 18 months. What a lot of people forget is that network bandwidth and storage technologies are growing at an even faster pace than Moore's law would suggest.

    That is like saying, everyone knows the number 9 bus comes at half 3 on wednesdays, but noone expects 3 taxis sat there doing nothing at half past 3 on a tuesday.

    Can we put this madness to rest? Ok back to the articles.

    erm... lost track now....

    --
    #hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
  31. Re:fileless systems by gorbachev · · Score: 2, Funny

    Something like:

    SELECT * FROM storage WHERE path = '/home/gorbachev/.cshrc' :)

    --
    In Soviet Russia, I ruled you
  32. 64 bits is awfully big already by pslam · · Score: 4, Informative
    Getting rid of file/drive size limitations for the foreseeable future?

    It would take over 500 years to fill a 64 bit filesystem written at 1GB/sec (and of course 500 years to read it back again). 64 bits is already an impossibly large figure. There's absolutely nothing special or clever whatsoever about doubling the size of your pointers aside from using up more disk space for all the metadata.

    64 bits is enough for today's filesystems in much the same way that 256 bit AES is enough for today's encryption - there are far bigger things that will require complete system changes than that so called "limit". I suspect a better filesystem will come along well before those 500 years are up... I agree with grandparent:

    -1, Marketing Hype.

    1. Re:64 bits is awfully big already by Jeff+DeMaagd · · Score: 4, Insightful

      It would take over 500 years to fill a 64 bit filesystem written at 1GB/sec (and of course 500 years to read it back again).

      One product already can transfer a Terrabyte per second, so that would cut the transfer down to half a year. And I imagine that transfer rate would continue to increase.

      I don't see how one would necessarily argue against such a thing for products that will go for cluster and supercomputer use. I say might as well get the bugs out so when you can so that once the 65th bit is needed, the supercomputer suppliers are ready.

      http://www.sc-conference.org/sc2004/storcloud.ht ml

    2. Re:64 bits is awfully big already by Too+Much+Noise · · Score: 3, Interesting

      The funny thing is, until the time an 128-bit FS will really be needed any patents Sun has on ZFS will have expired. So whatever that day's Open Source OS of choice will be, it will at least support ZFS (and probably that time's 128-bit incarnation of several of today's FS's).

      Somehow, an alternate history where 80286 was 64-bit instead of 16-bit (while everything else staying the same) comes to mind when reading the Sun's marketing on this.

    3. Re:64 bits is awfully big already by pslam · · Score: 5, Insightful
      Yeah, its probably marketing hype now, but in 5 years, what about 10? Just because we can't do it now doesn't mean that we should stop progress.

      No, precisely because we can't do it now, and for the very predictable future, we shouldn't be wasting all that disk space, access and CPU time for a boundary that no production system is likely to ever reach before they get upgraded. That's just practicality.

      Seagate apparently sold 18.3 million desktop drives last year. Assuming they're all about 120GB (which is generous of me), that would be about 17.6*10^18 bits. Guess what, that's 2^64 bits. Yes, you would have to buy every single desktop hard drive Seagate shipped in the last year to have the capacity to fill a 64 bit filesystem. And find space for 18 million drives. And a power station to deliver the several hundred megawatts you'd need.

      Even at 2 times drive capacity growth per year that's still a ridiculously unattainable figure. In 14 years time you'd only need to buy 1000 drives (which are now 2000TB each). But 14 years is a geological time scale when it comes to computers. You'd have wasted 14 years of CPU time and disk space devoted to those extra 64 bits.

      If you still think 64 bits isn't enough, how about 96 bits? It would take 46 years before hard disks were big and cheap enough so you could fill the filesystem by buying 1000 of them. But no, they chose 128 bits because it sounded good.

    4. Re:64 bits is awfully big already by dTb · · Score: 2, Informative

      The filesystem has compression built in as an option to make storege more efficient. They currently use LZJB (fast but little reduction) compression but plan to add more powerfull but slower compression at a later date.

    5. Re:64 bits is awfully big already by laird · · Score: 3, Interesting

      "It would take over 500 years to fill a 64 bit filesystem written at 1GB/sec"

      This is about the same argument as IPv6 addressing: it's expensive to change the size of the address space, so make it absurdly large because bits of address space are cheap, you enable some interesting unforseen applications, and you put off a forced migration.

      While I agree that 128-bit block addressing is overkill for a single computer, once you're going to expand past a 64-bit filesystem, there's not much point in going smaller than a 128-bit fileystem. It's not like you'd save money making it an 80-bit filesystem.

      As to your point about the speed of a hard drive vs. the addressible space in the filesystem, keep in mind that filesystems are much larger than disks. For example, it's not that unusual (in cooler UNIX environments) for everyone in a company to work in one large distributed filesystem, which may run across hundreds or thousands of hard drives. Now imagine a building full of people working with very large files (e.g. video production) where you could easily accumulate terabytes of data. Wouldn't it be nice to manage your online, nearline, and offline storage as a system, extremely large filesystem? Or, for real blue-sky thinking, imagine that everyone on the planet uses a single shared, distributed filesystem for everything. Wouldn't it be cool to address _everything_ using a single, consistent scheme no matter where you are. Cool, eh?

    6. Re:64 bits is awfully big already by mcrbids · · Score: 3, Insightful

      1) Adding more address space bits doesn't significantly slow down performance.

      2) Migrating from one address space to another is painful. Why make it more frequent by aiming low? Do you think migration would be any less painful in 14 years?

      3) New applications: Broadband didn't just result in really fast web-page downloads - the entire online music industry stems from that. The original creators of TCP/IP had no idea that they were developing media on-demand, they were making it so that you could transfer bits from one archaic machine to another.

      Building flexible, capable systems creates an environment where development isn't as constrained by limitations - resulting in new, unpredictable developments.

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    7. Re:64 bits is awfully big already by julesh · · Score: 2, Insightful

      once you're going to expand past a 64-bit filesystem, there's not much point in going smaller than a 128-bit fileystem.

      Why expand past a 64 bit filesystem. 64 bits with 1k blocks as your smallest addressable unit (which is more than reasonable for a filesystem this size) gives you 2^74 bytes to play with. For reference, that's 16 * 2^70 bytes = 16 * 2^30 terabytes, or "one hell of a lot of data".

    8. Re:64 bits is awfully big already by identity0 · · Score: 2, Insightful

      You've pointed out just why we need this. The problem is, you're still thinking in terms of individual hard drives in individual computers that can only be accessed by the local machine.

      What are you going to do when you access all of your data through a network, and the whole world has their storage on the internet, using a global filesystem? You said yourself that one manufacturer makes 2^64 bits of HD space every year, so 64-bit is obviously not enough. We need 128 bits if we want to be able to make use of all the HD space that is going to waste on networked computers today.

      Hell, we could do that today, if we had - wait for it - the right filesystem.

      The fact that it's Sun that came up with this suggests they're thinking along the same lines. They would benefit greatly if people started using a massively networked filesystem, especially if they own the code to it.

    9. Re:64 bits is awfully big already by Mornelithe · · Score: 2, Interesting

      Online, nearline and offline storage aren't the same address space. I have several partitions on my computer. Some use reiserfs, some use ext3 and some use ext2 (and vfat and ntfs...). In Linux, they're all mounted to look like one large filesystem hierarchy, but they're not. Each partition has its own filesystem 'address space.'

      So you don't need larger than a 64 bit filesystem unless you're going to have a single volume (real or virtual) that uses more than 16 billion terabytes of data. That's 64 billion 250 gig hard drives. What's the population of China these days? 2.5 billion or thereabouts? If you gave everyone in China 25 250 gigabyte hard drives, you'd come close to filling up a 64 bit filesystem (you'd fall short actually).

      And that's only if everyone in China uses a single, giant RAID array for those 64 billion hard drives.

      Or everyone on the planet gets 9 such hard drives. That 1.75 terabytes for every single human being right now, and we're still within the limits of a 64 bit filesystem.

      Your video editing analogy doesn't even come close, and the idea of a whole country using a single, centralized volume (let alone the whole planet) doesn't really make any sense. Addressing all the data in the entire world on every computer at the filesystem level seems like a very bad idea, to me.

      Maybe in 10 to 15 years we'll have individual disks large enough so that large clusters can exceed the bounds of a 64 bit filesystem, but you'll still have to buy entirely new hardware to take advantage of that capability, so a 128 bit filesystem on today's hardware offers no advantages over a 64 bit filesystem, and in fact only makes things slower. Not really very cool at all if you ask me (although the other features of the filesystem likely have merit).

      --

      I've come for the woman, and your head.

    10. Re:64 bits is awfully big already by mcrbids · · Score: 2, Insightful
      If you could suggest to me a new applications that needs over 8 billion times more storage capacity as top-of-the-range current systems, please, go ahead and introduce it. Just don't ask me for financing.
      Please read what I wrote! Or, is the word "unpredictable" not in your comprehension? Try reading this, word for word, and see if your response does anything but make you sound like an idiot:

      "3) New applications: Broadband didn't just result in really fast web-page downloads - the entire online music industry stems from that. The original creators of TCP/IP had no idea that they were developing media on-demand, they were making it so that you could transfer bits from one archaic machine to another."

      How could they predict iTunes? Why would you think it reasonable to predict the usage of such a filesystem?
      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
  33. Forgot by TheFlyingGoat · · Score: 2, Funny

    I was going to respond to the article, but I forgot everything I know about file systems.

    --
    You have enemies? Good. That means you've stood up for something, sometime in your life. --Winston Churchill
  34. The final file system, XXXfs by www.sorehands.com · · Score: 2, Funny

    Though it before xyzzyfs, it is the last because it automatically generates and collects porn. Most geeks would never get past it.

  35. There already is an HFS as well. by tepples · · Score: 4, Funny

    Then why didn't IBM call its improved HFS "HFS Plus"? No wait, that would collide with Apple's HFS and HFS Plus, used in Mac OS.

    It would appear that there can be only twenty-six distinct file systems. Then Microsoft went and innovated NTFS with Four-Letter-Word File System Technology, which actually was just a copy of IBM's HPFS, the first to introduce File System Named After a Competitor Technology.

  36. Silly AC by 2nd+Post! · · Score: 2, Insightful

    You organize a 128bit file system with a database.

    Why bother with folders as a root? You can create a folder hierarchy *with* a database too.

  37. Shared data pools... by vspazv · · Score: 4, Interesting

    So what are the chances that someone could accidentally wipe the shared data pool for an entire company and how hard is recovery on a volume striped across a few hundred hard drives?

  38. Patents and other Bad Signs. by twitter · · Score: 4, Interesting
    Opensource is useless when it's patent encumbered. While it's nice that the details will be available, it sucks to think that I can't use them except to serve Sun for the next 17 years. Such disclosure, of course, is what the patent system is supposed to provide but does not. What the patent is providing is ownership of ideas. How obvious those ideas are and if there's prior art is impossible to say from the linked puff piece.

    This article is shocking. I'm used to much less hype and far more technical details from Sun. Software patents and bullshit are not what I expect when I follow a link to them.

    I don't like any of this.

    --

    Friends don't help friends install M$ junk.

    1. Re:Patents and other Bad Signs. by Anonymous+Writer · · Score: 3, Interesting

      Opensource is useless when it's patent encumbered.

      The GPL states the following...

      Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all.

      I thought that if the patent holder distributes patented material under the GPL, it is a declaration that the holder has relinquished control over the patented material for as long as it is applied under the GPL.

    2. Re:Patents and other Bad Signs. by gimpboy · · Score: 2, Informative

      There are many opensource licenses. All opensource means is that the code is available for inspection and modification. Opensource is more a copyright issue and has nothing to do with patents. The gpl --- which is not the same as opensource --- addresses both copyright and patent issues.

      --
      -- john
  39. Last Word? by tedgyz · · Score: 4, Funny

    It's a 128-bit filesystem, so doesn't that make it the last 8 words?

    --
    "No matter where you go, there you are." -- Buckaroo Banzai
  40. Some snippets from the article by ChrisRijk · · Score: 2, Informative

    ZFS achieves its impressive performance through a number of techniques:
    * Dynamic striping across all devices to maximize throughput
    * Copy-on-write design makes most disk writes sequential
    * Multiple block sizes, automatically chosen to match workload
    * Explicit I/O priority with deadline scheduling
    * Globally optimal I/O sorting and aggregation
    * Multiple independent prefetch streams with automatic length and stride detection
    * Unlimited, instantaneous read/write snapshots
    * Parallel, constant-time directory operations


    ZFS has some similarities to NetApp's WAFL in that it uses "copy on write".

    One of the fun things with ZFS is that it automatically stripes across all the storage in your pool. Disk size doesn't matter - it's all used. This even works across SCSI and IDE.

    One of the important things is that volume management isn't a seperate feature. Effectively, all the current limitations of volume managers are blown away:

    Just as it dramatically eases the suffering of system administrators, ZFS offers relief for your company's bottom line. Because ZFS is built on top of virtual storage pools (unlike traditional file systems that require a separate volume manager), creating and deleting file systems is much less complex. Not only does this eliminate the need to pay for volume manager licenses and allow for single support contracts, it lowers administration costs and increases storage utilization.

    ZFS appears to applications as a standard POSIX file system--no porting is required. But to administrators, it presents a pooled storage model that eliminates the antique concept of volumes, as well as all of the related partition management, provisioning, and file system sizing problems. Thousands--even millions--of file systems can all draw from ZFS' common storage pool, each one consuming only as much space as it needs. The combined I/O bandwidth of all of the devices in that storage pool is always available to each file system.


    This is also part of the stuff making admin and configuration far far simpler. The thing I like is that it should be far harder to go wrong with ZFS (not available in Solaris Express yet so I haven't seen this for myself).

    The very high degree of reliability as standard is very welcome too:

    Data can be corrupted in a number of ways, such as a system error or an unexpected power outage, but ZFS removes this fear of the unknown. ZFS prevents data corruption by keeping data self-consistent at all times. All operations are transactional. This not only maintains consistency but also removes almost all of the constraints on I/O order and allows changes to succeed or fail as a whole.

    All operations are also copy-on-write. Live data is never overwritten. ZFS writes data to a new block before changing the data pointers and committing the write. Copy-on-write provides several benefits:

    * Always-valid on-disk state
    * Consistent, reliable backups
    * Data rollback to known point in time

    "We validate the entire I/O stack, start to finish, no guesswork involved. It's all provable data integrity," says Bonwick.

    Administrators will never again have to run laborious recovery procedures, such as fsck, even if the system is shut down in an unclean fashion. In fact, Solaris Kernel engineers Bill Moore and Matt Ahrens have subjected ZFS to more than a million forced, violent crashes in the course of their testing. Not once has ZFS lost data integrity or leaked a single block.


    For more technical info see Matt Ahrens's and Val Henson's blogs - since they're among the engineers who worked on it.

  41. Actually, Novell already made ZFS... by thehunger · · Score: 3, Informative

    The codename for the first generation of Novells current filesystem was ZFS. Why? because it was supposed to be "the last, or final word" in file systems.

    Novell now Novell Storage System (I think it used to be NetWare Storage System).

    Apart from the obvious fact that SUN didnt manage to be very original in naming their filesystem, its noteworthy that Novell is porting their ZFS - now NSS - to Linux. It'll be part of Novell Open Enterprise Server - on both Linux and NetWare kernels.

    From the top of my mind, here are some features of NSS that SUN needs to exceed to qualify for a new "final word..":

    - Background compression
    - Fast on-demand decompression
    - Transactions
    - Pluggable Name spaces
    - Pluggable protocols (ie. http, nfs, etc)
    - Advanced Access control model with inheritance, rights filters, etc. integrated with directory service (duh!)
    - Quotas on user, group, directory level
    - 64-bit (ok, SUN obviously got that one)
    - mini-volumes
    - journaled
    - etc.

    oh well, I wont bother continuing, but its worth looking out for NSS. Hopefully Novell will open source it and not make it exclusive to their distros.

  42. There are a lot of cluster file systems by anzha · · Score: 5, Informative

    Right now there are a lot of file systems that do somehing not all that different than what Sun is proposing. The project I am on is evaluating them as we speak for a center wide filesystem. I've had the fun (no sarcasm, honestly) of setting up a number of different onces and helping to run benchmarks and tests against each. All of them have strengths. Every single one of them has some nasty weaknesses.

    If you are looking for an open source based cluster file system, Lustre is what you want. It's supported by LLNL, PNNL, and the main writers at ClusterFS Inc. It's a network based cluster FS. We've been using it over GigE. However, we've found that there needs to be a ratio of 3:1 for data server:clients for a ratio. Wehave only used one metadata server. Failover isn't the greatest. Quotas don't exist. it also makes kernel mods (some good and bad) to do a mild fork of the linux kernel (they put them into the newer kernels every so often). It only runs on Linux. Getting it to run on anything else looks...scary.

    GPFS runs on AIX and Linux. Even sharing the same storage. It runs and is pretty stable. it has the option to run in a SAN mode or network based FS. In the latter form, it even does local discovery of disks via labels so that if a client can see the disks locally it will read and write to them via FC rather than to the server. It, however, is a balkanized mess. It requires a lot more work to bring up and run: there is an awful lot of software to configure to get it to run (re: RSCT. If you haven't had the joys of HATS and HAGS, count yourself very, very lucky).

    ADIC's StorNext software is another option. This one is good if you are interested in ease of installation, maintanence, and very, very fast speeds (damn near line speed on Fibre channel). I have set this one up for sharing disks in less than two hours from first install to getting numerous assorted nodes of different OS's to play together (Solaris, AIX, Linux). It freakin on virtually everything from Crays to Linux to Windows. It's issues seem to be scaling (right now doesn't go past 256 clients) and it has some nontrivial locking issues (righting to the same block from multiple clients, and parallel I/O to the same file from multiple clients if you change the file size).

    There are some others that are not as mature. Among them are Ibrix, Panasas, GFS, and IBM's SANFS. All of them are interesting or promising. Only SANF looks like it runs on more than Linux though at this point. Our requirements for the project I am on are to share the same FS and storage instance among disparate client OSes simultaneously. This might not be the same for others though and these might be worth a look. Lustre dodges this because its open source and they're interested in porting.

    --
    Do you know why the road less traveled by is littered with the bones of the unwary?
    1. Re:There are a lot of cluster file systems by Plugh · · Score: 3, Informative
      You forgot to mention the GPLed Cluster Filesystem that Oracle released some time ago.

      You also may want to check out the ASM (Automated Storage Manager). It only works for disks that Oracle manages, but it does some pretty cool automatic load-balancing and RAIDing.

      Disclaimer:
      Yes, I do work for ORCL.
      No, I do not work on either OCFS or ASM (but I have partied with those guys :-)

  43. Technically.... by jolyonr · · Score: 5, Funny

    The last word in file systems is "systems".

    Thank you.

    --


    Please read my Canon EOS tech blog at http://www.everyothershot.com
  44. British or American? by abb3w · · Score: 2, Interesting
    Billion billion is a perfectly valid number.

    True. However, it is more ambiguous than "million million million", as absent minded Brits might interpret it as a "million million million million".

    Or would you rather they say 6.0 × 10^18?

    Yes.

    Most people can't imagine that.

    Most people can't imagine it anyway, whether you call it "six billion billion", "6.0 x 10^18", "6 x 2^60", or "1.27 x e^43". Or understand any number higher than the number of dollars they carry in their wallet, for that matter. Anyone who needs to make any decisions in life based on this ZMS number ought to be able to understand it any of those ways (although getting help from a calculator for the last one or even two is understandable). Of course, many people manage things they can't understand. This is life.

    --
    //Information does not want to be free; it wants to breed.
  45. The proof is in the pudding by melted · · Score: 3, Informative

    As someone who's been involved with performance/stress optimizations I can tell you that for each situation you can carefully put together two types of tests: one which proves that there's a problem, another that proves the problem doesn't exist.

    The proof is in the pudding. Let Sun release it and administrators use it for a year or two, then we'll see if it's good enough. Right now I'm having doubts it's as good as they want you to believe.

  46. ZFS by BJH · · Score: 2, Informative

    Two words:

    "Patent burdened"

  47. Boil the oceans, eh? by Hukui · · Score: 3, Funny

    Logically, the next question is if ZFS' 128 bits is enough. According to Bonwick, it has to be. "Populating 128-bit file systems would exceed the quantum limits of earth-based storage. You couldn't fill a 128-bit storage pool without boiling the oceans."

    Well...I never really like the oceans anyways. They were always so wet.

  48. White Papers by dTb · · Score: 2, Informative

    If anyone wants to read more details on the "Zettabyte File System" they can view the white papers on ZFS self-tuning and QOS as they contain far more detail than the marketing article given.

  49. Bill Gates just called.... by da5idnetlimit.com · · Score: 3, Funny

    Says he needs a new wallet...

    If Bill Gates had a nickel for every time Windows crashed... ..oh wait, he does.

    --
    It takes 40+ muscles to frown, but only four to extend your arm and bitchslap the motherfucker
  50. Easy upgrades by dTb · · Score: 2, Interesting
    I am very impressed by some of the ideas coming from Sun regarding this file system:

    "We're absolutely trying to make disk storage more like memory, and often use that analogy in our presentations. For example, when you add DIMMS to your computer, you don't run some 'dimmconfig' program or worry about how the new memory will be allocated to various applications; the computer just does the right thing. Applications don't have to worry about where their memory comes from. Likewise with ZFS, when you add new disks to the system, their space is available to any ZFS filesystems, without the need for any further configuration. In most scenarios it's fairly straightforward for the software to make the unequivocably best choices about how to use the storage. If you want to tell the system more about how you want the storage used, you'll be able to do that too (eg. this data should be mirrored but that not; it's more important for this data to be accessed quickly but that can be slower). We hope that with relatively modern hardware, all but the most complicated and demanding configurations will be handled adequately without any administrator intervention." read more

  51. Re:fileless systems by Tony · · Score: 4, Interesting

    After years of everyone saying that the relational model was the answer to all data organziation needs... the hierarchical model reappeared in the form of XML, and people realized that it is convenient to organize some types of data hierarchically.

    Convenient, and flawed.

    XML isn't designed to handle changing data. It's designed to be a data markup language, which indicates it's used for presenting data, not managing data.

    So far, the relational model is the best mathematically-rigorous method of managing sets of data. There are many advantages to hierarchical data representation, but for manipulation, the relational still trumps.

    Do I want to use SQL to access my files? Not if I don't have to. There are perhaps better methods, even some transparent methods.

    But, do I want to continue to self-organize my data? Hell, no! There's just too much information stored on my computer, and on my network, these days. And, considering that much of my data has multiple relationships, the hierarchical model is growing a bit long in the tooth. Many of my documents belong in multiple hierarchies.

    But, there might be a real solution soon:

    Gnome Storage looks to be a good first step.

    --
    Microsoft is to software what Budweiser is to beer.
  52. Gibi kibi mebi fibi by snol · · Score: 2, Funny

    Speaking of numbers no one can pronounce....

  53. think logarithmatic scale by pikine · · Score: 2, Interesting

    One of the key feature of ZFS is that you can create a file system over a pool of storage. Nothing stops you from building a distributed storage pool of 18.3 million desktop drives (they don't have to be locally connected). You could apply the same concept as SETI@HOME and allow end users with excessive storage space to lend them. Didn't someone talk about a peer to peer backup system a while ago?

    And com'n, don't be so against hypes. Not all numbers are evil. And the overhead to process some extra bits are miniscule. The space and time required are in logarithmic time to the size of the number set. E.g., 128-bit is some billions billion times the size of 64-bit, but only takes 2 times more to store and process. And this time is already small compared to the actual I/O time, and the space compared to combined storage space.

    --
    I once had a signature.
    1. Re:think logarithmatic scale by Mornelithe · · Score: 2, Insightful

      Actually, it's only 18.3 million desktop drives if you address every single byte of the filesystem. Most don't do this; they allocate space in blocks. 1k is a reasonable block size if you're talking many terabyte systems.

      With a 1k block size, you'd be addressing 16 billion terabytes of storage. Let us know as soon as every single person on earth has more than 2 terabytes to donate to your distributed
      filesystem project.

      --

      I've come for the woman, and your head.

  54. More technical information on ZFS by ahrens · · Score: 2, Informative

    You can find some more technical information about ZFS in my weblog. Check out the comments to my first entry about ZFS, there are a few juicy details there and I'll do my best to answer any questions posted to my blog.

    Disclaimer: I work on ZFS at Sun.