Slashdot Mirror


Is ext4 Stable For Production Systems?

dr_dracula writes "Earlier this year, the ext4 filesystem was accepted into the Linux kernel. Shortly thereafter, it was discovered that some applications, such as KDE, were at risk of losing files when used on top of ext4. This was diagnosed as a rift between the design of the ext4 filesystem and the design of applications running on top of ext4. The crux of the problem was that applications were relying on ext3-specific behavior for flushing data to disk, which ext4 was not following. Recent kernel releases include patches to address these issues. My questions to the early adopters of ext4 are about whether the patches have performed as expected. What is your overall feeling about ext4? Do you think is solid enough for most users to trust it with their data? Did you find any significant performance improvements compared to ext3? Is there any incentive to move to ext4, other than sheer curiosity?"

289 comments

  1. Risk Vs Benefits Analysis by eldavojohn · · Score: 5, Insightful

    Is ext4 Stable For Production Systems?

    Probably.

    Is there any incentive to move to ext4, other than sheer curiosity?

    Ok so I'm gussing production = income = your ass? Let me turn your question back to you by asking, "What is driving this need to move to ext4?" Because so far, all you've told me is that you are considering risking your ass for sheer curiosity.

    I may be grossly misinformed but that is how the question sounds to me. And by "your ass" I don't mean oh-no-we-had-a-service-outage-for-five-minutes ... no, we could have a customer on the phone saying, "You mean to tell me that the modifications being made to my site for the past 24 hours are gone?!"

    If it ain't broke, don't fix it!

    I don't know about you but I'm too busy dealing with shit like this than to ponder new potential problems I can put into play.

    Look through this page for a rough comparison of ext4 with other file systems. There's a better list of features for ext4 here that will tell you why you might need to switch to it. It is backward compatible with ext3 and ext2 so moving to it may be trivial. If you're dealing with more than 32000 subdirectories or need to partition some major petabytes/exobytes then you might not have a choice. Some of these benefits are probably not risking your ass for but if there's a business need that cannot be overcome any easier way then back your shit up and do rigorous testing before you go live with it. If you're using Slashdot to feel out if the majority of users scream OMGNOES so you don't waste your time doing that, then that's fine. Just don't do this if you don't have to.

    I tell you what, there's a $288 desktop computer at Dell today that you can buy, put ext4 on and your OS of choice and your application(s) and whipping boy it into next century without risking anything. Where I work we have two servers in addition to our production servers. I don't think this is an uncommon scheme so if you have a development server, throw it on there and poke it with a stick. Then move it to the testing server and let your testers grape it for two weeks. Then you'll know.

    --
    My work here is dung.
    1. Re:Risk Vs Benefits Analysis by Joce640k · · Score: 4, Insightful

      > If it ain't broke, don't fix it!

      This.

      --
      No sig today...
    2. Re:Risk Vs Benefits Analysis by BrokenHalo · · Score: 3, Insightful

      A shorter approach to the question:

      What do I gain by running with ext4?
      Is that gain worth the time spent changing what I've got?

      If the answer to the first question is that ext4 is cool and shiny, and the answer to the second is unknown, the OP has his answer.

      Filesystems are one thing we need to be VERY conservative about. We need to be certain that it works reliably, because we do not need to find our work disappearing out the end of our backup cycle after having discovered problems too late. (Yes, I know, what is this "backup" of which I speak?)

      I still have drives running ReiserFS, and I still use ext2 for boot partitions mounted readonly. I pretty much trust those systems, but even so, I still take backups and test them when I can.

    3. Re:Risk Vs Benefits Analysis by stinerman · · Score: 1

      It is backward compatible with ext3

      Not if you decide to use extents, which is a major reason why you'd want to use ext4. Per your link:

      The ext3 file system is partially forward compatible with ext4, that is, an ext4 filesystem can be mounted as an ext3 partition (using "ext3" as the filesystem type when mounting). However, if the ext4 partition uses extents (a major new feature of ext4), then the ability to mount the file system as ext3 is lost.

      But then again, if you're looking at ext4 just for extents, there have been other file systems that have used extents for awhile.

    4. Re:Risk Vs Benefits Analysis by identity0 · · Score: 1

      >I may be grossly misinformed but that is how the question sounds to me.

      You are. The question is clearly asking about normal users, which is NOT uber-leet production $$$$ systems.

      > My questions to the early adopters of ext4 are about whether the patches have performed as expected. What is your overall feeling about ext4? Do you think is solid enough for most users to trust it with their data? Did you find any significant performance improvements compared to ext3? Is there any incentive to move to ext4, other than sheer curiosity?

      I see no problem with migrating a desktop to a different FS out of sheer curiosity, as long as one backs up one's personal data beforehand.

      Don't let the title fool you, the body of the text makes no reference to 'production systems' and it is likely something inserted by the editors.

      But yeah, whoo smartass. All it takes is a smartass attitude to get +5 Insightful these days.

    5. Re:Risk Vs Benefits Analysis by Anonymous Coward · · Score: 0

      If it ain't broke, don't fix it!

      This.

      Fixed that for you.

    6. Re:Risk Vs Benefits Analysis by Jurily · · Score: 1

      What do I gain by running with ext4?

      And also, "What do I lose?". Ext4 is nowhere near trustworthy in my eyes. I'll probably switch about the same time I abandon KDE 3.5.

    7. Re:Risk Vs Benefits Analysis by Anonymous Coward · · Score: 0

      Two words: ad hominem.
      Another two: fuck off (see I also can do this!)

      BTW: I'm not an original poster.

    8. Re:Risk Vs Benefits Analysis by Hognoxious · · Score: 1

      Ok so I'm gussing production = income = your ass?

      It means the one where those people we all hate (salesmen and accountants) do their stuff. In other words, not the dev[1] system, not the one-of-n-different-flavours-of-test system, not the sandbox system. Nor is it the training, or the QA system. Get it?

      This has been standard terminology since pretty much forever.

      [1] before you need to guess again, that's short for "development".

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    9. Re:Risk Vs Benefits Analysis by diegocgteleline.es · · Score: 1

      What do I gain by running with ext4?

      Barriers enabled by default. The fact that most of ext3 users are running without them is scary.

      If you keep using ext3, enable them. They aren't enabled by default because they have a noticeable performance hit, but if you are paranoid about corruption, you really want them.

    10. Re:Risk Vs Benefits Analysis by identity0 · · Score: 1

      Hey, he's the one who started flinging the insults, bro.

    11. Re:Risk Vs Benefits Analysis by mqduck · · Score: 1

      Is there any incentive to move to ext4, other than sheer curiosity?

      I'm not sure how you read otherwise, but it sounds to me like that question is explicitly stating that curiosity isn't a good enough reason.

      --
      Property is theft.
    12. Re:Risk Vs Benefits Analysis by slaufer · · Score: 1

      As a side note, you don't need to buy even a $288 computer to test it; you can get a free one here and here.

    13. Re:Risk Vs Benefits Analysis by Master+of+Transhuman · · Score: 1

      Oh, man, THANK YOU! For sending me to the "Website is down" YouTube video - I haven't laughed like that in months! That thing deserves a /. post of its own! I've sent the link to some of my clients, penis picture regardless!

      --
      Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
    14. Re:Risk Vs Benefits Analysis by drolli · · Score: 1

      The only reason to switch away from a perfectly working fs to something which is less than 2 years in the stable kernel is that you need to do it (e.g. featurewise, performance). If it is about things barely measurable in normal life (e.g. speedups10%) for you application, forget it. The guys who need it can test it first. Most likely they make more qualified bug reports than somebody, who obviously does not even know if yhe needs it or not.

      I used ext3 starting from 2003 in production systems and i can not say yhat i suffered from using ext2 for two years more. Some people tried to tell me that it helps me rebooting the fileserver i maintained (for 30-40 computer, approx 20 people) faster after a chrash, but our small fileserver anyway was not rebooted that often, and crashes where so seldom that the time for checking the fs was clearly acceptable.

    15. Re:Risk Vs Benefits Analysis by tuxgeek · · Score: 1

      Insightful Post ..very good points made indeed

      My personal experiences w/ ext4
      Currently using/testing Kubuntu 9.04 with KDE 4.2.3 & ext4
      All my systems are considered as production as I am a corporate business owner contractor and use this for accounting, administration, bidding, communication, development, etc.
      Kubuntu is not as stable as Lenny w/ KDE 3.5.10, but is stable enough for my needs. KDE 4.2.3 does provide some very nice features and the eye candy kicks ass. I also develop my own business software using QT4, which is why I opted for this configuration

      EXT4 seems snappier than ext3 and I like some new features it offers although I may not be utilizing many to their fullest extent. I have also experienced files vanishing when being saved but had backups so the setback was not detrimental with exception of lost modifications.

      Overall I prefer this setup over all other OS options available. It is always a pure joy to use, fast and efficient.

      --
      "Suppose you were an idiot...and suppose you were a member of Congress...but I repeat myself." Mark Twain
  2. Ye by identity0 · · Score: 5, Funny

    I've been running ext4 on my system and everything's fi

    1. Re:Ye by dov_0 · · Score: 4, Interesting

      I've been running ext4 for / , but left ext3 for /home where any KDE apps I run could fudge writes. No problems at all.

      --
      sudo mount --milk --sugar /cup/tea /mouth /etc/init.d/relax start
    2. Re:Ye by TCM · · Score: 4, Insightful

      So you used the "riskier" fs for / where you don't actually need the features it provides and used the "more stable" fs where features could actually be useful because app/fs developers couldn't agree on semantics?

      Only on Linux...

      --
      Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
    3. Re:Ye by diegocgteleline.es · · Score: 1

      Use the "riskier" filesystem for the unimportant data (an desktop operative system can be reinstalled easily), use the stable filesystem for the data that matters (personal data that if you lose you won't be able to recover again). Whats wrong with that?

    4. Re:Ye by DarkOx · · Score: 1

      Nothing is wrong with that at all. In other situations it might be make all sorts of sense, like if ext4 offered vastly improved read speeds somehow on the relatively smallish files on most peoples / partition. The only place I really see ext4 offering anything is maybe in /lib or /etc where the number of files can grow very large, otherwise whats the point. The files are not big, they are don't change much (mine is mounted RO) I just don't get the point of a fancy FS on ones / part. EXT2 would really be fine probably ideal. Now I guess if /var is not on its own partition or you don't run tmpfs or something similar on /tmp you might gain something but if those are true you not probably exactly optimized for file system performance in any case.

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    5. Re:Ye by dov_0 · · Score: 1

      So you used the "riskier" fs for / where you don't actually need the features it provides and used the "more stable" fs where features could actually be useful because app/fs developers couldn't agree on semantics?

      Or I used the potentially riskier filesystem for executables, system wide libraries and config files that are not regularly rewritten to realize a noticeable increase in system speed while leaving the older more reliable filesystem for my important documents, mp3's and personal config files where speed is not so much of a concern.

      Only on Linux...

      Yes. On Linux I can reinstall my system in 15-20 mins and all the extra packages I use in another 20mins. That time includes making myself a cup of tea. All my personal config files are in /home so I won't have to configure anything after a re-install.

      --
      sudo mount --milk --sugar /cup/tea /mouth /etc/init.d/relax start
    6. Re:Ye by H.G.Blob · · Score: 1

      I did it like that too until the last fsck took more than 15 minutes, then I switched /home too. No problems at all.

    7. Re:Ye by dov_0 · · Score: 1

      I just don't get the point of a fancy FS on ones / part.

      Basically I want to move across to ext4, but have only gone as far as it seems safe at this time to do so. I have noticed an increase in overall speed on the system.

      --
      sudo mount --milk --sugar /cup/tea /mouth /etc/init.d/relax start
    8. Re:Ye by dov_0 · · Score: 1

      Just out of interest, do you run any KDE4 apps?

      --
      sudo mount --milk --sugar /cup/tea /mouth /etc/init.d/relax start
    9. Re:Ye by swilver · · Score: 1

      Yes... cause only KDE apps did this... *groan*

    10. Re:Ye by TCM · · Score: 1

      Actually, you are right with the last part. It's just that I wouldn't use ext4 for system partitions.

      Even if you can restore the system easily, why risk it if you don't get any benefit?

      --
      Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
    11. Re:Ye by H.G.Blob · · Score: 1

      Yup, running the experimental Kubuntu 9.04. Never had a problem with the stock kernel.

  3. Wrong question by AmiMoJo · · Score: 5, Insightful

    You are asking the wrong question. Ext4 does not need fixing, the apps do.

    Are your apps patched yet?

    --
    const int one = 65536; (Silvermoon, Texture.cs)
    SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    1. Re:Wrong question by QuoteMstr · · Score: 5, Interesting

      Face it: your side lost. "fsync everywhere" is an infeasible, untenable, and useless position to take.

      fsync-on-rename creates a much better environment for application developers and users alike. The Right Thing happens by default, and I maintain that nobody actually wants the unsafe rename behavior. Allowing an application "choice" in this respect is a red herring.

      The only improvement I'd make it to flush the file involves on every rename, not just renames that happen to overwrite an existing file. Under the current scheme, an application doing the write-close-rename to replace a file will still be put in a bind if the file to write doesn't exist yet. (i.e., you can still end up with a zero-length file where no such file ever existed on a running system)

    2. Re:Wrong question by k8to · · Score: 5, Insightful

      There was no single loser here.

      Ext4 should handle the case gracefully, but the apps will fail on other filesystems, and they *will* be run on those filesystems, so they should fix the bugs.

      --
      -josh
    3. Re:Wrong question by eldavojohn · · Score: 3, Insightful

      You are asking the wrong question. Ext4 does not need fixing, the apps do.

      Are your apps patched yet?

      At the risk of revealing just how incredibly inept I am about file systems ... shouldn't your "apps" (and by apps I am guessing you mean applications) be calling the operating system to do anything to the file system? I mean, isn't the point of operating systems to create or contain APIs and the like that allow you to interface with any file system type that the OS supports?

      I guess what I'm asking is just the technicality that only his operating system need be patched and tested for it?

      Again, I don't really do this type of coding and in all the C programming I've done, I've never seen a need or way even to get down and dirty with the file system. I can dream up cases (like Google's bigtable) where that may be desirable with benefits if well planned but I would imagine most of the time it would be unwise and unsafe and put you dependent on a type of file system.

      --
      My work here is dung.
    4. Re:Wrong question by nwanua · · Score: 4, Interesting

      Wha....? Are you seriously suggesting that applications/utilities need to be patched to deal with faulty (yes, faulty) filesystem semantics? For _every_ single filesystem they might encounter? The whole point behind a filesystem layer is to present a unified view of files to the user layer regardless of physical media or driver quirks.

      The point is really that ext4 is/was broken, and IMO, any filesystem requiring patches to applications in order not to lose data is no filesystem at all. It's unbelievable (despite the technical benefits of ext4) that this would even be up for consideration.

    5. Re:Wrong question by blueg3 · · Score: 5, Informative

      The problem is that some applications assume a behavior that is not supported by the POSIX definitions (the guarantees provided by the OS functions they're calling). However, it happens to be the behavior on existing filesystems and happens to be convenient. Now a new filesystem comes along and sticks to the POSIX definitions but does not follow this behavior. Application breaks, people complain.

      As a simplified example, imagine you create file B, then delete file A. Existing filesystems happen to do this in order, so you always have at least one of A or B. (If the system crashed partway through, you might have both A and B.) Your application fails if neither A nor B is present. POSIX doesn't require that the operations be performed in order. New filesystem comes along and sometimes does them in the reverse order, so if the system crashes at the wrong time, neither A nor B is left on the filesystem.

    6. Re:Wrong question by Anonymous Coward · · Score: 0

      If my understanding of this problem is correct, it occurs only if the system crashes shortly after the KDE apps have updated their configuration files. If so many people think this is a big problem, I'm more worried about the great number of constantly crashing linux machines.

    7. Re:Wrong question by icebike · · Score: 4, Interesting

      Face it: your side lost. "fsync everywhere" is an infeasible, untenable, and useless position to take.

      And had it been enforced, as soon as all developers went thru and added the fsync calls everywhere it would have become necessary for file system maintainers to no-op fsync calls in order to regain any approximation of prior performance.

      Flushing "one file" is not always sufficient. Calling fsync() does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicit fsync() on a file descriptor for the directory is also needed. And perhaps the higher level directory as well.

      --
      Sig Battery depleted. Reverting to safe mode.
    8. Re:Wrong question by TheSunborn · · Score: 1

      But even then you might end up with a zero byte file, if your system crashes between the close and rename call. (Or between write and close, or doing write, or well anytime after open).

      But I don't really think there might be a zero size file left if the system crashes is such a problem.

      But what we really need is a flag to close(Or open) called FLUSH_ON_CLOSE that flushes a file when it's closed. There are so few situations where you would not want to do that, so maybe it should be default, and we could add a DO_NOT_FLUSH_ON_CLOSE.

      Bonus points for anyone who can give a realistic use case for DO_NOT_FLUSH_ON_CLOSE

      I don't think it's more effective to delay the flush, because you are not going to write anything in/behind the last flushed data block.

    9. Re:Wrong question by Anonymous Coward · · Score: 5, Insightful

      Only on Linux is it the user's fault that apps have data loss because the Linux kernel people changed filesystem semantics. At least Microsoft takes some responsibility for their mistakes :-/

      I did follow the ext4 debate. Here's my quick synopsis.

      • Linux kernel hacker discovers he can make a certain microbenchmark run 50% faster if he allows reordering of filesystem metadata writes ahead of filesystem data writes. Said hacker checks in code with a "now 50% faster!!!" message.
      • A few months later, users start discovering data corruption of KDE files. Specifically, a copy of A to A', ftruncate(A'), write(A'), rename(A' to A), host crash, causes the resulting file to contain A data and not A' data despite the well-known atomic "rename" that serves as a barrier.
      • Linux kernel hacker ignored problem as not-a-bug, since the apps didn't make use of fdatasync() / fsync() correctly, which (using Posix semantics) would have prevented data corruption. The detail to note here is that Posix doesn't actually say that rename is a write barrier for data and metadata, even though everyone would assume that it is a write barrier and ALL other filesystems have treated it as a write barrier. (And in my opinion as a professional systems programmer, this is an oversight in the Posix standard and not a desired behavior). So the linux kernel hacker is technically correct but has introduced a behavior that goes against all previous implementations.
      • Linux kernel hacker (and some Slashdot posters) attack KDE developers for being incompetent because they didn't read a sub-sub-sub clause of the Posix spec that (1) isn't mentioned in the man pages, (2) only gets read by kernel programmers anyway, and (3) is about two orders of magnitude more arcane than the average desktop app developer will ever read documentation.
      • 90% of users and 80% of programmers wonder what the hell fdatasync() and fsync() and the difference between data and metadata write barriers are, and why the default behavior is to corrupt data.
      • Linux kernel hacker promises to commit a few patches to fix the problem, so as not to break software that has worked perfectly fine for the past 10 years.
      • Those of us with experience realize that since said kernel hacker didn't believe this was a problem in the first place, the patches are as likely to be half-hearted band-aids as to actually increase data integrity guarantees. Programming has a long and proud history of making a quick fix to satisfy "management" (in this case, the Linux community) that makes one symptom go away and doesn't actually fix the underlying problem.
      • We get an Ask Slashdot asking if the problem actually got fixed, because 99% of us do not have the technical expertise to understand patches to the Linux filesystem to figure out if this actually got fixed.

      I do have a moral to this story. Filesystems have one cardinal, inviolable rule. DO NOT CORRUPT THE USER'S DATA. The guarantee is that if a user makes a read, the user will get back either good data OR an error (or explicit indication of no data). Google likes filesystems that lose data - but they don't ever give back corrupt search results. Ext3 can reorder writes - but defaults to a safe 5-second flush rate to keep the window of unexpected corruptions small. Ext4 ignored this rule and allows silent data corruption so that this filesystem can be the best at certain microbenchmarks, and instead of accepting responsibility, the kernel hacker in question blames everybody else.

      The greatest danger to Linux's success is not Microsoft. It's the hubris of many Linux developers, users, and advocates, who are too busy disavowing responsibility and blaming everybody else to fix real user's problems. (And yes, I'm a follower of the Raymond Chen philosophy)

    10. Re:Wrong question by Jane+Q.+Public · · Score: 5, Funny

      Huh? Buddy, this is Slashdot. There are lots of single losers here.

    11. Re:Wrong question by RiotingPacifist · · Score: 3, Informative

      how should the apps behave? write,rename is the best way to do what they want, if you cant trust the filesystem to rename a file (and not just not rename it but leave its metadata wrong so neither the new or original are in the correct place) then what sort of program are you going to be able to run?

      --
      IranAir Flight 655 never forget!
    12. Re:Wrong question by RiotingPacifist · · Score: 4, Interesting

      hmm i think most of them are but im still having problems with mv, seriosuly can we stop this bullshit, ext4 was clearly not working!
      If you cant rename a fucking file without risking total corruption of the file, at no point in renaming "settings-new" to "settings" should the file "settings" become unusable, What the fuck CAN kde4 do?

      --
      IranAir Flight 655 never forget!
    13. Re:Wrong question by Anonymous Coward · · Score: 1, Insightful

      By your logic, web standards should be changed to match the behavior of Microsoft IE. Since IE is the most popular browser, it should not be forced to conform to the incompatible ("faulty") web standards.

      This is exactly why we need precise interface specifications, along with powerful tools for checking against those specs. Otherwise, application developers will find some idiom that appears to work without regard to whether they are assuming more than the spec guarantees. As a result, their code is broken. The current OS code might not expose the error, but a future one will. The OS code should not have to include hacks for every possible interface error that could be present in application code.

    14. Re:Wrong question by QuoteMstr · · Score: 5, Insightful

      But even then you might end up with a zero byte file, if your system crashes between the close and rename call. (Or between write and close, or doing write, or well anytime after open).

      This statement is incorrect. Suppose you want to atomically replace the contents of file "foo". Your application will write a file "foo.tmp", then call rename("foo.tmp", "foo"). At no time on a running system does any process observe a file called "foo" that does not have either the new or the old contents, and this invariant holds true whether or not "foo", "foo.tmp", or any other file has been flushed to the disk.

      On the filesystem level, the kernel can actually write the contents of foo.tmp to disk whenever is convenient. The only constraint is that the on-disk name record for "foo" must be updated to point to the new data blocks from foo.tmp only after these data blocks have themselves been written to disk. That's the issue here: without that ordering guarantee, the kernel can write a file's name record before its data blocks. If the system crashes after the name record is written but before the data blocks are, what's observed on the recovered system is a zero-length file.

      That's the problem here: the kernel is conjuring out of thin air a zero-length file that never actually existed on a running system.

      Forcing applications to call fsync is not only an onerous burden on application developers, but it also reduces performance because it gives the filesystem less freedom than the much looser constraint on rename above.

      Bonus points for anyone who can give a realistic use case for DO_NOT_FLUSH_ON_CLOSE

      1. Application configuration files. You don't care that they hit the disk immediately, but only that when they do hit the disk, they're not corrupt
      2. /etc/mtab

      Flushing on close is the wrong thing: it far exceeds the minimum requirements that most applications actually need, which will substantially reduce performance.

    15. Re:Wrong question by Achromatic1978 · · Score: 1
      Most insightful and informative AC. Ever.

      Where's my mod points?

    16. Re:Wrong question by Hognoxious · · Score: 1

      POSIX doesn't require that the operations be performed in order.

      [eldavojohn mode]I guess[/] it doesn't forbid it either. So what's the reason, other than pure pedantry, to do them in random order?

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    17. Re:Wrong question by Anonymous Coward · · Score: 0

      I love you more than a person ought to love an anonymous poster on a web site.

      The whole point of computers is to make them do things that the user wants them to do. Ext4 as delivered has been a clear violation of that standard.

    18. Re:Wrong question by QuoteMstr · · Score: 4, Insightful

      The problem is that some applications assume a behavior that is not supported by the POSIX definitions

      POSIX is a red herring here. It covers the behavior of a running system, and makes no guarantees about atomicity or durability following a crash. After a crash and as far as POSIX goes, it's perfectly legitimate to overwrite the entire disk with hentai. Every crash recovery technique goes beyond POSIX because POSIX says nothing about crashes.

      POSIX doesn't require that the operations be performed in order

      It most certainly does! On a running system, if you rename B over A, at no point does any process on the system observe a file called "A" that does not have either the contents of the old A or the contents of B. THIS ATOMICITY IS A FUNDAMENTAL POSIX GUARANTEE.

      Filesystems should do their best to honor this guarantee (which always applies on a running system, remember) even when the system crashes. Filesystems don't have to do that according to POSIX. Instead, they should do it because it's a sane thing to do, and doesn't violate anything POSIX guarantees. POSIX is not the arbiter of what a good system should be. It's perfectly reasonable to make guarantees that go beyond POSIX, and every real-world operating system does precisely that. POSIX guarantees are necessary but insufficient for a reasonable system in 2009.

    19. Re:Wrong question by Anonymous Coward · · Score: 0

      POSIX doesn't require that the operations be performed in order.

      [eldavojohn mode]I guess[/] it doesn't forbid it either. So what's the reason, other than pure pedantry, to do them in random order?

      Ohhhh, I think I've got a fan! :-)

      Steve Stephenson is that you?

    20. Re:Wrong question by Requiem18th · · Score: 1

      The greatest danger to Linux's success is not Microsoft. It's the hubris of many Linux developers, users, and advocates, who are too busy disavowing responsibility and blaming everybody else to fix real user's problems

      Unlike Microsoft who takes all responsibility from any malfunction in its softw--Oh that's right the EULA crowd never does.

      Come on, no Ubuntu LTS uses ext4 by default, nor Debian stable, nor OpenBSD AFAIK.

      When you are dealing with the bleeding edge its normal for things to break. This is not disavowing responsibility, its fixing the problem where the problem is.

      --
      But... the future refused to change.
    21. Re:Wrong question by GryMor · · Score: 2, Insightful

      Performance optimization. You can get much write rates if you can reorder the writes to be sequential on disk, starting with whichever one the disk head can get to first.

      --
      Realities just a bunch of bits.
    22. Re:Wrong question by TCM · · Score: 1

      Come on, no Ubuntu LTS uses ext4 by default, nor Debian stable, nor OpenBSD AFAIK.

      What has OpenBSD got to do with anything in this discussion?

      --
      Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
    23. Re:Wrong question by Anonymous Coward · · Score: 1, Insightful

      Unfortunately, "fixing" apps to work around ext4's brokenness means you have to fsync the new version of a file before renaming it over the old one. So instead of having KDE's 500 config files being lazily flushed to disk in a single 10-millisecond disk write, each one gets written synchronously, hanging your system for 5 whole seconds. Brilliant.

      Or, I could just use ext3, which gives sane behavior (preserving either the old or new version of a file, don't care which) and doesn't require apps to be written in a way that makes you feel like you're running DOS on floppy disks.

    24. Re:Wrong question by Requiem18th · · Score: 1

      AFAIK BSDs can be run on top any fs including ext4 yet not by defaut.

      --
      But... the future refused to change.
    25. Re:Wrong question by Kjella · · Score: 1

      A few months later, users start discovering data corruption of KDE files. Specifically, a copy of A to A', ftruncate(A'), write(A'), rename(A' to A), host crash, causes the resulting file to contain A data and not A' data despite the well-known atomic "rename" that serves as a barrier.

      No, it's more fucked than that. The rename has pointed A to A', but the data for A' has not been written so you have NO data, only a zero byte file. From a "high-level" perspective, and by high level I mean I want to atomicly replace file A with A' then this is clearly a major WTF but apparently not for the ext4 developers. That means there's bigger chances of ice skating contests in hell than me installing ext4 on a production server.

      --
      Live today, because you never know what tomorrow brings
    26. Re:Wrong question by thsths · · Score: 1

      Nor is fsync() what you want - you want an atomic file replace operation. Rename is atomic, and it used to work, but with delayed allocation it may happen before the file is written. So what you want is an atomic file replace operation that does not happen before the data write. Rename may not be the best option for that - a special file write mode may actually be better. In any case the issue affects both sides - kernel and user space.

    27. Re:Wrong question by AmiMoJo · · Score: 1

      So, to bring it back round to the point, isn't the problem that apps break if this undefined behaviour isn't stuck to? That sounds like a flaw in the app.

      Can someone offer a specific example of a program that requires this behaviour for a good reason?

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    28. Re:Wrong question by HeronBlademaster · · Score: 1

      You're using a different situation than your parent post gave to try to prove him wrong.

      He used a two-operation sequence as an example, saying POSIX doesn't guarantee they'll happen in order: create B, then delete A. He said nothing about renaming B over A.

      Your example was one operation: rename B over A. Yes, this is one operation, and yes, POSIX guarantees it will happen atomically.

      Neither of you is wrong (as far as I can tell) and there's no reason both of you can't be right (since you're describing different situations).

    29. Re:Wrong question by QuoteMstr · · Score: 1

      Your example was one operation: rename B over A. Yes, this is one operation, and yes, POSIX guarantees it will happen atomically

      This is the operation salient to the ext4 discussion. The other operation is a distraction. (But for the record, POSIX also guarantees that they happen in order. How could it be otherwise?)

    30. Re:Wrong question by osu-neko · · Score: 2, Insightful

      At least Microsoft takes some responsibility for their mistakes

      Actually, I'll take the process you described above over what occurs at Microsoft or other closed-source shops any day. They also have their fair share of stubborn, arrogant developers with the kind of attitude displayed above. The reason you don't see the kind of detailed analysis of what happened all the time like the one above is simply that it all occurs behind closed doors. Oh, and because of that, you don't see the kind of outcry that eventually leads to patches until after the product ships, if ever. Microsoft can say "we can't help it if a hardware crash corrupts your data" as well as anyone else.

      Everything else in the post is right, it's just wrong in the implications that this is somehow unique to Linux, or indeed anything other than substantially less common in Linux than at Microsoft or other such corporate development communities. Frankly, it's more common where it's less likely to result in a public airing like the one above.

      --
      "Convictions are more dangerous enemies of truth than lies."
    31. Re:Wrong question by TCM · · Score: 1

      No idea where you got your knowledge.

      The BSDs support Ext2. Although personally, I wouldn't do anything write-related to an Ext2 fs on BSD, let alone use it for the system itself.

      As for Ext3 or even Ext4, well, just no.

      --
      Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
    32. Re:Wrong question by QuoteMstr · · Score: 1

      So, to bring it back round to the point, isn't the problem that apps break if this undefined behaviour isn't stuck to?

      Stop willfully misinterpreting me. The behavior is undefined as far as POSIX is concerned, not undefined in general. It's okay for programs to rely on behavior that goes beyond POSIX. A program that relied on POSIX and nothing else couldn't do anything about crash recovery anyway, since POSIX is silent on the topic of crashes.

      There is nothing here to "fix". There is no "flaw".

      As for a "good reason" --- programs don't depend on this behavior: users do. Users expect a good filesystem to maintain as many POSIX invariants as possible even after a crash. There is no purely POSIX way for a program to safeguard against a crash.

    33. Re:Wrong question by QuoteMstr · · Score: 3, Insightful

      Nor is fsync() what you want - you want an atomic file replace operation.

      Yes.

      Rename is atomic, and it used to work, but with delayed allocation it may happen before the file is written. So what you want is an atomic file replace operation that does not happen before the data write.

      Precisely.

      Rename may not be the best option for that - a special file write mode may actually be better. In any case the issue affects both sides - kernel and user space.

      NO, NO, NO. write, fsync, close, rename is how you spell "atomically replace this file" in terms of system calls. It does precisely the correct thing on a running system. You yourself admit that it "used to work". It has worked for decades, in fact. (Though before journaling filesystems, all bets were off after a crash.)

      That sequence of system calls is how applications tell the kernel to replace the given file. There is no useful interpretation of those system calls that doesn't involve an atomic replacement of the whole file. We don't need a separate system call: we already have the system calls. Nobody executing those system calls wants the dangerous interpretation of rename. At no time did an application developer sit down and think to himself, "I want to tell the kernel to perform an atomic rename, except when the system crashes. In that case, I want a zero-length file." Gods, no. Obviously, the application developer wanted to atomically replace the named file. Filesystems just need to honor the obvious intent of application developers.

    34. Re:Wrong question by diegocgteleline.es · · Score: 0

      but the apps will fail on other filesystems

      Not just filesystems, but also operative systems. Many other unix systems (if not all) don't protect users from the fsync/rename issue. If an app wants to be portable, it must use fsync.

    35. Re:Wrong question by mysidia · · Score: 1

      rename is not guaranteed by ISO C to be atomic.

      Apps should instead use something more like: link(), fsync(), unlink()

    36. Re:Wrong question by Anonymous Coward · · Score: 0

      Some days, a score of +5 Insightful just isn't enough. Extremely well written, and perfect analysis + explanation. You sir, are a hero.

    37. Re:Wrong question by davecb · · Score: 2, Informative

      The apps don't fail on ufs.

      --dave

      --
      davecb@spamcop.net
    38. Re:Wrong question by HeronBlademaster · · Score: 1

      Why should POSIX care which order it happens in? Allowing these two separate operations to be reordered could (in theory) allow a filesystem driver to increase performance by ordering the operations such that the drive head travels the shortest distance.

      Granted, operations on a single file shouldn't be reordered, but there's little reason that operations on unrelated files should have to remain ordered.

      My Google skills are apparently lacking, as I'm unable to locate the POSIX specs for file system operations. Anyone have a link handy?

    39. Re:Wrong question by QuoteMstr · · Score: 1

      Apps should instead use something more like: link(), fsync(), unlink()

      Not one of these functions is guaranteed to exist by either C89 or C99. Don't comment on topics you don't understand.

    40. Re:Wrong question by QuoteMstr · · Score: 1

      write, fsync, close, rename

      I meant write, close, rename, of course. The fsync is unnecessary unless you need to guarantee that to some third party (an SMTP client, a user, an NFS client, etc.) that the data are on the disk after the operation is complete.

    41. Re:Wrong question by QuoteMstr · · Score: 1

      Why should POSIX care which order it happens in? Allowing these two separate operations to be reordered could (in theory) allow a filesystem driver to increase performance by ordering the operations such that the drive head travels the shortest distance.

      We're talking past each other. Of course POSIX defines an ordering of operations on a running system. You're talking about writes to the backing disk being reordered, which is outside the scope of POSIX, and about crash recovery, which is also outside the scope of POSIX. My argument is that a reasonable system ought to maintain at least some of the POSIX ordering guarantees across crashes.

      My Google skills are apparently lacking, as I'm unable to locate the POSIX specs for file system operations.

      It's not free.

    42. Re:Wrong question by QuoteMstr · · Score: 1

      By your logic, web standards should be changed to match the behavior of Microsoft IE.

      Yep. HTML5 blesses and standardizes a lot of the compatibility workarounds everyone is already using.

    43. Re:Wrong question by whoever57 · · Score: 1

      This statement is incorrect. Suppose you want to atomically replace the contents of file "foo". Your application will write a file "foo.tmp", then call rename("foo.tmp", "foo").

      And what happens if the process has write permissions for the file, but not the directory?

      --
      The real "Libtards" are the Libertarians!
    44. Re:Wrong question by BikeHelmet · · Score: 1

      The greatest danger to Linux's success is not Microsoft. It's the hubris of many Linux developers, users, and advocates, who are too busy disavowing responsibility and blaming everybody else to fix real user's problems. (And yes, I'm a follower of the Raymond Chen philosophy)

      Wonderful post. And I totally agree. Whenever I criticize features of linux, I get modded flamebait or troll, indicating that nobody on /. gives a crap unless it causes them problems like real data loss.

      Even then, if the data loss is only for me, then I must be a troll. :P

    45. Re:Wrong question by QuoteMstr · · Score: 1

      Obviously, in that case, creating foo.tmp fails --- or if you create foo.tmp elsewhere (not recommended), then the final rename from foo.tmp to foo will fail.

      You don't actually need write permission on foo at all --- only on the directory containing it. (Sticky bit aside aside, of course.)

    46. Re:Wrong question by Thinboy00 · · Score: 1

      then put the temp file in /tmp instead.

      --
      $ make available
    47. Re:Wrong question by Rich0 · · Score: 3, Interesting

      Define bug.

      Here is the issue - application wants to make an atomic change to a file. The application doesn't care if the file ends up in the starting state, or the final state - only that the change is atomic.

      fsync doesn't do that. Fsync guarantees that the file ends up in the final state quickly (but not atomically). Fsync also degrades system performance.

      So, the proposed application change doesn't accomplish what the app writers actually want, and it slows down the system. It does reduce the risk of data loss.

      What we really need is transaction support for files - just like we have for databases. Now, I agree that this may not be needed for all file operations (though admins should be able to turn it on by default if they want), but this is really the "right way" of handling this sort of situation.

      If anything I find myself patching apps to remove fsyncs. MythTV forces frequent fsyncs of the video stream and it can kill performance and even lead to data loss (buffer overruns - the degraded disk performance can't keep up with recorded video demand). There is no reason a recording needs to be fsynced every 30 seconds. If power goes out I'm going to lose 5 minutes of my recorded show anyway while the system comes back up - losing the previous 30 seconds of unflushed video isn't the end of the world. I'd rather have that then have dropped frames and glitches all over the place from lost video packets.

      What we need is for apps to tell the OS what they actually need, and for the OS to figure out how to deliver it. App writers shouldn't care what filesystem you're writing to and what the approved way of modifying files on that filesystem is. They certainly shouldn't care about how the write cache works. Sure, there should be an fsync option, but it should be used to sync disk writes to operations that take place in other media or over the network (such as in a transactional database). There should also be other options like atomic file operatiopns (make the following changes to the following files atomically). Let the app figure out what its requirements are, and let the OS figure out how to deliver it.

    48. Re:Wrong question by Jesus_666 · · Score: 1

      There's an easier fix: Just query whether the filesystem you want to write to is mounted synchronously and refuse to write to it if it isn't. That way such syncing issues can be programmer-time-efficiently avoided.

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
    49. Re:Wrong question by Rich0 · · Score: 1

      And in my opinion as a professional systems programmer, this is an oversight in the Posix standard and not a desired behavior

      Agreed - there really needs to be a POSIX way of forcing an atomic write but not forcing an fsync. They aren't the same thing, and an atomic write doesn't carry nearly the same performance hit as an fsync. The last thing I want is every app on my system making the write cache useless just to prevent wholesale data loss.

    50. Re:Wrong question by Thinboy00 · · Score: 1

      That's funny, Mozilla and Apple on on the WHATWG but not MS. OTOH, if everyone's using them, why not document them so web developers can spend less time working on making their sites work with IE*.

      --
      $ make available
    51. Re:Wrong question by Thinboy00 · · Score: 1

      fsync(). If you don't like the performance hit, don't make 120 writes/min.

      --
      $ make available
    52. Re:Wrong question by Jesus_666 · · Score: 1

      Unlike Microsoft who takes all responsibility from any malfunction in its softw--Oh that's right the EULA crowd never does.

      To quote the GNU General Public License, Version 3 (emphasis mine):

      15. Disclaimer of Warranty.
      THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

      16. Limitation of Liability.
      IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.


      BSD contains similar language. Face it: Most of F/OSS, including the Linux kernel, explicitly denies any responsibility for the software working properly - or even for the software not overwriting your root partition with random data. Virtually all end-user software is distributed on a "maybe this will possibly work" basis.

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
    53. Re:Wrong question by darksith69 · · Score: 0

      After a crash and as far as POSIX goes, it's perfectly legitimate to overwrite the entire disk with hentai.

      Are you implying that it is perfectly legitimate to overwrite the entire disk of hentai I have amassed through years of work, or that this POSIX thingy will overwrite and empty hard drive with the hentai you speak of?

      If it's the former, I'm going to get a job as consultant and recommend everybody to put a disk filled with hentai in their servers, since only that one will be overwritten after a crash. If it's the latter, I'm going to force the crashes myself.

      Oh man, this POSIX thing is cool! Where do I get it anyway?

    54. Re:Wrong question by Anonymous Coward · · Score: 0

      it only can be corrupted if the computer crashes between when the file is renamed, and when the data is written to disk. If the computer doesn't crash then everything always works fine.

    55. Re:Wrong question by fmayhar · · Score: 1

      The apps don't need fixing, neither does ext4. It's not an either-or. There's a third element in this little problem.

      Might I suggest that it's the buffer handling in Linux that's messed up?

    56. Re:Wrong question by DarkOx · · Score: 1

      Bonus points for anyone who can give a realistic use case for DO_NOT_FLUSH_ON_CLOSE

      Sure you don't want your thread to block on fclose just because the disk was not available at that instant.

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    57. Re:Wrong question by spitzak · · Score: 3, Informative

      Some corrections, although the sentiment is correct:

        copy of A to A', ftruncate(A'), write(A'), rename(A' to A), host crash, causes the resulting file to contain A data and not A'

      This is not what is wrong. If the file contained the old version of A it would be fine, this is the expected behavior. The problem is that the file contains some partially-written version of A' (usually a zero-length version).

        Posix doesn't actually say that rename is a write barrier for data and metadata

      Actually POSIX does say exactly that. The hole EXT4 weasels through is that POSIX says "anything can happen when the machine crashes".

        apps didn't make use of fdatasync() / fsync() correctly

      The apps *were* using these calls correctly, by not calling them. They are very slow and make guarantees that have nothing to do with the desired action, which is an atomic rename.

    58. Re:Wrong question by whoever57 · · Score: 1

      then put the temp file in /tmp instead.

      And if /tmp is on a different filesystem, this achieves nothing as far as ensuring integrity of the write.

      --
      The real "Libtards" are the Libertarians!
    59. Re:Wrong question by Anonymous Coward · · Score: 0

      Huh? Buddy, this is Slashdot. There are lots of single losers here.

      ha ha ha ha...

      oh.

      doh!

    60. Re:Wrong question by doug363 · · Score: 1

      Hear, hear. That's exactly the problem. The invariant of rename is useful, and it is useful to be able to get the atomic cutover without ensuring that the file is on disk. The previous ext4 behaviour is utterly useless to everyone. The reason is that there is an important case that hasn't been raised much: where the process doing the renaming isn't the process doing the writing. Consider:

      $ echo "Hello, world" > file.new # or some other operation that produces file.new
      $ mv file.new file
      $ [System crashes some time after mv completes.]

      In this case, the shell probably doesn't do an fsync before it closes file.new, so the bug might appear. Basically, the implication of not solving this is the kernel is that every application that writes a file needs to be changed to fsync before closing it, if someone else might later want to rename it over another file. That is equivalent to an enforced fsync_on_close flag, but also requires that every application is changed, and it completely wipes out the lazy allocation, write-behind optimisation that this was all supposed to allow. The only realistic possibility is for mv to open file.new and fsync it before renaming, but that's racy, so the kernel is the only place where this problem can be fixed properly.

    61. Re:Wrong question by Anonymous Coward · · Score: 0

      Entirely right, I state as the user who posted the GP but accidentally clicked the damn post-anonymously box... Serves me right for writing without consulting notes!

    62. Re:Wrong question by mikechant · · Score: 1

      Come on, no Ubuntu LTS uses ext4 by default,...

      No current Ubuntu release of any sort (LTS or not) uses ext4 by default. The next release (9.10, Karmic Koala) *may* default to ext4.

    63. Re:Wrong question by swilver · · Score: 1

      Ext4 does need fixing. It's beyond stupidity to sync meta-data to disc more frequently than the real data that programs actually call upon a filesystem to store safely. This rename problem is just the tip of the ice-berg when data and meta-data is not synced up to the same point in time.

      The problem simply is that actions taken by programs are no longer seemingly executed in order. Actions have to be separated into actions that affect meta-data, and actions that affect file contents. When you do the following actions:

      1) create file
      2) write data to file
      3) rename/delete some other file
      4) write some more data

      Programs should not have to care about the fact that only actions 2 and 4 actually modify file contents. They only need to know that in case of a crash the only situations that can arise are that steps were only completed upto a certain point (so, 0 to 4 steps). Ext4 did not guarantee this and could end up with steps 1 and 3 committed but not step 2 -- the end result is that to be absolutely safe, every program in existance has to be rewritten to sync every time your applications "switches" between updating meta-data and file contents.

      Unfortunately, sync does FAR more than just tell the filesystem that step 3 is dependent on step 2 -- it tells it to drop everything, flush everything to disc right now, add a mark to the journal and wait until all that has completed -- not just your data, but all data that every application on your system might be working on -- it's a sledge hammer, one that should be used with care, and RARELY in user applications.

      People don't seem to realize that modern filesystems (in ordered mode) already do hundreds of actions before actually committing them by default -- they can get away with that because they guarantee one simple thing: it all still LOOKS sequential, because they guarantee that a situation can never arise where a later action was committed but an earlier action was not.

    64. Re:Wrong question by swilver · · Score: 1

      fsync-on-rename only fixes this specific problem I fear. There's many more where that came from when you donot sync meta-data and file contents up to the same point in time. A simple example:

      1) Write into some application's logfile: "user X deleted file Y"
      2) Delete file Y

      Step 2 is just a meta-data update and could occur before step 1 is completed. If the system crashes during these operations you could end up with file Y being deleted without a log entry showing which user deleted it. This will work fine with filesystems that seemingly execute actions in order (like ext3 in default ordered mode), but would fail on filesystems that take a more dim view of what's important.

    65. Re:Wrong question by Anonymous Coward · · Score: 0

      Wait-a-minute!

      Do I understand that serious major programs -- incl kde4 --actually depend on UNDOCUMENTED fs behaviour ???

    66. Re:Wrong question by QuoteMstr · · Score: 1

      Theodore Ts'o' actually suggests there be a binary that just opens a file and calls fsync on it, and that all shell scripts use it.

      Talk about being out of touch with the actual requirements of users...

    67. Re:Wrong question by josephcmiller2 · · Score: 1

      Answer - change POSIX.

    68. Re:Wrong question by ChrisMaple · · Score: 1

      What the fuck CAN kde4 do?

      cp settings settings-old
      fflush
      cp settings-new settings
      fflush
      read settings back in and check for correctness; fix as required. Do NOT remove surplus files.

      --
      Contribute to civilization: ari.aynrand.org/donate
    69. Re:Wrong question by shutdown+-p+now · · Score: 1

      fsync(). If you don't like the performance hit, don't make 120 writes/min.

      I think a much more feasible option for everyone concerned it to simply not use ext4, but any other sane FS (since they all actually do it properly).

    70. Re:Wrong question by Anonymous Coward · · Score: 0

      Hey Jane? I'd say the people who remained single ARE THE WINNERS, rather than the losers! After all - 4/5 couples get divorced it seems nowadays, and for what? So some bimbo can take 1/2 of what the guy in the marriage owns (which IS what happens 9/10 times)?? No, I'd have to say it's "the other way around", & they're the winners, & TOO SMART to get snagged by some 2-bit thieving slut who'd sell her pussy for a dollar by suckering in some "I was pussy-whipped @ 22 & got hitched" fool (or a fool that was stupid enough to knock that same slut up, & then got hooked up with that same gold-digging slut with no soul (who'd sell her own mother to get a dime)). Yu've got to be a "Jane" (a woman), with a statement like yours, since your life goals are to rob some man of whatever he has with your "perfect prison" (That removes the desire of the prisoner to escape) that is between your legs, which is what most women do, and do so at the expense of the fools who fall for it. No thanks.

    71. Re:Wrong question by Anonymous Coward · · Score: 0

      Wow, sounds like you have some deep seeded issues.

  4. Probably not (yet) by jamesmorlock · · Score: 2

    I would just wait until it becomes main stream and all the issues are worked out, until then I'll stick with ext3

  5. I think it's "safe enough" by buttfscking · · Score: 2, Interesting

    I moved to ext4 as soon as it became available. I haven't had any problems thusfar (no data loss, etc), and the increased speed is noticable. So - in the opinion of a very casual Linux user - I would say that yes, it's "okay." I'm not sure I'd trust it with anything super serious, though. I could be the only one without any problems, after all. As always, you should tip-toe around anything bleeding-edge.

    1. Re:I think it's "safe enough" by eldavojohn · · Score: 5, Funny

      I moved to ext4 as soon as it became available. I haven't had any problems thusfar (no data loss, etc), and the increased speed is noticable. So - in the opinion of a very casual Linux user - I would say that yes, it's "okay." I'm not sure I'd trust it with anything super serious, though. I could be the only one without any problems, after all. As always, you should tip-toe around anything bleeding-edge.

      Yeah, man, it's ok go ahead and flip your entire corporation's servers to ext4 over this weekend. A Slashdot user named buttfscking just said it is "safe enough."

      --
      My work here is dung.
    2. Re:I think it's "safe enough" by Anonymous Coward · · Score: 1, Funny

      Buttfscking is my real name you insensitive clod!

      Sincerely,

      Ray J. Buttfscking

    3. Re:I think it's "safe enough" by drinkypoo · · Score: 1

      Speaking of users with funny names, I converted to ext4 (the hard way — create a bootable backup, then repartition) as soon as Jaunty went final. So far system stability seems to be about the same as ext3. I've hung it with a couple of effective fork bombs (shell scripts accidentally spawning themselves because I am too stupid to enter a complete path) and had to force-power-cycle with no data loss or indeed problems of any kind.

      I wouldn't have done this, however, if I didn't have a full system backup. So I'd say if you have to ask, the answer is no.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    4. Re:I think it's "safe enough" by BrokenHalo · · Score: 4, Informative

      I haven't had any problems thusfar (no data loss, etc)

      How do you know? Do you do md5sums on every file? Most admins I've come across don't seem to, and it could be months or years before you find out, in which case any loss might easily end up outside your backup cycle.

    5. Re:I think it's "safe enough" by Hognoxious · · Score: 2, Insightful

      Well he said not to, but don't let the facts interfere with a choleric rant.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    6. Re:I think it's "safe enough" by Anonymous Coward · · Score: 0

      Well he said not to, but don't let the facts interfere with a choleric rant.

      Uh what? None of your post makes any sense, especially since Buttfscking said it was "okay" and "safe enough."

      The last part of your post is downright trollish.

      Choleric:

      1. Easily becoming angry.

      2. Showing or expressing anger.

      Rant:

      1. A criticism done by ranting (To speak or shout at length in an uncontrollable anger).

      2. A wild, incoherent, emotional articulation.

    7. Re:I think it's "safe enough" by josephcmiller2 · · Score: 1

      md5sums no longer safe - use SHA256

    8. Re:I think it's "safe enough" by ChrisMaple · · Score: 1

      "How do you know? Do you do md5sums on every file?"
      tripwire? At one time this was part of a standard Red Hat installation.

      --
      Contribute to civilization: ari.aynrand.org/donate
    9. Re:I think it's "safe enough" by Anonymous Coward · · Score: 0

      And this is why those of use that used to do md5 sums on our files now run ZFS. Have the file system take care of any data corruption.

  6. Yes by Anonymous Coward · · Score: 0

    If you're producing file undelete software.

  7. Speed improvements? by wild_berry · · Score: 1

    Did you find any significant performance improvements compared to ext3?

    The extents mean that a large contiguous read is faster and files are more likely to be written in contiguous chunks, giving a bit of a boost to the filesystem. That's the explanation I have for my system and its 5400-rpm laptop disks seeming quicker (note that the appearance of greater performance isn't greater performance).

  8. maybe by wizardforce · · Score: 0

    It depends on why you are switching from an older filesystem to ext4. It's a relatively new filesystem so you should probably expect it to be a bit more buggy when combined with software not designed for it. From my limited experience with a combination of KDE and ext4 recently I'd wait on upgrading for a while. Ext4 looks like it could be very interesting as software matures around it however, as it is currently KDE seemed to me at least a bit less stable on ext4 than it was on ext3. however, I didn't stay with the filesystem as long as I should have so take this with a bit of salt...

    --
    Sigs are too short to say anything truly profound so read the above post instead.
    1. Re:maybe by arth1 · · Score: 1

      Not only compared to older file systems. XFS is such an older file system, and it still outperforms ext4 for quite a few operations. For deletes, ext4 is far faster, but otherwise, XFS tends to win.

      Then again, it all depends on what you're going to use the system for. Some file systems are very good for running databases on top of, and others are good for small and fast create/destroy operations, and yet others favour appends.

      In an informal test I just did with large Subversion repositories, ext4 didn't score too well, being rather slower than both JFS and XFS, and slightly slower than ext3. An rsync repository gave somewhat similar results, although here it outperformed ext3. On the other hand, for a build environment ext4 fared much better, and outperformed the others.

      But anyhow, this is all irrelevant if you don't consider ext4 mature enough for production use yet. I'm not sure that I do -- there may still be critical bugs that haven't been found yet, and I'd give it a little more time before I shift it from "bleeding edge" to "cutting edge".
      It'll get there, I'm sure, and possibly even without any major bugs fixed. But I don't want my production systems to be the test case for verifying that. If ext3 and xfs works well for me, I don't see a need to change just yet.

  9. Um, yes, it's called fsck. by dandaman32 · · Score: 5, Informative

    I'm using ext4 on an encrypted partition on my tiny X41 tablet. The hard disk is 5400RPM IIRC, so when Ubuntu decides to run fsck due to a scheduled run or an unclean shutdown after a certain bug manifests itself, I don't have to sit there for 10 minutes or more waiting for fsck to run. That for me and many other casual users is probably the biggest advantage of ext4.

    Does a laptop count as production? In the eyes of an everyday user, yes. My laptop is very much "production" IMHO, and I trust ext4 enough to not magically make all my school assignments disappear.

    Digressing a bit, I haven't seen any of the data loss either, though I use GNOME and not KDE. I do think that if an application relies on specific undocumented behavior, that the application should change, not the filesystem driver. It's acceptable that the kernel developers are doing their best to get temporary workarounds into place, but the permanent solution is to fix the applications so they don't depend on undocumented behavior.

    1. Re:Um, yes, it's called fsck. by Sfing_ter · · Score: 1

      yeah, i fixed that years ago by using reiserfs.

      --
      A computer once beat me at chess, but it was no match for me at kick boxing. Emo Philips
    2. Re:Um, yes, it's called fsck. by RiotingPacifist · · Score: 1

      reiserfs, ive been using it for years for fast fsck and it can handle a file rename gracefully too :O
      Its not undocumented, the problem is kde was using write then rename to make sure there was an atomic operation an gaurantee the integrity of the file, nobody expects a rename to fail (and then ext4 came along and zeros metadata at bad times to improve the performance)!

      --
      IranAir Flight 655 never forget!
    3. Re:Um, yes, it's called fsck. by hackstraw · · Score: 2, Informative

      Maybe I'm clueless, and I'll be corrected shortly, but a) didn't ext3 bring this functionality back in in 2000 or so? b) don't most distributions format their partitions with the options to not do fsck's periodically based on mount count or time?

      <insert paragraph break about here>

      I know that every system I ever have to create a filesystem manually I remove the counts to prevent that quick reboot from being a slow reboot and a trip to the data center to babysit the thing through a fsck.

    4. Re:Um, yes, it's called fsck. by Thinboy00 · · Score: 1

      fscking occasionally is probably a good idea for (file)system stability...

      --
      $ make available
    5. Re:Um, yes, it's called fsck. by Ant+P. · · Score: 1

      ext4's fsck is faster than ext3's because it keeps track of unused areas of disk explicitly - as opposed to having to check them just in case there's something there.

      It takes seconds instead of minutes, which is good enough to convince me to stop skipping it.

  10. Maybe.... by jonnycando · · Score: 1

    ...a moot point for me....I have been using xfs for several years, and so haven't tried, nor do I think I need the latest iteration of ext. But like was opined already, it's not ext4 but the apps that need fixing. So it seems at least.

  11. ext4 is buggy by hamanu · · Score: 4, Interesting

    Well, the fsck times are really fast compared to ext3, and thank god, because EVERY time I reboot it requires an fsck, complaining about group descriptor checksums. Even if I unmount my ext4 filesystem and remount it without rebooting it gets all fscked up. I have a 3TB ext4 fs on LVM on RAID, that was NOT converted from ext3, but built on brand new drives. My similar ext3 filesystem has had so such problems.

    ext4 takes about 7 minutes to fsck, ext3 took hours. I hope they fix this soon.

    --
    every _exit() is the same, but every clone() is different.
    1. Re:ext4 is buggy by msuarezalvarez · · Score: 4, Informative

      Maybe you should do something about whatever the cause for the constance fsck'ing is. You do realize it is quite abnormal to have a system have errors at each remount, don't you?

    2. Re:ext4 is buggy by RiotingPacifist · · Score: 1

      why were you on ext3 if you needed constant fcsking there have always been better options resierfs, JFS, etc

      --
      IranAir Flight 655 never forget!
    3. Re:ext4 is buggy by TCM · · Score: 5, Insightful

      But he uses R-A-I-D! R-A-I-D magically makes data bulletproof and immune to disaster as we all know.

      Seriously, running a 3TB RAID with a buggy fs and applauding faster fsck times instead of wondering why the fs gets fucked up constantly must be the peak of idiocy.

      --
      Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
    4. Re:ext4 is buggy by StarHeart · · Score: 1

      This sounds like a problem I have had. It isn't ever time I reboot, and has gotten better with newer kernel versions. Mine is a 4tb ext4 filesystem on linux software raid5.

      --
      Havoc Penington, the bane of my Linux desktop.
    5. Re:ext4 is buggy by Junta · · Score: 2, Interesting

      I too had a 2TB RAID volume with ext4. I suffered the same situation. I continue to complain myself even though I have reformatted as ext3 and solved my problems, so that others will hear my issue and learn.

      And before you claim my underlying IO must be flawed, a large part of my job is storage subsystem validation and I'm quite used to isolating which layer is inducing problems from storage controller hardware, drivers, or higher-layer os layers, and every thing I did, every test I ran, pointed to ext4 as the culprit in this case.

      --
      XML is like violence. If it doesn't solve the problem, use more.
    6. Re:ext4 is buggy by Anonymous Coward · · Score: 0

      That is exactly why I don't use ext for large filesystems.

      XFS is the most stable and feature-filled filesystem out there right now, and it has been for the last 6+ years or so. So for me it's XFS on everything except /boot partitions. No fschk's with XFS either. I have tens of terabytes on XFS and have never lost one piece of data.

    7. Re:ext4 is buggy by ion.simon.c · · Score: 1

      Would you mind pointing us to your test results?

    8. Re:ext4 is buggy by hamanu · · Score: 2, Informative

      Hey genius! It's called SARCASM! There is no need to insult me for sharing my experiences with ext4. I am not some idiot who thinks RAID is a backup either.

      I am NOT happy with ext4, but in case you needed to know more about my system, my ext4 data is a MIRROR of the ext3 data, and I have a LTO-2 ultirum drive to backup my filesystem regularly. I have off-site backups in fireproof waterproof safes.

        I also use e2image to get the metadata off the drive in case I need to reconstruct the contents.

      --
      every _exit() is the same, but every clone() is different.
    9. Re:ext4 is buggy by hamanu · · Score: 2, Informative

      I did, and the the cause of the fscks was...(drumroll)..ext4.

      --
      every _exit() is the same, but every clone() is different.
    10. Re:ext4 is buggy by hamanu · · Score: 1

      constant fscking is the bug, not the intended behaviour, dude. If I had KNOWN it would fsck every reboot I would have done that. Now that people have heard my experience maybe they will choose that option.

      --
      every _exit() is the same, but every clone() is different.
    11. Re:ext4 is buggy by jabuzz · · Score: 1

      I would add that it is the only free Linux file system with directory quota's. They call them project quota's and they can have more than one directory tree in them, but they do work.

  12. No by ducomputergeek · · Score: 2, Insightful

    We avoid anything that has less than 24 months of wide deployment unless there is some absolute pressing need to move to an unstable/untested product.

    We have test and development systems where we run latest and greatest, but generally they are used in sync with the existing system. We don't switch over until we're damn sure there aren't any unforeseen consequences. That typically means 12 months without any major hiccups and 3 months without minor ones.

    --
    "The problem with socialism is eventually you run out of other people's money" - Thatcher.
    1. Re:No by RiotingPacifist · · Score: 1

      why is this troll? ext4 hasn't been out nearly long enough!

      --
      IranAir Flight 655 never forget!
    2. Re:No by icebike · · Score: 3, Funny

      We avoid anything that has less than 24 months of wide deployment unless there is some absolute pressing need to

      Good Idea. Let's all follow this sage advice.

      --
      Sig Battery depleted. Reverting to safe mode.
    3. Re:No by sznupi · · Score: 1

      If only that was the policy at Black Mesa...

      --
      One that hath name thou can not otter
  13. It's a good file system. by 3vi1 · · Score: 4, Interesting

    I was one of the people that spoke loudly when Ext4 caused 0-byte file corruption.

    While I don't entirely agree that it's just "an application issue", because apps that work fine on every other filesystem should not need to be re-written specifically for Ext4, I am pleased at the work the devs have done to work around the problems. The kernel patches have eradicated the issues I had with corruption, and the performance is still great.

    I never did official benchmarking to determine the extent, but my perception is that there's a noticeable performance increase when using Ext4 instead of Ext3.

    If I were building a production server, I may think twice and just go with Ext3... unless the app would *greatly* benefit from Ext4. However, for a desktop system, I think Ext4 is a very good choice and ready for primetime.

    1. Re:It's a good file system. by Flammon · · Score: 3, Interesting

      ... because apps that work fine on every other filesystem should not need to be re-written specifically for Ext4

      Not quite. I believe XFS and JFS behave the same way as Ext4. Here's a good article and thread on the subject. http://lwn.net/Articles/322823/

    2. Re:It's a good file system. by QuoteMstr · · Score: 3, Interesting

      Not quite. I believe XFS and JFS behave the same way as Ext4.

      When XFS was first released, there was quite a buzz surrounding it before people realized they'd lose data. XFS, not ext3, would have been the the de-facto Linux standard had the developers not stubbornly refused to fix its dataloss bugs. By the time they finally got around to it (for some cases), there'd already been irreparable damage to XFS's reputation.

    3. Re:It's a good file system. by Anonymous Coward · · Score: 0

      XFS was first released in IRIX, in 1994. The buzz was that a commercial filesystem was being put into the kernel. This particular behavior was well known, and the "dataloss bugs" were not bugs, they were part of the design. XFS on Linux was killed by the same application issues now affecting ext4.

      XFS has horrible delete performance, this alone would have kept most people from using it for any amount of time anyway.

    4. Re:It's a good file system. by jabuzz · · Score: 1

      Rubbish most users do *NOT* delete large numbers of files on a regular basis. I have file systems with in excess of 20 million files on them. If I look at the number of files that are expired in the TSM logs (and some of these are renames), I might see 20,000 files in a day tops.

  14. Anonymous Coward by Anonymous Coward · · Score: 0

    After the famous filesystem corruption due delayed allocation I lost confidence in ext4. I've been using xfs on some partitions and it works great.

    1. Re:Anonymous Coward by grumbel · · Score: 2, Informative

      If you worry about file corruption, I wouldn't touch XFS, that thing shredded files for me on every single unclean shutdown.

    2. Re:Anonymous Coward by larry+bagina · · Score: 1

      Seconded. Earlier this year, I set up a NFS raid box for storing videos, music, and other large files, so I went with XFS. Within 10 minutes (only copying 4-5 videos over), XFS had corrupted itself to the point it couldn't be recovered. EXT3 may be a tad slower, but it can manage to read and write files.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    3. Re:Anonymous Coward by drinkypoo · · Score: 1

      I use XFS for my 1TB MyBook, I have frequent power failures and occasionally just rip the 1394 cable out of the side of my laptop, and have never lost any data on that volume. (I've had to force-off my system with / on ext4 and never had a problem; I DID have a problem with ext3 once but it was a long time ago.)

      Yay anecdotes!

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    4. Re:Anonymous Coward by dotgain · · Score: 1
      I can't find it anywhere googleing, but I remember some slashdotter posting that it is a feature of xfs to guarantee that every file opened for writing at the time of a crash will be deleted after the crash. (The reason given is that those files could possibly contain state that could be dangerous, which seems understandable in some circumstances)

      I experience exactly this problem, but can't find any documentation for it. If it's the case, every XFS mount should show in the kernel log "warning: Make sure you've read the xfs featurelist, It might not be doing what you want". Does anybody know more about this?

  15. Connection Interrupted errors loading slashdot? by Anonymous Coward · · Score: 0

    Is anyone else getting a lot of these "connection intrupted" errors when clicking on stories?
    It's been going on for a week now and is making slashdot almost unreadable and annoying.

    1. Re:Connection Interrupted errors loading slashdot? by icebike · · Score: 1

      No. But this seems the wrong place to hid this question.

      --
      Sig Battery depleted. Reverting to safe mode.
    2. Re:Connection Interrupted errors loading slashdot? by CarpetShark · · Score: 1

      Blame your your firewall, your proxy, your router, or your ISP, in about that order.

  16. Not for me... by petrus4 · · Score: 1

    I've never used anything other than Reiser3 with Linux. Might not be the most reliable or fast, but it has other advantages.

    - Undeletion.
    - Partition resizing.
    - Readable from within Windows via YaReG.

    1. Re:Not for me... by Anonymous Coward · · Score: 0

      ext2/3 can be resized offline, ext4 may have online resize too. i can also read ext2 partitions from windows (see http://www.fs-driver.org/)

      and undeletion should never be needed :)

    2. Re:Not for me... by The+MAZZTer · · Score: 1

      As my sibling post said, http://www.fs-driver.org/ is a Windows File System driver drive ext2, and thanks to forward compatibility (as I understand it), ext3 works too. http://sourceforge.net/projects/ext2fsd is another alternative.

      You should be warned that whenever I've used the first tool to write to the partition, I've ended up with Ubuntu fscking it on boot. But I've never noticed any problems like data corruption from using it. The second one also seems OK, although when browsing the disk from the Command Prompt it shows entries for . and .. in the root, which confuses dir.

    3. Re:Not for me... by dotgain · · Score: 1

      I have to chip in for ext2fsd as well, I use it is because I can't store >2GiB files on FAT32 partitions, which I need to do for transferring DV files back and forth between Linux tools and Windows/AvidXPress DV. It's wonderful, never caused me to fsck unnecessarily, and no data loss (that I've noticed) and when starting out I did calculate sha1sums just to check.

  17. Theodore Ts'o: Donâ(TM)t fear the fsync! by sirdude · · Score: 5, Informative

    After reading the comments on my earlier post, Delayed allocation and the zero-length file problem as well as some of the comments on the Slashdot story as well as the Ubuntu bug, itâ(TM)s become very clear to me that there are a lot of myths and misplaced concerns about fsync() and how best to use it. I thought it would be appropriate to correct as many of these misunderstandings about fsync() in one comprehensive blog posting.

    http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/

    FYI, Ts'o is the ext4 maintainer.

    1. Re:Theodore Ts'o: Donâ(TM)t fear the fsync! by mybecq · · Score: 1

      Interestingly, his blog post is titled Don't fear the fsync!

      He then gives this "advice" under the heading (Perceived) performance problems with fsync()

      An fsync() call every 15, 30, or 60 minutes, done by a thread which doesn't block the application's UI

      The lesson is thus: "Don't fear it, but use it really sparingly!"

  18. You're Asking Slashdot? by welshbyte · · Score: 2, Insightful

    You should be asking this question in a more authoritative forum. The majority of Slashdot readers are likely to just regurgitate their perceived status of ext4 from the last time ext4 was mentioned on Slashdot and I know for certain that ext4 has had more testing and development since then. Try asking the ext4 development team; they're very nice, helpful people in my experience. I refer you to the #ext4 channel on irc.oftc.net and the linux-ext4 mailing list.

    1. Re:You're Asking Slashdot? by ionix5891 · · Score: 1

      i wonder if /. servers make use of ext4

  19. wait until at least 2.6.30+ by xenoterracide · · Score: 2, Insightful

    last I checked some patches for the dealloc empty file problem was being merged in 2.6.30. if you want to avoid it but want some other advantages like faster fscks you could go with data=journal on your filesystems which is a bit slower but also disables dealloc, while still having extents, barriers, and other ext4 benefits. I've been using data=journal on my /home partition without a single problem.

    it also depends a lot on what you have in 'production'. a web server that's mostly doing reads it should be fine for. a heavy email server... well.. can you afford to lose email on a crash? I think it might be alright for a server that just does mta but not the fs for the actual mailbox's (with dealloc anyways). database server should be fine, because the database's job is to make sure data hits the disk, among other things. dns servers are a very read heavy so again I would think it'd be fine. so basically you need to watch anything that's heavy write and not to a database, and even then only with dealloc.

    still as I'm sure others have said, it's a good idea to wait on new tech like this. some tools don't yet recognize that ext4 is not ext3.

    1. Re:wait until at least 2.6.30+ by slash.duncan · · Score: 1

      While I use reiserfs(3) here and therefore have no personal bone in this, I regularly run live git kernels (tho only from -rc2 or so) and follow LWN religiously, plus kernel trap, H Online, LKML itself, and others, somewhat less religiously. I'm thus reasonably confident I've kept up with this and most other major kernel issues of late.

      The ext4 "big blowup" occurred during the 2.6.29 development cycle. Some fixes were put in immediately, but some were too big or weren't going to be ready in time, so they went into 2.6.30, which is just about ready to come out, maybe another couple weeks.

      So ext4 is still getting major "non-routine-maintenance-only" changes. By my reckoning, that makes it absolutely not-production-ready. Testing, fine, but not production. Here's what I'd consider ready to even look at for production, and even then, depending on the usage, the recommendation could be wait some more: Give ext4 one full kernel release cycle with no further issues and nothing other than routine updates and changes. Then /start/ considering it. Since some major changes went in for 2.6.30, that means that 2.6.31 would be the first possible full cycle without anything but routine maintenance changes, and it will be at least 2.6.32 before I'd consider it for anything but temporary data, for /me/. That's assuming there's no more non-routine changes still coming in 2.6.31, which would of course put it off yet another full cycle, to 2.6.33.

      Again, that's the first I'd even /begin/ to consider it for production. As you've probably already noted if you've read the other comments, many sites are way more conservative than that, requiring six months (2-3 kernel release cycles) to two full years (about 10-11, possibly 12 kernel release cycles) with nothing but routine maintenance updates before they'll trust a filesystem. That's fine and I can certainly see 6-9 months, but personally, beyond a year, I'd consider the odds of screwed backups (even with tested 3X backups, onsite and 2X offsite) coincidentally hitting at the same time as a Katrina at my main ops site more worrisome than the stability of my production operational filesystem.

      So I'd say no earlier than 2.6.32... and that's provided 2.6.31 brings no non-routine ext4 changes at all.

      I won't attempt to cover the potential advantages and etc. That's covered well enough elsewhere. If you don't already know it's going to be a performance win based on your previous research and testing, you're putting the cart before the horse and you shouldn't even be /thinking/ about ext4 for your /productions/ systems yet. As I said, an OK-to-break testing system, sure, but you don't go changing a production system with no clear idea of what the benefits are going to be, and without knowing from previous testing that they are concrete, and WILL be there in your deployment scenario.

      FWIW/YMMV, but that's the policy I'd be using if it were my ass on the line.

      --
      Duncan
      "Every nonfree program has a lord, a master,
      and if you use the program, he is your master."
      R Stallman
  20. Regretting using it.. by Junta · · Score: 1

    I have installed a system and have been getting resize inode invalid and group descriptors corrupted issues on clean reboots. fsck has yet to fail me, and IO stress tests have demonstrated no general io corruption other than ext4 errors.

    On the flipside, for my applications I haven't really gained much.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  21. "*^%£*(!^&*T"49! by RiotingPacifist · · Score: 1

    Im running ext4 too but as you can the content of my posts is fine!

    --
    IranAir Flight 655 never forget!
    1. Re:"*^%£*(!^&*T"49! by dword · · Score: 1

      Warning: ext4 may break your joke detector.

    2. Re:"*^%£*(!^&*T"49! by RiotingPacifist · · Score: 1

      Warning: ext4 may break your joke detector.

      Ok, so it wasn't particularly funny joke!
      But i was definitely going for a joke about how the files are fine but the metadata (in this case post title) is what gets messed up.
      yeah obviously that kind of joke bombs when I do stand-up tho!

      --
      IranAir Flight 655 never forget!
    3. Re:"*^%£*(!^&*T"49! by Anonymous Coward · · Score: 0

      But i was definitely going for a joke about how the files are fine but the metadata (in this case post title) is what gets messed up.
      yeah obviously that kind of joke bombs when I do stand-up tho!

      Two words: get laid.

  22. Its good for general use by revjtanton · · Score: 1

    Ive used it for the past few months on my netbook, however I've only recently tried the Fedora 11 build on ext4 on my desktop. I was impressed w/ the boot speed on my netbook, and for the netbook thats all that really matters. I had absolutely no problems w/ the netbook using Jaunty UNR and ext4.

    I did have some problems w/Fedora on my desktop though. It booted nice, and did all the easy stuff well (web browsing, office stuff, etc.) but it got all screwy w/Wine and Eclipse. It might be the GPU I have, or the lack of driver support for Fedora w/it, but Wine ran extremely badly with Steam and Left 4 Dead as compared to a Fedora 10 build w/ ext3. Also I had the typically reported problems w/ext4 and data loss when I was doing some Android dev in Eclipse.

    Like anything new in the open source community there are bugs to be hashed out. If nobody uses it and nobody reports the bugs then it won't get better. The boot speed gives it value. Windows 7 RC is booting only slightly slower than ext4 (at least on my system) so if Linux is going to make its stand its got to do certain things to make it distinct. I believe simple things like boot-time and broad dev support are the areas that Linux can shine and to that end it appeals to the right clientele to take it in the right direction as a community. :)

    1. Re:Its good for general use by gilboad · · Score: 1

      ... I assume that you understand that you comment (and GPU problems) with a pre-release (...) Fedora 11 has -nothing- to do with the subject at hand (ext4 stability), right?

      - Gilboa

    2. Re:Its good for general use by revjtanton · · Score: 1

      Thats not fair to say. 11 is the first distro that allows use of ext4 with Fedora, and I do mention that the problems with GPU probably have little to do with what we're talking about, but that I've also had issues with data loss etc so it is relevant. I simply thought all bugs should be noted and it wasn't my intention to stray off topic.

    3. Re:Its good for general use by Anonymous Coward · · Score: 0

      Fast boot + netbook = Moblin. 5-7 seconds from grub to working desktop, disk and cpu idle. Alternately you can patch your Fedora kernel to use sreadahead and muck around with the init scripts, but Moblin is much easier to work with.

      The beta adds a new, interesting UI. It's probably better than Ubuntu's Netbook Remix, but it may or may not be a selling point to a slashdotter.

  23. No by _LORAX_ · · Score: 1

    It still needs more time. I have played under both ubuntu and rhel 5.3 and run into strange behavior that makes me uncomfortable.

    1) Bonnie++ throws errors even on server class hardware that something is wrong when creating and deleting a large number of random files. This is with no errors in the filesystem and everything operating normally. https://www.redhat.com/archives/ext4-beta-list/2009-February/msg00000.html

    2) A crash of ubuntu ended up removing *ALL* group and other permission on a laptop drive. Not just those altered within 2 minutes of the crash, but of every single file in the system leading to a system that non-root users could not log into.

    Neither of those are acceptable. For now it's still ext3 only until ext4 has had some more time to mature.

  24. Missing the point, IMO by Weaselmancer · · Score: 1

    If my understanding of this problem is correct, it occurs only if the system crashes shortly after the KDE apps have updated their configuration files. If so many people think this is a big problem, I'm more worried about the great number of constantly crashing linux machines.

    It's not that a ton of Linux boxes are crashing. It's just that it's a computer, and sometimes they crash. ANY computer. Any machine, any OS. They're made by people, and nothing we make is ever perfect.

    In the light of that though - what we're striving for is to provide the best performance and the best results under any circumstances, even the rare ones. Worrying about these <1% corner cases is what makes a superior product. A problem handled gracefully is a problem the user hopefully never sees. And that goes a long way towards creating a satisfactory experience. People say things like, "I don't know - I just plugged the scanner in and it died." They never say "I've plugged this scanner in over 1000 times and it's never died!" People remember the negatives, so it always pays to minimize those, however rare they may already be.

    --
    Weaselmancer
    rediculous.
    1. Re:Missing the point, IMO by Repossessed · · Score: 2, Informative

      They never say "I've plugged this scanner in over 1000 times and it's never died!"

      Speaking as a help desk tech, they say that alot. In fact, its always worked before is probably the single most common form of whining the caller's do.

      Its particularly amusing when someone is complaining they've never had te replace a battery/toner cartridge before.

      --
      Liberte, Egalite, Fraternite (TM)
    2. Re:Missing the point, IMO by Weaselmancer · · Score: 1

      True, but what is the occasion of the call?

      "This time it didn't."

      They'd never call you and say that it's worked 1000 times and still is, and that's just great.

      It's the crashes that stand out in memory, not uptime. At least that's been my experience.

      --
      Weaselmancer
      rediculous.
  25. Not reassuring by Junta · · Score: 3, Insightful

    He presents three common cases for 'quickie' file modifications:
    -Modify-in-place. Yes, this logically cannot be expected to leave the content intact in an unexpected interruption. You ask the OS to blow away data, then send it new data, there is a logical indeterminate state in the middle where doing things in the order you specified leaves you exposed.
    -Write new file, use rename, using fsync to ensure a low exposure of data. This forces data to disk so it's coherent.
    -Write new file and then use rename without fsync:
    *This* he claims should easily be expected to corrupt the contents. I take issue with this. The fact that this occurs is because ext4 commits the rename out-of-order ahead of the data commit. I don't understand why the rename operation cannot also be delayed until after the data has been written out. I've seen several people ask 'I don't care that the change happens *now*, but I want the changes to occur in the order I specified', and thus far have seen Ts'o miss that point (intentionally or unintentionally). I have not read any explanation of why changing hardlinks should logically be an operation to jump ahead of pending data writeout. I could be missing something, but I'm not the only one with these questions.

    fsync gives a relatively expensive guarantee above and beyond what people require to behave sanely. He says its inexpensive 'now' relative to the past. However, 'now' in this context only applies to ext4 users and thus the operation degrades other filesystem performance and fsync remains an expensive operation relative to not doing at all.

    In terms of the general attitude of filesystems shrugging off data consistency so long as their indexes are intact, I find myself agreeing with Torvalds' comments on the debacle:
    http://thread.gmane.org/gmane.linux.kernel/811167/focus=811700

    --
    XML is like violence. If it doesn't solve the problem, use more.
    1. Re:Not reassuring by Anonymous Coward · · Score: 0

      The ridiculous part of this is that there are 3 distinct writes in the replace file with rename case.

      1) Open the file (modifies the directory)
      2) Write the file contents (obviously writes data)
      3) Rename file (modifies the directory)

      It is reasonable to expect that when (3) is flushed to disk that (2) has also been flushed to disk. However that is not the case, it'll flush 1&3 fairly quickly if the directory data block has room (doesn't need to allocate more blocks). It appears to delay allocates, I assume to minimize fragmentation and maximize contiguous writes.

      While that may adhere to all of the posix guarantees, it is missing a fairly important concept to data consistency. That is, you should always 'fail-safe' (or at least provide a fail-safe ordering mode) In storage, that means creates before deletes. A similar concept in programming is create the object before updating external pointers.

      What people want is guaranteed fail-safe ordering without requiring immediate flushing.

  26. Production? by msimm · · Score: 1

    A laptop? No, that doesn't count unless you run your production system on another laptop of the same build and make. At least where I work production is business critical systems on real kit, then we have our development environment, testing environment, and after that (in terms of importance) we have the business/office network and individual workstations (and your laptop would be somewhere after that).

    Nothing a developer did on a home system would be considered production ready without, you know, doing lots of actual testing.

    --
    Quack, quack.
    1. Re:Production? by colinrichardday · · Score: 1

      So someone could actually use a laptop for hosting a crucial web site, but you wouldn't "consider" it to be production ready without actual testing. Hmm . . .

  27. Data loss by WillKemp · · Score: 1

    A couple of months ago i installed Ubuntu 9.01, which used ext4 by default. Running it, i experienced data loss for the first time since i moved from ext2 to ext3 quite a few years ago now. I've just changed back to ext3 - which has been rock solid for me since it first appeared in Redhat or whatever distro it was i was using back then.

    1. Re:Data loss by slashdevnull · · Score: 2, Informative

      A couple of months ago i installed Ubuntu 9.01, which used ext4 by default. Running it, i experienced data loss for the first time since i moved from ext2 to ext3 quite a few years ago now. I've just changed back to ext3 - which has been rock solid for me since it first appeared in Redhat or whatever distro it was i was using back then.

      There's no such thing as Ubuntu 9.01. I'm assuming you mean Ubuntu 9.04 (aka. "Jaunty"). If you installed that a few months ago, you installed it while it was still in pre-release status. It also uses ext3 by default, not ext4. See http://www.ubuntu.com/testing/jaunty/beta#Ext4%20filesystem%20support . where the Ubuntu team says "Ubuntu 9.04 Beta supports the option of installing the new ext4 file system. ext3 will remain the default filesystem for Jaunty, and we will consider ext4 as the default for the next release based on user feedback. There has been extensive discussion about the reliability of applications running on ext4 in the face of sudden system outages. Applications that use the conventional approach of writing data to a temporary file and renaming it to its final location will have their reliability expectations met in Ubuntu 9.04 beta; further discussion is ongoing in the kernel community."

    2. Re:Data loss by WillKemp · · Score: 1

      Yeah, true. It was 9.04 beta. I was mixing it up with Debian's 5.01. And, yeah, maybe it's not default and i chose to install it - i don't remember, but i generally opt to test new software, and regularly file bug reports.

      But none of that's really the point. The point is that i lost data for the first time in many years - while using ext4.

    3. Re:Data loss by slashdevnull · · Score: 1

      My exposure to ext4 is so far limited to running it on a VM running Ubuntu 9.04 since a few months prior to its April, '09 release. I've noticed improved disk I/O performance, and have experienced no data loss. I'm considering running it on some non-critical production systems, but that's beside my point, which I didn't make clear, so I'll try to do so now:

      Before doing something as extreme as running a newly released filesystem on a not-yet-released distro, I read the distro's release notes to know what the issues and risks were. After doing this with Ubuntu 9.04 beta, I was expecting the possibility of data loss with ext4, so took appropriate steps (set up regular backups, etc.) As a distro test pilot, I packed a parachute (backups), and expected to use it.

      Making a statement today (post- Ted's fix, which has been applied to the Ubuntu 9.04 Linux kernel) such as "I lost data for the first time while running ext4 on Ubuntu (pre-fix)" is unfair and misleading. The question being posed is "is ext4 stable for production systems (implied: "now", and assuming: "in a released Linux distro")?", not "was it stable for production systems a few months ago, in a beta distro release?".

      If people are only concerned with a single technical issue, and that issue has been resolved with a fix from the project maintainer, then we need to be asking questions more like, "Was this the right fix?", then: "Has ext4 been properly vetted, and is therefore trustworthy enough for production systems?", and "If not yet, then when?".

  28. Yes...I have experienced problems with ext4 by Kamphor · · Score: 1

    I nearly lost my whole filesystem. It's a good thing I had a backup core system on reiserfs to boot from and run fsck. from what I understand, it's a problem with the ext4 journaling system and metadata. this link has info on the journal problem...which may have already been patched in the current kernels. http://lwn.net/Articles/284037/ wiki page for ext4 - bottom has a fix for the problem: http://wiki.archlinux.org/index.php/Ext4 essentially, mounting and ext4 filesystem with option "data=ordered" helped my system out. since I have enabled this mount option, my filesystem is now stable even after hard reboots or power failures. Hope this helps out people as it did me! -Kamphor

  29. No by 427_ci_505 · · Score: 1

    As someone who recently had the latest ubuntu trash every inode on my ext3 partitions, I'd have to say no. Not because my case is related to ext4 in any way, but because if a kernel (2.6.28) can get ext3 wrong, I shudder to think what happens with ext4.

  30. That you tube video kicks ass! by Anonymous Coward · · Score: 0

    Funniest thing I've seen in weeks!

  31. Caution by ProteusQ · · Score: 1

    My two cents worth: if in doubt, don't. Wait a year for others to find the bugs.

  32. EXT4 is not broken? by DJRumpy · · Score: 2, Insightful

    Why does everyone keep speaking about EXT4 as if it's broken? It's working exactly as designed. It's the applications that need fixing, no?

    1. Re:EXT4 is not broken? by Jurily · · Score: 4, Insightful

      It's working exactly as designed. It's the applications that need fixing, no?

      Does it matter whose fault it is when users are losing config files? It worked fine before, and now one of my basic expectations concerning Linux is broken: that no matter what happens short of hardware failure, I will not lose the files I already have. We're disappointed, and pointing fingers does not help.

    2. Re:EXT4 is not broken? by DJRumpy · · Score: 1

      If your Linux box is crashing that often and you have no backups, the only person you have to blame is yourself. If something is that mission critical you should be using a more stable branch for one and backups should alleviate the potential for data loss if it occurs (including an FS that is either tested with known good apps that aren't exposed to this, or by using a different OS that doesn't see this issue). Crashes should be very few and far between in any case.

      If the specification allows this kind of gray interpretation it should be clarified to resolve it forcibly either in favor of the FS or in favor of the app designers, but either way it is written to spec while the apps are not.

    3. Re:EXT4 is not broken? by dotgain · · Score: 1

      You're an ext4 developer aren't you? You're why I bought a Mac last week after a decade of trying in vain to use a Linux box as my #1 machine.

    4. Re:EXT4 is not broken? by diegocgteleline.es · · Score: 1

      Because Ext4 is a mainstream filesystem. People has been getting zeroed files for _years_ with XFS, but only enthusiasts used it. Ext4 however is "mainstream" and needs to face problems that many other operative systems/filesystems don't need to fix because they aren't used very widely and their techy users will never step up and admit that their systems suck.

    5. Re:EXT4 is not broken? by davecb · · Score: 1

      Actually the summary is wrong: ext4 breaks a guarantee that was part of Unix since v6, well before Posix was written.

      It's probably fixed now.

      --dave

      [To brutally oversimplify, Posix allows a weakening of a guarantee about the creation and filling of files which required a critical order of data and metadata operations. The weakening is one which ext4 used: one can defer writes of data and metadata to improve performance, which it did, but at the cost of writing metadata before the data without a mechanism to recover the data. The filesystem is consistent, it just doesn't contain the data you expected (;-))

      This was logical, from the Posix spec, but quite startling to the users of the filesystem, who had been using ext3 or other filesystems in the past and not suffering similar problems.

      This was also true of very very old programs from Unix, Minix, BSD or other Linuxes, so blaming the developers may have been an overstatement.]

      --
      davecb@spamcop.net
    6. Re:EXT4 is not broken? by Jurily · · Score: 1, Interesting

      If your Linux box is crashing that often and you have no backups, the only person you have to blame is yourself. If something is that mission critical you should be using a more stable branch for one and backups should alleviate the potential for data loss if it occurs (including an FS that is either tested with known good apps that aren't exposed to this, or by using a different OS that doesn't see this issue). Crashes should be very few and far between in any case.

      And there we have the problem with the Linux community, boys and girls. Ext4 is not behaving like the rest of the filesystems? It's your fault, dear user.

      The files in question are not mission-critical, like Firefox and KDE config files. But they are annoying when they go poof. The crashes I experience come from me applying the power button because the reboot process is waaay too slow for my liking. And I haven't had a single issue with that since Red Hat 7.3. And now you tell me it's my fault I've come to rely on a feature that was there for 10 fucking years? In fact, the very feature that converted me to Linux?

      Do you think I give a fuck what's in the specs? The illusion of safety is now gone, and there is nothing you can say to make up for it. Telling me it's my fault does not help, either.

      In terms of "data loss upon the unexpected", ext4 ranks right there with Windows 95. Now you can turn off your computer.

    7. Re:EXT4 is not broken? by whoever57 · · Score: 2, Insightful

      Why does everyone keep speaking about EXT4 as if it's broken? It's working exactly as designed.

      But is the design any good? If the advantage of EXT4 is better performance, how much of that performance improvement will be lost once the applications are fixed?

      --
      The real "Libtards" are the Libertarians!
    8. Re:EXT4 is not broken? by Ed+Avis · · Score: 4, Interesting

      The point is that you have expressed all sorts of fear about ext4 - oh no, I'm not letting it near my production boxes - but you have not applied the same standard to the applications that trashed their config files when run on ext4. Even though, strictly speaking, it is the applications that are buggy. You should be equally enthusiastic about getting rid of KDE and any other software that trashes configuration files; otherwise it looks like you are playing favourites and blaming ext4 in order to overlook the bugs in the apps you're attached to.

      --
      -- Ed Avis ed@membled.com
    9. Re:EXT4 is not broken? by iYk6 · · Score: 2, Insightful

      Does it matter whose fault it is when users are losing config files?

      Finding out where the problem lies is a pre-requisite for fixing it.

      It worked fine before, and now one of my basic expectations concerning Linux is broken: that no matter what happens short of hardware failure, I will not lose the files I already have.

      The out-of-spec-apps-saving-files-on-ext4-loses-files bug is only a problem with hardware failure.

      We're disappointed, and pointing fingers does not help.

      Well, sure, it doesn't help now. ext4 was quickly amended to behave more like ext3, and there is no reason to bitch about the past.

    10. Re:EXT4 is not broken? by DJRumpy · · Score: 1

      Yes, it is your fault for not making a backup. NO PC is guaranteed to not require a backup no matter what OS or Filesystem you run. To think otherwise is foolish, so yes, you are to blame for that. You are also to blame if you are powering your PC of via the Power Button instead of waiting for it to do a proper shutdown. Who sits there and watches their PC power down? Why are you so concerned about how long it takes? You're just begging for disaster with such computing habits. Start the shutdown and walk away.

      Are you to blame for the data loss root cause? No, and I never said you were (unless your an app developer that is). The issue has been well documented for months. If the apps your using are still not patched for those apps requiring critical system writes then the app designers are to blame for your data loss. Have you contacted them to complain?

    11. Re:EXT4 is not broken? by Thinboy00 · · Score: 2, Informative

      If the specification allows this kind of gray interpretation it should be clarified to resolve it forcibly either in favor of the FS or in favor of the app designers, but either way it is written to spec while the apps are not.

      The specs are not remotely ambiguous: They are in favor of the FS. The problem is that app developers got lazy and wrote
      bar=open("/foo/bar", O_CREAT | O_WRONLY | O_TRUNC);
      //some write operations
      close(bar);

      When the specs say they should write this (otherwise if the write operations don't make it to the disk for any reason the config file is truncated):
      bar=open("/foo/bar.new", O_CREAT | O_WRONLY | O_TRUNC);
      //some write operations
      close(bar);
      rename("/foo/bar.new", "/foo/bar");

      Since the rename operation is atomic the config files are always in a consistent state and changes are atomic; if you need durability (per ACID) you add an O_SYNC to the flags (or follow every write with fsync(bar);) and check for the existence of a /foo/bar.new on startup. Isolation is achieved by locks, separate files, etc.

      Also interesting: unlike fsync(), rename() isn't a very intensive operation; the above code basically says to the system "make sure it's in a consistent state next time I look at it, but don't panic if it doesn't make it to disk at all, just make sure the old version is still there."

      --
      $ make available
    12. Re:EXT4 is not broken? by Thinboy00 · · Score: 2, Insightful

      If you want to ensure your data makes it to disk, use fsync() like the specs say. If you won't use fsync(), don't complain when the FS loses your data; the specs say it MAY randomly lose for any reason, unless you fsync(). If you just want Consistency and not necessarily Durability, just make a foo.new file and rename over foo.

      --
      $ make available
    13. Re:EXT4 is not broken? by Thinboy00 · · Score: 1

      Ext4 has been around for a decade already?

      --
      $ make available
    14. Re:EXT4 is not broken? by DJRumpy · · Score: 1

      Agreed. Note I said interpretation. Apparently it's still not clear enough to some. The FS is within the spec. The apps are not.

      I just wonder why folks aren't putting pressure on the app developers to fix those apps that are still not patched for this. How long has this been a known issue?

    15. Re:EXT4 is not broken? by The+Archon+V2.0 · · Score: 1

      Why does everyone keep speaking about EXT4 as if it's broken? It's working exactly as designed. It's the applications that need fixing, no?

      (broken) = (not trusted), but it doesn't follow that (not broken) = (trusted) or even that (not trusted) = (broken). I wouldn't be the first to cross a newly-made rope bridge over a deep gorge, no matter how unfrayed the ropes looked. ext4 could follow design spec to the letter and cure cancer to boot, but that doesn't mean it's been hammered on in sundry setups 24/7 for two years straight.

    16. Re:EXT4 is not broken? by ivucica · · Score: 3, Insightful

      Didn't your mom teach you not to forcefully shut down any operating system with any file system? Just because it has measures to reduce the damage doesn't mean you can abuse it. So in this case, it is your fault.

      And here I was going around all this time, feeling sorry for ext4 users who actually experienced system crashes due to bad graphics chip drivers or some other similar and silly problems. But no, it turns out that people who complain most are those who rely on operating system being able to resuscitate itself.

      There's a reason why the filesystem syncs itself at the end of shutdown process, and why it is expected that you follow the process to the end. There's a reason why shutdown process exists in the first place. Throwing poor insults like "ext4 ranks with Windows 95" (perhaps you mean Win95's implementation of FAT?) doesn't help. Sure, it shouldn't lose stuff when the unexpected happens ... but you shouldn't rely and expect it will. Unexpected is just that -- unexpected -- and you'd better be prepared for it the next time your desktop falls over while it's turned off and your drive dies a horrible death. Because God, Buddha, Allah, Shiva or someone else will make sure that happens to you, if you've raised yourself to expect that FS will survive being constantly forcefully turned off.

      kthxbye.

    17. Re:EXT4 is not broken? by dotgain · · Score: 1
      No, but whiney, argumentative developers who just want to make an OS for the developers but still expect some kind of widespread adoption nonetheless have.

      Ext4 is not even a chapter in the book of why Linux / FOSS is relatively where it was 10 years ago. The fact that the filesystem's in the mainstream kernel in conjunction with the comments in this thread do not paint a pretty picture.

      To an outsider who didn't bother acquainting himself with all the facts (possibly by way of *shock*dodgy journalism*shock* or similar), one might get the impression the the Linux camp have been having problems with data corruption recently, or some yadayada and be really glad he gotthefacts(TM) and stuck with his Windows box. Most of the exposure seems to be over people arguing whose right, there is no outward appearance of anybody giving a fuck about non-1337 types that are losing files here.

      Ext4 has been around for a decade already?

      No, ext4 is new. The situation is not.

    18. Re:EXT4 is not broken? by Anonymous Coward · · Score: 0

      You're missing the point. Yes, maybe the apps are broken. But...

      ext3+app = working

      and

      ext4+app = broken

      So why upgrade to a system configuration that, as a whole, is broken? Just wait until ext4/app/etc figure it out.

    19. Re:EXT4 is not broken? by Smallpond · · Score: 1

      It doesn't help to back up a disk if the filesystem did not write the data to it. The big change in ext4 is that it doesn't allocate the disk space for a file until it has to, and the amount of memory in modern PCs means that for small files it never has to.

    20. Re:EXT4 is not broken? by spitzak · · Score: 3, Insightful

      EXT4 is broken.

      Posix requires that writing a file and then renaming it to a new location is an ordered atomic operation. Say file B already exists. You write file A, then close it, then rename (mv) it to B. Another program running at the same time opens B and reads it. It will get one of these two results, and NO OTHER RESULT:

      1. It sees the old contents of B
      2. It sees what was written to A.

      EXT4 (before these patches) could result in the following result if your machine crashes and you start it again and look at B:

      3. B is empty (also B is various partially-written versions of A, but empty most common).

      Now it is true that Posix says that if the machine crashes, all bets are off. So yes EXT4 is being technically correct. But it would be equally technically correct if all the files on the disk were empty so this is pointless.

      EXT4 promises to make crashes recoverable. This implies to me that after you recover from a crash, you will be left in a state allowed by POSIX. This means either you get the old contents of B or the new full contents of A, and EXT4 by allowing a different result is breaking it's design and promise.

    21. Re:EXT4 is not broken? by spitzak · · Score: 2, Informative

      The rename is precisely what is broken in EXT4!

    22. Re:EXT4 is not broken? by BrokenHalo · · Score: 1

      I just wonder why folks aren't putting pressure on the app developers to fix those apps that are still not patched for this.

      I can't say I've done much reading up on this particular issue, but common-sense would dictate to me that it's the kernel's job to stand between the application and the hardware. So if selected applications get you into trouble because they misbehave and (attempt to) bypass the kernel in writing directly to the filesystem, then one could be forgiven for saying there must be something seriously dodgy about those apps. In which case it is futile to blame the filesystem, Linus or anyone else.

    23. Re:EXT4 is not broken? by mrmeval · · Score: 1

      So they broke basic functionality again with a wonderful file system that depends on user programs to jump through it's hoops. Wonderful they've joined KDE, Gnome, Xorg so damn many others that to Get Shit Done I can't bring FOSS into my company as even a server OS and maintaining it as a Desktop would be hell.

      --
      I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
    24. Re:EXT4 is not broken? by Zenin · · Score: 1

      Without sync you've still got nothing. rename() affects the directory (the hard links) not the file itself. rename() absolute does not ensure your data is on disk.

      Your example ensures your code has performed all it's write calls before moving the file into its final location (a check against the code crashing more then the FS), which is good. But again, without sync it's only about your own code crashing...it has nothing to do with the FS.

      If you care, sync. Yes, it has a performance penalty; Reliability is costly.

      --
      My /. uid is better then your /. uid
    25. Re:EXT4 is not broken? by lamontg · · Score: 1

      Why does everyone keep speaking about EXT4 as if it's broken? It's working exactly as designed. It's the applications that need fixing, no?

      Everyone expects filesystem to behave transactionally these days, so that if you follow the create-write-rename pattern that you either get the old contents or the new contents of the file. I just wrote this diatribe on the ubuntu bug report:

      https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781?comments=all

      ted ts'o:

      "You can opine all you want, but the problem is that POSIX does not specify anything ..."

      I'll opine that POSIX needs to be updated.

      The use of the create-new-file-write-rename design pattern is pervasive and expected that after a crash either the new contents or the old contents of the file will be found there, but zero length is unacceptable. This is the behavior that we saw with ext2 where the metadata and data writes could get re-ordered and result in zero-length files. With the 800 servers that I was maintaining then, it meant that the perl scripts for our account management software would zero-length out /etc/passwd, along with other corruption often enough that we were rebuilding servers every week or two. As the site grew and roles and responsibilites grew that meant that with 30,000 linux boxes, even with 1,000-day uptimes there were 30 server crashes per day ( even without crappy graphics drivers, a linux server busy doing apache and a bunch of mixed network/cpu/disk-io seems to have about this average uptime -- i'm not unhappy with this, but at large numbers of servers, then server crashes catch up with you ). And while I've never seen this result in data loss, it does result in churn in rebuilding and reimaging servers. It could also cause issues where a server is placed back into rotation looking like it is working (nothing so obvious as /etc/passwd corrupted), but is still failing on something critical after a reboot. You can jump through intellectual hoops about how servers shouldn't be put back into rotation without validation, but even at the small site that I'm at now with 2,000 servers and about 300 different kinds of servers, we don't have good validation, don't have the resources to build it, and rely on servers being able to be put back into rotation after they reboot without worrying about subtle corruption issues.

      There is now an expectation that filesystems have transactional behavior. Deal with it. If it isn't explicitly part of POSIX then POSIX needs to be updated in order to reflect the actual realities of how people are using Unix-like systems these days -- POSIX was not handed down from God to Linus on the Mount. It can and should be amended. And this should not damage the performance benefits of doing delayed writes. Just because you have to be consistent doesn't mean that you have to start doing fsync()s for me all the time. If I don't explictly call fsync()/fdatasync() you can hold the writes in memory for 30 minutes and abusively punish me for not doing that explicitly myself. But just delay *both* the data and metadata writes so that I either get the full "transaction" or I don't. And stop whining about how people don't know how to use your precious filesystem.

    26. Re:EXT4 is not broken? by Anonymous Coward · · Score: 0

      And there we have the problem with the carpenter community, boys and girls. My new hammer is not behaving like my old hammers? It's your fault, dear hammer-er.

      The thumb bruises I experience come from me swinging the hammer too fast because careful swinging is waaay too slow for my liking. And I haven't had a single issue with my old, smaller hammers. And now you tell me it's my fault I've come to rely on hammers for 10 fucking years?

      Do you think I give a fuck about safety? The illusion of safety is now gone, and there is nothing you can say to make up for it. Telling me it's my fault does not help, either.

      In terms of "avoiding injury", hammers rank right there with lawn darts.

    27. Re:EXT4 is not broken? by Jake+Griffin · · Score: 1

      So if I write a library that is supposed to add two numbers, but as a side effect, added 1 to the result, and someone, rather than following documentation and realizing there is a bug in the lib, relied on the bug and just subtracted 1 in their application, you would blame ME when I fix the bug in the lib and have it correctly add two numbers?

      --
      SIG FAULT: Post index out of bounds.
    28. Re:EXT4 is not broken? by tepples · · Score: 1

      Didn't your mom teach you not to forcefully shut down any operating system with any file system?

      Yes, but dad is the one who pulls the plug when the child either misbehaves or has used the computer for more than 60 minutes in a week.

    29. Re:EXT4 is not broken? by Ed+Avis · · Score: 1

      Didn't your mom teach you not to forcefully shut down any operating system with any file system?

      It's 2009. If any system can't handle a power failure without soiling its pants then the system is broken. It's a quite reasonable expectation that config files should not get trashed, and if they are, then some code somewhere is buggy.

      --
      -- Ed Avis ed@membled.com
    30. Re:EXT4 is not broken? by mR.bRiGhTsId3 · · Score: 1

      And here I would put it the other way as
      ~trusted -> broken
      After all, what good is a file system everyone is afraid to use.

    31. Re:EXT4 is not broken? by ivucica · · Score: 1

      I'm having no problem with people complaining while expecting system to survive power failure without adverse effects. I expect that as well.

      I'm having a big problem with people complaining while expecting system to survive day-to-day torture through forced shutdown. No system should be expected to survive misuse.

      If you burst a hole in shuttle's cockpit, don't be surprised that there's air leaking. But still safeguard against accidental bursting (i.e. don't use cardboard for making shuttle's hull). Ext4 fixes are expected to protect people suffering accidental power loss. But I wouldn't be surprised at any data loss coming out of habit to forcefully turn off the machine.

    32. Re:EXT4 is not broken? by AvitarX · · Score: 1

      On EXT3 an application can:

      Super reliably (very slowly) write what you want
      Atomically write changes, with little risk of file loss, but changes can be lost
      Sloppily write in a way that can allow data loss

      In EXT4 an application can:
      Super reliably (very slowly) write what you want
      Sloppily write in a way that can allow data loss

      There is no in between, unless of course you use the new non-default ordered option.

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    33. Re:EXT4 is not broken? by AvitarX · · Score: 1

      Too bad what you list as proper does not work by default in EXT4 (by default).

      This is still within spec, the spec does not say order of operations must be maintained, so an FS driver can violate that.

      An all this discussion on weather the problem is in the FS or not is silly to me, isn't it the driver that decides to re-order things?

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    34. Re:EXT4 is not broken? by AvitarX · · Score: 1

      The thing is that with previous FS drivers (and optionally with EXT4, so this is really just kind of academic) one could make sure that data that existed at boot time would not be lost, with only stuff in RAM, but not synced being lost on a crash.

      EXT4 by default re-orders things so that your file that has been on disk for 100's of years can theroretically be killed by a power outage, as the rename() can occur before the writing of the data.

      The solution is to queue the order of things, so that they go to RAM for a leisurely write, but they write in the order they happen. This way data on disk at boot time is not clobbered.

      The spec for ODF spreadsheet allows for formulas to be in written pig-latin, it does not mean it is the best way to do things.

      It is a good thing for a programmer to be able to say, I don't really care if these changes get written in the case of a power failure in the next ten minutes, but please don't erase what used to be here last year if it happens.

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    35. Re:EXT4 is not broken? by AvitarX · · Score: 1

      EXT4 is already fixed with the ability to require ordered writing.

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
  33. Not touching it for at least 12 months by wdef · · Score: 1

    Filesystems are mission critical for everything. Stabilility is the thing here. Personally, I see no reason to risk this until they iron out all the wrinkles.

  34. My opinion by Anonymous Coward · · Score: 0

    Here's my opinion on ext4:

  35. We had this problem by xiox · · Score: 4, Interesting

    Our 8TB raid system would get trashed after copying data onto it (group descriptor checksums on fsck). It looks like it was an ext4 bug. They fixed it about a week or two ago, here. Maybe it will get in your kernel soon. I'm not going to start ext4 on any production system for at least 6 months I think now.

    1. Re:We had this problem by hamanu · · Score: 1

      oh cool, thanks

      --
      every _exit() is the same, but every clone() is different.
    2. Re:We had this problem by jabuzz · · Score: 1

      Six months...

      Clearly you do not value your data. Perhaps six years might be a better idea. In the mean time there are at least two well supported extent based file systems for Linux that have a proven history of over six years that support file systems larger than 8TB. Namely XFS and JFS, pick one and go with it.

  36. Too cheap of an excuse.. by Junta · · Score: 2, Insightful

    His point was that POSIX doesn't speak to crash behavior. As such, if a system detects a crash and zeroes the MBR and nearby blocks, it would still be POSIX compliant, but no one would plausibly be mollified by that.

    The application isn't making a complex assertion based on undocumented behavior not contained in a spec, it's making a very simple assumption that if it writes data to a file, and then calls rename when those calls complete, that those two operations will proceed in order. It proceeds in order on the running system, and the desire expressed is that same ordering guarantee occurs to persistent storage (it is acceptable to be stale/lagged, so long as the second operation didn't jump in front of the other).

    --
    XML is like violence. If it doesn't solve the problem, use more.
  37. other operating systems? by Anonymous Coward · · Score: 0

    Face it: your side lost. "fsync everywhere" is an infeasible, untenable, and useless position to take.

    For all the people that complain about 'fsync everywhere', what do they do on other operating systems? There were issues on ext4, but what about Solaris UFS, BSD FFS, NetApp homedirs? What about applications on OS X HFS compiled via pkgsrc or fink?

    The debate seems to be between some app developers and the Linux FS devs; how about bringing in some "neutral" parties and see how the applications work on those systems?

    If the app is still "broken" on a non-Linux OS, it's probably the app; if it doesn't have data-loss issues, it's probably the behaviour of the Linux FS.

  38. Same here... by Junta · · Score: 1

    I have the *exact* same problem as the parent poster had (fsck being required with group descriptor corruption and/or resize inode issues). On clean reboots.

    This system I have done lots of stress testing outside and inside of the running ext4 without rebooting, and have had no problems with ext3 or any data miscompare at all to suggest fundamental I/O misbehavior. I have only had problems with ext4. I have not seen data loss after fsck -y, but I have had to fsck -y a lot *despite* never having resized, converted, or suffering a crash on the afflicted system.

    That being said, I have ext4 in three places. In two places (my smaller systems), I haven't observed this. I have only observed it on my 2 terabyte volume.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  39. fsync performance by Anonymous Coward · · Score: 0

    And had it been enforced, as soon as all developers went thru and added the fsync calls everywhere it would have become necessary for file system maintainers to no-op fsync calls in order to regain any approximation of prior performance.

    Is this true on every POSIX FS (UFS, FFS, NetApp NFS, XFS, etc.) or just ext3/4 on Linux? Just because something sucks on Linux doesn't mean it sucks on other Unix-y operating systems.

    (It's like NFS getting a bad reputation--especially locking--because Linux's implementation sucked for so many years.)

    1. Re:fsync performance by QuoteMstr · · Score: 1

      (It's like NFS getting a bad reputation--especially locking--because Linux's implementation sucked for so many years.)

      Huh?

      fcntl/lockf works fine over NFS so long as both sides are either using NFS4 or both sides are running NLM (the NFS locking daemon -- i.e. rpc.statd)

      Dot-locking also works over NFS, though open(..., O_EXCL|O_CREAT) does not work, nor does flock. Have these worked on other systems, or started working under Linux?

  40. Disturbing by QuoteMstr · · Score: 2, Interesting

    Disturbingly enough, rename under OS X's HFS+ filesystem doesn't appear to be atomic even on a running system. If they can't get rename right on a running system, I'd hate to see what kind of scrambled mess the filesystem is after a crash.

  41. I haven't had a problem with it yet. by jadedoto · · Score: 1

    I've been running it for a few months now, and haven't had a single issue.

  42. "Probably?" by bonch · · Score: 1

    If you answer someone's question about a feature being ready for production systems with the word "probably" instead of "absolutely," then it's not ready for production systems.

  43. re ext4 by freddieb · · Score: 1

    I tried it briefly. I had no problems with it. However, I did not see the extreme performance promised and the rumours of data loss were surfacing so I switched back to ext3.

  44. ext4 also has space allocation issues by CarpetShark · · Score: 2, Interesting

    I tried ext4 as soon as it hit 2.28. I never ran into the KDE bugs, but I did notice it complaining that the filesystem was full despite many GB being free (and we're not talking about the relatively small amount reserved for root here).

    It certainly wasn't fit to be renamed from ext4dev at that stage.

    1. Re:ext4 also has space allocation issues by Anonymous Coward · · Score: 0

      newer versions of ext2progs default to fewer inodes. Try running df -i to check your inode counts.

      I mount /usr/portage on a loopback device (helps to keep it from thrashing the /usr filesystem with frequent deletion/creation of hundreds of small files daily) and ran out of space for that very reason when I first tried ext4.

  45. Why are you reading this title? by Thinboy00 · · Score: 1

    Too subtle. Noone reads the titles these days.

    --
    $ make available
  46. HA HA HA by Anonymous Coward · · Score: 0

    Linux is a mess.

  47. Maybe it's just me... by fluffman86 · · Score: 1

    but it seems like I've been getting random freeze-ups since using it. Usually happens when downloading 500mb of gmail into evolution, or when deleting/adding more than 100 or 200 MB of files in one fell swoop.

    See https://bugs.launchpad.net/bugs/327509 for more.

    1. Re:Maybe it's just me... by Virtex · · Score: 1

      It's not just you. I had the same problem on a laptop I use for work after doing a fresh install of Ubuntu 9.04 stable. For about three weeks it would freeze up almost every day, frequently two or three times a day. I thought the problem might have been with VMWare, so I removed it. I tried juggling around some of the drivers, but the problem continued until one day I backed up the entire filesystem, reformatted it as ext3, then restored all my data. The system has been completely stable since then - no crashes for the last month. From some of what I read, this may be a Ubuntu specific problem, but I don't have experience with ext4 on any other distros so I can't say for sure.

      --
      For every post, there is an equal and opposite re-post.
  48. Ext4 is better for torrents by Nicolas+MONNET · · Score: 1

    When the torrent client creates the file (most fill it with zero to avoid fragmentation at the beginning), it's almost instantaneous, while with ext3 it can take a few dozens of seconds for large files. However I've experienced process lockups on ext4, nothing shows up in the log but the process accessing files on ext4 is unkillable (zombified).

  49. That is something I find peculiar... by Junta · · Score: 3, Insightful

    When they went to journalling filesystems, by and large a simple mount operation turned into a mini-recovery operation, a psuedo-fsck if you will. This would even happen on read-only mounts, which to me violates expectations of no disk data being modified.

    JFS had one 'quirk' that I think they got right, journal replay was an fsck-level event. A filesystem with a dirty journal could only be mounted read-only and the journal replay code was in fsck and had to be ran to enable remount read-write. There are numerous reasons why I stopped using JFS, but that is one point I kinda agreed with their quirkiness on.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  50. ext4 / KDE issues overblown by mpyne · · Score: 1

    The comments on this thread seem to be a bit mistaken on average on what the hubbub was regarding ext4 and KDE, so I'll try to clear it up a bit (I'm a KDE dev but I'm not speaking for KDE here of course).

    The "issue" with ext4 was that it's handling of the standard write(); close(); rename(); idiom for replacing existing files by writing out a new file and then renaming it in-place over the old one could leave zero-length files laying around if the system crashed .

    ext4 never would spontaneously delete data merely because rename() was used, it was a side effect of its implementation that if the system crashed before the data had been written to disk but after the rename had taken effect then a zero-length file would be left in its place after restarting the system.

    Where KDE comes into the picture is that KDE 4 writes its updated settings to disk too frequently (which is a known bug, now fixed, KDE bug 187172 pertains). So, if you were starting up or shutting down your KDE session when the system crashed you'd likely have had quite a few config files written out in the past 60 seconds or so. ext4 is very good about writing out metadata so the renames would have taken effect. But apparently ext4 didn't force the actual file data to disk until 60 seconds had gone by (unless asked to via fsync()). So after the reboot there was a great chance that the $KDEHOME/share/config/*rc files had been effectively truncated, thus causing loss of settings.

    Many people have complained that KDE should do "the right thing" and use fsync() everywhere, but most people don't know that KDE had always done that... until ext3 became popular. ext3 suffers massive slowdown in the face of fsync() (although I guess some kernel hackers will have that mostly fixed in 2.6.30?) so KDE actually removed fsync() calls in response to user demand. And there's no less than two fsyncs() that would be required, one to force the file to disk, and a second to force the update to the directory after the rename().

    People claim that KDE violates POSIX standards but really the effect we get from using rename() with fsync() is exactly what we want, a kind of lazy "version A or version B, one or the other". At least for rc files, it is not at all important that version B be reflected system wide afterwards at the time the write() happens so fsync() is overkill. Of course ext4 isn't "violating" POSIX either, but most agree that its behavior was undesirable in this situation, so patches have been committed to ensure ext4 adds ordering in this case to ensure that metadata updates happen after data updates.

    I actually just converted all my filesystems to ext4 (from XFS) the other day, since it's at least possible with appropriate mount flags to get ext4 to act as a proper desktop FS. (I didn't know about XFS's similar issues with power loss until it was too late). So far everything is working nicely for me, although I haven't intentionally power cycled to test that case either.

    1. Re:ext4 / KDE issues overblown by fnj · · Score: 1

      I was with you until I reached the part where you said you converted "all" your file systems from xfs to ext4. Why would you switch from one busted ass corruption prone file system to another busted ass corruption prone file system? There are perfectly reliable file systems available, like ext3 and reiser3 (one could be forgiven for not considering the latter, but ext3 has been solid as a rock since just about the beginning).

      If you want to experiment with the unproven, buggy ext4, you are to be commended - but it is unwise to use it for "all" your data.

      P.S. My profound compliments to ALL the KDE developers for their remarkably competent and well thought out work over the years!

    2. Re:ext4 / KDE issues overblown by mpyne · · Score: 1

      Why would you switch from one busted ass corruption prone file system to another busted ass corruption prone file system?

      ext4 isn't busted by design. The major change (in regards to this bug) is that the default data mode changed from ordered to writeback. writeback is explicitly noted in their documentation as having increased risk of corrupted files in a powerloss situation so I wonder why they made the change, but you can switch it back and still retain the other upgrades ext4 provides over ext3 (which is what I've done).

      I guess the default may change again in later kernel releases to be safer and then corporations with fancy datacenters and UPSes everywhere could still use data=writeback but I can't speak for the kernel devs.

  51. most apps already did the 2nd; still failed by Trepidity · · Score: 4, Informative

    KDE did already do the 2nd (what you list as correct), and most developers assumed that this was sufficient to keep the files in a consistent state, due to rename() being atomic. The problem is the sync issue you mention afterwards: the failure mode being encountered was that the rename() executed instantly to clobber the old file, while the new file still contained no data on disk. If the machine crashed in the window between the rename() and the sync, you have neither the old nor the new file.

    The main thing being discussed with KDE (and others) is how to fix this. Adding a sync() after every config update totally destroys performance, if you might update hundreds of small config files semi-frequently. See, for example, this discussion among Python folks for pros/cons of various options.

    1. Re:most apps already did the 2nd; still failed by Hucko · · Score: 2, Insightful

      Ah the sync should come before the rename. I understood the problem as kde was truncating the old file before the sync. If you have the above system, why wouldn't you copy foo > foo.old, before working on foo.new? At the worst then, user can copy foo.old back to foo; assuming there has been a crash between foo.new rename and sync. I thought this was the standard practice that the apps forgot to do.

      --
      Semi-automatic amateur armchair Australian philosopher; conjecture ready at any moment...
    2. Re:most apps already did the 2nd; still failed by Anonymous Coward · · Score: 0

      Is this broken on only Linux? If so then the behaviour on Linux is wrong and it needs to be fixed.

    3. Re:most apps already did the 2nd; still failed by Jake+Griffin · · Score: 1

      ...but if this is broken on Windows too, then the behavior is not a bug at all, but an undocumented feature, and Linux users should be grateful for the new, innovative technology.

      --
      SIG FAULT: Post index out of bounds.
  52. homogeneous environments first, please by Chris+Snook · · Score: 1

    As always, when considering a new technology, the best place to deploy it is in large, single-purpose setups that actually benefit from the differences relative to the thing you're replacing. You'd have to be completely nuts to migrate an existing data center full of disparate workloads to ext4, but if you're about to deploy a bunch of functionally identical streaming media servers where the improvements in handling large volumes and files will make a measurable difference, and the cost of validating the setup is amortized across several production systems, you'd really be nuts not to at least consider it. If there's an app that does something stupid, you have one bug report to file instead of 50, and you need to change one install configuration file to fix it.

    Don't be first, but definitely don't be last.

    --
    There's no failure quite as dissatisfying as a complete and total solution to the wrong problem.
  53. Fine since Jaunty Beta by Wolfger · · Score: 1

    I'm running KDE4 on EXT4 since March, and zero problems. Is it any better than EXT3? Not in any noticeable way. I think it's faster, but haven't benchmarked it. It's not the-difference-between-cable-and-dialup faster.

    1. Re:Fine since Jaunty Beta by jadedoto · · Score: 1

      I seem to notice a slight difference-between-DSL-and-cable difference.

  54. Is ext4 Stable For Production Systems? by Zero__Kelvin · · Score: 1
    Is ext4 Stable For Production Systems?

    "Earlier this year, the ext4 filesystem was accepted into the Linux kernel. "

    I love it when an ask Slashdot article has the answer to the question as the first sentence.

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  55. Wow ! by tbi · · Score: 1

    It does seem to be working well, even though it crashed it still managed to recognise that you were half way through doing something and posted the reply for you ! I'm presuming it does this for other data, that, for example, you're trying to save when you have a sudden power outage.

  56. To actually try to answer the question... by fmayhar · · Score: 2, Insightful

    ...the three reasons are performance, performance and performance.

    Ext4 has extents (and therefore loses indirect blocks), a better on-disk layout policy and generally better algorithms in its allocation code. Of course, performance varies depending on the app in question but we've found that it beats ext2 in almost every respect in our environment. (We don't run ext3 because journals cost performance [by buying reliability] and that's all ext3 gets you: a journal. This is why we wrote and submitted the no-journal hack for ext4.) In particular, ext4 beats ext2 for write-heavy loads by, well, lots. Yes, we've measured this stuff.

    So why would one go to ext4 over ext3? Because it's a better file system, not to mention one that's actually (a) being developed and (b) past pre-alpha.

    Of course, our environment is a tad different from most. We have *ahem* more than a _couple_ of servers.

  57. Heard of "Time Tested?" by Anonymous Coward · · Score: 0

    Ever heard the term "Time Tested?"
    EXT4 will be "time tested" in another 3 years.

    That's when production users will **know** how safe this file system is to deploy.
    Before that, you have a bunch of unproven yahoos saying "works fine for me" OR "crashes my system constantly" based on individual experience.

    I won't be touching it for another 2.5 years and then it will go into a lab for 6 months of testing with
    - our hardware
    - our operating systems
    - our patch methods
    - our applications
    - our people running the systems.

    Good luck and be careful out there.

  58. Better tool !!! by DrYak · · Score: 1

    I've never used anything other than Reiser3 with Linux. Partition resizing.

    Which by it self is a sure win of Reiser3/4(*) and Ext2/3/4 versus JFS and XFS which can only increase size.
    That's part of the reason I'm using it too.

    Readable from within Windows via YaReG.

    Another tool you might be interested by too :
    Virtual Volumes.
    - It's by the same guy who wrote Explore2fs
    - It supports ReiserFS using the same tools (rtstools) as YaReG.
    - It supports also RAID and LVM.
    - Read/Write support.

    And hoepfully, they'll end up adding WebDAV support so you can mount file systems under Windows.

    (*): Now that development has been taken over by Edward Shishkin, shouldn't this get renamed as ShishkinFS ? :-P

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
  59. You must be joking by Burz · · Score: 1, Flamebait

    EXT4 is the newcomer and KDE has worked fine not just with EXT3, but with all the other filesystems like XFS, JFS, Reiserfs and even Reiser4.

    Its clear that EXT4 was NOT testing with crucial existing system software before release. And now that the audio and graphics architectures are a screwed up mess due to disregard for PC users and their most critical software and use cases, we can thank Linus & co for poisoning even the filesystem itself, which will get foisted on unknowing users who won't even know enough to change the default fs or to make a choice between data integrity and performance.

    Grrreeeaaaattt job!!!

    1. Re:You must be joking by BrokenHalo · · Score: 2, Informative

      we can thank Linus & co for poisoning even the filesystem itself, which will get foisted on unknowing users...

      Ext4 might have made it into the tree, but Linus hasn't made it a default. There are lots of things in the tree that are clearly marked as "experimental", and neither Linus nor anyone else expect or recommend that you use them in a production environment. If your distribution DOES make it a default, I would seriously suggest you find a different distro.

    2. Re:You must be joking by steveb3210 · · Score: 1

      Its available in Ubuntu for the current version and, last I read, will be the default in Kosmic..

    3. Re:You must be joking by Anonymous Coward · · Score: 0

      ext4 is neither marked as experimental, nor does Linux ever impose a "default" filesystem. It just provides a crapload of filesystems and the distributions pick which one will be "default".

    4. Re:You must be joking by BrokenHalo · · Score: 1

      Its available in Ubuntu for the current version and, last I read, will be the default in Kosmic..

      Any filesystem is available as an option with any distribution. All it involves is a kernel option. But to make ext4 a default this soon in (I presume you mean) Karmic Koala seems unnecessarily rash to me.

    5. Re:You must be joking by mR.bRiGhTsId3 · · Score: 1

      Actually, I think the DE developers have learned that to assume makes an ass out of you and me. Sure, ext4 broke the established behavior for most filesystems, but the fact it follows specification means that someone was making unwarranted assumptions about file system performance. Its now the DE's fault that they can't fix it without destroying performance with an i/o bound bottleneck required to make sure that they actually save their data.
      Maybe this is a sign that millions of plain text configuration files aren't always the best idea when something like .kde can be as large as 150 mb, and all the DEs need to give some thought to optimizing their config storage so that it is both fast and robust in the face of quirky file systems.

    6. Re:You must be joking by AvitarX · · Score: 1

      I bet the ordered writing is turned on by default though.

      This will make all the wining irrelevant.

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
  60. EXT4 is not broken, script kiddies are broken. by sowth · · Score: 1
    1. A power failure is hardware failure.
    2. Ext4 is an experimental filesystem which just came out. You are an idiot for expecting it to be perfect and stable.
    3. If you really use the power button because proper shutdown is "too slow" (ever hear of pressing the button and walking away?), then you are an even bigger idiot. While some of the shutdown process for most Linux distros are too long because they were designed for servers, at the very least the kernel needs time to unmount the filesystems and write remaining data to disk. Cutting the power doesn't allow this to happen.

    Computers aren't magic boxes. They have real physical limits. Software engineers aren't magicians. They need time to perfect their code.

  61. ext4 at my house by FatherDale · · Score: 1

    i've been using it since, what, 2 days after Ubuntu 9.04 came out, on a 250gb HP convertible and a 20gb SSD eee 901. No issues. I'm impressed by the speed. It does work, and I'm very ok with that.

  62. It's working for me . . . by Anonymous Coward · · Score: 0

    I am not running what you would call a production system. But I installed the latest Ubuntu with the EXT4 file system and it works just fine.

    If you're concerned with specific applications, what is keeping you from testing those on a limited basis in a production environment with EXT4, making frequent backups on those computers as a precaution against what you are worried about?

    If you're happy with EXT3 and it works without problems in your work place, why do you feel the need to change? Yes, my Ubuntu boots faster with EXT4. Early development feedback suggested that this would be the case. But once the computer is up and running, I can't say that I notice a large speed difference, so if you're looking at productivity improvements, I wouldn't bother with changing your file system.

    Why not just wait until EXT4 has been in general use for a while, then make the switch when you upgrade your Linux distro? By then, if your concerns are valid, they will have either been addressed or not, and you will know if it's something you should avoid.

    My understanding, explained by the development people at Ubuntu, is that there were two ways of keeping track of where data was on a hard disk partition, and some programs didn't work well with one of them. That issue was addressed in Ubuntu, and I am sure it was also addressed in any recent major distro that is offering the option of using the EXT4 file system. I don't remember the specifics, as it was complicated and beyond me to completely understand it, but I grasped the basic concept and was satisfied that they pinned down the issue that caused serious problems and addressed it well enough to include the option to install that file system when installing Ubuntu.

    Whether you switch file systems or not, you should have a good backup system. Some businesses still don't have a reliable backup plan. I hope you're not one of them. Whether you're using EXT3 or 4, things can and do happen, even hardware failure that is independent of whatever file system you are using.

  63. What a lame post by marcus · · Score: 1

    How did this 'article' ever get past the filters?

    Of course the answer is 'if you have to ask, then you can't afford it'.

    While the P gave a very detailed response, the ultimate answer is: you should not be asking this question.

    --
    Good judgement comes from experience, and experience comes from bad judgement.
    - W. Wriston, former Citibank CEO
  64. Use a real, reliable filesystem by NekoXP · · Score: 1

    Forget it. ext4 is still not stable even though they took the "dev" tag off. The filesystem is going to have problems upon problems upon problems for months and months and months and you're going to be looking at updating and patching kernels in perpetuity to keep up with those patches. The only thing they stabilized was featureset and on-disk format.

    If you want the features of ext4 for production, you should already be running XFS. You can't get much better on Linux filesystems these days. By the time ext4 gets to be "production worthy" (by this I mean has spent a year in the tree and become the default on major Linux distributions meaning thousands of people have installed on it, found all the showstopper bugs and had it patched in distro downstreams at least), you can bet Btrfs will be out and "stable" anyway, and you'll be asking about whether THAT is ready for production.

    Use XFS now. Wait 18 months for Btrfs, if it is indeed any better. If not, and still looking for a filesystem challenge today, work out a way you can use ZFS - which has its problems on Linux (especially that it needs to run through FUSE - not a performance issue so much as a technical nightmare to use it as your root filesystem that no distro actually would support because of moronic non-free policies).

  65. Not for me... by shreddertomas · · Score: 1

    All I know is that I converted my laptop from ext3 to ext4. It became unstable as hell freezing over all the time. Just reinstalled it with ext3 and everything is fine and stable.
    Possible it was because one time my comp froze up and had to be forcibly shut down, after that an fsck was run that had to fix some errors. After that hell broke loose.
    On a laptop it should be possible that the battery runs out without having to reinstall everything, at least that works with ext3.
    So far I value the robustness of ext3 more.

  66. Switched to ext4 - Time is money by josephcmiller2 · · Score: 1

    Yeah, I have ReiserFS on my laptop, and everyone here should really switch to that for Desktops at earliest convenience. Quick fsck, reasonably reliable, extremely fast, extremely fast, faster than ext3, faster than ext4, very fast. On my desktop computer however, I didn't have the time to take the system down to re-format for ReiserFS. So I switched to ext4 because it's faster. Time is money, don't argue "don't fix it if it ain't broke" with me. I'll tell you when it's broke. When the long-abandoned ReiserFS is fast as shit on a slow machine, but the ext3 looks like performance wasn't even considered when used on more recent hardware. Time is money guys. Yes, I can be more productive with ext4 when I have 8 virtual desktops and multiple projects for multiple companies requiring completely different application sets simultaneously on my computer and just trying not to get more than a week behind on any of them.

  67. ReiserFS FTW! by crhylove · · Score: 1

    I still use ReiserFS. It's killer!

    --
    I hold very few opinions. I hold information based on observation and fact. If you wish to disagree, please use facts.
  68. no problems with it here by mrdtr · · Score: 1

    I've been using it with Kubuntu since 9.04 came out, and haven't had any problems at all. Maybe I just got lucky? but I doubt it. I regularly back up everything anyway, so I'm not too worried.

  69. Ext4 work well by guliverk · · Score: 1

    I use Ext4 on my new notebook without any problem.

    --
    JMule user : http://www.jmule.org
  70. ZFS does not have that problem. by jotaeleemeese · · Score: 1

    It does not even have an equivalent to fsck.

    --
    IANAL but write like a drunk one.
    1. Re:ZFS does not have that problem. by ivucica · · Score: 1

      Have RAID? I don't. Have Solaris? I don't. Run your home machine on server hardware, in general? I don't.

      And reiserfs also doesn't seem to have a real equivalent to fsck/chkdsk. Still not a reason to forcefully shut down. After all, your FS in general may recover, but are you sure that one of those programs you're running right now won't break its database backend and leave it inconsistent?

      Better be safe than sorry, when it comes to data. Would you forcefully shut down a server providing your business $2,000 in sales every 10 minutes and risk corruption of anything? Didn't think so. You would probably recover, but why go into the trouble?

  71. Nor ZFS.... by jotaeleemeese · · Score: 1

    We must face this, ZFS ate Linux's lunch this time ...

    --
    IANAL but write like a drunk one.
    1. Re:Nor ZFS.... by davecb · · Score: 1

      Sure did: a good healthy competition between the Solarii, BSD folks and Linus is a wonderful thing (;-))

      --dave

      --
      davecb@spamcop.net
  72. I have worked 18 years with UNIX. 10 with Linux by Anonymous Coward · · Score: 0

    I think it does not harm to ask questions about anything here.

    Sometimes you need people with relevant experience that are somehow detached from a given issue but that still can provide useful advice.

  73. Mysql database corruption by Anonymous Coward · · Score: 0

    On CentOS 5.3, we get data corruption with multiple versions of MySQL, but only if we use ext4. ext3 works fine, but it's the same old crappy performance.

  74. You dont need server grade machines to run Solaris by jotaeleemeese · · Score: 1

    As long as you can get things working you should be OK.

    I have installed it in Shuttle machines and the only thing that didn't work out of the box was the network card driver, which I found quickly after googling a bit.

    --
    IANAL but write like a drunk one.
  75. Re:You dont need server grade machines to run Sola by ivucica · · Score: 1

    You're right. You don't need a server grade machine to run it. You probably don't need RAID to use ZFS either.

    But I'm a "home" user. Why would I use a platform which has smaller hardware support when I'm already having problems with Linux? That is, why would I use Solaris if I'm already having issues with some hardware under Linux? My wireless card on my laptop is working ... mostly. How would it perform under Solaris?

    What about all the software -- I enjoy Debian and precompiled software. Even installation of packages under Debian takes too long for me; I can't imagine compiling all software not included in Solaris. Not to mention most of it isn't regularly tested under Solaris.

    Tell me again, why would I, as a home user, use a platform INTENDED for enterprise, and usually server, environments?