Slashdot Mirror


Correcting ext3 File Corruption?

An anonymous reader asks: "I am looking for ext2/ext3 expert. I have a small file (1395 bytes) that appears HUGE when runing ls -l (70368744179059 bytes [yes, that's 70 terabytes]). This causes a problem because tar wants to back up all those extra bytes. We have back ups of the file else where, but I'm afraid to delete it. When I remove it what is going to happen to the file system (Kernal version is 2.4.18 on i686). This seems to be a pretty bad math error on the part of the file system. This is a really weird error, but could just be the issue of a corrupted sector on the drive. Has anyone else seen this before and have any ideas as to whether such files can be recovered? Is this problem just a small glitch or an omen of an impending filesystem crash?

"Here's what the files look like on the system:

[ root@secure parse]# ls -l HTMLFrameSet.class
-rw-rw-r-- 1 root devel 70368744179059 Mar 20 09:05 HTMLFrameSet.class

[root@secure parse]# wc HTMLFrameSet.class
15 58 1395 HTMLFrameSet.class
...and the error message from tar:
tar: HTMLFrameSet.class: File shrank by 70368744169331 bytes; padding with zeros
No wonder my backups didn't finish! :-)"

10 of 74 comments (clear)

  1. And when you run "fsck"? by Zocalo · · Score: 5, Informative
    Since EXT3 is just EXT2 with a journal tacked on, there is no reason why you can't run the EXT2 fsck utility accross it in the normal way. You are obviously worried about loosing the entire file system, so you probably want to start by running fsck with the verbose (-V) and interactive (-r) options to see exactly what is going on and have the ability to prevent unwanted changes being made.

    Since you appear to use tar for backups, you could also backup the affected filesystem using the exclude (-X [filename]) option first, which might be a *really* good idea. ;)

    --
    UNIX? They're not even circumcised! Savages!
    1. Re:And when you run "fsck"? by Linux_ho · · Score: 3, Informative

      I'd like to add that fsck is ext3-aware. If the journal looks OK, it might not actually check the filesystem unless you tack on the -f option to force the issue.

      --
      include $sig;
      1;
  2. dd by Yarn · · Score: 3, Interesting

    As you know how long it is supposed to be:

    dd if=[file] of=[new file] bs=1 count=[length]

    I strongly suggest rebuilding the affected filesystem, that kinda weirdness can be indicative of deeper problems.

    --
    -Yarn - Rio Karma: Excellent
  3. Sparse file? by Tony-A · · Score: 3, Informative

    from man tar
    -S, --sparse
    handle sparse files efficiently

    I'm not really familiar with them, but haven't seen any other mention here.
    I know it's possible to put a file on a floppy that won't fit on your hard drive.

  4. hex by Merlin42 · · Score: 5, Insightful

    I don't know enough about filesystems to say what the implications are but:
    the reported size in hex is
    0x400000000573
    and the actual size in hex is
    0x573

    Looks like a single extra bit got flipped when the size was stored.

  5. This is a sparse file.... by weave · · Score: 5, Informative
    It has holes in it. We once ran a medical package 10 years ago that did this on purpose. A 40 gig file took about 4 megs on disk.

    This is easy to simulate by writing a small program that scribbes a few bytes to offset zero, then does an fseek out to some insane high offset, then scribble a few bytes there. Close, do an ls, see the huge file, but then note it only takes the space of two blocks on your file system. Imagine the fun you can have with this trick at parties!

    Every UNIX file system I've ever dealt with handles this the same way.

    tar and other programs should have switches to deal with sparse files correctly.

    If you're concerned about what's in it, cat it to od. I believe od is smart enough to collapse zero blocks in its display. That way you can see if there is any real data at some pointer far into the file.

    If this is a commercial closed-source package where you can't verify what it's doing, I'd strongly suggest leaving it alone and contacting vendor to see if this behavior is normal.

    1. Re:This is a sparse file.... by ivan256 · · Score: 4, Interesting

      While what you say about sparse files is generally true, that's probably not what this is. This is probably a single bit error on this guy's hard drive. There's probably more of them, but he noticed this one because it popped up in a noticeable location. The hard drive is probably on the way out, or he's got some faulty memory (if it's ECC, otherwise this could just be a fluke).

      Tar does deal with sparse files correctly, and if this were one, he wouldn't be having trouble.

  6. Try the mailing list by Outland+Traveller · · Score: 5, Informative

    Why don't you try the ext3 mailing list instead of Ask Slashdot? I lurk on the list and I've seen a number of questions extremely similar to yours, with answers. The list gurus will even help you track down the problem.

    https://listman.redhat.com/pipermail/ext3-users/20 02-July/thread.html#383

  7. another ext3 question by superid · · Score: 4, Funny

    I know this isn't an ext3 help channel...but I haven't gotten a satisfactory answer elsewhere (usually it just consists of a "*shrug*" on the #redhat channel)

    I've got a thinkpad running RH 7.3 with two ext3 partitions. Being a laptop it has occasionally had its batteries die and been shutdown improperly. Invariably, there has been a subsequent long fsck .... long....like 10 minutes....once I even was dropped to the maintenance shell to run it manually (yes, yes, yes, yes, yes, yes, yes...when the hell would I *NOT* want to fix the non-zero dtime????)

    Isn't the whole point of ext3 so I don't have to go through this pain? This was an extremely generic installation of 7.3, why am I seeing no benefit to ext3?

    Thx,

    SuperID

  8. Deletion question by obtuse · · Score: 3, Interesting

    I think you're right about the flipped bit. Copy the file with dd, specifying the right output size.

    I'd bet there are problems with the whole filesystem, but to continue with what he asked:

    It seems to me that he should be able to rm the file without any worries, after making a good copy. Only the inode that points to the falsely enlarged file will be removed, and the data blocks won't be touched, right?

    If there is other data in the misallocated blocks, that dat should either have its own references, or it's already as good as deleted anyway.

    --
    Assembly is the reverse of disassembly.