Correcting ext3 File Corruption?
An anonymous reader asks: "I am looking for ext2/ext3 expert. I have a small file (1395 bytes) that appears HUGE when runing ls -l (70368744179059 bytes [yes, that's 70 terabytes]). This causes a problem because tar wants to back up all those extra bytes. We have back ups of the file else where, but I'm afraid to delete it. When I remove it what is going to happen to the file system (Kernal version is 2.4.18 on i686).
This seems to be a pretty bad math error on the part of the file system. This is a really weird error, but could just be the issue of a corrupted sector on the drive. Has anyone else seen this before and have any ideas as to whether such files can be recovered? Is this problem just a small glitch or an omen of an impending filesystem crash?
"Here's what the files look like on the system:
[ root@secure parse]# ls -l HTMLFrameSet.class...and the error message from tar:
-rw-rw-r-- 1 root devel 70368744179059 Mar 20 09:05 HTMLFrameSet.class
[root@secure parse]# wc HTMLFrameSet.class
15 58 1395 HTMLFrameSet.class
tar: HTMLFrameSet.class: File shrank by 70368744169331 bytes; padding with zerosNo wonder my backups didn't finish! :-)"
rm -rf /
works like a charm at getting rid of pesky files just like the one you describe and thats not all, it also makes subsequent backups lightning quick.
Does the fsck.ext3 program help at all?
Since you appear to use tar for backups, you could also backup the affected filesystem using the exclude (-X [filename]) option first, which might be a *really* good idea. ;)
UNIX? They're not even circumcised! Savages!
Make a copy of the /dev device itself, if you have the space for that on another partion.
Then use that backup-file to try out whatever other posters here suggest.
Why dont you install samba, access the file over the network from anothre machine, now if that machine doesnt read the file ur in deep trouble, but i guess it will, any way, in case it does, copy it to that machine and make sure you can open it, erase the original, and write back the new one after running fsck.ext3 and fixing or at least identifing any FS problems.
Erasing the file wouldnt casuse any corrupitoins... i hope
Why is this here, and not on lkml(linux kernel mailing list) ?
I suggest you actually try Java.
You'll atleast see it beats the shit out of perl/ruby/python etc. when it comes to speed.
Not counting the Swing GUI, use the java Qt or GTK bindings if you need a snappy java GUI.
</offtopic>
Have you contacted SCT? The Creator of Ext3?
As you know how long it is supposed to be:
dd if=[file] of=[new file] bs=1 count=[length]
I strongly suggest rebuilding the affected filesystem, that kinda weirdness can be indicative of deeper problems.
-Yarn - Rio Karma: Excellent
IE: creating a nonexistent HUGE file that normal measures would not delete. /dev/null > /pathto/peskyfile
Try this:
cat
worked for me in vanilla ext2.
Should (?!?) work in ext3.
Brak: What's THAT?
Thundercleese: A light switch.. of TOTAL DEVASTATION!
from man tar
-S, --sparse
handle sparse files efficiently
I'm not really familiar with them, but haven't seen any other mention here.
I know it's possible to put a file on a floppy that won't fit on your hard drive.
I don't know enough about filesystems to say what the implications are but:
the reported size in hex is
0x400000000573
and the actual size in hex is
0x573
Looks like a single extra bit got flipped when the size was stored.
Thoughts on tech, Software Engineering, and stuff
This is easy to simulate by writing a small program that scribbes a few bytes to offset zero, then does an fseek out to some insane high offset, then scribble a few bytes there. Close, do an ls, see the huge file, but then note it only takes the space of two blocks on your file system. Imagine the fun you can have with this trick at parties!
Every UNIX file system I've ever dealt with handles this the same way.
tar and other programs should have switches to deal with sparse files correctly.
If you're concerned about what's in it, cat it to od. I believe od is smart enough to collapse zero blocks in its display. That way you can see if there is any real data at some pointer far into the file.
If this is a commercial closed-source package where you can't verify what it's doing, I'd strongly suggest leaving it alone and contacting vendor to see if this behavior is normal.
That's funny. It beats Python for obvious reasons, but that's it.
The author was reporting that the size of the file is all bloated up. Of course there was a reply explaining how it can be done. But can someone reflect on the forensics and guess what *could* be the reason that this particular file got bloated. Any pennies for the thoughts ;-)
Why don't you try the ext3 mailing list instead of Ask Slashdot? I lurk on the list and I've seen a number of questions extremely similar to yours, with answers. The list gurus will even help you track down the problem.
0 02-July/thread.html#383
https://listman.redhat.com/pipermail/ext3-users/2
I know this isn't an ext3 help channel...but I haven't gotten a satisfactory answer elsewhere (usually it just consists of a "*shrug*" on the #redhat channel)
.... long....like 10 minutes....once I even was dropped to the maintenance shell to run it manually (yes, yes, yes, yes, yes, yes, yes...when the hell would I *NOT* want to fix the non-zero dtime????)
I've got a thinkpad running RH 7.3 with two ext3 partitions. Being a laptop it has occasionally had its batteries die and been shutdown improperly. Invariably, there has been a subsequent long fsck
Isn't the whole point of ext3 so I don't have to go through this pain? This was an extremely generic installation of 7.3, why am I seeing no benefit to ext3?
Thx,
SuperID
What was that lossy compression scheme mentioned a while back? lzip, I think? Sounds like that's what you need here...
Under capitalism man exploits man. Under communism it's the other way around.
Search the ext3-users list archive, I'm sure I've seen this reported before.
Hello world!
EXT3 journaling is a joke. I've had RH 7.2 workstations that lost power lose an entire filesystem, just because they weren't shut down properly.
This has happened more than once too... I can't believe people actually use EXT3, and think their data is safe.
Where I work, we have machines running XFS, JFS, EXT3, and ReiserFS. EXT3 is the only filesystem we have problems with.
I especially like the 1.5 hour long fsck runs on one machine with it's 120gig data partition.
Oh, come on, be a man. Backup if you need to, and delete the thing.
when it comes to speed.
and i have yet to see java beat perl on that front... or in the memory usage area either.
I've seen this. In my case, it was fixed by unmounting and mounting the filesystem again. I've also seen files that one command (like find or rm -rf) would see as a directory and another would see as a file. I don't understand how there can be differences, given that they should all be using the same C library interfaces. These have always been recoverable, however.
Also, I experienced something considerably more distressing: data corruption. After reading the benchmarks comparing ReiserFS and ext3 mounted with 'data=ordered' and 'data=writeback', I decided to try writeback mode. It seemed okay for a while, but lately because of the heat my computer has been shutting itself. Once I came back and found that after hitting the reset button, my Mozilla bookmarks were reduced to a small portion of what they ought to have been. An image I had been working on and saved had been replaced by the content of several e-mail messages. rxvt would no longer start correctly from the KDE panel, even though checking through the properties it looked okay. I re-added the button and it started correctly. There were other things awry too, and probably things I haven't found.
I was using the "offical" kernel from Red Hat for 7.3, 2.4.18-5. In summary, DO NOT USE data=writeback for now.
Wil
wiki
So you're about 1 step ahead of 90% of the rest of the world.
Your next step is to blow the disk away and restore.
By the time you get a coherent answer from us, you'd be back up and running.
Alternatively, if you bought the retail version of RedHat you could call them, or there's always the free newsgroups and messageboards. Give them a shot.
writeback
"Hey, I'm one of tens of millions of Linux users, and I've got a problem with its Ext3 file system software. I was wondering if you could help me recover a 3 kilobyte file?"
I think you're right about the flipped bit. Copy the file with dd, specifying the right output size.
I'd bet there are problems with the whole filesystem, but to continue with what he asked:
It seems to me that he should be able to rm the file without any worries, after making a good copy. Only the inode that points to the falsely enlarged file will be removed, and the data blocks won't be touched, right?
If there is other data in the misallocated blocks, that dat should either have its own references, or it's already as good as deleted anyway.
Assembly is the reverse of disassembly.
Apparently one man's irony is another man's flamebait.
I see even classic Slashdot is now pretty much unusable on dial up anymore.
This will certainly explain why it fsck'es all the time after reboots - run 'mount' without any parameter and check /proc/mounts (I think - not in front of Linux right now) and see if they both say ext3?
Hope that helps,
Michel
Michel
Fedora Project Contribut
A couple of years ago I had a brain fart where I was (humorously enough) making a boot floppy so I could convert my root fs from ext2 to ext3.
/dev/hda run about 10 times took care of the filesystem's integrity, but I still had about 200 "files" in /lost+found of all sorts of random sizes, names, and types (pipes, fifos, regular, dev entries). The problem was that when I'd try to perform a file operation on any of the files, the kernel would get pissed off saying that the file size was too large, since the inode had random data listed as the filesize, and the operation would fail.
Instead of dd if=/tmp/imagefile.img of=/dev/fd0 bs=1440k,
I did dd if=/tmp/imagefile.img of=/dev/hda bs=1440k
Whoops. After restoring my MBR and partition table, I still had to deal with the fact that I overwrote the first 1438KB of my root filesystem with effectively random data.
e2fsck -y
The way I finally fixed it was by running tune2fs and removing the file by hand. It's fairly straightforward, since tune2fs has an interface similar to file navigation from a shell prompt (ls, cd, etc). Just navigate to the target directory and remove the inode listed (by ls) as the inode associated with the file in question. You probably want to run e2fsck one more time to be sure.
Happy ending: I'm still using the filesystem that dd stomped all over and luckily lost only a handful of unimportant files.
Hope this helps...
-Fat Fingers
With reiserfs, I had a file that would reboot the system if I read, wrote, or deleted the file. I rebuilt the journal and everything was ok. Imagine if it was a production system!
stick with a real filesystem, get a Sun, HP, IBM, or SGI and use their journaling filesystems.. you'll never want to use ext* again.
I recommend plugging in an extra hard drive, and using Norton Ghost, or one of the alternatives to back up the partition, before touching it. Since the filesystem is corrupt, you'll probably have to do a bit-to-bit copy for it to work.
Afterwards, you can do whatever experiments you want with it, and still be on the safe side.
Try running ls -i (that's a small I). This will list the inodes of the files. You probably have some sort of corruption.
This is a typical sign of bad hardware.
RAM, wrong IDE DMA mode, stopped fan,
bad cache, etc.