Auditing Large Unix File Systems?
jstockdale asks: "The recent article on perpendicular recording hard drive technology brought me, as a unix(tm) admin, to reflect on the management of data systems and file servers of capacities >1TB (which exist today and tomorrow will become commonplace). Since Google for once seems useless, what suggestions does the Slashdot crowd have on methods and software to audit changes, visualize file system usage, and in general to determine the qualitative and quantitative nature of the content of large unix file systems?"
I like the idea of treemaps.
x .s html
:)
http://www.cs.umd.edu/hcil/treemap-history/inde
Hehe, it was originally made to see what was taking up all the room on an 80MB hard disk
There's various software available based on this concept, most working like "du", except that you get the results graphically. You typically see a large picture on screen of what directories and files are taking up the most space. It looks like a piece of Mondrian artwork, with the size of rectangles corresponding to the size of space taken, so it is easy at a glance to see what is hogging all of the disk space. It can be drilled down, of course, by clicking to zoom in.
A quick Google search revealed SequoiaView:
http://www.win.tue.nl/sequoiaview/
Unfortunately this only runs on Windows, but I'm sure there are similar Linux programs available.
Dr. Demento On The 'Net!
Just take all the data on your disks, and do a low-level scan. Basically, take each byte and perform a parity check so you get a 1 if there's an odd number of 1's in the binary representation, and a zero otherwise.
Now, here's the secret: take all these zeros and ones, and do a parity check on THEM. BLAM! Your entire array is now down to ONE status bit!!!
Now take a big crayon and write that status bit on a piece of your favorite color paper. Put it up in the machine room for all to see. Or just slip it in your drawer if you think that letting this kind of information out is a security leak. Your call.
Then, repeat the process once an hour or so. Today's arrays are so fast that it shouldn't take long. Each time you get the digit, the zero or the one, compare it to the last output. If it's changed (for example, from 1 to 0 or 0 to 1), than WHAMO, you've got SOMETHING going on, better check it out!!!
This "early warning system" gave me a "heads up" to some serious probablems more than once. You might want to check it out, so-called "storage experts" EMC didn't even have a package to do this so you might do a little coding in VB, but it's worth it!