Monitoring What Files Your Applications Leave Behind?
GoRK asks: "I have to install a commercial application on one of my servers. The application refuses to locate itself anywhere other than under the /usr tree, and I am concerned with it munging with my configuration under /etc as it automatically configures its daemons and whatnot. I am also a bit concerned with the method it goes about to verify its license on install. Is there any way I can run the installer in some sort of wrapper that allows me to monitor what files and network sockets get read/written to during the change so that I can monitor what data on my machine is getting moved around and also build a catalogue of every last little bit of the app in case I ever have to remove it?" In my opinion, a sensible software installer should have some form of user accessible package manifest included. Why should consumers trust third party software to "do the right thing" in the right locations, especially when installing software you don't have the source to.
"I am a big fan of MD5 sums and package management as a very reliable integrity check and I value the fact that every file on my system save user documents, etc. belongs to a package and I can verify its authenticity. I need to make sure that I know about each and every system file that gets modified during the installation. It would also be nice to see and control if it accesses anything under /dev/ just out of curiosity.
I've never been a diehard security freak before, but I just feel like it's 'time to do things right' so to speak. Is there a tool that will assist me here?"
Run the application in a chroot'd sandbox.
Or, (protopkg trick of the week, kids), write a prototype that just has "sleep 10" in the compile() function. When protopkg goes to sleep, hit ctrl-z to stop it, and do whatever you want to manually. Then when you're done, give that shell an 'fg' to let protopkg finish its work.
Idunno about monitoring the network sockets... that's kinda weird.
do the install
bash# locate
bash# diff
Or, if you wanted to get fancier and check for changed files:
for i in `locate
do
echo $i:`sum $i` >>
done
install
for i in `locate
do
if grep -v $i:`sum $i`
echo $i >>
fi
done
###
Shell scripting is your friend. Learn it well.
--
-- Slashdot sucks.
Linux strace doesn't trace fork(2)/clone(2) by default. You have to use -f for that. Read that man page, and practice by tracing a simple shell script to see what it does and what you can make of it. Use -s to show longer strings if you want. (The default is to show ... after 32 character strings, except for file names, which are always shown in full.)
I recomment you strace all system calls from the installer (i.e. don't use -e), and filter later with grep | less, so you make sure you don't miss any interesting data.
#define X(x,y) x##y
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cordes ,
1) Yes, there is closed source software. (Somewhere on other peoples computers.)
2) Yes, Open-Source Software has bugs just like any other software. But having the source enables you to fix it.
3) Yes, you have to trust many people, if you use any kind of software. But it's easier to trust software engineers, who don't hide their source.
--
Raphael Wegmann
Raphael Wegmann
wegmann@psi.co.at
True, but then on Unix all the configuration's on the filesystem anyway. Usually all you want in the end's a list of what changes the program made. Detailed step-by-step traces are useful only rarely, in debugging arcane problems. If I need more detail than Tripwire gives by default, I usually just diff the new files against a CD-RW backup from just before the install ( yes, I'm spoiled by being able to quickly back up my configuration that way ).
Yup. Minimum for any shop that considers itself professional:
Production (it works)
Test (we think it works)
Development (we're trying to make it work)
Training (the users are trying to figur out how it works)
Our test platforms sometimes double as training platforms; nothing reveals incorrect assumptions quicker than a clueless user banging at an app.
"What do you mean I shouldn't close while it says 'Committing transaction...'? If I shouldn't do it, why does it let me?"
You can also type:
# find /etc -mmin 10 /etc in the last ten minutes.
to see all the files that were changed in
For those kind of apps, it's faster to build a chroot (Debian chroots are really simple to make - unpack the base2_2.tgz to a dir somewhere, cd to it, and do chroot . bin/bash), make a copy of it, then do rsync --dry-run --verbose or diff -u --recursive on the two dirs to find out what changed.
Tripwire is designed as a security tool to tell you what files have been added, deleted and changed on your system but it sounds like it would easily the job you're looking for.
You just run it once to generate a database of files on your system and again after installation to see what has changed. Easy!
Where does the original poster say it's a Linux system?
Just as a quick side-note to the MS side of things... well actually Novell side.
:) ). Once you've tweaked the AOT (step 7 - and I do advise from experience that you at least review what changes the app makes - you'd be surprised how often things get "changed" that have *nothing* to do with what you want... IE setting changes comes to mind as a example), you can now push that application down to ANY NUMBER OF COMPUTERS.
Novell has this funky "new" technology which basically goes like this:
1) You Re-Image the box with your standard Base Image (ie. only OS + Drivers)
2) You start the box (WinNT 4 sp5 for me) and login with a fresh account (ie. new profile)
3) Run Novell's SNAPSHOT software (takes picture of hard drive - all files, reg, ini, etc)
4) Install your app and configure.
5) Run SNAPSHOT again (Which now takes another picture of your hard drive & configuration, then sits and compares them, throwing the differences to a file, and copying all changed files to a location you specified)
6) Go for a drink (can take a while if its a big app)
7) Import the differences file SNAPSHOT created into NDS, then review/change/mangle the application however you see fit.
At the end of the day you've just created a "Application Object Template" which you can then import into Novell's NDS (where MS got its Active Directory idea
This product is called ZENworks, and it really is great - note, I didn't say it doesn't have problems and glitches. It comes in a few different flavors: Server, Desktop (and 1 other?)
It's a really great idea/concept - lots of things currently available singlely, now nicely packaged into 1 thing. Unfortunately I haven't seen the equivalent in Linux/*nix yet ('course that *might* have something to do with *nix not yet having a nice Directory structure like NDS)
I used to have a cool sig.
You are thinking of Reflections on Trusting Trust, and it wasn't a compromise of gcc, it was just a "what if" situation that Ken Thompson talked about once.
What makes you think that your clients can't change the microcode anyway, just because you don't give them source code?
------
No, the InstallShield way of doing things is a POS that is the reason why Windows programs can trash DLL versions and such.
Most of the modern unices have a central package management system that I would never trade for any per-package installation program. In fact, anyone could distribute a deb package, and all you'd have to do is run dpkg --force-depends -i commercialapp.deb && apt-get -f -y install, and the package and all its dependencies would get installed.
------
Does your Unix support DMAPI? Perhaps you could get a look at what's its doing using that. Maybe a plugin for ReiserFS, if you're using that, to get a look at what the changes are that it makes.
- - - - -
Napster-to-go says "Fill and refill your compatible MP3 player", which is a lie. It's not MP3. It's WMA with DRM.
InstallShield offers a Linux version now...has for awhile. Too bad the Linux companies won't use it. It is Java based, and it is pretty damned cool. And, of course, you can view files and thier install paths with the setup list. Would probably help in cases like this...
s p/a?
http://www.installshield.com/iemp/specs/default.a
strace, which uses posix tracing, can trace every system call made by an application.
Actually, I frequently use it when debugging. Program doesn't start up correctly, I run it under strace to see if there's a file it can't find.
It'll show you every subprocess, every kernel call, every file access, network access, etc.
'strace netscape'
or
'strace -eopen netscape'
I have to install a commercial application on one of my servers. The application refuses to locate itself anywhere other than under the /usr tree,
/usr. being `part of the OS' is fairly hard to determine on a platform without a standard distribution, and such arbitray information is useless to base a filesystem standard on.
You didn't say whether the program was Open Source or proprietary, just that it was commercial. However, I'll assume that like most slashdotters you've never looked up the dictionary definition of commercial, or the Free Software Foundations confusing words list, and mean `closed source' instead. In reality a programs status as Free Software or Open Source or otherwise has no bearing on whether it is commercial or not.
Either way, whether its pay per license proprietary software or Open / Free software that is produced for commercial reasons (meaning a support contract is avaliable), complain to your vendor.
The FHS specifies RPM 3.05 as the (current) standard for installing software on Linux systems. nearly everyone who provides software on Linux (Open or proprietary) provides packages in this form, and the overwhelming majority of users use them. Get your moneys worth (either from licensing if its a closed source or support agreements otherwise) and tell the vendor you want packages.
Where to put the app is another argument. I have no problem with a packaged app that wants to live in
and then you can restore whatever got wiped out from backups.
You are using tripwire, right? And keeping good backups?
*grin*
locate only finds files that are already in its database. You'll need to use find instead for this to work.
reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))
$ touch /tmp/instdate.<package-name>
/> -newer /tmp/instdate.<package-name> -print
:)
$ <run the appropriate install procedure>
$ find <tree where it installed itself, or if unknown,
Works for me...
--
Yomigaeru Aiyan Geek!!!
It's capable of detecting new files, changes to existing files and files that were deleted.
If there is hope, it lies in the trolls.
Well, perhaps tripwire would be an option.
If there is hope, it lies in the trolls.
Yes, the Windows crowd are like children or savages, easily dazzled by shiny objects (shrink-wrapped software) and perpetually dependent on parents/witch doctors (boxware vendors/MS) for the talismans to ward off a scary, complex world.
But there are drawbacks to being an adult - we are burdened with the knowledge that the world is held together with duct tape and dried dung. For us there are no happy surprises in shiny packages, only the unerring certainty that software sucks and will continue to suck.
With regard to shells, I agree - it's utterly amazing that Microsoft hasn't managed to "innovate" a decent shell yet. But I have used Bash on Windows, and it wasn't pretty. Windows is sluggish in many ways that come annoyingly to the fore in shell interaction.
This is more-or-less my method. However a
/root/changelog )
./root/changelog/foo
/root/changelog/foo -print \
/root/changelog/some-new-package-file.lst
/root/changelog/some-new-package-file.lst
couple comments.
I make a directory ( say,
and place all such package installation
find-listings in that directory.
The only other change is that most times,
the installed files retain their original
dates. To overcome this, I recommend using
the find-flag -cnewer ( changed newer )
sequence ( assume 'some-new-package' is the
new software package )
touch
# install your software now
find / -cnewer
|tee
The file
will contain a list of all changed files on the
system.
hth
-- kjh
p.s. This is also useful for figuring out what
files a GooEY admin menu affects.
Do the touch ; Do the GooEY ; do the find-command
I want large tracts of the directory structure to be modified only by packages installing their standard files. To completely restore these areas, all you need know is what packages (and which versions) were installed.
The ideal is something like this: I backup /etc, /home and /usr/local only. If my hard drive goes to Silicon Heaven, I buy a new one, restore onto it, and then run a program that looks at the list (saved in /etc) of packages I had installed and helps me reinstall them all.
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
Many software installers will leave an installer log when they are done, but by then it can be too late. It seems rare that a software installer actually tells you what it is going to do before it does it. It is things like this that lead to unneeded tedium for the end user, such as backing up your configs before an install, 'just in case.' I just don't think we should be forced to go through this hassle, and I will make sure to clue the user in during/before the install in any software I write in the future. It had previously never occurred to me to do this, and I think it is the same way with most developers, be they commercial, open source, shareware, or whatever..
Time for some tasty Shiner Bock!
Certainly, when something like this is available, production-ready, and fast enough, I would choose to use it on all my boxes...
Quidquid latine dictum sit, altum videtur.
Why not just "find / > beforeinstall.txt" and "find / > afterinstall.txt" and use "diff" to compare them afterwards?
You can use diff to check files in /etc or wherever if you have backups as well.....
Some people (noteably the best UNIX admins) take things like this VERY seriously, for a good reason. After putting 600 man-hours in to configuring a machine from the ground up, I know I wouldn't just go install somthing when I don't know what it will do to my system. Sure I have backups, but it should never come to that. In order to effectivly run a production server, you have to know how _everything_ installed on it works, and interacts with everything else. If you do not know, you waste lots of time with trial and error, and any modifications you make may cause other applications to misbehave, because you simply do not know how things interact. I agree 100% with the posters concern.
Come on now. That's just FUD. All warranties have certain things that will void them. For example, if you buy a new radio and it breaks you may be able to replace it, but if it broke because your 5 year old smashes it with a sledge hammer then you can't replace it because the warranty doesn't cover that.
You simply say something in the warranty like "If you use drivers for this device that were not written by either <your company name> or a company or person who <your company name> has certified, then the warranty on this device will be void." And presto! You are not forced to pay to replace any hardware that someone uses a 3rd party driver on.
Why open source the drivers then? So your consumers have more confidence in the quality, and they can fix bugs themselves and give them back to you.
I realize that this contract prevents users from applying bug fixes that they wrote if they want the warranty, but I think it's a fair compromise. You give the bug fix back to the vendor and they can review it and roll it out in the next version of their driver. Probably won't be as fast as you'd like, but hey, the bug could have not been discovered and fixed at all.
--
Garett
You should listen to this. This is some of the best advice you're likely to get short of wasting your money on a "consultant." Where I work, we have three stages to production. In addition to the test box, we have a more tightly controlled "production test" environment and then changes get promoted to production.
On the other hand, if you're a smaller company with limited resources and you happen to be running Linux, you may want to give serious consideration User-mode Linux. From the site:
"User-Mode Linux gives you a virtual machine that may have more hardware and software virtual resources than your actual, physical computer."
I've played with it a bit and it gives you a complete (and completely sealed off) environment. It creates the entire environment within a file. So you could create the environment you want and then simply make duplicates using cp.
It does require quite a bit of disk space since each VM is a complete system. So, if you want a virutal system with a 2GB filesystem you'll need +2GB of disk space, but heh, disk space is one of the cheaper components, certainly cheaper than a new system.
Also keep in mind that it does "split up" your real system resources so you'll want to make sure you have plenty of RAM if you do any real work with it.
Anyway, check it out and see if will help. Either way, you really do need to seperate test and new "stuff" from production.
So you're saying that commercial applications simply cannot use a sensible installation scheme, and must instead resort to hiding everything from the system administrator? So open-source really is superior, apparently.
Oh please. You don't need to know how everything works. Microsoft made Windows easy to use and reliable, so all you have to do is use the install wizard and everything works great. If you're really curious about the inner workings, though, part of the way they achieved this unparalleled reliability and ease-of-use is through architectural decisions.
1. The Registry. This piece of utter genious is crucial in an easily-maintained system. Instead of having a bunch of separate configuration files to keep track of, you just have one big binary file which all the install wizards access and change. Sure, sometimes one application screws it up for everything else, but that's no problem. Just re-install everything, no big deal. In this day and age, it's simply ridiculous to consider going back to ASCII text configuration files; it's just too difficult for certified system administrators to read these and edit them. Besides, we can always pass the blame to one of the third-party software vendors. I know that every time I have a computer problem at work and I need to meet a deadline, I can just tell my boss how some non-Microsoft program screwed up my whole system and he'll say "no problem, take as long you need. As long as we're using reliable MS software, we don't need to worry about deadlines."
2. A central location for DLL's. By keeping all the DLL's in one or two directories, and allowing all applications to modify or overwrite them, we achieve breathtaking gains in system efficiency and reliability. Sure, sometimes you'll have minor problems with incompatible versions of the same DLL, but we can just blame that on the non-MS vendor and reinstall everything.
Honestly, you talk like your time is precious or something. What's the big deal if you spend days setting up a system and some big-$$$ application screws it up? You just call their tech support (for $300+ per call) and tell them about your problem, then spend the next week or two redoing everything. And if it messes up that time, do it again. Using the intuitive and attractive Microsoft Windows user interface is such a pleasure, those hours (days, weeks...) will pass in no time. Plus, you can have Clippy (tm) help you! What other software company has such a fantastic, useful innovation?
what about doing the whole thing in a chroot environment and then comparing it with the "original" tree?
Just my 0.02 euros..
J.
Tongue-tied and twisted, just an earthbound misfit, I.
Checkinstall is a script that uses installwatch and rpm to build rpms from a source install. Do a search on freshmeat.net for checkinstall, the author also maintains installwatch. You only need to run installwatch as a wrapper to the install program and it will give you a listing of the files that it creates.
Bah!
We NT folk have FileMon, RegMon, and Sysdiff/SMS Packager at our disposal.
It's not often there exists something in the NT world that doesn't have a parallel in *nix land, but this is one of 'em
---
nuclear presidential echelon assassination encryption virulent strain
nuclear presidential echelon assassination encryption virulent strain
Whizzmo
And then install it. Make a second /usr for the install. Or try using a small shell script that will traverse the file system and tell you all the recently modified files. Or just use a virtual machine.
The Lottery:
"Not my manner of thinking but the manner of thinking of others has been the source of my unhappiness." - M
I agree that it's probably a good idea to have a look at what it does (by running in some sort of sandbox), but here's my take on the whole situation in general...
First off, you shouldn't be installing anything untested onto a production server. What your company should have is a box identical in configuration to your production box (or at least a development server. You DO have those, right?).
Install the package on this server first. See what it touches. See if other packages misbehave after this is installed. Above all else, do not touch a production box period unless you've already seen what this program does.
If your staging box blows up, so what? It's happened to me. Nobody really relies on it, and that's what the box is for anyway. No big deal. Document everything that happens when you install on this machine. Since it's exactly the same as your production box, if everything works, then all you have to do is follow your documentation to install onto your production server, and all will work fine.
Trying to install stuff (even in sandboxes or wrappers) onto a server without testing it in a closed environment or staging area first is asking for trouble.
You can accomplish anything you set your mind to. The impossible just takes a little longer.
While not wishing to fan these flames further, there are some (IMHO) good utilities from sysinternals which allow you to do this for free.
filemon will report file update/access and regmon will report registry update/access.
I haven't tried your examples, but I have found that even if the output from these is rather verbose, given some judicious regular expressions the output can be cut down to manageable size.
<-- You are here.
I used to keep a mock-up of a minimal system (or you can use a disposable separate computer) in a sub-directory and I would do test installs to that sub-directory using a chroot(ed) shell. This gave the opportunity to deconstruct things nicely. It was only used after other suspicous nonsense happened with a package or source.
/usr with "somdir". Other install methods like SVR4 "pkgadd" have similar relocate facilities.
If this is an RPM on a linux box use "--root somedir" to prefix things like
If it is just a tar/cpio archive and an associated script to install, read the scripts.
In short, there is no one tool to do this stuff, but it *is* doable by a number of means (on a *nix box) where it is otherwize impossible to do it on a windoze install-wizzard bundle.
Hope this helps some.
Rob.
PS I have also had some luck by looking at the errors generated after trying to run the install as a normal, non priveliged, user.
--
Innocent people shouldn't be forced to pay for inferior software development.
--"Code Complete" Microsoft Press