Recovering From apt-get Failures?
"I ran 'apt-get update; apt-get upgrade' last night to upgrade one of our machines. Nothing seemed out of the ordinary, I recieved no error messages indicating something terrible happened. But now when I try to use 'su', it returns a libpam error that I've been unable to find reference to, either on the web or in the Debian mailing list archive.
So this poses the question: what do I do if apt-get fails and screws up the system? I've tried reinstalling/reconfiguring the affected packages to no end. If we used my distro of choice (Slackware), I'd have an intimate understanding of my system and would know right where to look when I get an error. But with apt, most of the packages on the machine are black-boxes; I don't know much about them outside their package name and function. What would cause apt-get on a debian-stable machine to actually cause problems? And what can I do to back out of its changes?"
Thus, I use RPM exclusively.
You can argue and quibble about functionality and neither installer will win decisively. However, RPM doesn't ruin your system. Ever. I don't understand why so many users tolerate an installer that's the equivelant of playing Russian Roulette.
It's insane, if you ask me.
So I sez to him, I ain't givin' you no damn three-fity.
You're running unstable. Running unstable is a really stupid thing to do if a problem this simple and tiny resents any kind of a barrier to you.
I've yet to hear of anything like this happening to anyone running the stable, i.e. Joe Average release.
The more helpful response would be - subscribe to debian-user, as any problem in unstable is discussed there straight off, or join #debian on irc.debian.org and ask "apt" about the broken package. Chances are it'll have the answer if it isn't in the channel topic.
Have you tried getting older packages and installing them over the newer ones with dpkg? Did you look at /usr/doc/<package>/changelog.* and README*? Did you try extracting the pre- and post-install scripts from the .debs (which are really ar archives containing data.tar.gz and control.tar.gz, the latter containing all the packaging stuff)?
As for myself, I don't usually use Apt directly, but the Apt method in Dselect. People say it's counter-intuitive, which it is, and difficult to use, which I don't think it is - once you get the grip of <space>, <enter>, +, -, _, =, R, Q it gets quite easy. Before upgrading a system I look at which packages are going to change; if they are critical, I try to choose a time where I know I am physically close to the machine and can undo the upgrade one way or the other.
I disagree. The packaging has a black-box look (and even then you can easily open .debs with standard tools on any UNIX system, unlike RPMs) but what matters is the files they contain. PAM configuration is still in /etc/pam.*. Use dpkg -L to see what files are there.
Furthermore, everybody messes up sometimes. Debian developers may well have broken something despite all the precautions they take before uploading a new package to Debian-stable. And there is something strange currently going on with FreeBSD 4.2-STABLE according to someone on the mailing-list.
And no, I myself have seldom if ever had problems with Apt on Debian-stable.
I'm sorry if some of the following sounds harsh but you are going about things in dangerously foolhardy ways.
So let me get this straight? You already have one bad experience of an apt-get and yet you are doing the same thing all over again? No analysis or review of what went wrong the first time and what you should be doing differently? Is it any wonder the same problem crops up again?
I haven't had this problem but it would seem looking at /etc/pam.d/su to see if anything has changed might be a good start. To answer the larger question, on any kind of production machine, do apt-get -s upgrade before actually upgrading. The -s switch tells you exactly which packages are going to be installed/removed/whatever. For each affected package look at the changelog to see what's new. Every time a package is uploaded, the changes are sent to the debian-changes (stable) or debian-devel-changes (unstable) mailing lists. Or you can look at the packages' page on www.debian.org
This won't help you if the packager has simply made a mistake (which can happen.) To cover that eventuality, keep the last known good version of each package handy somewhere. If you don't do apt-get clean, downloaded packages will remain in /var/cache/apt/archives. Or you can keep a central repository somewhere.
???? How are .debs any more of a black box than slackware packages? You can get the source of any one of them with apt-get source The Debian Policy Manual explains exactly how the system is supposed to function and where things are.
Then find out! I'll admit Debian is underdocumented but certainly all the information in this message is well-known. Why are you adminstering Debian systems if you don't understand Debian? apt-get is magic in many ways but it will never be a replacement for human competence and common sense.
Another thing is that RPM doesn't inherently guaranty that the package is legitimate, whereas .deb does (in the form of pgp signatures).
You are misinformed. RPM does support PGP signatures. This is not a new feature of RPM. To verify my claim, please see the man page for "rpm(8)". Perhaps your complaint is that Redhat's automatic update tool, or some third party automatic update tool doesn't use the RPM PGP signature?
I would like to comment that only a small amount of resarch is required to compare current versions of the APT and RPM package systems and discover that they are remarkably similar and each has a few features that the other could benefit from.
Comparying apt-get to RPM is inappropriate. Comparing Redhat's, or a third party's, RPM-based autoupdate tool to apt-get would be more appropriate.
dude, i have had far worse things happen when working with unstable. granted, unstable is a risk, but there are some nice versions in there. i usually install my base system from stable and my daemons from unstable.
anyway, if you are tempted to use unstable, do your upgrades on a test box before trying it on your production server. that's a fast and easy way to try it out with a box you are sitting next to - the alternative is to screw a box which may be thousands of miles away (depending on where you colo).
another benefit is that your test/development box always has the same state as your production server. as it should.
When I said, "to compare current versions of the APT and RPM package systems", I actually meant to DEB and not APT. I was refering to the package system, not the update tool. My apolgoies for any confusion.
Is'nt this kind of situation why you should regulary backup ...? Granted, reinstalling from tape to recover from a config problem is frustating, but you should have this option. You *do* backup, right ???
:wq
This is kinda silly.
/etc/apt/sources.list to the security site, stable. If you are going to automate updating of the machine in cron, be very very conservative. And it is also a good idea to capture the standard output and standard error, and email it to the sys admin. That way you should at least be alerted in a timely manner when the machine is hosed.
/etc/apt/sources.list points to the main debian server, and debug things immediately.
ONLY apply security updates as fully automated. ONLY point
Then, manually, upgrade the machine with apt-get -u dist-upgrade while
Apt-get is a great tool. So is a gun. Both can wreak havoc when used inappropriately. But remember, it is not the gun's fault. Rather, it is the jerk on the trigger.
I run a production (24/7) server at work as well, which runs an app which we have been offering to our customers and are constantly improving (adding features, maintenance, customization etc).
Naturally, we have our down-time late at night or on weekends, when it's not as likely that there's anyone using it. Firstly, keep your data on a separate drive from your applications and system. Now, we make daily backups of all of our data.
As for your system, every time you do an upgrade or new install, make a disaster-recovery image of the apps/system drive or partition. Get a good piece of software (commercial if you must - even Ghost or DriveImage will do) to image the whole thing into a safe location. This can happen either to a network drive on another server, a tape, or another drive on the same system. In our case, we have a server dedicated to storing various images for our machines.
Just 3 weeks ago we had an upgrade destroy our system (render it unbootable). This happened at 2am while upgrading RAID device drivers. It took about 30 minutes to restore the system from our disaster recovery image, and then of course we analyzed what went wrong and completed the upgrade over the next hour, then tested, and created a new disaster recovery image with the updated configuration.
If you run a production system, you can't afford to not have a disaster recovery image that you could restore in a matter of minutes - naturally you can't afford to re-install everything. Also, keeping your data on another drive keeps it safe from whatever you're doing to the apps.
Oh yes - and DO test your disaster recovery procedures before you need it.
You can accomplish anything you set your mind to. The impossible just takes a little longer.
If you run the unstable distribution, be prepared for problems every now and then.
Now..since you've hit a problem go to #debian on irc.openprojects.org. Every time I've hit a problem with an update to an unstable system the problem has been reported and usually a fix is in the channel topic.
And it should be repeated... If you trashed a web server doing an update, why were you updating a production system to new packages without testing them on another system? The ONLY source in my sources.list on my production systems are for security updates, and I apply those one at a time and test them.
This is a bit basic, but I was messing about with stuff from the Red Hat Rawhide folder and screwed up my install - back into 'doze and restored the root partition from a Ghost image made a few weeks before (yes it was ext2). I stored all the images on the Samba server in the next room, so it wasn't a completely Windows solution ;-)
Ghost came with my CD re-writer, but it's cheap enough on it's own and worthwhile if you're on a dual-boot system.
insignificant sig
I've gotten burned twice lately with *nix auto-updaters.
.hosed ; mv * .hosed ; mv .hosed hosed" and reinstalling a newer version of slack on the system. Ugh.
1. My parents have a ppc based system that I installed Yellow Dog Linux on, and installed some network cards in, so it could function as their internet NAT router. I ran the "yup" program the other day, and it updated a lot of system utilities and was very helpful. However, it also upgraded to a different version of the linux kernel at the same time. Now, this wouldn't have been so bad -- but -- ** IT DELETED THE OLD KERNEL!! ** Not good, sir, not good. So along comes my dad to reboot the thing one day when he was moving it around, and when it came back up, since the boot loader hadn't been reconfigured, the old kernel tried to boot. Whoops! (Actually, the kernel itself wasn't deleted, but several other importants pieces close to the kernel, i.e some modules that were dependencies to other modules, etc...) It sure was fun talking my dad through hours of trying to compile modules for the other kernel and getting it installed into the boot loader. (The old kernel was toast -- the "yup" program completely overwrote the old kernel source code)
2. My system got hosed last night when I used "autoslack" to blindly upgrade my system. You'd think I would have learned my lesson -- well -- I partially did, I backed up all my config files first. (Good thing too.) Anyway, I din't figure out exactly what autoslack hosed on my ssytem -- it either upgraded a lot of critical system components to new ones that required a version of glibc that I didn't already have, or it upgraded my glibc, somehow leaving some critical system components confused about what the hell was going on. (things like "ls" and "rm" were completely broken) I ended up doing something like "cd / ; mkdir
Comment removed based on user account deletion