Recovering From apt-get Failures?

← Back to Stories (view on slashdot.org)

Recovering From apt-get Failures?

Posted by Cliff on Sunday March 4, 2001 @01:54AM from the blasted-package-management dept.

Ross Vandegrift asks: "Once upon a time long ago, apt-get totally trashed the company webserver while we were trying to upgrade it. Since then I have been very suspicious of apt-get. Recently, a friend from school talked me into trying it again. I slowly eeked back into it and was keeping our machines up to date, and automatically applying security patches. It was cool, I was starting to trust it again. Well, today that changes..." I'm sure everyone who has ever used RPM or apt can understand the frustration one goes thru when running into this problem. For those who have, what did you do to get your system functioning again?

"I ran 'apt-get update; apt-get upgrade' last night to upgrade one of our machines. Nothing seemed out of the ordinary, I recieved no error messages indicating something terrible happened. But now when I try to use 'su', it returns a libpam error that I've been unable to find reference to, either on the web or in the Debian mailing list archive.

So this poses the question: what do I do if apt-get fails and screws up the system? I've tried reinstalling/reconfiguring the affected packages to no end. If we used my distro of choice (Slackware), I'd have an intimate understanding of my system and would know right where to look when I get an error. But with apt, most of the packages on the machine are black-boxes; I don't know much about them outside their package name and function. What would cause apt-get on a debian-stable machine to actually cause problems? And what can I do to back out of its changes?"

4 of 19 comments (clear)

Min score:

Reason:

Sort:

The problem is between keyboard and chair by Jaldhar · 2001-03-03 23:14 · Score: 3

I'm sorry if some of the following sounds harsh but you are going about things in dangerously foolhardy ways.
Ross Vandegrift asks: "Once upon a time long ago, apt-get totally trashed the company webserver while we were trying to upgrade it. Since then I have been very suspicious of apt-get. Recently, a friend from school talked me into trying it again. I slowly eeked back into it and was keeping our machines up to date, and automatically applying security patches. It was cool, I was starting to trust it again. Well, today that changes..."

So let me get this straight? You already have one bad experience of an apt-get and yet you are doing the same thing all over again? No analysis or review of what went wrong the first time and what you should be doing differently? Is it any wonder the same problem crops up again?

I'm sure everyone who has ever used RPM or apt can understand the frustration one goes thru when running into this problem. For those who have, what did you do to get your system functioning again?

"I ran 'apt-get update; apt-get upgrade' last night to upgrade one of our machines. Nothing seemed out of the ordinary, I recieved no error messages indicating something terrible happened. But now when I try to use 'su', it returns a libpam error that I've been unable to find reference to, either on the web or in the Debian mailing list archive.

I haven't had this problem but it would seem looking at /etc/pam.d/su to see if anything has changed might be a good start. To answer the larger question, on any kind of production machine, do apt-get -s upgrade before actually upgrading. The -s switch tells you exactly which packages are going to be installed/removed/whatever. For each affected package look at the changelog to see what's new. Every time a package is uploaded, the changes are sent to the debian-changes (stable) or debian-devel-changes (unstable) mailing lists. Or you can look at the packages' page on www.debian.org

This won't help you if the packager has simply made a mistake (which can happen.) To cover that eventuality, keep the last known good version of each package handy somewhere. If you don't do apt-get clean, downloaded packages will remain in /var/cache/apt/archives. Or you can keep a central repository somewhere.

So this poses the question: what do I do if apt-get fails and screws up the system? I've tried reinstalling/reconfiguring the affected packages to no end. If we used my distro of choice (Slackware), I'd have an intimate understanding of my system and would know right where to look when I get an error. But with apt, most of the packages on the machine are black-boxes;

???? How are .debs any more of a black box than slackware packages? You can get the source of any one of them with apt-get source The Debian Policy Manual explains exactly how the system is supposed to function and where things are.

I don't know much about them outside their package name and function.

Then find out! I'll admit Debian is underdocumented but certainly all the information in this message is well-known. Why are you adminstering Debian systems if you don't understand Debian? apt-get is magic in many ways but it will never be a replacement for human competence and common sense.
run the upgrade on a test box first by gempabumi · 2001-03-03 23:59 · Score: 3

dude, i have had far worse things happen when working with unstable. granted, unstable is a risk, but there are some nice versions in there. i usually install my base system from stable and my daemons from unstable.

anyway, if you are tempted to use unstable, do your upgrades on a test box before trying it on your production server. that's a fast and easy way to try it out with a box you are sitting next to - the alternative is to screw a box which may be thousands of miles away (depending on where you colo).

another benefit is that your test/development box always has the same state as your production server. as it should.
Re:He doesn't play with unstable. by mopsuestia · 2001-03-04 02:26 · Score: 4

That's what he said, BUT the problem he is describing was introduced within the last couple of days in unstable. There were at least 3 bug reports placed yesterday concerning this bug, (see here, here, and here) and I imagine a fixed package will be available RSN. (Personally, I just reverted to the previous revision of the packages that were sitting in my apt cache.) I have had no problems with PAM in stable.
As others have said running unstable is only for those who are ready for breakage and know enough to fix it. Use stable if the above doesn't apply or testing is you absolutely much have newr packages.
Also, bear in mind that these are broken packages, so the blame doesn't really fall on the package manager. The packages may have been just as broken if they were distributed as tarballs. It is not the .deb's that are broken, per se, it is the files contained in those .deb's.
Finally, no one is stopping anyone from installing from tarballs on a Debian system (or any other linux distribution). You don't get the benefits of the package manager, but if you don't trust apt/dpkg/rpm anyway, I don't doubt you will think that is much of a loss.
I don't trust *nix auto-updaters by blackwizard · 2001-03-04 18:29 · Score: 3

I've gotten burned twice lately with *nix auto-updaters.

1. My parents have a ppc based system that I installed Yellow Dog Linux on, and installed some network cards in, so it could function as their internet NAT router. I ran the "yup" program the other day, and it updated a lot of system utilities and was very helpful. However, it also upgraded to a different version of the linux kernel at the same time. Now, this wouldn't have been so bad -- but -- ** IT DELETED THE OLD KERNEL!! ** Not good, sir, not good. So along comes my dad to reboot the thing one day when he was moving it around, and when it came back up, since the boot loader hadn't been reconfigured, the old kernel tried to boot. Whoops! (Actually, the kernel itself wasn't deleted, but several other importants pieces close to the kernel, i.e some modules that were dependencies to other modules, etc...) It sure was fun talking my dad through hours of trying to compile modules for the other kernel and getting it installed into the boot loader. (The old kernel was toast -- the "yup" program completely overwrote the old kernel source code)

2. My system got hosed last night when I used "autoslack" to blindly upgrade my system. You'd think I would have learned my lesson -- well -- I partially did, I backed up all my config files first. (Good thing too.) Anyway, I din't figure out exactly what autoslack hosed on my ssytem -- it either upgraded a lot of critical system components to new ones that required a version of glibc that I didn't already have, or it upgraded my glibc, somehow leaving some critical system components confused about what the hell was going on. (things like "ls" and "rm" were completely broken) I ended up doing something like "cd / ; mkdir .hosed ; mv * .hosed ; mv .hosed hosed" and reinstalling a newer version of slack on the system. Ugh.