Is RPM Doomed?
Ladislav Bodnar writes "This is an opinion piece offering solutions for all the ills of the RPM Package Manager. It has been written with Slashdot in mind - it is a fairly controversial topic and I would like to hear the experiences and views of other users who have tried different package formats and different Linux distributions. The conclusions are pretty straightforward - either the big RPM-based distributions get together and develop a common standard or we will migrate to distributions offering more sophisticated and trouble-free package management. Note: the main server allows a maximum of 100 simultaneous connections. To limit the /. effect, here are two other mirrors: mirror-us and mirror-hu (the second one has larger fonts). Thanks in advance for publishing the story."
I administer a few RedHat servers, mostly 6.2, and 7.2 which each perform a different function. If an RPM is offered for a piece of software I need to install, I usually download that first.
If the rpm install fails, I will spend about 3 minutes troubleshooting the issue. If I can't get it to go, I download the source and compile from scratch. 9 times out of 10 this works without having to figure out dependancies.
RPM works great when the envirnment is exactly the same as the build envirnment. When it's not...well, it just plain sucks. Source almost always works without incident.
Really, there is nothing to difficult about:
./configure
make
su
make install
Although it only works for products where the source is openly available.
RedHat needs a compile from source package format that most people can figure out. srpms may do it, but I have no clue how to use them.
-Pete
Soccer Goal Plans
There's been quite a discussion on the installer issue in the Gentoo forums (the thread can be found here). The general consesus from the users seems to be that they like Gentoo being kind of a "niche" distro. If the idea of the source based distro really appeals to you, I would suggest giving it another go and leaning very heavily on the forums (if you need to). Gentoo's Forums have the most helpful and friendly user base I have ever seen on the internet. I have yet to see a single person give a n00b a hard time (outside of the occasional rtfm...). I realize that it's not for everyone and that it takes a little bit of work, but I think Gentoo is definitely worth it after the dust settles. It's nice to install an OS and feel like you actually accomplished something.
Oh yeah, and I don't like RPMs.
What we need is to get rid of the entire packaging system all together. I know I'll probably get toasted for this. But software should install in linux the same way it installs in windows. There should be one file, like setup.exe. I should take that file, execute it, it will ask me what parts of the software I want, and where I want to put it, etc. From my experience there are two pieces of software for linux that do this, the Tribes 2 server, and Mozilla.
The entire packaging system is just a pain in the butt. This depends on that depends on this. urpmi, rpm -i, rpm -U, things not working with no explanation. In Windows I never have to worry about one thign relying on another thing. Because just about everything uses DirectX. And directX COMES WITH anything that uses it. And it has a simple graphical isntallation.
There should be one downloadable file for each piece of software I want. It should install on its own, on any linux machine, easily and graphically. And all of my library packages like glibc, etc. Should transparently update themselves to the newest versions all the time. I dont' want to have to worry about that stuff. Drivers in linux are incredibly difficult to install. They should become a simple right click, install driver. Done. I want all that other crap taken care of for me. I don't have time to change paths in config files, tinker with code, look up crazy commands and recompile crap.
I feel the package system is the real place in which linux fails. Most distros, lets use Mandrake as an example, have graphical easy installations. But when you get to the package selection phase you're stuck forever weeding through thousands and thousands of checkboxes. Not cool.
One piece of software should be one checkbox. KDE alone has like 20+ rpm files. There should be one file. KDE3setup.exe.
You know that installshield that almsot every piece of windows software has? Maybe someone could code that for linux. I would, but I have no idea how to do something like that. But I know someone reading this does. And if you want to save your open source os, I suggest you do.
The GeekNights podcast is going strong. Listen!
RPM by itself isn't the real problem here. The author is complaining that installing applications in Linux is a pain in the ass, because the system often doesn't have all of the required libs installed.
I admit, RPM doesn't make this an easy problem to solve. Any normal Windows app would simply package the required libraries with it. Thus if the lib doesn't exist, it can install it. But RPM doesn't work that way. RPMs can only hold one logical unit. So one app, or one library, or one set of platform independent support files. RPM builders could include more, but doing so will likely break the RPM dependancy tree.
The real problem in all of this is the destinction between applications and the system itself. Is grep part of the OS, or is it an addon app? How do you tell? Most would argue that grep is a part of the OS, but you can easily install linux without grep, so it must not be essential. But if packages expect it to be there, then it must be essential. But if it's not part of the OS, then they shouldn't have expected it to be there in the first place, so now it is their fault for not thinking ahead... This problem just goes in circles all day. The worst part about this is that my use of grep is just an example. This problem applies to literally all packages outside of the kernel itself. Don't believe me? How about init? Do you think that init is essential? I agree, but what version? Do you want a SysV init, or a BSD style init? Technically you can have either.
To solve this whole problem, we really need to take two steps. First we need to define a base Linux system. And I don't mean a completely solid, unwavering, definition either. Standards that never evolve are quickly dubbed 'legacy'. The trick is to define a complete base install. Everything from the kernel, to the version of GCC (and no RedHat, gcc 2.96 isn't going to cut it), to what version of X is installed, to what "expected unix utilities" are available, and what libraries are available. Feel free to change the standard, but each time you do so you must raise the bar somehow, either by making it more reliable, or faster, or adding features, or some combination of the above. There is only one last key item to making this system work. You must retain backwards binary compatability for long periods of time. Feel free to completely break legacy systems, but make sure that you only do so after you've had at least 5 to 6 years of stability.
Then there is the second step. RPM is a nice system management system, but it is a shitty application packager. Mostly because of the dependancy issues and the fact that each RPM package can only hold one logical unit. We really need an install shield like system for applications (both gui and console installs in the same package). Feel free to keep track of what is installed, and what files belong to who, but you really need to separate the system from the applications. Once you have a base defined, keeping the system and apps under the same packaging system no longer makes sense. The absolute need for it is removed.
I think the biggest thing we need with rpm (and other distro systems) is standardized package locations.
.dll registration. On Windows, the only way the OS knows about a .systemwide dll is when you've added an entry to the registry for it. On Linux...run ldconfig, and it rebuilds the systemwide cache (ld.so.cache), which is significantly faster (contiguous, not incrementally modified, not modified by all sorts of other apps storing filename associations and the like) to read.
That's already done in the LSB.
The problem is that each rpm is required to contain a static list of files it installs *with pathnames*. The nice thing about this is that it lets you run rpm -qip foo.i386.rpm without executing any code (sandboxed or otherwise) to see the list of files. The stupid thing is that there then has to be a totally different rpm for every distro and every maintainer.
In addition, it means that the maintainers need to keep *two* lists of what files are in the package -- one list for "make install" and the other for rpm. This is probably the most annoying design decision of RPM I've seen. There needs to be a FILES file with a list of installed files with a gen-files script (that runs sandboxed to build FILES for not-yet-installed packages and is run at package installation time to generate FILES). Have the Makefiles read this for make install. This would make life easier for maintainers (one list of files to install), would make RPMs more reliable (no accidental adding of a file to the Makefile but not to the spec file), and would let an RPM work on any distro (if we ever get the gcc-2.7, gcc-2.96, gcc-3 stuff worked out).
even though the newer libraries could do the job of the older ones
This is true for minor version number increases, but for a major version number change, newer libraries cannot simply link to the program.
Also, the registry is a fucking stupid idea. (despite the fact that GNOME and KDE are mindlessly cloning it). The registry causes more problems than anything else I've seen on a Windows system. The MacOS did things right -- let all your centralized databases just be caches for data that can be rebuilt from files around your system. If something gets borked or corrupted...that's okay. Absolutely do *not* make your single copy of data a registry -- put the masters around the system, and let the centralized db be rebuilt if necessary.
Also, registries require "installations" and "uninstallations" instead of just copying files. You can just copy appropriate files from one system to another and run code on a Linux or MacOS box. On a Windows box, you're in for running installers to poke at the registry. And finally, I've seen tons of broken Windows installers that poke at registry entries and end up completely screwing up data that some other app uses. For example, a friend once had Sonique and WinAmp installed, but couldn't associate mp3s with either. I took a look at the registry -- Microsoft's two-entry file association scheme let the extension entry point to a nonexistent application entry, IIRC. As a result, the mp3 entry didn't show up in the Folder Options dialog in Explorer, and couldn't be reassigned, and WinAmp and Sonique kept giving errors when trying to grab associations.
The day any distro starts requiring a registry is the day I never touch that distro again. Right now, I can just uninstall GNOME if I want to do so.
Oh, and another thing. The Windows registry is a *massive* shared database. As a result, tons of stuff modifies it and causes internal fragmentation and loss of physical continuity between related keys. Then all apps use the registry heavily (God, I hate apps that poll it), so you get slow app launch times, that annoying disk churning that you hear on Windows boxes...rrrgh.
Take a look at
The registry is basically a hack, because Windows *used* to have what MS considered a worse scheme (.ini files). It isn't a very well thought out system.
May we never see th
I started my Linux experience with SLS and a 0.99 kernel. Then I switched to Slackware, then flirted with Caldera. Then for a while I ran RedHat on my servers, before switching in about 1999 to Mandrake on all machines.
And then I decided to experiment with Debian on a test box, and fell in love. I now have it on my desktop, my laptop, and three out of my five servers.
Why?
The package manager. It just works. It just works reliably, installing all the right stuff, resolving all the dependencies. When there are conflicts (not often) it reports them and suggests remedies. In short, the Debian package manager is to all other UN*X package systems I've ever seen as a computer is to a tally-stick. No-one who has used dselect will ever go back to RPM.
I'm old enough to remember when discussions on Slashdot were well informed.
I don't know if you've noticed lately, but libraries _are_ packages today. GTK+ for example. Qt, ncurses, etc. And if a package creates a _new_ library, then not many people are going to depend on it. And if they _do_ depend on it, they might as well depend on the entire package being there--since the library is a _part_ of the package.
The idea of sharing arbitrary library code is a failed experiment. If I create MyProgram and then I create MyProgramLib.. not many people will ever use the library. The only case they _will_ use that library is if I _package_ it seperately, and make it a coherent entity itself--with documentation. This is why, IMO, going package-only and dropping the various */lib directories can only be a Good Thing. And this is how Red Hat, etc. do it today. They create dependencies between _packages_. If I create an app in RPM format that needs, say libgimp, then my package will depend on the _entire_ gimp package being installed. Not just libgimp. Why not just handle packages naturally?
I'd also like to point out the benefits of doing this:
- Package corruption will be detected immediately. When something depends on a package and a file is missing or corrupt then the package can be determined corrupt.
- Dependencies handled naturally. When a program complains that a file doesn't exist, I can pinpoint _exactly_ which package the file is in and can simply reinstall the package. No need to hunt down which file belongs to which package.
Dijkstra Considered Dead
Say your package directory was /usr/app (or whatever, there are standards for these things, y'know) libpng would live in /usr/app/libpng, qt would live in /usr/app/qt. Things could still dynamically link them, and it would still Just Work. The only difference is that you don't have four hundred files all crammed in /usr/lib.
/usr/app/libpng/n.m . Which is only a refinement, but which is much safer. In the case of large packages this would cost a lot of disk space (how many versions of KDE or Gnome do you want to keep on your computer?), but OTOH it would be a lot safer. You could keep multiple versions of even so prevasive a package as KDE or Gnome during development, and if one didn't work, you could revert to an earlier version. (Yes, something like this is done during development anyway, but that requires special fiddling, and changing the directories around when it finalizes, etc. This approach wouldn't. And deleting an obsolete version would be nearly as easy as removing the directory (well, you *would* need to check for dependencies).
/usr/bin directory to become composed entirely(?) of links. Still, I've done that already when trying out a new version of Python, and it didn't seem to cause any problems. (I suppose that the other bin directories probably wouldn't be affected that way. Especially /bin and /sbin, since they might be needed when other partitions weren't mounted.
Almost. I think that he was really proposing that libpng version n.m would live in
I guess that a side effect would be for the
I think we've pushed this "anyone can grow up to be president" thing too far.
I've seen a lot of this "dependency hell" and it makes me really hate dependency on .so's:
with a statically-linked build, it either works -- reliably -- or it doesn't work at all. I've heard
all the .so justifications before, and from my point
of view as a practicing fifty-year-old mathematician, computer scientist, and
environmental modeler, it is all a lot of bunk
when it comes up against the real practice of computing.
"My opinions are my own, and I've got *lots* of them!"
It isn't the packaging format really
Source Mage and Gentoo[1] are two excellent source based distros that avoid these classes of problems altogether, and unlike RPM (or debs[2]) add no burden to the upstream software developer.
Shawn Gordon of The Kompany touches on this when he says (from the article, you did read the article, right?)
Source based distros like Gentoo and Source Mage have packaging systems that automate the process of downloading, configuring, compiling, and installing all of the software on their systems from source (pedants will note there is the occasional binary package, e.g. NVidia drivers, but for the vast, vast majority of software my point holds). Indeed, this approach makes the packaging system itself less important (so long as it works properly) than the overall engineering and organization of the distro itself, and completely irrelevant to the software developer (as it should be).
This has a couple of disadvantages, and a whole bunch of real advantages. So much so that almost no one who has used a source based distro will go back to a binary based distro once they've tried it, despite the cons (in fact, of the numerous people I know who've tried Source Mage and Gentoo, both very different from one another BTW, I know of not a single person who has gone back to their old binary favorite, be it Suse, Mandrake, Red Hat, or Debian).
There are numerous other advantages I could add here, but you get the idea.
The entire article on the flaws of RPM might better be entitled "The Flaws of Source Based Distributions" which, in the age of Free Software and source code availability, coupled with todays fast processors, really ought to become a thing of the past. In fact, it wouldn't surprise me at all to see Debian, Suse, Mandrake, and Red Hat all embracing the notion of source-based distros sometime in the future
And the advantages in speed, stability, and ability to keep current with new software releases in a timely manner will only become more acute as time goes on.
So while binary based distros are by no means dead (despite my rather provocative headline), it is my opinion that the writing is certainly on the wall, and the ovservant person can already mark the shifting change in the wind.
[1]There are other source based distros as well, including Linux from Scratch and Lunar Penguin, and likely others as well.
[2]Though in fairness the Debian developers take up most if not all of that burden
The Future of Human Evolution: Autonomy
I run a system based loosely on Linux from scratch, which adopts a link farm approach like you describe. My /usr/bin (and /usr/blah directories generally) indeed do have hundreds and hundreds of symbolic links. This probably impacts performance, but I've not noticed it on my K6-3/400 PC with old slow IDE disks. Using some simple perl scripts to create, retarget and clean up symbolic link farms, package management is simple. The key benefit is that the metadata associating a file with its package is the symbolic link itself - it is logically incapable of becoming out of sync.
My work-around for the root file system is as follows. Each package I keep in /usr/pkg/packagename-version. Things destined for /usr/bin live in /usr/pkg/packagename-version/bin and so on. Things which need to end up in (say) /sbin live in /usr/pkg/packagename=version/root/sbin. I cp -a the contents of these root subdirectories into /.
This mechanism is a comprimise, but works quite well. I can compare files in root fs directories against those in /usr/pkg/*/root to find which file came from which package. Updating is a simple cp -a.
Why not do the same for /usr, and avoid the symbolic link farms? Primary reason is that while copying into the root fs those files that need to be there might take up 30MB or so, doing the same for /usr would mean an extra 500MB or more of duplicated data. The other reason is that for those packages which aren't too tied to their location in the filesystem, differing versions can be present on the system simultaneously.
What I'd love to have in a package manager is a more intelligent dependency check. Like, instead of just saying "I need this version of X," it would also just check for the existance of /usr/X11R6. Or if a package requires BerkelyDB, after checking "inside" the package manager, just try and see if there's a libdb.so somewhere in the LD search path. And then mark down "inside" the package management system that the "BerkelyDB" or "XFree86" dependency seemed to be fulfilled by a manual installation.
That would be the ideal system for me.
Al Qaeda has ninjas!
Actually, there is a limitation of .rpm that hinders the APT4RPM functionality-- file dependencies. .rpm archives depend on specific files, while .debs depend on specific packages. This can be worked around, essentially by creating a list that maps files-that-are-depended-upon to packages-containing-these.
But yes, there is at least one technical superiority of the .deb file format. I have never heard any argument that .rpms have a technical superiority to .debs, so I have to wonder: why don't RPM-based distros don't switch to deb? They could just adopt the .deb file format as RPM 5, make the tools speak deb, and stop worrying about it. They'd serve their users better and reduce duplication of effort.
Or perhaps users should take it into their own hands. Using tools like 'alien', it might be possible to take the apt4rpm approach one step further-- create an unofficial 'Redhat .deb' distribution-- the same packages as Red Hat, but in a different package format.
Yeah, there is couple of problems with RPM, but:
.spec file) :)
- it's easy to do upgrades (on RedHat, don't know about others) I do it several years from remote location, and only once it failed because of bad LILO configuration...
- you always know which file belongs to which package
- you can verify checksums of all installed files
- dependencies is not a problem - it's a solution to the problem
- it's simple to locate needed package from distro
- if you're trying to install someone else package, you'll better to get sources, and build rpm package youself
- I agree that it is bad idea to distribute rpm binaries only, so best is to post tar.gz source, rpm packages are optional (it is good if source includes
- and if you don't like dependencies, you can always use --nodeps
P.S. When I start using linux in 1995, first distribution I installed was Slackware, and after one year I switched to RedHat.
Slackware is a good, but you have same dependency problems (and you even don't know which package to install in case of such problem, lets say then installing some binary package). It also much harder to upgrade it....
What if, when you wanted to perform a binary installation, it checked dependancies the same way that autoconf-like programs do... tries to find them in particular locations, and creates a configuration file for that program based on what it found? It can do version checking as well, and report any mismatches to the user. In situations where there isn't a clear-cut place to put such a file, the installer could create a bourne shell startup script instead. It would work everywhere, and wouldn't be dependant on _any_ rpm or deb databases.
I realize that this would require one new file (either a config file stored in the program's library directory, or a shell script used for startup), for each package that gets installed, but we're already looking at wasting space with the rpm or deb databases anyways.... this solution wouldn't take up any more space and has the added bonus of being completely cross-distribution!
For library packages, it shouldn't even need to store a config file... it can just check the versions of the software or libraries that it does require and report back to you. The job of actually finding the libraries as they are needed can be performed by the linker, which is presumably set up to search applicable directories. Heck, if it's not, even this information could be reported at installation time too!
File under 'M' for 'Manic ranting'