Petreley On Simplifying Software Installation for Linux
markcappel writes "RAM, bandwidth, and disk space are cheap while system administrator time is expensive. That's the basis for Nicholas Petreley's 3,250-word outline for making Linux software installation painless and cross-distro." The summary paragraph gives some hint as to why this isn't likely to happen anytime soon.
I tried it (gentoo) some time ago. After two weeks of frustration I moved back to debian.
For me it was more like
1. emerge
2 come back in 8 hours and then:
a. see whole bunch of compilation errors,
b. dependencies were not sorted out correct, so nothing works
c. combination of above
I specially liked (still do) the optimization potential (where debian is stuck at i386), but it didn't work for me.
this sig has intentionally been left blank
Static linking is a seriously bad idea. Part of the job of a packager is to arrange the app so it doesn't include its own copies of packages but uses the standard ones available on the system (and states these dependencies explicitly, so the installer can resolve them automatically).
Take zlib as an example of a library that is commonly used. When a security hole was found in zlib a few months ago, dynamically linked packages can be fixed by replacing the zlib library. This is as it should be. But those that for some reason disdained to use the standard installed libz.so and insisted on static linking needed to be rebuilt and reinstalled.
(OK I have mostly just restated what the parent post said, so mod him up and not me.)
Quite apart from the stupidity of having ten different copies of the same library loaded into memory rather than sharing it between processes (and RAM may be cheap, but not cheap enough that you want to do this... consider also the CPU cache).
A similar problem applies to an app which includes copies of libraries in its own package. This is a bit like static linking in that it too means more work to update a library and higher disk/RAM usage.
Finally there is a philosophical issue. What right has FooEdit got to say that it needs libfred exactly version 1.823.281a3, and not only that exact version but the exact binary image included in the package? The app should be written to a published interface of the library and then work with whatever version is installed. If the interface provided by libfred changes, the new version should be installed with a different soname, that is libfred.so.2 rather than libfred.so.1. It's true that some libraries make backwards-incompatible changes without updating the sonames, but the answer then is to fix those libraries.
-- Ed Avis ed@membled.com
First of all, RAM and disk space are NOT cheap. I spent 60 euros for 256 MB RAM, that's is not cheap (it's more than 120 Dutch guilders for goodness's sake!). A 60 GB harddisk still costs more than 200 euros. Again: not cheap. Until I can buy 256 MB RAM for 10 euros or less, and 60 GB harddisks for less than 90 euros, I call them everything but cheap.
What's even less cheap is bandwidth. Not everybody has broadband. Heck, many people can't get broadband. I have many friends who are still using 56k. It's just wrong to alienate them under the philosophy "bandwidth is cheap".
And just look at how expensive broadband is (at least here): 1 mbit downstream and 128 kbit upstream (cable), for 52 euros per month (more than 110 Dutch guilders!), that's just insane. And I even have a data limit.
There is no excuse for wasting resources. Resources are NOT cheap dispite what everbody claims.
The article is surprisingly dense for such a word count -- yet is easy to read.
;-)
Petreley is undoubtedly getting the grip of this "writing thing"...
Seriously, though, however smart and logical are his conclusions, one thing bothers me: the installation should be simplified but "right", too.
I mean, there are other objectives besides being easy.
Last week I tried to install Red Hat 8.0 on a Pentium 75Mhz with 32MB RAM (testing an old machine as X-terminals). It didn't work.
The installation froze at the first package -- glibc (it was a network installation) -- probably due to lack of memory (as evidenced by free et al.).
Why? It was a textmode installation. I know from past experience that older versions of Red Hat would install ok (I used to have smaller computers).
My suspect is that Red Hat has become too easy -- and bloated. Mind you, I opted for Red Hat instead of Slack or Debian because of my recent experiences, in which RH showed to recognize hardware better than others.
I hope Petreley's proposed simplification, when implemented, takes size into consideration. The way it is (using static libs, for instance), it seems the other way.
The article as a whole, though, present neat ideas and it's one of the best I've recently read.
Assuming redhat, freebsd, windows and mac osx installers installed and setup how you like, the interfaces are a lot simpler than gentoo or debian's. /dev/sda* instead. Machine doesn't have a CD-ROM drive or network isn't supported? I made a nfsroot kernel, mounted root from another one of my gentoo boxes, and did the install from there.
I have installed windows, redhat, and gentoo. Yes, windows and redhat have much prettier interfaces. However, I have spent countless hours trying to install windows and redhat because the install tried to do something I didn't want it to do and crashed.
Windows 2000: the box has IDE and SCSI drives. I wanted windows on the SCSI drive as C. I had to take the IDE drive out to get it to let me. I don't even know where to start installing windows 2000 on a box without a CD-ROM drive.
RedHat: Anybody ever try installing RedHat onto a new box using ReiserFS and network install when the card is listed but the module won't load? I gave up and installed a CD-ROM drive.
Gentoo's install does take a long time but I never had these problems. When I was selecting where to install, I just used
Slackware has a similar install procedure (all console) but it doesn't compile everything like gentoo.
So the point is, "Assuming redhat, freebsd, windows and mac osx installers installed and setup how you like" is a very big assumption.
But that's not the point. The point is, the interface, not the process behind the interface, needs to be intuitive.
/dev/sda, to someone who doesn't understand taht a / is a file seperator, /dev is where your devices are and sda is where your scsi (I'm not a linu dude, i might be wrong) is a scsi device is a small hurdle as well. But for Gentoo, it's not the only small hurdle. Things like emerge and what to do if a package doesn't compile properly is another. I've had it happen. Just re- emerge.
I remember back in the BBS days, it took me an hour or too to realize that...
Continue (Y/n):
Meant Y was the default. Not that Gentoo does or doesn't do it. But it's guilty of the same thing OpenBSD does. The interface is very VERY simple. Just not intuitive.
For the general case, win2k and redhat have intuitive install interfaces. Skip the actual working or not working of some driver or some odd setup. It's the clicking of the buttons and the finishing of the process. You can't very well understand what you are doing if the interface is unusable.
Look at DOS. Dos was a very simple interface except for one facet. It used drive letters. Other than that small hurdle, you were fine. That's the problem gentoo has as well.
Linux and *BSD are great under the hood. Quite stable. I'm not a windows user other than at work, and I'm not fond of it. But if your interfaces suck, how can you get anything done?
--
"I'm not bright. Big words confuse me. But Wanda loves me and that should be enough for you." - Cosmo
As I see it, the following things need to happen to really make application installation be very clean under any Unix like operating system:
Too damn many times I've tried to install FOO, only to be told by the packaging system "FOO needs BAR". But FOO doesn't *need* BAR, it just works "better" if BAR is present (e.g. the XFree packages from RedHat requiring kernel-drm to install, but working just fine (minus accelerated OpenGL) without it).
Were venders to do this, then a program install could be handled by a simple shell script - untar to
The system could provide a means to access the HTML (a simple, stupid server bound to a local port, maybe?) so that you could browse all installed apps' help files online.
As a final fanciness, you could have an automatic process to symlink apps into a
ls
and see them.
www.eFax.com are spammers
One of the greatest strengths of the UNIX platform is its diversity..
Package installation is a simple prospect on the Windows platform for the simple reason that the platform has little diversity.
Windows supports a very limited set of processors.. So there's one factor that windows packaging doesn't have to worry about.
Windows doesn't generally provide seperately compiled binaries for slightly different processors ("Fat binaries" are used instead, wasting space).. So the packaging system doesn't have to worry about that. On linux, on the other hand, you can get separate packages for an athlon-tbird version and an original athlon version.
On an MS system, the installers contain all the libraries the package needs that have the potential to not be on the system already. This could make the packages rather large, but ensures the user doesn't have to deal with dependencies. Personally, I'd rather deal with dependencies myself than super-size every installer that relies on a shared object..
Furthermore, on windows there arn't several different distributions to worry about, so the installers don't have to deal with that either.
All of these point confer more flexibility to the unix system but have the inevitable consequence that package management can get to be rather a complex art. We could simplify package management a great deal, but it'd mean giving up the above advantages.
This same problem occurs in the windows world as well, dll hell as it is often called. Here's how it works for windows. Say your program needs vbrun32.dll. You have a choice. You can put the dll in the same folder as the executable, in which case your program will find it and load the right dll. Or you can put it in the system or system32 dll in which case your program and others can find it and load it. However, if vbrun32.dll is already loaded into memory, your program will use that one. I remember we used to have problems with apps only working if loaded in the right order so the right dll would load.
As with Linux, if there's a bug in the library you have to update either one file or search through the computer and update all instances. But, as with linux, the update can mess up some programs, others might be poorly coded and not run with newer versions of the dll. I've seen this last problem in both windows and linux; it looks like the programmer did if version != 3.001 then fail instead of if version 3.001 then fail.
If everyone is forced to use the same library, you get these problems and benefits:
--1 easy point of update
--1 easy point of failure
--older software may not run with newer versions
--programmers may insist on a specific version number
--updates to the libraries can benefit all programs; if kde or windows gets a new file open dialog box, then all programs that link to the common library can have the newer look and feel by updating just one library.
On the other hand, if you let each program have its own, you get these problems and benefits:
--difficult to update libraries when bugs are found
--can run into problems if a different version of the library is already loaded into memory (does this happen with linux?)
--guarantee that libraries are compatible with your app
--compartmentalization; everything you need for an app is in it's directory. Want to uninstall? Just delete the directory. No need to worry that deleting the app will affect anything else.
--no weird dependencies. Why does app X need me to install app Y when they clearly aren't related at all. The answer is shared libraries. Which is why many people like Gentoo and building from source.
Microsoft has waffled back and forth on the issue. Under dos, everything just went into one directory and that was it. Windows brought in the system directory for shared dll's. Now the latest versions of windows are back to having each app and all of its dlls in one directory.
Personally, I think compartmentalization is the key, provided we get some intelligent updaters. If libthingy needs to be updated, the install procedure should do a search and find all instances of the library, back up existing versions and then update all of them. This wouldn't be that hard to do.
Important features of the way AmigaOS libraries worked:
* All libraries were versioned, but not on the file system level. Each library contained a version number in it's header.
* Versions of the same library were always backwards compatible. This was Law. Software using an old version of a library must continue to work on future versions. This also meant that developers had to think out thier library API beforehand (because you would have to maintain that API). Libraries could be extended though with extra functions.
* Application programs had to 'open' libraries before using them. When opening a library an application would specify the minimum version that it required of the library. (If no matching or later version was found then the program would have to exit gracefully).
* There tended to be few (compared to Linux anyway) libraries. Libraries tended to be biggish. A few big libraries instead of many tiny libraries. This made them manageable.
* The backwards compatibility rule/law for libraries meant that software could bring it's own version of a library and update the existing version of that library, but *only* if it was a more up-to-date version.
As a previous poster pointed out, a lot of the problem would disappear if people (library writers) maintained compatible interfaces for the same library soname. I'm pretty sure that this is the way it was designed to work.
anyway, a FYI.
--
Simon
And, sorry, but your post is an example of "more leet than thou"! This isn't meant to flame you, but you make the common mistake of assuming that "easy for a newbie to use" MUST EQUAL "dumbed down", and that's absolutely not the case.
Look at the apps that have options to use either basic or advanced interface. Selecting the basic interface doesn't mean that the app somehow no longer knows any of its more 1337 functions; it just means they aren't in the user's face, baffling the newbie with a million options he can't make head nor tail of. And as a rule, the default is the advanced interface, but the "make it simpler" option is easy to find right off.
This really isn't much different from "custom" and "standard" options for installers. Yeah, it requires more thought on the part of the developer. Is that a *bad* thing??
~REZ~ #43301. Who'd fake being me anyway?
Some very good points and ideas, but also IMHO some misguided assumptions and directions.
/usr/bin, /usr/lib, etc. ) Also the idea of having configuration files that resolve dependancies forces the application to use such configurations, which is also undesireable.
/opt or into /usr/bin or /usr/local if you don't want it there.
1) RAM & Disk space is not always cheap, or even readily available. There are many legacy systems where users would benefit from these advantages but the users are unable or unwilling to upgrade the system. What happens to old 486 and 586 systems where the motherboard doesn't support drives larger than X - there are work arounds, but the people who need easier install processes aren't going to tackle the complex system configuration issues to implement these. What happens when you can no longer obtain RAM in your community for your old machine, or it no longer has spare slots, etc. What happens if you have a second hand computer and simply don't have the available $$ to spend on upgrades, no matter how cheap they are. I don't like the idea of designing an easier-to-use system that excludes such people, no matter how small a portion of the market they may be. Hence redundant copies of libraries and staticly linked libraries are a very inelegant solution for these people.
2) We musn't impose requirements on application developers to use a given installer library, or code their apps to conform with particular standards that the installer requires - it is again unfeasible and undesireable in many circumstances. Developers have more than enough to worry about as it is without having to reimplement the way their app behaves to be installer friendly. The installer must exist at a level independant of the way the application has been coded, to a reasonable degree. I think that much of the problem that exists currently is that too much of the "packager" issues of making apps compatible to a hundred and one different unices has been getting dumped on developers and this both reduces their time for actual development and means that we have a hodge-podge of apps that are compatible to an unpredictable degree, because essentially developers don't want to be burdened with this.
3) Diversity is the spice of life, and it is the spice of unix. The community of unices is robust because it has adapted systems which are generally stable and reliable across a vast array of hardware and software. We want to capitalize on this tradition and expand and enhance it, not force anyone to use a particular layout for their apps & installations. This being said, I find the idea of local copies of libraries in the application directory unappealing, because it forces one to have a local directory ( rather than using
5) Aside from all these criticisms, there are many things I do agree with. Particularly that dependencies should be file specific, not package specific, that an integration of installer & linker is key to the organization of such a tool. I also agree that the installer should make use of auto-generated scripts wherever possible, and should provide detailed, useful messages to the end user that will help them to either resolve the conflicts in as friendly a way as possible, or to report the conflicts to their distribution. Also the installer should have advanced modes that allow for applications to be installed in accordance with a user or administrator prefered file system. That is one shouldn't be forced to install into
Given all this, is there any possible way to solve all of this in one consistent system? I think so - but it may require something that many will immediately wretch over. A registry. That's write, I used the foul windoze word registry. I propose a per-file database for libraries & applications that would record where given versions of given libraries are installed, under what names, in what directories, of what versions, providing what
There are a thousand forms of subversion, but few can equal the convenience and immediacy of a cream pie -Noel Godin
A good rule to follow is to never dynamically link to something that's substantially smaller than your program.
Dynamic linking tends to pull in too much. If you use "cos", dynamic linking hauls in the whole math library. Yes, the paging system will eventually figure out what the working set is. Every time the program runs.
Maybe we shouldn't have "shared libraries" at all. We should use static libraries for things that are really libraries (math, I/O), and "big objects" (CORBA, .NET, etc.) for things that have lots of internal state (GUIs, databases). The big object systems already have machinery for version management and interface incompatibilities.
For those who haven't tried it:
"The Zero Install system removes the need to install software or libraries by running all programs from a network filesystem. The filesystem in question is the Internet as a whole, with an aggressive caching system to make it as fast as (or faster than) traditional systems such as Debian's APT repository, and to allow for offline use. It doesn't require any central authority to maintain it, and allows users to run software without needing the root password."
Hmm, this seems to ignore one of the big advantages of an always-build-from-source distribution. If you were using Debian or RedHat and the links package required svgalib, I'd think 'fair enough: it was probably built with the svgalib drivers'. But if you are building from source there should be the choice of using svgalib or not.
Compile-time feature choices aren't handled very well by most package managers. The usual approach is to turn on every possible feature, but this leads to some odd dependencies as you observed (on my Slackware box I needed to install the Enlightenment sound daemon to get gdm working). Maybe a better approach would be to split the features into pseudo-packages. Something like:
% inst links[builds and installs links with minimal features]
% inst links-svga
[rebuilds links with svgalib support and installs that - also installs svgalib if needed]
% inst links-x11
[ditto for X11 support]
% uninst links-svga
[turns off svgalib support and rebuilds links yet again, so now it has X11 support but no svga]
Alternatively the links package could be split into a bunch of libraries (links_svgalib_driver.so, etc etc) which it checks for at run time. Then the links-svga package would contain that library. But this approach requires changes to the application.
-- Ed Avis ed@membled.com
While I think the debate over static vs dynamic libraries, DLL Hell, and registry vs central vs distributed storage of program parameters and settings is all worthwhile, he didn't cover what I think is *the* most important issue in the Linux installation process, and that is device detection.
MOST of the problems I've had with installing Windows, Linux or OS X involve the fact that when I am all done, not all the components of my machine are working the way I expected them to. I end up with no sound, or bad sound, or video that isn't right, or a mouse that doesn't work, or in the really bad cases, disk drives that work well enough to boot the system but then fail after I'm in the middle of something important.
Once I get past the initial installation I feel I am home free. If the devices all work the way they are supposed to, then I can avoid most other problems by just sticking with the distro that I started with. If it was Debian Stable I stay with that, and if I need to install something that isn't part of that system I install it as a user (new version of Mozilla, Evolution, Real*, Java for example).
It would definitely be nice if developers who used shared libraries didn't seem to live in a fantasy land where they are the only users of those libraries. But I *don't* think that this is Linux's biggest problem with acceptance. What Linux needs is an agreement by all the distros to use something like the Knoppix device detection process... and then to cooperatively improve on it. A run-from-CD version of every distro would be great. Why blow away whatever you are running now just to find out if another version of Linux might suit you better?
I'd like a system that does a pre-install phase where every component of my system can be detected and tested before I commit to doing the install. The results of that could be saved somewhere so that when I commit to the install I don't have to answer any questions a second time (and possibly get it wrong).
There is nothing that can guarantee that what appears to be a good install doesn't go bad a week later, but I personally haven't had this happen. I usually know I have a bad install within a few minutes of booting up the first time, and by then, its too late to easily go back to the system that was "good enough".
the optimisation really only comes in to play if the whole system has been optimised for a particular host. often just changing gcc to a more modern version is enough; changing i386 even to pentium4 with mmx/sse/sse2 has not given me much of a performance increase for everyday use (except in numerical code), but gcc-3.2 (on LFS) was resposible for insane speedups over my old mandrake!