Why Switch a Big Software Project to autoconf?
woggo queries: "I'm a CS grad student working on a large research project (over 1 million lines of code, supported on many platforms). The project has been under development for several years, and the build system is nontrivial for end-users. We'd like to make it easier to build our software, and I'm investigating the feasibility of migrating to GNU autoconf. I need to demonstrate that the benefits of autoconf outweigh the costs of migrating a large system of makefiles with a lot of ad-hoc kludge-enabling logic. Has anyone made a similar case to their advisor/manager? Does anyone have some good 'autoconfiscation' war stories to share? (I've already seen the Berkeley amd story and the obvious links from a google search....)" Depending on the intricacies of the build process, such a conversion might take an awful lot of work. It might be easier to put a nicer face on the "nontrivial build process", although there is something to be said for the ease of "./configure; make; make install"
You don't have to migrate the whole thing at once. Just start with a nice simple configure.in that copies Makefile.in to Makefile and config.h.in to config.h and add #includes for config.h everywhere.
Make sure new tests go in configure.in and you can slowly move across with little trouble. It should therefore be pretty easy to show that the costs are very low...
Here's an article on the subject, written by Uwe Ohse (you can read the original article here. Many of the problems were fixed in the mean time, but it makes an interesting read nevertheless.
autoconf config.guess, config.sub There was no source from which one could get up-to-date version of these scripts, which are used to determine the operating system type. This often caused pain: asked the openbsd port maintainers about it. (btw: there is now a canonical source for them, "ftp.gnu.org/gnu/config") autoconf takes the wrong approach The autoconf approach is, in short:
- check for headers and define some preprocessor variables based
on the result.
- check for functions and define some preprocessor variables.
- replace functions not already found in the installed libraries.
Yes, it works, albeit some details are discouraging, to but it mildly.No, it doesn't work good enough. This approach has lead to an incredible amount of badly code around the world.
Studying the autoconf documentation one learns what kind of incompatibilities exists. Using autoconf one can work around them. But autoconf doesn't provide a known working solution of such problems. Examples:
- AC_REPLACE_FUNCS, a macro to replace a function not in the
systems libraries, leads to the inclusion of rarely used
code in the package - which is a recipe for disaster.
On the developers system the function unportable() is available,
on another system it isn't? Oh well, just compile unportable.c
there and link it into the programs
...
- The same is true for AC_CHECK_FUNC(S). In this case there isn't
a replacement source file, but even worse, there's an #ifdef
inside the main sources, unless the programmers are careful
to use wrappers, which they often aren't, because compatibility
problems are often discovered very late in the testing process
(or even after release) and people are usually trying to make
the smallest possible changes.
In both cases you end up with rarely, if ever, used code in your programs. It's not dead code, it's zombie code - one day, somewhere, it will get alive again.Yes, this solves a problem. But it's overused, it's dangerous. In many cases unportable.c doesn't work on the developers system, so she can't test it. On other cases unportable.c only works correctly on _one_ kind of system, but will be used on others, too.
Yes, the often used packages _have_ been tested almost everywhere. But what about the lesser often used? ...
Keep in mind that there is no central repository of replacement functions anywhere
This is surely nothing which can be avoided completely, but it's something which has to be avoided whereever possible.
There's a solution to this problem, but it is completely different from what's used now: Instead of providing bare workaround autoconf (or a wrapper around it) ought to provide an abstraction layer above it, and a central repository for such things.
That way a programmer wouldn't use opendir, readdir, closedir directly but call they through the wrap_opendir, wrap_readdir and wrap_closedir functions (i'm aware of the fact that the GNU C library is this kind of wrapper, but it hasn't been ported to lots of systems, and you can't rip only a few functions out of it).
autoconf macros are inconsistent. For example: AC_FUNC_FNMATCH checks whether fnmatch is available and usable, and defines HAVE_FNMATCH in this case. AC_FUNC_MEMCMP checks for availability and usability of memcmp, and adds memcmp.o to LIBOBJS if that's not the case. Other examples exist. autoconf is not namespace-clean. autoconf doesn't stick to a small set of prefixes for macro names. For example it defines CLOSEDIR_VOID, STDC_HEADERS, MAJOR_IN_MKDEV, WORDS_BIGENDIAN, in addition to a number of HAVE_somethings. I really dislike that, and it seems to get worse with every new release.
My absolutely best-loved macro in this regard is AC_FUNC_GETLOADAVG, which might define the following symbols: SVR4, DGUX, UMAX, UMAX4_3, NLIST_STRUCT, NLIST_NAME_UNION, GETLOADAVG_PRIVILEGED. autoconf is large I'm feeling uneasy about the sheer size of autoconf. I'm not impressed: autoconf-2.13.tar.gz has a size of 440 KB. Add automake to that (350 KB for version 1.4). Does it _really_ have to be that large? I don't think so.
The size has a meaning - for me it means autoconf is very complicated. It didn't use to be so, back in the "good old days". And it accomplished it's task. I really don't see that it can do so much more today (i don't mean "so much more for me"). configure is large Even trivial configure scripts amount to 70 KB of size. Not much?
Compressed with gzip it's still 16 KB. Multiply that by millions of copies and millions of downloads.
No, i don't object to the size. It's perfectly ok if you get something for it. But you don't, about one half or more of each configure script can be thrown away without any lossage.
- Large parts of it just deal
with caching, which wouldn't be needed if configure wasn't so slow.
- Other parts of it are the --help output, which looks so good
...
but doesn't help usually (try it and find out what to do, without
reading README or INSTALL).
- Then there is the most bloated
command line argument parser i've ever seen in any shell script.
- Then there are many, many comments, but they aren't meant to
help you seeing what's going on inside configure, they are the
documentation for the macro maintainers (some might actually
prove to be useful, but the vast majority doesn't).
The configure scripts are the utter horror to read. There's a reason for this: configure doesn't use any "advanced" feature of the shell. But i wonder - are shell functions really unportable? And if the answer is yes: Do you really expect anything to work on that system? The problem is that a shell that old is unlikely to handle almost anything, for example large here documents.The configure scripts are the utter horror to debug. Please just try _once_ to debug 4000 lines of automatically generated shell scripts.
Note the autoconf maintainers: The road you're on is going to end soon. autoconf is badly maintained Let me clarify this first: I don't think bad about the developement. I'm missing maintainance of the already released versions. Now, at the end of 2000, almost two years have passed without a maintainance release of autoconf. 9 months have passed since a security problem has been found (in the handling of temporary files). There have more bugs been found, of course. ...
I know that nobody likes to dig in old code, but 2 years are a little bit much. automake My primary objection to automake is that automake forces me to use the same version of autoconf everywhere. Autoconf has a large number of inter-version conflicts anyway, but automake makes that situation worse, much worse.
I'd need the same version of both tools on all machines i happen to touch the Makefile.am or configure.in or any other autoconf input file on. There are a number of reasons for that, one of them is that automake used to provide some autoconf-macros during the years autoconf wasn't developed at all, and these macros are now moved to autoconf, where they belong to. But if you happen to use, say, AM_PROG_INSTALL, and later versions declare that macro obsolete
That doesn't sound too bad? Ok, but suppose
- update all those machines regulary (i'm not going to really do
that, i'd rather stick to what's installed, but anyway)
- i didn't touch a project for, say, 2 years, and then i need to
change something and release a new version. This involves changing
the version number in configure.in.
In more cases than not this will need considerable changes to configure.in. Some major, most minor - but even the minor ones need attention.I found that hard to deal with. Things were even worse since every CVS checkout tends to change time stamps, which can mean that autoreconf is run even if there's no chance been done to any autconf input file.
Don't misunderstand me: i don't attribute that to automake. I attribute it to the internal instability of autoconf. Unfortunately you can't have automake without autoconf. libtool Libtool adds an additional layer of complexity. I never had any reason to look at the insides of libtool (which proves that it worked for me). But having one project which used autoconf, automake and libtool together was enough - never again. I got exactly the same problems as i got with automake, described above, but they were worse and happened more often.
One problem with libtool is that releases don't happen very often. Libtool rarely is up to date with regards to changes on some operating systems. Which makes it difficult to use in packages meant to be really portable (to put it mildly).
A libtool script and a number of support files are distributed with every package making use of libtool, which ensures that any user can compile the package without having to install libtool before. Sounds good? But it isn't.
Another problem is the size of the libtool script. 4200 lines ...
summary
Autoconf is the weak link of the three tools, without any doubt.
Version 1 wasn't really designed, version 2 made a half-hearted
attempt of dealing with design problems. I'm not sure about the
next version.
Petru
I am an engineering grad student working on a similarly sized project. Our project is compiled on a variety of Unix platforms using automake, autoconf and libtool. As you are already compiling for multiple platforms you are 90% of the way there in determining the different needs for each compile. If you haven't already organized your build process, now would be a good time before it becomes 10M lines of code.
Autoconf and friends make it infinitely easier to compile our code. However you will have to put in a fair bit of work determining all variety of tests required to determine the idiosyncracies of each build. You are probably already doing something similar if you can build on multiple platforms.
Autoconf has been well worth the initial effort. Occasionally new compile problems crop up, but they are usually solved by the addition of another 1 line check in configure.in.
Selling autoconf should be easy. Wrestle with compile problems once getting autoconf working, or have users repeatedly wrestle with the problems without autoconf.
Well first, this isn't the issue here. We're talking about compiling from source which isn't the same as deciding how to install a binary once compiled.
As for rpm, etc you just want a graphical front-end, I think.
That said, Window's setup.exe is actually unnecessary complicated for users. With ROX, we're using application directories. There is no setup program because the program can just run in-place. See the example at the bottom of this page for an example. As a bonus, running a source package is done in exactly the same way as running a binary (it just takes a bit longer).
The whole business of installing is terribly arcane if you think about it (hint: the computer already has everything it needs to run the application... why the extra steps?)
I'm a novice when it comes to automated build tools, but I've been impressed by Ant, from the Jakarta project by the Apache Group. From what I've read, it seems that Ant can do almost everything autoconf can, but because it's written in Java and uses XML to store its configuration, there are no cross-platform issues. I should add that I have *very* limited experience with autoconf; I've really only *used* it, not developed with it, so my opinion is a fairly uneducated one. Has anyone else used Ant and autoconf enough to make a good argument for or against Ant?
For all the fact the libtool tag line is "Do you really want to worry about linking on AIX yourself?"...
... abnormal ... but you'd think libtool would know how to do it).
I spent quite a bit of time this summer trying to use autoconf'd stuff on AIX (Gtk+). I played with a pile of recent and not-so-recent versions of libtool. It was a pain in the butt. (Granted, linking on AIX is
I think the power of the autoconf suite to make things work across "all Unices" is a bit exaggerated. Check whether it really does a good job of supporting all the platforms you need, first.
Thanks to Auto conf, some really nasty #if's in the code have been removed by a single include line, its also able to simplify code by removing 50 trillion OS specific checks from the source files and only insert it when it needs to.
Once you get over the basic learning curve hurdles, then you should be fine.
It would be nice though if there was a makefile -> autoconf converter (but that is just me).
UPS Sucks
I'm working on a relatively large project myself.
We're using a build tool called "Jam", which can be gotten from http://www.perforce.com/jam/jam.html. It does a very good job at cross platformability and is faster than make at determining include dependencies.
An open source project that uses their own version of Jam are the boost libraries at http://www.boost.org/
Enjoy!
How about this instead?
apt-get --compile source packagename
and go have a cup of coffee...
Don't forget the Autoconf macro library and also the fact that there are thousands of free packages out there which will have configure scripts from which you can borrow - try to find packages in a similar domain to your own.
The difficulty (complexity, time taken) in maintaining a package which works on N platforms is usually proportional to N - or if you are unlucky, some small power of N like 2. What your code is really doing is trying to understand the properties of the target system and so the #ifdef __hpux__ for example in line 1249 of blurfl.c is actually trying to determine if the quux is use like this or like that. Autoconf on the other hand will produce a single preprocessor macro for driving the quux. This means that you don't have an extra 15 lines of #ifs to handle quuxes in other operating systems. Hence you may not have less #ifdeffed bits, but each of the #ifdeffed bits will be shorter.
Autoconf works differently to thge standard approach - it allows your code to work with each feature independently, and so while the normal approach is O(N) or O(N^2) in the number of suppoered OSes, the complexity of maintaining an autoconfiscated program is O(N) in the number of different features supported by the operating systems between them. The great thing about this is that Autoconf keeps these orthogonal and prevents these things from interacting too heavily, and it turns out to the the case that the total number of different features basically flattens out to some constant number even when you continue to add more supported OSes (i.e. there are only a certain total number of different ways of doing things even though each of the N operating systems can choose among X ways of doing Y things).
So, what this means is that you shold do the feasibility study as discussed above, but naturally autoconfiscating the whole system will take a while. The ideal time to do this is either
Another option to explore in combination with Autoconf is the Apache Portable Runtime. There is also Autoproject but I suspect that is a little lightweight for your needs.
Taking all the above together I suggest that you
When you are cross compiling, configure may not be able to run those tests, but you can help it out.
Define the CONFIG_SITE to point to a config.site file in configure's environment. Then put into your config.site file a line like:
ac_cv_func_getpgrp_void=yes
Yes, you do have to look at the configure code to find that name, but it lets you give the software the answers it needs.
HTH,
Eli
The Mozilla Project has a project on their website about migrating from their build system to autoconf. No Idea how far this got, but it fits your requirements of a huge project.
nikel
This is not a solution to the current question, but is something to think about when choosing which tools to use for any project. Different languages handle portability in different ways. The C approach is a mixed blessing; the core language is portable to about any platform in existence. But almost every non-trivial program uses many many library calls, and libraries can have many inconsistencies across platforms. Even if the functionality is the same, the functions may be called different names. In the case of more advanced features such as threads or IPC, different platforms can have entirely different systems. The way I see it (I may be wrong), automake and autoconf solve the stupid inconsistencies between library functions, and tell the user if a needed library is not installed.
Other languages take different approaches. Java has a very large set of libraries that are specified to be part of the language and so must be included. The Java language is also constant between platforms. Not every platform has a conforming Java environment, but the most popular ones do. Common Lisp also has huge functionality in its standard library that is part of the language specification. OCaml has a nice standard library, and is open source. If you want your program to work in OCaml on some unsupported archictecture, you can compile it yourself. This still leaves porting the library. If the target archicture has POSIX, this is easy.
Autoconf is a tool that in the end can only make portability choices for you. In order for those choices to mean anything you have to have a need for your software to be portable (to a wide number of platforms, really), and you need to understand the real issues with writing portable software.
If you're writing for FreeBSD, Solaris, and Linux 98% of the time for application software you can write it so there are no portability issues. Why have the autoconf step when you can "make;make install"? Modern systems are not all that different for high level stuff, and are converging for medium level stuff. It's really only the low level details keeping them apart.
If on the other hand running on an old Ultrix box, and on your SCO Unix box, or on that PDP-11 in your garage is important autoconf can give you the mechanisms to make all that work but only if you know that differences between the platforms, and what changes need to happen to your code to make it work . It does no good to have autoconf check to see if bzero exists if you don't know to use memcpy as an alternative, or vice versa. A check without an alternative is just a way of bombing a little sooner than the compile stage.
The other thing autoconf can help with is optional packages. These are not portability issues per se, but rather choices that need to be made, but often aren't worth bothering a user about. Consider the application that's all command line based except for a single X app that's not really needed, just nifty. Well, if the system doesn't have X, you don't build it, and if there's no X it's unlikely the user wanted to run X apps anyway.
As far as the mechanics go, autoconf is fairly easy. Once you understand the changes that need to happen making autoconf make them for you is trivial.
The maintainers of the autotools (autoconf, automake, libtool) wrote a book to help explain the approach used by the tools. (Yes, it's called the goat book. Read the page to find out why.)
I've seen an amazing amount of crap posted in these comments; the parent article by jvl001 is one of the few good exceptions. NO tool can get it all; the autotools get you about 90%, and you have to help it the rest of the way. There are solution for just about all of the problems and red-herrings I've seen posted here, but you need to look a little farther than /. to find them.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
I've used autoconf extensively when building with a cross-compiler. Your advice to ``stay far away from autoconf'' is unwarranted.
It's certainly true that building with a cross-compiler requires some extra care. It's also true that autoconf provides somes tests which give errors when building with a cross-compiler. However, it's not arbitrary. Those tests can not be run correctly if you can not run an executable. The tests are provided a convenience for the common case of a program which is never built with a cross-compiler.
For a program which is built with a cross-compiler, there are various ways to handle these tests. I usually write the configure.in script to check the cross_compiling variable, and, if set, perform the check at run time rather than at compile time. For example, if you have the source code to GNU/Taylor UUCP around, look at how the ftime test is handled.
There is a chapter in the book I cowrote which discusses building with a cross-compiler.
When you use the AC_CONFIG_HEADER macro, as almost everybody does, autoconf does pretty much what you suggest: the main difference is that it writes a single header file to disk, rather than many small header files.
autoconf normally only changes the Makefile to handle things like compiler options and library options. DJB handles those by creating little files which the Makefile uses. It is entirely possible to use autoconf in this fashion, and it requires no work beyond what DJB's system requires.
The autoconf system is certainly more complex than DJB's system. I think it is also more powerful and more consistent. It's far from perfect, but I don't think you've clearly identified the problems it has.
- The library dependancy checking via AC_CHECK_LIB rocks. Why should I have to deal with the infinite possible locations for shared libraries? Autoconf deals with this nicely, and builds your -l and -L parameters for you.
- ./configure --prefix=[somewhere] makes it very easy for your users to customize the installation directory.
- AC_ARG_WITH is an execllent macro that lets you create compilation options with ease. Two of my favorites are creating debugging and profiling builds that can be removed in a production compile.
The only truly horrible feature of autoconf/automake is the function detection mechanism, as other posters have complained about. Since there is no viable replacement for the functions that are being checked, this is just plain dumb. I suggest not using this feature at all, and then only write code that is a) ansi strict, or b) only uses library functions that you checked with AC_CHECK_LIB.Following these simple rules has made it very easy for me to create sane makefiles across projects with a very large number of subdirectories and sources.
std::disclaimer<std::legalese> sig=new std::disclaimer; sig->dump(); delete sig;
My experiences with autoconf- and libtool- based build processes is that they tend to either a) require using a gcc-based compiler or b) will only kick in optimization flags if the end user sets CFLAGS manually (and even then, the CFLAGS may not get carried over into all parts of the project).
So, depending upon your needs and just how portable you need to make your project, you might want to look at imake. While imake isn't 'simple' by any stretch of the imagination, one can take advantage of the fact that any system that ships with X11 developer packages has a working imake system that includes a good set of optimization switches set. The only big problem with imake is that a lot of folks don't set the site and/or host configuration files to change the compiler settings if they aren't using the manufacturer's compiler. [a simple #define HasGcc2 YES or is usually all it takes!]