How to Fix the Unix Configuration Nightmare
jacoplane writes: "There's an interesting article on freshmeat talking how sorting out some kind of standard for configuration could really help Unix systems could be more user friendly. The article points out that since Apple has managed to build a quite usable system on top of NetBSD, it should be doable to do the same for open-source interfaces."
The part that makes such a system really useful however, is a standard agreement of which information is stored and what it means. This is where the Windows Registry falls down. And Unix is even worse, because all it has is some common soft-of-agreed-upon shell variables, like $EDITOR etc.
Apple is able to do this better because they set the standards for the OS (even more than MS). The can have one central "registry" for something like default associations of MIME-types with particular applications and define an API so every application can use it and a user doesn't have to change his settings in his browser AND his mail client AND his ftp client, etc.
Given the diversity of the unix crowd, the latter seems difficult to me. Maybe they can include it as part of LSB for a start?
Idempotent operation: Like MS software, wether you run it once or often, that doesn't make it any better.
I was thinking this morning how I hadn't udpated my Apache server in a while, and wondering whether I should apt-get the latest version (Apache is kind of important for security, as it's the only open port on my system from the internet). However, I've done various tweaks to config files which would get overwritten if I accepted the Debian standard file, but I don't want to miss out on any new settings that could be important. I know I can do a diff, but that's effort I'd rather not have to go through.
This kind of situation is where a registry-like interface is useful; the install program just has to do a quick 'if-not-exist then add' to any new settings and leave the rest alone (or ask if you want an overwrite of all settings, with appropriate disclaimers).
This kind of thing is difficult to do in a flat file split into different sections (if there isn't a concept of sections, you can just tag the setting on), but trivial in a registry structure, especially when the tool (dpkg-upgrade/apt/rpm) has to handle all the different file formats. However, linux/Unix users would rebel en masse if the registry got inflicted on them (with damn good reason! I like being able to fix problems from single user mode using vi!), but some form of layer between the text file and settings may provide the best of both worlds (programmatic ease and editability). apt/dpkg/rpm could use the interface to add/modify settings without splatting your custom tweaks while still adding the new required settings.
Unfortunately, we're starting from a difficult point, with thousands of applications with many different requirements for their settings. Hopefully we can get some covergence over time.
Because there is only one captain on the ship, Apple. Good luck fixing it in the Linux world. The only way it might have a chance of working IMHO is if such a proposal gets included the Linux Standard Base. Here's a bold idea, why not copy the way Apple does it??? No need to reinvent the wheel...
-adnans
"In short: just say NO TO DRUGS, and maybe you won't end up like the Hurd people." --Linus Torvalds
"gconf" does exactly what the Freshmeat article describes: a unique way of storing configuration. Its frontend is abstracted from the backend which uses a hierarchy of small plain XML files per default but could also use a SQL database or a LDAP directory.
LDAP is nice on its own, too, with configuration being stored platform- and host-independently, as well as global, group-specific and user-specific settings. Netscape Roaming works this way.
Sure, installing software is easy. Unpack and compile. We are talking about configuring software.
Like getting sendmail to rout and forward mail, apache to serve up web pages you want, BIND to bind names to IP address, etc.
These tasks arn't "hard" or anything, but they do require a lot of reading on the part of a Newbie. In the windows world (don't know much about OSX) Most of that stuff can be done via am intuitive GUI.
Flame me if you want, but I'd greatly prefer a system that didn't require me to learn diffrent config file formats for each service I want to have running... or deal with a hodgepoge of 'easy config' program hacks.
A simple, standard configuration system is definetly the way to go.
autopr0n is like, down and stuff.
There's lots of cool stuff you can do with standardized configuration files, and dynamically generated GUIs are only one part of it. But developers are not going to change their file formats unless there's a real push for a specific format. Thus, to establish any format, you first have to develop lots of filters to support legacy software. As the FM author correctly points out, such a system would make software configuration under *x potentially much easier than under Windows.
Finally, to all the hackers who fear that this will take away their file-meddling options: If properly implemented, it won't -- and if it does, it will not get accepted as a standard, exactly because of stubborn people like you :-)
The real issue is finding the configuration file; if you're using a distro you're not used to, then one often has to resort to using find. This is exactly what the Linux Standards Base was designed to avoid, and if all distros were to follow this model then I can't see that there is any real problem.
At the end of the day, we're not talking about things like setting background pictures in the window manager, we're talking about setting up mailservers, webservers and the like. If you're not cluefull enough to edit a text file then you're not cluefull enough to be put in charge of setting one of these servers up either.
Blaming GW Bush for the Iraq war is like blaming Ronald McDonald for the poor quality of food.
I'd agree that XML is a good basis, but "XML" really doesn't provide much by itself. It's just a file format that is human readable. If you just use XML with a bunch of proprietary tags, your own XML language so to speak, you really don't gain much over the existing different syntax config files.
An automated tool has no clue what your ipaddress (or whatever) tag means at all. You need to provide additional context for tools to understand the semantics of the configuration data. To make configuration files understandable in a more intelligent sense, you need to either restrict the tags you use to your own configuration language, or you need to provide metadata of some sort.
Why is this intelligence necessary? Well there are all sorts of dependencies and relationships in configuration files. You might want a GUI to let you know if you change something that may break another setting, and so on. Plus ideally you would only allow legal values to be set. Data typing could be done with W3C XML Schema Definition Language, or RELAX NG schemas.
Which brings me to RDF, which I think would be better suited to this task than XML alone. If you use RDF (see http://www.w3.org/RDF/ ) you make it much easier to have a self-describing format that tools can do more intelligent things with than raw XML. While I don't think RDF, DAML+OIL, et al is enough to create a Semantic Web as Tim Berners-Lee is hoping, it _is_ a step in a higher-level direction that will support more intelligence in processing data.
Mozilla already uses RDF for various configuration files and I'm sure there are other applications that do too. Mozilla has a whole bunch of stuff about their RDF here.
XML is just a tree of "stuff" in human-readable format. RDF lets you set up properties and relationships in the data in a standardized way. I don't have a brilliant example to prove this to skeptics, but really it is a better way to represent a lot of types of data you want to be able to query. There are many knowledge bases, expert systems and other query engines already out there using RDF and even higher-level languages like DAML+OIL.
-Kevin
It took MacOS X for people to realize that this was a problem in UNIX? Please.
/etc/registry.conf. There is no reason it has to be binary.
The reason UNIX and UNIX applications are hard to configure, in most cases, is because Open Source programmers are lazy.
This is obviously a blatent generalization so I will explain.
The old adage is that an Open Source program gets written when a programmer has an "itch" they decide to scratch. The problem is that very few people are itched by configuration. You may write the best web server in the world (Apache!) but by time it comes to writing the configuration manager for it the volunteers start falling away.
It isn't very fun writing a bunch of dialogs, windows, buttons and such to make a nice configuration for a program. It's kinda like documentation (and we all know the state of docs for many UNIX programs).
I see examples of this every day. I have a Mac OS X using friend who sends me the URL of every new program he decides to use. It's incredible how many of them are UNIX ports with a beautiful configuration manager stuck on. Mac programmers hold themselves to a higher level of user experience and UNIX people need to get on the boat.
What's needed isn't a global, all dancing, all singing configuration system. What is needed is responsibility in programming.
P.S. Everyone always whines about the Windows registry because it's binary, you can't edit it blah, blah, blah... But the fact is: It works. The average user never cares to edit it because they config their programs from WITHIN their programs. If something is truly needed, do the Windows registry in text file format. Make it
What we need is exactly what the author is proposing. A generalized configuration system that works off meta data that the software developers supply.
- All the possible configuration options in logical groupings
- descriptions of each option
- default values
- validations on input values
- being able to label options as 'experimental', etc...
- and of course option dependencies - that is if this option is turned on, then enable these 5 other options.
To do it right you'd ultimately you'd probably need to have a very light scripting language with flow control and variables.But the important thing is to get a workable standard in place, one that the majority of developers can rally around and will be happy to develop configuration scripts for their applications. Most developers wouldn't even have to change the way their application reads the config files - documenting everything in a form that this 'configuration manager' can use would be enough. It's already been too long - we shouldn't wait any longer to get this process moving forward.
We should use actual Python code as the configuration file format. It's callable directly from C, and by extension, all other languages. It's nice and clean. It supports heirarchial inclusion of other configuration files. It has easily readable comments. In short, it's the perfect configuration file specification language.
You're right about the registry, but a better designed system can give you the best of both worlds. Mac OS X's plist files are human readable and writable, easily accessible via a standard API (1 line of code to get or set any property), and do not involve a single point of failure.
How to solve most of our problems: 1.Lots of nuclear plants. 2.Cure aging.
Yes, and the discussion went to the next step and started talking about possible APIs. The parsing of XML (DOM and SAX -- DOM for in-memory data structure and SAX for streaming access) are well understood and, most important, already done with innumerable tools already written.
Wrong! Wrong! Wrong! You haven't read the spec and are obviously making assumptions based on what you've heard from others who haven't read the spec either. XML Schema allows not only the specification of integer, string, float, date, etc. values, but integer subsets (greater than and less than) and string patterns (through regular expressions). And RELAX-NG can use XML Schema's datatype support as well.
Do you have something that does validate IP addresses for example or were you just complaining about how one thing doesn't (even though it does) without showing anything that will solve the problem. If you say regular expressions, I must remind you that XML Schema supports regex matching of attributes and element data.
Could you please enumerate these non-compliance issues? An encoding issue? Bugs in DOM or SAX support? What? I wonder about the vitrol of your argument without any specifics. Especially since I been using both parser that you mention (and a few others) without any noticeable issues associated with the standard. Any problem were usual to do with parser extensions to the standard which are fairly simple to avoid.
You're right! That is a big issue. Unfortunately for your argument, it is not an issue with the leading parsers that you mentioned yourself: Xerces and MSXML. The default parser for Java (Crimson through JAXP), the default parser for Windows (MSXML), the default parser for Perl (XML::Parser I think...haven't used it yet) to name a few ALL handle entities. This is like saying that there are many web servers out there that don't support CGIs with the obvious intent of dissuading use of web servers.
I agree, but you're missing an important aspect here. INI files are usually one and at most two levels deep. They do not handle hierchical structures well at all! Let's look at a snippet from a default Apache configuration file.
What do we see? Hmmm... Because the key/value pair format doesn't allow hierarchy, they fell back to something that looks a bit...er...XML-like. And it isn't key=value, it's key[whitespace]value. Yet another issue: one file has the equal sign and others don't. I thought we were aiming for consistency? And what if the parameter is a message and that message has a newline character? A '\' at the end of each line? A '\n' where appropriate? These are problems that XML has solved.
And let's not forget Unicode support. We English speakers may have a hard-on for English, but can't you imagine a case where a program intended for a non-English speaking audience would want -- if not the names of the parameters themselves -- parameter values in alternate character sets than ISO-8859-1 (Latin1)? And before you rebut with UTF-8, do you want to write the UTF-8 translator? What about the other codesets? Are you strictly limiting to UTF-8? This is another area of XML parser that keep it simpler for the programmer. Transcoding is done for you and is most likely better implemented and supported than what you or I would come up with.
And when you have multiple "Files" definitions? Well, multiple blocks right? The "Order," "Deny," and "Satisfy" parameters are associated with the "Files" definition, yes? So why aren't you associating a group of "Files" definitions together? Because that would require further hierarchy? Would you end up with something like this?
Set up a relatively complex configuration where you have virtual hosts and those virtual hosts have behavior different from the default. What's that? You include them in tags by host? Tags that look somewhat like XML anyway you say? Imagine that!
And what about the comments? For some formats, the comment is the ';' character. For others it is the '#'. And none of which I am aware differentiate between implementation comments and actual directive comments. For example. the comment above describes the configuration directive so that any configuration editor could conceivably read in that info and display as help to the user (assuming that all comments were kept correctly before the directive and not after or on the same line). But what if you wanted to make comments about your specific installation. You are making comments so that others in your working group can see why you made certain changes to the config file, right? So how are they differentiated from the primary directive comments. This is the type of problem that XML namespaces are intended to solve -- a way of giving demarcation points to distinct pieces of (sometimes) unrelated data or simply the clean combination of multiple schemas so that your schema definition doe not bloat to immeasurable levels and different schemas can easily share with one another.
Something keeping back Linux/BSD/UNIX is the stubborn, 1337 coders who spent all day figuring out a config file that some joker on the net thought would be a keen config format but forgot to comment it. Let's face it, Apache is not the norm. Config parsers are a known problem with known solutions. Writing a config file parser is not the primary focus for most programmers out there. The config file is a necessary evil to them. Let's make it easier and just say, here's your DOM interface. It's well documented, works well for the limited dataset with which you a working (config files are usually less than 50K), will handle any configuration organizational type you want (hierarchical, flat, etc.) and will save you the time of writing a parser yourself.
Have a nice day.
- I don't need to go outside, my CRT tan'll do me just fine.
We will now hear from all the people who insist that "they have to do wierd thing X because it's (traditional) (k00l) (needed to support their 386 machine)". They're wrong. Someone needs to be a hardass and fix this thing.