Expanding the use of XML in Linux?
elemur asks: "I was wondering if there are any projects to expand the use of XML in Linux? There are alot of areas where XML could be more easily and consistently used than continuing making more and stranger configuration files. Many configs could probably fall under a generalized standard application config DTD, and applications that needed something more targeted could supply their own. Some sort of DTD repository could be setup on the machine to handle this. Then, apps just need to use libxml (or whatever it would be called) to handle the reading and parsing. It would seem to make things much more consistent. Has anybody looked into this sort of thing?" It's a good thought. And a standardized configuration file format might be the thing to reduce some of the complexity most folks find in Linux. What do you all think about the capabilities of XML?
Update: 09/29 04:03 by C : Screwtape submitted this tidbit "I just saw this on MozillaZine and I'm quite impressed. Somebody has taken the XML parser from Mozilla, and written software that makes it work like an xterm - but with extra features. For example, you can write a replacement for ls where all the filenames are hyperlinks to the actual files. The site is here. "
Linux (and to a lesser degree Unix) are great because they are vitually pure meritocracies. Code has traditionally been adopted into the standard tool set because it is useful, not because it is marketable. The day the Linux community starts (more or less) collectively agreeing to and anouncing paths chosen by buzzword, Linux will have the same future as Windows. The right way to do this is to allow some enterprising XML advocate create a distro with XML based configuration files and see if it catches on. A few moth ago, Linux Today posed the question, "Should the kernel be rewritten in Java?" The answer is, of course, "If someone wants to!"
Here is some info on XML
t ml
m l_overview.html
1) Scientific American - XML and the Second-Generation Web
http://www.sciam.com/1999/0599issue/0599bosak.h
2) XML Overview
http://xdev.datachannel.com/directory/xml_101/x
XML Overview
XML is soon to be the lingua franca for open information exchange on the World Wide Web. XML can be used for business-to-business transactions as well as for information delivery directly to a consumer (via a browser or the like). XML is all about information although XML per se is not concerned about how the information is displayed to the human reader.
What is so important, then, about XML? XML provides a significant advance in how data is described and exchanged by
Web-based applications using a simple, flexible standards-based format. Hypertext markup language (HTML) enables universal methods for viewing data; XML provides universal methods for working directly with data. XML is a subset of Standard Generalized Markup Language SGML)that is optimized for delivery over the Web. It is defined by the World Wide Web Consortium (W3C), ensuring that structured data will be uniform and independent of applications or vendors. XML interoperability kick-starts a new generation of business and electronic-commerce Web applications.
The power and beauty of XML is that it maintains the separation of the user interface from structured data, allowing the seamless integration of data from diverse sources. Customer information, purchase orders, research results, bill payments, medical records, catalog data and other information can be converted to XML on the middle tier of a three-tier enterprise IT architecture, allowing data to be exchanged online as easily as HTML pages display data today.
Data encoded in XML can then be delivered over the Web to the desktop. No retrofitting is necessary for legacy information stored in mainframe databases or documents, and because HTTP is used to deliver XML over the wire, no changes are required for this function.
XML is valuable to the Internet as well as large corporate Intranet environments because it provides interoperability using a flexible, open, standards-based format, with new ways of accessing legacy databases and delivering data to Web clients. Applications can be built more quickly, are easier to maintain, and can easily provide multiple views on the structured data.
As well, for any company adopting XML to extend their offerings, XML:
Is open, easy, and flexible to systems and application developers.
Can turn a Web-based enterprise into the ultimate database of databases. Is platform and systems independent.
It operates in a structured data environment with seamless interchange capabilities.
Sits on top of HTTP or IP and provides the protocol and language for exchanging data or information. Improves efficiency at the browser presentation tier and is an effective content storage mechanism.
Is a data integration technology that allows distributed systems to exchange and manipulate only required information.
Helps manage large repositories of content that are both addressable and structured.
Provides a uniform method for describing and exchanging structured information.
Allows every piece of information to be accessed in a Web XML-based infrastructure that has its own URL.
This library is used in the gnome project for a number of program including gnumeric, gill, dia, libglade, etc.
It offers the following:
Documentation and code can be found from the libxml home page
Considering keeping DTD around, I was thinking about that, I need a DTD cache with the URL and System ID association, this would be welcome.
I'm not an AC !
Daniel
Daniel.Veillard@w3.org
XML means an infinite set of yet undefined HTML like languages with no semantics whatsoever. With a proper DTD you have one language from this XML set of languages. You still have no semanctics, except what you write down to the DTD as human language comments.
With an a style sheet, either XSL or CSS, you can define a semantic transformation where a syntactic element of any XML family language, like a tag , is given a semantically meaningful visual presentation. Instead of using XML you could as well use plain HTML 4.0 with CSSS and tags like or to get the same result. XML gives you no benefit at this level here.
There exists no mechanism for specifying semantics for a program which intends to use your DTD-defined language as the form of its own configuration files, or whatever purpose. Implementing those semantics as as hard manual works as with any other non-XML syntax.
XML is a hoax. It is actually nothing but a free licence to invent an infinite number of your own "" and just pray that somebody will agree with you on their meaning.
Of course the world can negotiate standard XML DTDs with standard semantics, and standard software to interpret them. But agreeing on them and getting the semantics implemented is the exactly same problem as would be regardless of their syntactic family ties. The computing world is full of different formal languages, and the their syntactic variance is not a big deal. Finding a good syntax for an application domain is important, and implementing the semantics is the challenge. Just _parsing_ simple specialiced syntaxes for configuration files or application specific data is trivial with modern tools anyway.
I could not care less, if the data I have to parse is comma separated fields, name=value pairs or XML compliant value.
XML is vapor, a hoax, but XSL is the most horrible thing I have ever seen on my career. I won't go to details here.
Anssi Porttikivi
app@iki.fi
currently mostly a html/http/Tcl/Oval/database programmer
While I agree this can definitely be another good way to use XML; I'm not sure everyone will be willing to abandon the ASCII config file formats they have been using for a very long time, and move to an XML-based configuration registry. But something like this has to be done sooner or later...
Why? Using XML for configuration files doesn't actually buy you anything. All XML is is a way of concisely describing the format of a file -- but the data (and more importantly the data's semantics) in each configuration file will vary between programs just as much as they do now.
If you proposed that all configuration files be written as Lisp s-expressions, people would look at you funny because they could easily see that that doesn't magically win -- but mumbling the phrase "XML" seems to escape those filters, even though it's just s-exps with typed parentheses.
[I'm not kidding about the s-exp thing, either. I started to write a little system using XML and XSL to transform some of my docs into HTML, flat text, and LaTeX forms. But when I realized (thanks to Erik Naggum) that the XML document was just a way of serializing a tree structure I changed tacks. Instead, I stored my docs as Lisp s-exps and then was able to use Common Lisp instead of XSL to write the tree-walker and was done in a quarter of the time the other way would have taken.]
In any event, the thought that might bring real benefit is the idea that you can build a hierarchical configuration structure where apps can acquire config data from the containing configuration classes. (It shouldn't be a registry tree, though, because an application might rightfully belong to several disjoint classes -- you'd want a registration directed acyclic graph.) Defining XML DTDs would be a convenient (but not essential) way of creating a lingua franca.
In essence, configuration classes would correspond to OO classes, and the configuration acquisition tree would be an inheritance graph. This could be huge win if the graph were well-designed, because each piece of config data would be kept in one place, and program configurations would automatically adjust when that unique datum was changed. However, if the tree were poorly-designed then you would lose big in the same ways that bad OO designs suck hard -- configurations would be very brittle with lots of interapplication dependencies and needed information would be scattered all over the place.
My personal belief is that the registry approach would tend towards the "huge lose" case -- incrementally designing a good OO architecture requires aggressive refactoring, and refactoring configuration information among dozens of software projects that aren't even necessarily aware of each other would be an interesting problem in change management.
So I see having to manually adjust individual configuration files is the price we pay for letting each project develop independently of the others.
That's a great idea. There are several places XML is being used rather efficiently already. I know that that the Gnumeric spreadsheet program already stores its files in the XML format. This, to me, is one of the great things about XML....
Werd.
If this isn't defined, one of the valid options is for us to attach an electrode to your chair and electrocute you :-).
Why not
That is an interesting idea; two problems:
The tools presently available are generally parsers, and have nothing to do with the grotty work of file locking and error detection/correction.
As such, it would represent a useful inclusion into Linuxconf.
At some future time, when there is actually some useful configuration information managed in an XML repository, and when there is a scheme not only to read, but also to reliably write, XML, it would then prove to be a useful inclusion.
Until there is something of comparable functionality to libPropList for both read and reliable write, I'll remain skeptical of the usefulness of XML for storage of configuration information.
If you're not part of the solution, you're part of the precipitate.
If there is a lot of data, as might be the case for things like mail routing tables, there is also merit to having a random access mechanism so that the data doesn't have to either be stored in memory or parsed repeatedly.
This is one of the merits of the CDB system; it provides a "binary" form that is rabidly fast but which also can be rewritten from scratch with exceeding rapidity.
Approach that supports both needs:
The two merit to CDB in this regard are that:
I once "compiled" a file into hashed form, and got about a million keys inserted in 17s on my PPro box.
There is no temptation to change the binary form, as you can't modify what has already been written out to it.
This means that the text form stays as the true data source.
Noticing that the system needs to "recompile" is the one "problem issue" here; it is not exacerbated by this approach as it would be equally true for a purely text-oriented scheme.
If you're not part of the solution, you're part of the precipitate.
Yes, indeed, it would be a monumental effort to get everything in /etc rewritten to use a set of XML-based data files.
The big deal is not merely that of getting something working, but also to ensure that a robust system results. As you say, "what if ... it is corrupted or incorrect?" The XML standard provides no guidance here, and there are liable to be three answers, not one, which will muddy the waters further.
If you're not part of the solution, you're part of the precipitate.
Are there some that do reliable write, e.g. with file locking, backups, and automated backout if it encounters errors?
If you're not part of the solution, you're part of the precipitate.
However. It is not all fine and dandy.
The "configuration problem" has not one issue, but several:
XML represents Yet Another Format; it is of value if it pushes out some of the existing formats. If it merely augments the population with another, there is no win here.
Result: Ambiguous. XML might provide value.
The issue here is that you need to ensure that the configuration is written out correctly.
This may require writing out the new config to a new file, validating that it is readable and correct. (Oops, made a mistake updating /etc/inet.d. Now the system won't reboot...)
There is merit to having a "database form" ala IronDoc where the physical representation is a database system, which provides a somewhat different persistence model than the typical text file.
(Before people start proposing that I be shot, I tend to favor the notion of, if using a binary format, synchronizing it carefully with a text format.)
The merit of a "databased" scheme, which should provide a separate database for each facility, is that updates can be implemented "instantly" without needing to rewrite a whole file, and without a need to parse the file. Note that even in a situation where XML is used as an interchange format, there is still merit to storing the "tree" in database form. David McCusker, author of IronDoc and architect of the (regrettably failed) "Bento" database system that was part of OpenDoc, suggests this very use for IronDoc.
For those that feel religious about using text files, a system like libPropList still has merit over the "let's do something with XML" idea since it has, already debugged, the locking, parsing, and config-file-rewriting code that let's use XML, it's k001 doesn't inherently provide.
In short, deciding to use XML merely establishes a format; it does not resolve that:
Michael Stonebraker (of fame with such developments as Ingres and Postgres) has most recently founded a company called Cohera based on the Mariposa Distributed Database Management System. This tool allows many databases to work together to process queries.
The "obvious" implication of this with this thread is that a valuable thing to be able to do is to join together many "databases" that are configuration repositories, and provide a central way of getting at the data.
The critical thing that is necessary is for configuration repositories to provide some sort of "metadata" so that they, in effect, publicize their existence.
A "federation" tool like Linuxconf, Ganymede, or such, can then be used to join together the metadata and manage it all together.
Unlike the situation with the infamous Windows Registry, this doesn't force all the configuration data into one fragile binary DB; it allows the data to stay wherever it was concluded that it should reside.
The critical factor here is not that data files all have a common format; it is that there be some way of translating their data into a common format.
XML has a lot to offer here in terms of providing a central "presentation" format. It could offer more if tools were available to make this a two-way street, where updates done to the central XML could be pushed back to the individual configuration data repositories.
However. If someone writes some integration code to (say) connect Linuxconf to libPropList so that it could directly manipulate libPropList files, that would also represent a movement in the right direction.
Conclusion: XML may have value to offer in confederating config information.
That has to come along with a whole lot of coding effort to build robust configuration data repositories that may or may not use XML.
If you're not part of the solution, you're part of the precipitate.
Someone else mentioned the likelyhood of, say, a /usr/lib/sgml-like directory where system-wide XML configuration files would be. An alternative could be /etc/sgml, /etc/xml, /etc/config, whatever.
/etc/sgml/remote and an /etc/sgml/local and update everyone's $XML_CONFIG_PATH appropriately.
The point is that if we establish a standard system-wide configuration path, it would be pretty easy to do what you're describing using traditional remote FS techniques.
If you wanted some apps to be remotely configured and some locally configured, create an
In addition to the other poster's comment that you can use < to represent < and > to represent >, you can also simply use the "Extrans" posting method (versus "Plain Old Text"), which automatically does these conversions for you.
Like another poster said, so long as they have a basic grasp of the XML libraries that exist, it's just a matter of calling the library's parse and write functions to read and create your own application-specific XML files.
Though learning XML would most certainly be a huge advantage, but it's hardly necessary. It's pretty easy for a non-XML-savvy person to edit an XML configuration file. It's not too difficult to figure out.
A libxml already exists. In fact, most languages already have some sort of library support for XML.
The support is there, the libraries are there. It's just that there's no "config file standard" yet and developers haven't really looked into it.
Not really.. With the XML wrappers out there, it's be fairly easy to parse in a file, and get the data from it. For general configuration data, this would be rather simple, becouse it's normally just key=value pairs..
-- I'm the root of all that's evil, but you can call me cookie..
One project that will eventually come out of the Jabber IM project is JNX, which would be expanding certain *nix functionality with XML data storage/routing. This actually goes one step further, and allows any program to transfer data to any other, by sending XML 'streams' of data. With the right transports, one could extend it further by having a 'Configuration Repository' for the system, which could actually store configuration data in something like an XML flat file, or even a MySQL database. Imagine 100 *nix desktop systems, all controlling their configuration via XML and a centralized database. This is some of the things that we will be looking at within the next year or so. www.jabber.org
-- I'm the root of all that's evil, but you can call me cookie..
The version of awk that handles XML data is called Perl. I thought you'd heard of it :)
,hacker Perl another Just)'
And then there's sgrep.
Damn all these tools for moving us into the 20th century.
perl -e 'print scalar reverse q(\)-:
Matt. Want XML + Apache + Stylesheets? Get AxKit.
The game project I'm working on uses XML for its network protocol, its database stuff, and (some) configuration scripts. We've explored a lot of the available parsers and run into some issues but no major sticking points. The two key issues:
Streaming - To use XML across the web it is nice to be able to stream XML packets (e.g., object definitions) and collect them client side and make use of them in real time. None of the current parsers provide this adequately, although several are working on it. We had to develop our own library for streaming this stuff (libAtlasWF). It's focused mostly on real time 3D information transferral, customizable by receiver to filter out unneeded information. It's generic enough to be useful for a wide range of applications, though we're using it for game systems.
Binary - A major requirement for games (and other applications) is binary formats for performance reasons. This was a major argument against XML until we realized that the XML tags (and lots of the data) could be rendered in binary simply by replacing tags with particular bytecodes and such. Probably not as compactly efficient as a custom binary code, but extraordinarily flexible (e.g., develop in ASCII XML, then just flip a switch to go to performance-oriented binary, and redefine binary tags as needed).
We're calling this real-time, 3D, binary-ready protocol "Atlas". We'd love your input (and help) in bringing it to any application that could use it. We invision client applications that understand XML-Atlas and can communicate with any server talking in this language, and a variety of specialized servers doing the same.
Here's some links to information about Atlas: Atlas Version 1.0, and WF Protocols
Bryce
It seems the poster wants XML introduced into all applications - not really the Kernel. I don't know many places where the Kernel would benefit from XML - except for the one configuration file the code itself wouldn't need XML.
I think what is being addressed is more an application issue. That would need to be addressed to many different vendors simultaneously. For instance: Apache config files, WuFtpd/ProFtd/etc files, Gnome/KDE config files...
I could see a lot of reusability for a config file parser for application development. But it would seem like the development tools would need this XML ability and not just everyone using XML. But I imagine that C/C++, Perl, Python, Java, etc already have XML parsers/creators. So really people are waiting for developers to embrace XML.
Does anyone now embrace XML? What are the advantages of XML over other config file parsers? Are there other standardized config file parsers? I know I've written my share of wheels in different languages for parsing config files.
Joseph Elwell.
IMHO, the *real* win of XML is not in replacing plain-text configuration files, but rather in replacing binary file formats and simple databases. One example would be word-processor file formats, which are usually a poorly documented, poorly structured binary mess. XML makes sense there. Another example is the Windows registry.
But XML is a significant *disadvantage* for the plain-text configuration files that dominate the Unix world. Generally, those plain-text files have a comment mechanism to clearly explain what needs to be edited. Adding XML will just add a bunch of unnecessary tags that will make it difficult to hand-edit configuration files - and the ability to easily hand-edit the human-readable configuration files is one of the most powerful advantages of the Unix Philosophy.
What i would like to see in the Unix/Linux world is a GUI that is capable of managing the entire OS via simple graphical tools, but doesn't *prevent* hand-editing of configurations. I don't see XML as a significant boon in that regard.
---
Maybe that's just the price you pay for the chains that you refuse.
Hand me that airplane glue and I'll tell you another story.
Don't know much about XML but it would be one more thing a person would have to learn before they could begin to write an app worth using. Could go either way, should improve the software that is developed, because you wouuld have to know all of the tools. But it might also deter some good programmers who don't want to learn/don't have time to learn/or are to stubborn to learn a new way of doing things. It might still be easier to parse your own config rather than including a library and learning how to use it.....
Funny and I thought Perl == Paid employment recently located
I've been working on ideas for a new Linux distribution (yeah yeah, flame on) that's based around CORBA and XML: backend objects convert a normal program's configuration files into XML and back again, communicating with whatever frontend program you want to write (a java applet, a command-line program, a GUI, whatever). So Apache's httpd.conf gets rendered into XML, the XML gets edited (by whatever means), then the edits get handed back and the backend program converts it back into an httpd.conf file.
There are three specific advantages to this, as opposed to making every program use XML for its configuration:
1. Some programs are just optimized for a particular parsing style. Apache's needs differ from sendmail's, which differ from inetd's. There's no reason each program needs to use XML internally.
2. It's backward compatible. My hypothetical distribution wouldn't need to make major patches for every new release of every daemon in existence. It's also forward compatible, because it doesn't require new daemons to use XML either.
3. It doesn't piss off the "tomsrtbt and vi at 3 am" people, who either by necessity or choice want to be able to hack directly on configuration files.
Does this turn into a "Windows Registry"? No. The only differences between Linux and the Registry is that the Registry is binary, and that it's all in one place-- Linux configurations are scattered all over. This method lets you edit all configuration files using a single tool, but it doesn't give up the ability to hand-edit files if necessary.
The other application XML has in this distribution idea is for package management. Instead of a proprietary format (RPM, DEB, whatever), we have an XML "manifest" or "spec file" in a tarball along with the relevant files; the XML file has a well-defined filename, and tarballs can be created anywhere, without installing any additional software. This doesn't burden software developers with the need to make special arrangements for binary distributions; by including such a spec file with their normal source distribution, they've effectively created an "SRPM" without any additional effort. If they want to package up binaries as well, more power to em; otherwise, someone else can.
Nothing worth doing is worth doing today.
> What I want to avoid though is writing a custom
> XML parser, especially if in 2 years, every
> Linux distribution is going to have one. Does
> anyone know which one that is likely to be?
Several Linux distributions already include an XML parser that could (should?) become the standard: libxml, written by Daniel Viellard (he also runs the rpmfind system) from the W3C. It is already in use by the GNOME project.
See http://rufus.w3.org/veillard/XML/xml.html
Glade does something like this. It will produce C, C++, Ada source code or an XML file describing the GTK application. With python you parse the XML and then use a build tool to build the GTK objects as they are laid out in the XML by Glade. You just have to attach the events to code that you write.
:-).
While I see a great possibility here of having a way in which the author gives the user GUI elements and a set layout that the end user can change, the implementation suffers from the same problems that almost all XMl suffers from.
First, very few people understand the full scope of XML and all the adjoining technologies. The only way I understand it is putting it into the same perspective as SGML, but that doesn't cover many of the other parts. I guess most of us are waiting for an O'Reilly book on it.
Second, part of XMLs power is the way it can be used for almost anything. Part of the reason of it not being universally excepted is the the way it can be used for almost anything. Has anyone seen XML used the same way twice? From what I can see this has resulted in a lack of tools to deal with XML from beginning to end. Yes, you can merge a lot of tools to do some really amazing things, but it takes a lot of time and mental effort to come up with these chains of tools (much more so then other chains of tools).
Third, the resource consumption for most XML based projects is high. High enough that it will limit the viable uses of XML in the short term. Building a large GUI from XML may increase the start time of memory consumption beyond where it is usable for some applications. Netscape already loads slowly, if that time were doubled I'd be looking to use lynx for most everything (half
Personally I think XML would be a great way to help organize all those sometimes nasty .conf files. There is one large pitfall that needs to be avoided. XML is very easy to convert to a database system. One of the main uses it has (at least in my experience) is a great way to abstract out databases in a non-proprietary format. If XML were used with wrapper classes to access conf files some people might be tempted to port those conf files into a database and change the wrapper library to a database wrapper instead.... to a naive user (how many of these are there now?... but in 5 years?) there would be no difference. What would we end up with? A Windows like registry system. I believe this should be avoided at all costs. A database system may look nice from the outside and have lots of great features, but when something goes wrong and the settings for your favorite daemon are in a corrupt database entry what are you going to do? XML will provide the organization linux conf files need and allow the existence of the flat files everyone loves, but also presents an easy step toward database (registry) systems.
Although not "fully grown", libxml will do what you want it to do. However, it has one major drawback: it is a GPL, not a LGPL library. This means that only other GPL programs can make use of it. Since libxml is part of the Gnome libraries, this is normally not a problem as all Gnome applications must be GPL (hmmm, are LGPL libraries allowed link to GPL libraries?). However, the question was wondering about broad XML support, and thus libxml won't work for other licenses such as BSD, Artistic or MPL (a lot of XML utilities are licensed under MPL).
However, there are two other C/C++ alternatives. The first is expat which is a C library that's pretty extension with extremely liberal licensing. The second is still beta, it's a new feature in the Qt 2.1 snapshot. The KOffice project is using this. There are, of course, dozens of Java and Perl XML libraries as well.
I don't think that you will ever see one single XML library for all purposes, but there are enough XML libraries available now in a range of Free licensing and languages that it won't be a problem.
A Government Is a Body of People, Usually Notably Ungoverned
A common dtd is good if you want to exchange the xml data that it describes among different applications. In the case of a document or spreadsheet, this is clearly a good thing. However, I don't see people developing a pressing need to exchange their configuration files between different applications. What other application could possibly have a use for a sendmail configuration other than sendmail?
If each application has its own configuration dtd, then editors can use that dtd to help the user write a valid config file. It can specify required tags, optional tags, and describe the structure of the file. Rather than using generic tags like <item id="username">Ethan</item> you can have <username>Ethan</username>, and this way the dtd can require the username tag, so you don't forget it. A common configuration dtd would be far too generic to be of much use in this.
I don't happen to be working with such an application at the moment, but I am enhancing some Linux documentation with XML-type labels. Not that it will be directly visible, as it will only be used in producing other documentation. But you can just pick a corner you like and start painting.
XML in place of the current config files would have some advantages. For one thing, it would allow use of a single high-performance parser to parse the files. The days of writing/copying a config file parser for every application would be over. Perhaps we could create a shared library to do this. (If there is any interest in a libxml.so, please let me know. It sounds like a cool project.)
It might reduce version control issues in some cases since new/unknown tags in XML can be ignored (much like unknown HTML tags are ignored). However, a well-written config file parser would do this already.
It would probably speed up the process of creating GUI front-end configurators, since the parser/generators could be reused. An advanced user without a GUI configurator is like a fish without a bicycle, but it would be helpful to newbies and regular desktop users. The "Linux is hard to use" argument would start to go away.
There are some big drawbacks, though. The first is that tons of applications would have to be revised in order to read XML config files. In an open-source world, this means a long painful process where some developers switch to XML immediately and others wait a while. Then there is the pain of converting your customized http.conf/fstab/.profile/etc (bad geek pun intentional) files to XML.
Also, there is the fact that most of the cool tools in Linux are really designed for all of the Unix world. Realistically, the Linux environment can't just switch over to XML config files unless the entire Unix community does.
Maybe future apps should use XML as their config file format. I don't see our well-worn existing tools making a switch anytime soon, however.
Just my 2 cents.
Save the whales. Feed the hungry. Free the mallocs.
But the biggest win will come from minimising the proliferation of DTDs. If the community can co-operate on the development of common DTDs then the exchange of data between software agents developed by different projects will be hugely easier. By all means have diffrent projects - both KDE and Gnome have, in my opinion, benefited from the competition between the two - but if that competition develops the sort of bitterness which reduces communiction and co-operation we all lose.
I would strongly urge anyone who is developing a new software agent - whether it's a user-level application or a new daemon - which either stores data or exchanges data with other agents to seriously consider XML as a format, but more importantly should look at the DTDs that already exist to see if any will fit, and should communicate with anyone else working on related tools.
If anyone wants to look at the XML tutorial I gave at INET99 it's here
I'm old enough to remember when discussions on Slashdot were well informed.
I've established a project to produce a lightweight structure document management system using XML. Essentially it is an XML DBMS. The project is still very very young, but is growing rather quickly.
http://www.dbxml.org
The more you stray from line-based text files that you can easily call things like awk and grep on, the more you'll alienate some people. I don't believe there's a version of awk or grep that handle XML-based records. This is pretty important to tool-minded people.
It should be possible to write a set of filters to convert between XML and standard configuartion files. Whilst this a lot of work to begin with (ie. such a filter for each different kind of configuration file, which is pretty much one for each different configuration file, if it was possible to begin with a biggie) such as the .xinitrc file, that might create the momentum to convert over the other applications. Writing filters of course obviates the need for the application writers themselves to rework existing code, which may be pretty much impossible with something like X windows.
Of course people have been saying config files should be LISP S-expresions for years; maybe the hype about XML will be enough to make this idea work...
Actually XML is a bit more general than S-exps in that it allows you to pass parameters, and specify that these parameters are optional, fixed, or have default values. Of course these can be simulated in LISP, but the ways XML handles them is cleaner.
Also XML has something resembling a type system in its DTDs, which seems somehwat alien to the LISP mindset...
Thanks, I have written a couple of XML apps myself, and know it is text based. The distinction between what I refer to as "ASCII config file" and XML config file should be fairly obvious.
People creating DTDs on their own is not the whole point of XML. People using standardized DTDs that are widely accepted by the target community is. If you see half a dozen different DTDs for Linux app. configuration files, all supported by various(different) development groups, there will be added confusion to Linux maintenance and installation; which is already being used as FUD material by Microsoft and the gang. That will not be good.
The bottom line is: With XML, you're not supposed to "create DTDs for whatever task is at hand", if your application is intended to interoperate with those of other vendors, companies, etc. Industry groups already started to take the liberty of creating DTDs on their own with the hope that their DTD may turn out to be the de facto standard in the field; and that will cause major fragmentation in the near future. It is already happening in the e-commerce area with different DTDs being pushed by Microsoft's BizLink and Rosetta, etc.
Zigbee Central: A Zigbee weblog
While I agree this can definitely be another good way to use XML; I'm not sure everyone will be willing to abandon the ASCII config file formats they have been using for a very long time, and move to an XML-based configuration registry. But something like this has to be done sooner or later.
People are moving and creating DTDs on their own which has the potential to cause a huge fragmentation on the XML arena; so before someone tries to design a configuration DTD, there must be some concerted effort to start a group within the Linux community that will work on this and other relevant XML DTDs.
Just my 2 cents.
Zigbee Central: A Zigbee weblog
I started out using XML for simple configuration files on a Java software project. Once we started to use it, we realised that it's extremely powerful and soon started finding many uses for it.
Expect to see XML cropping up everywhere soon. Microsoft (boo, hiss) is going to be using it for document exchange (XML is very good at this) in their Office products. There are rumours that M$ is already bastardising XML, rather than stick to the standards (now where have we heard that before?).
XML initially looks daunting, but really isn't too difficult to learn. There are some standard API's being developed (SAX and DOM), at least in Java. XML and Java work very well together. I haven't used XML with Linux, so can't comment on available libraries, if indeed there are any yet.
I certainly would encourage developers to look into using XML. It certainly beats writing your own parsers and you'll soon appreciate its flexibility. HH
Yellow tigers crouched in jungles in her dark eyes.
She's just dressing, goodbye windows, tired starlings.
I'm sure that there are a multitude of uses that XML can be put to on a GNU/Linux system. However, I don't think that XML is the panacea it's touted to be. Like Java before it, and C++ before that, and C before that, and Pascal before that, and Basic before that, XML is just another tool that's been the target of a lot of hype.
Certainly, you could implement a 'registry' in XML. The question is not "can you", but "what benefit does it provide and what are the drawbacks". The question of a "Windows registry" for Linux has often been discussed in the newsgroups and the usual response is: "What happens if your registry gets corrupted?".
By all means, develop an XML 'registry' tool, and/or enhance X to include XML attributes, and/or develop a new filesystem that uses XML as it's structure, but please don't expect that your development is going to replace the existing tools and features of the system. Like the ever growing variety of scripting languages, the only thing your development will accomplish is to add another choice to the mix.
Good Luck.
"values of beta will give rise to dom!"
I am working on an Open Source project called XMLTP. XMLTP seeks to standardize the transport mechanism for XML data. By taking cues from the Linux and Apache community, XMLTP.org is in the process of developing a standard way to send, receive, and execute upon XML data. By creating a common pathway, in the form of a client and server, a protocol, programming API's, and standard for formatting, XMLTP.org will provide a core technology to make XML more than just a web-based data formatting standard. For more information http://www.xmltp.org
The LinuXML Project is "devoted to changing the UNIX de facto standard for inter-process communication (IPC) and storage from line-based ASCII records to XML."
Check out the
There is indeed a version of grep for XML-based records. It's called sgrep (structured grep) and "is a tool for searching and indexing text, SGML, XML and HTML files and filtering text streams using structural criteria." It is based on the concept of regions, i.e. nonempty text substrings that are typically occurrences of constant stsrings, SGML tags, or meaningful text elements recognizable via delimiting strings or the built-in SGML, XML and SGML parser.
Check out the
... back when we were doing the initial work on wiring XML into perl. I managed to get hysterical storms of laughter out of the audience by flashing up 5 or 6 well-known *nix config files in increasingly ugly and baroque syntax. I forget the details now, but there was inetd.conf and fstab and httpd.conf and so on; all of which have multiple chunks of data encoded in text with a totally arcane ad-hoc set of syntax rules for fishing them out.
XML probably wouldn't actually be as easy to read as inetd.conf or fstab for someone who's used to inetd.conf or fstab, but there are those times when you pull up a conf file that's new to you and wonder what the author was smoking. Lesson - great programmers often design hideous config syntaxes.
The one that totally brought the perl conference house down was fvwm95rc.whatever.m4, I forget the exact name, which has an absolutely hair-raising melange of positional, functional, and from-another-planet syntaxes (and then had to be run through m4 fergosshakes. (why does m4 still exist?)
Anyhow, I think history has shown that a set of textual config files is a better way to run an OS than a set of dialogue boxes or a hierarchical binary 'registry', so it would be kinda nice if there was some common syntax out there. But even speaking as a certified XML bigot, it's hard to see how to get there from here. If I can help, let me know.
Cheers, Tim Bray (tbray@textuality.com)