Domain: docbook.org
Stories and comments across the archive that link to docbook.org.
Comments · 60
-
Old tech, and limited
I just recently reviewed the landscape of document writing systems for a client.
TeX (and LaTeX, and such) are a fine choice for specific purposes. There's a lot of functionality, it's robust and widely used. If you're writing a journal submission paper, it's a good choice.
The publishing landscape has changed. There are now many more types of document (help files, web pages, books, articles, owner's manuals, laws, contracts) that people want to write, and the TeX family is inconvenient for many of them.
XML is a more comprehensive document content specification. It easily covers all of the common document types (including those for which the TeX family is useful) and is extensible in a straightforward manner.
As a specific example, DocBook (a specific XML scheme) covers all cases where TeX is useful, and many more. An XML processing system can convert to any presentation format (HTML, XHTML, PDF, Microsoft Help, Text), and it's straightforward to build converters for new formats.
(There are also other XML schemas.)
The drawback of DocBook and XML in general is that installation is a nightmare. So far, there's no "one package install" that gets the author up and running. XML processing is a series of steps, with each step served by one of several open source packages. The author must choose and install software for each step, usually without any indication which is best for his purposes. This only needs to be done once, though. (For open source - paid software packages have this sorted out.)
(For example, see how long it takes you to install DocBook 5.x on a windows system.)
The TeX family is a good choice, but if you're not already using it consider learning a more recent solution.
-
Re:DocBook is horrible
If you start selecting tags to make the output look the way you want it to look, you don't understand XML (and subsequently shouldn't be using DocBook).
Anyone who was some experience writing documentation knows that the main objective is to write beautiful and readable documentation, not choosing the right markup...
The fact that you couldn't find the tag but could find the other tags you've mentioned is just depressing, especially when those tags are most often sub-tags of a code tag block.
The CODE tag is new in DocBook 4.3. Version of jade shipped with Ubuntu 9.10 is 1.2.1 and it does not know about the CODE tag. That's another problem with DocBook, it is a moving target with a standard that moves faster than the tools that support it.
Just wait until you need to generate HTML help, Text file documentation, a web page manual, and a printed PDF of the same core documentation. The single-source design of DocBook will be much better appreciated then, if you learn how to use it.
I doubt most people who express that belief has actually tried to publish the same documentation in HTML and PDF form. DocBook produces PDF by first converting the document to LaTeX (so one is left wondering, why not use LaTeX itself in the first place?) and then use its tools to export to PDF. The result is a document as ugly and badly type-setted as an O'Reilly book. The HTML output basically looks like a raw data dump of the text, like this book for example. That's underwhelming to say the least, considering that 50%+ of a DocBook document is spent writing XML markup.
If you really want to know why DocBook sucks so much, you should check out Sphinx which is a document writing system done right. For some reason, it can manage without the overly verbose XML and idiotic semantic markup and still produce high quality documents that blow DocBook's out of the water.
-
Re:DocBook - like HTML 1.0, only dumber
Not only that, it sounds like a horrible format if you need documentation to write in the documentation language. Just looking at their What is DocBook page leaves me wondering what the hell it really is...
Even how to write English is documented in English, so why do you argue that any language which can use itself to document how to make more of itself is bad?
-
Re:DocBook - like HTML 1.0, only dumber
A short look at the Docbook element reference (about halfway down the page at http://www.docbook.org/tdg5/en/html/docbook.html ) will show some of the elements that are relevant when publishing a *book*; elements for citations, bibliographies, indexing, callouts, glossaries, etc. HTML does not provide these elements.
-
Re:DocBook - like HTML 1.0, only dumber
Not only that, it sounds like a horrible format if you need documentation to write in the documentation language. Just looking at their What is DocBook page leaves me wondering what the hell it really is...
-
Re:LaTeX
And DocBook
-
You are reinventing DocBook
You are trying to reinvent docbook. Not only is everything you want done, it is implemented in several tools (XMLMind and oXygen are two I know of), has a standard method of converting it to any form you want (XSL, XSLT, XSL-FO), and there are tools that are already written to take advantage of those standards (Apache FOP being a FLOSS one). The latest version of DocBook uses XML namespaces, so you can mix in other markup languages as well; the canonical example is DocBook + MathML + SVG, which covers 99.9% of the math/science based literature out there. BTW, if you DO plan on going down this path, I suggest picking up a copy of XSLT, 2nd edition by Doug Tidwell. The latest version of the DocBook book is supposed to be out in August; don't buy the version currently on sale, it is 10 years old, and does NOT cover the current version of DocBook.
-
Docbook, definitely
It has exactly what you need, an html-like format, but tagged by meaning, not presentation. The project has tools to convert it to printable formats.
The spec: http://www.docbook.org/
The tools: http://docbook.sourceforge.net/
-
And on the 8th day...
...God created DocBook and Subversion.
We use DocBook and SVN to author/edit/maintain the MySQL Manual and related documentation.
Most of us working on the MySQL docs team also use oXygenXML for editing - it's neither libre nor gratis, but it's not terribly expensive, and it works well on any platform with decent Java support (one of the few Java GUI apps I've seen that really works, and works well). Handles many common XML formats including DocBook, XHTML, DITA, and TEI. You can also supply your own DTDs/schemas for custom XML formats. Includes both code and visual editing views, as well as instant validation and a built-in Subversion client. Easy to produce HTML or PDF output from XML source. Also has some nice XQuery and XSLT tools if you need them.
-
Re:Dead in the water until file format sorted
Gosh, isn't that what XML was supposed to do?
True, there's DocBook, but it doesn't seem to have been adopted very well so far.
-
Re:XML - XSLT - *
What you're describing sounds a lot like DocBook. I had difficulty getting the tool chain set up, though, so I have no practical experience using it.
-
Digital storage
I believe digitization of our entire literature is the goal. Think big.
But please don't use MS Word or something like that in the process. When things are going to be digitized, it should stay readable for many years no matter which hardware platform or software is used.
By using formats like DocBook or TEI much future work is saved when the book shall be converted to the current fashion of dataformats.
-
Re:What do you need that OpenOffice doesn't provid
Just started out with Vex http://vex.sourceforge.net/. It looks to be a pretty neat XML editor, based on Eclipse, with the DocBook DTD http://www.docbook.org/ built in.
I have been a longtime user of LaTeX http://www.latex-project.org/ and have found TeXnicCentre http://www.toolscenter.org/ to be a nice front end for LaTeX. I have tried word-processors, but haven't really played with OO.org long enough to understand the sectioning and styles feature. Now, I recently re-stumbled over LyX http://www.lyx.org/
I think I will stick with LyX/LaTeX till I understand DocBook better.
On a side note, I came across NaturalDocs (http://www.naturaldocs.org/) yesterday. It looks to be a neat way to generate documentation without messing up the whole thing with tags.
Now, it would be a nice idea to take all these diverse ideas and combine them together into a single tool that can work as a driver for various formats (somthing like GCC, which can compile multiple languages). So, you need to know 1 tool, which can parse reST, NaturalDocs, Doxygen etc. You know, the great unified theory of text processing ... -
Re:Not so easy
I suppose you could use DocBook and then output to whatever format you like. Being SGML (or XML) it's a bit like HTML.
OOo Writer has DocBook filters as well (bit of a work in progress apparently). -
GFDL?
I'm not sure this guy really embraces the spirit of free and open source software...if he did, perhaps he would have considered licensing his book under a free license, like the GNU Free Documentation License or one of the Creative Commons licenses. I'm starting to see this with some frequency, even with books published by commercial publishers like O'Reilly and Apress.
-
DocBook
Store the manuals in DocBook and then dynamically create PDF, HTML, etc. as needed.
-
Never use a Wiki for technical documentation.
> How useful are wikis for OS projects?
Never use a Wiki for documentation! Instead, you need a documentation maintainer to handle submissions. They will ensure that your documentation is clear, complete, correct, current, and consistent. This is hard work that goes largely unrecognized by the rest of the Open Source community.
Consider your documentation maintainer a part of your team. Give them CVS privileges. Don't disrespect them because they don't contribute massive amounts of source code. Answer their questions quickly and in a friendly manner.
If they have a problem explaining a feature, it may be a usability problem with your interface. Also, users will find a bug but will complain that the manual is wrong. So documentation maintainers are a source of good bug reports. Don't ignore their input because they're not an active programmer!
DocBook is the standard markup language for the major Open Source projects, so learn it.
-
Never use a Wiki for technical documentation.
> How useful are wikis for OS projects?
Never use a Wiki for documentation! Instead, you need a documentation maintainer to handle submissions. They will ensure that your documentation is clear, complete, correct, current, and consistent. This is hard work that goes largely unrecognized by the rest of the Open Source community.
Consider your documentation maintainer a part of your team. Give them CVS privileges. Don't disrespect them because they don't contribute massive amounts of source code. Answer their questions quickly and in a friendly manner.
If they have a problem explaining a feature, it may be a usability problem with your interface. Also, users will find a bug but will complain that the manual is wrong. So documentation maintainers are a source of good bug reports. Don't ignore their input because they're not an active programmer!
DocBook is the standard markup language for the major Open Source projects, so learn it.
-
DocBook-XSL + XSL-FO + FOP
Use XSLT to transform your XML to DocBook, then use DocBook XSL to convert to XSL-FO, then Apache FOP to generate a PDF.
Alternatively, skip the DocBook step and transform straight to XSL-FO. -
Docbook
Another XML-based format is DocBook, which originally was SGML based but now has a XML DTD too. From this format you can output to ps, pdf, rtf and plenty of other formats.
You could also hack one of the docbook XSL stylesheets (using XSLT? would be pretty!) to make it parse your own format.
Feel ready to own one or many Tux Stickers? -
Docbook
Another XML-based format is DocBook, which originally was SGML based but now has a XML DTD too. From this format you can output to ps, pdf, rtf and plenty of other formats.
You could also hack one of the docbook XSL stylesheets (using XSLT? would be pretty!) to make it parse your own format.
Feel ready to own one or many Tux Stickers? -
Re:Argh, the hidden codes!
You should try Docbook http://www.docbook.org/ or Latex. You can concentrate on the text you want to write much better than in Word. It is IMHO a far superior way of writing Text than Word is. Word always gets in your way and leads to more layout and less content that you originally intended.
-
Time to reconsider Wikis.
> Isn't it time for Google finally to put some work into refining their results...
Isn't it time to also reconsider the Wiki paradigm? More sites (like this) are requiring logins. "Golden Prose" indeed! IMHO, Wikis are evolving into crude Content Management Systems.
-
Re:Consistency, please
Perhaps I should have read the LDP website first. It appears that they insist that all documentation be submitted in XML or SGML DocBook formats. They even have a group of volunteers to help with that if the author is not able to provide DocBook.
A quick perusal of CPAN revealed eight modules specifically for dealing with DocBook. No doubt other languages have similar libraries.
Looks to me like more than half the work is already done. It shouldn't be a difficult matter to create a script to run the DocBook -> HTML+CSS conversions with predictable results. -
Re:Microsoft Publisher?Yeah, true, I don't know much about Word. I don't know about Quark XPress or Adobe InDesign. Most of my documents in Publisher are really small: 1 to 5 pages at most (product brochures and data sheets).
My larger documents are done with DocBook which is a whole other nightmare for documents bigger than 600 or so pages.
Thanks for the info. Thankfully I finally finished all of my documents and now can hope that I can sell my product. If not, its not gonna matter much to me. Sigh.
-
Re:XML
XML is to Docbook as SGML is to HTML. You wouldn't write web pages in SGML, so why write documentation in XML?
If you were to write your documentation in XML, then you would need to define a meaningful DTD/schema and all the tools that go with it to make it useful.
But why bother when someone's already done the hard work for you? eg. Docbook.
-
Docbook.. (again)
I have seen a variation of this question at least two times posted here. The unanymous answer is usually docbook and in this case is more relevennt, since the document is technical in nature.
good pick is DocBook: The Definitive Guide written by Norma Walsh (who chairs the Oasis DocBook Technical Committee) and published by O'Reilly that. Of course the book is also available in HTML, PDF and plain text.
-
Re:That XML buzzword again
The cool thing about XML-based implementations like DocBook is that I can generate all the formats you mentioned (ASCII, PS, HTML, TeX) plus a few more from just one single document in XML.
-
Docbook
The format of the documentation is also important. Give docbook a try if you haven't already. Concentrate on the content.
-
Re:I'll check it outWell first of all, legacy HTML will never go away -- not as long as millions of people are hacking out web pages by hand, or using antiquated HTML editors. XHTML will never completely replace legacy HTML, and if I still thought that was XHTML's central purpose, I would still consider XHTML a waste of effort.
The big virtue of XHTML is the big virtue of all XML document types: it's open. You can do anything with an XML document. I suppose that's also true of say TeX or RTF. Except these formats are very messy, and it's hard to extract the content from them. A good XML document type is well-structured, and thus relatively easy to access and manipulate.
If all you want to do with a document is display it as a single web page, that's not a big deal. But suppose you want to add it to some well-structured document management system? Or make it a chapter in a book? Or deliver it to a cell phone browser that uses WML or some other simplified markup language? Then all you have to do is write a filter that transforms your XHTML into the necessary XML document type. The possibilities are endless, and all of them are enabled by the simple openness of XML.
There are pitfalls, of course. A good XML application is carefully structured, and thoroughly separates presentation (layout, fonts, etc.) from content. That's why XHTML deprecates the use of formatting tags, like <center> and <font>, which act as if they designate content, but actually designate presentation. But there's nothing to prevent XHTML users from using deprecated features, or designers of other XML applications from structuring their documenting carelessly. So even after you run your document through HTML Tidy, you still might have to jump through a few hoops to transform it into a more sophisticated XML document type, such as DocBook. But the openness of XML makes just hoop-jumping a lot easier.
Anybody who's interesting in playing the XML transformation game needs to learn to program in the #1 XML transformation language, XSLT. This person has written some good introductory material, both online and in book form. Plus her web site neatly demonstrates the flexibility of the technology she teaches and advocates.
-
Re:turning point?
You could ditch template wars by insisting everyone in your organisation uses an industry standard. Try DocBook, an SGML standard for authoring technical documenation. I've used this for a number of years, knowing the information I'm producing will be processable in years to come. It's scaleable to huge numbers of large documents, so with that covered, you can go back to arguing about which fonts to use for command names
;) -
Re:use XML
And there is already a nice DTD for documentation called DocBook.
There are also various XSL and DSSSL stylesheets to convert the docbook xml into html, xsl-fo, pdf, latex etc.
Best thing with XML is, you can pack all of the documentation in one single place and create various documentations according to each audience (user, professional user, developer, etc) and language. There is no need to write duplicate informations, you only have to add certain attributes to the xml tags. -
MorphonThe canonical list of DocBook editors is here. The best program I've seen for editing DocBook is Morphon. It does a really good job of styling tags with CSS in real-time so that you can edit the document with various tags, but see the output in a WYSIWYG-like way. There's also a tree view. Another good program is XMLmind's XXE XML Editor. Both are Java apps, so will work cross-platform. They both come with good DocBook configurations, and are primarily used for DocBook. They've got free evaluation copies, and are reasonably priced at $100-$200.
I also looked at ArborText and FrameMaker. They claimed to support DocBook, but they supply config files only for (much) older DocBook versions. I found the out-of-the-box support for docBook to be sorely lacking. It looked like it was possible to configure them for better support, but it would have taken many hours to do so.
XML Spy and XMetaL looked pretty good. I don't remember how well they did with DocBook, but they are geared more for data-oriented XML, whereas Morphon and XXE are more suited for document-oriented XML, such as DocBook.
-
Where to find books that are Free as in FreedomYou can find quite a few books that are published under a variety of licenses such as the GNU Free Documentation License at The Assayer.
The most popular subjects there are "Science, Math and Computing" with 289 titles. There are quite a few other subjects covered there too.
The Assayer is more than just a list of books though - it has reader-contributed reviews. For example, here is the entry for DocBook: The Definitive Guide by Norman Walsh (available at www.docbook.org). There is a review at the bottom of the entry page.
I'm writing a Free book, although it is at a very early draft stage. The ZooLib Cookbook is a tutorial for the ZooLib cross-platform application framework.
I'm also slowly creating a copylefted collection of articles on software quality at the Linux Quality Database.
-
Dude, this article is more than 2 months old.It's a very interesting article, but it came out in February. That aside it's good that some of these are getting mainstream press.
Protocols to mention besides OpenLDAP and OAI are Whois++ and Z39.50. OAI actually is transported over HTTP. You could do the same with EAD or others.
Projects which implemented Z39.50 for the purposes of interoperability are ONE and ONE-2, EUROPAGATE, Desire and Desire II, DECOMATE and DECOMATE II, and Renardus just to touch the surface. Don't forget OHIOLINK...
Another other older, but interesting, metadata activity have been SGML MARC, and the corresponding XML MARC.
Those that are interested in more detailed reading can check out the Nordic Metadata Project, Nordic Metadata Project II, which studied the practical implications of cross browsing multiple databases and especially the use of Dublic Core. Even if you get agreement on the protocol and data standard, cross searching's not as easy as it sounds. One of the tools is the Dublin Core Metadata Temple (get it while you still can).The BYTE article was exciting to see again and could have benefited further from pointing out the relative ease of use of Dublic Core. OAI uses unqualified Dublic Core, SAFARI uses qualified Dublin Core to create an up to date index over academic research in Sweden. Shoot, since it already uses some META tags, you could even tweak htdig to use Dublic Core on your own site for those high precision searches.
With the interest in structured data (XML?) maybe well see some sites serving up not just HTML with Dublic Core, but maybe even Docbook or even TEI / TEI Lite. There are great tools for converting from Docbook to HTML, PDF, RTF, etc. and AbiWord and Kword already have partial support for docbook. If there were more, then we could see some real changes on searching the web. Coding for SGML is more difficult, so the obvious choice would be to start from Docbook XML.
-
DocBook XML
You've got two problems here:
- Convert paper to XML
- Convert XML to HTML and PDF
As for the first, I don't think that you're going to find a completely automated solution. First of all, OCR isn't terribly accurate to begin with. Second, you have to convert OCR'd plain text to marked up XML. You'll probably have to do this by hand unless the manual you're entering is terribly structured.
However, I'd certainly recommend using DocBook as your intermediate XML format. It's a well-designed language targeted at technical manuals. Don't re-invent the wheel with your own XML format and XSLT style sheets.
DocBook supports RTF, PDF, HTML, PS, LaTeX, and other output formats. Do yourself a favor and use it.
--Bruce
-
Re:I have documents in half a dozen formats
You've almost got the right idea. IMHO you're better off using something like DocBook (SG|X)ML, then you can generate a variety of output formats from text sources.
-
"Structured" TeX? Please, noI'm no TeX hater. It's a great achievement, it is unsurpassed for describing complicated layouts. (Even some proprietary-format word processors use TeX for equations.) Using TeX for basic word processing makes perfect sense. Not all documents are complicated enough to bring in markup technology.
But TeX enthusiasts seem to be stuck on the idea that Tex is also useful for structured documents. Sorry, it just isn't. If you want to impose structure on a document, you can't use a format designed around layout. Even if you add constructs that describe document structure (as LaTeX does) you can't prevent the user from using non-structure elements "because it looks right". So you end up with a convoluted mixture of structure and layout that's impossible to maintain. That's why HTML is such a mess. That's why maintaining large technical documents with traditional word processors is a nightmare.
If you need to maintain a large structured document, you need to use a format that makes no attempt at all to describe layout. So the writer is forced to think purely in terms of how the document is organized. You keep layout description in a separate thing, a "style sheet". Not only does that end your document maintainence nightmare, but it allows you to deliver the same document in different ways just by providing the appropriate style sheet. You have a single source that's accessible as a set of web page, or as a printed document, or whatever.
What formats am I talking about? Since this is 2002, I'm talking about XML. Not XML in general (most XML apps are data-centric not doc-centric) but specific appropriate XML applications, such as DocBook or DITA. For the stylesheets there's Cascading Style Sheets and/or XSL. But these are just the best technologies that happen to available now. The basic idea has been around for a long time: in structured documents you have to separate markup and layout.
-
Information overload
Several of the other commentors have mentioned problems with information overload (books that are complete, but weigh about as much as I do) and books that would be useful if you could get them to stay open on the desk. All of this is a problem of the lack of balance between completeness and conciseness. My solution to this is to distribute the book in both dead tree and CD/DVD form. The dead tree is like an extended index; a concise, precise overview of the real book, which is on the CD/DVD. This way, a person can easily read the dead tree and get a feel for the interesting/useful bits, and then use the computer to go to the truly important bits. If you can develop a method where you can disclose and hide different levels of information at different times AND PRINT IT OUT IN THAT FORMAT that would be really helpful. I guess the simplest way of thinking about all of this is like how the encyclopedia is laid out. There is the first half, which has a lot of short articles, and the second half which has longer articles. The dead tree would be the first 'half' and the CD/DVD would be the second half. Also, I would only want it on the CD/DVD if it was distributed in dockbook form as that is an open format (unlike PDF, which specifically bars apps that save/modify it for free) that is designed for information categorization and not just presentation (like HTML).
-
Re:Editing not a goal
I like to think of PDF as an output format only. By using a flexible markup system like Docbook, you can export to a number of formats. PDF is excellent (and often required) by book printers. It provides an unambiguous picture of how a book should be laid out.
To me, PDF is a lot like a system executable. You write the document in some portable source code, then compile it for a particular need. Of course, this is a very different philosophy than WYSIWYG edits. Oh well.
-
Docbook explained by KDE's team
I used to have a lot of trouble in making Docbook work, until I found out KDE's developers documentation.
Install the DocBook parsers and generators:
http://i18n.kde.org/doc/install/
General docbook information:
http://www.docbook.org/
SGML is the ISO standard for stocking information, and Docbook is the standard for writing books/documentation in SGML or XML. IMHO, it's the way to go. -
More Free+Online BooksI have several freely available online books in my bookmarks. They are a great alternative to carrying huge tomes everywhere I go. I have three of the below books on real paper, but I use the online editions far more frequently:
Numerical Recipies - Numerical Recipes in C, 2nd edition is the numerical methods book.
Autobook - GNU Autoconf, Automake and Libtool.
GGAD - GTK+/Gnome Application Development by Havoc Pennington. I'm not sure which is better, the book or the authors name!
WGA - Writing GNOME Applications by John R. Sheets. Not complete, which is a pity. I'm sure that will change though.
Docbook - The definitive guide to SGML.
CVS book - Open Source Development with CVS by Karl Fogel. It is not quite the complete book, but it is the interesting bits.
FreeBSD Handbook - FreeBSD documentation.
Maximum RPM - Documentation for the RedHat package manager.
Based on that list, can anybody suggest further online books that I may be interested in? (Don't bother telling me about the old O'Reilly books, I know about those) -
Re:Unreadable sites
I can't think of too many sites out there that Mosaic 1.0 can't read
I'm just having a look at the web through Netscape 1.1 - msn won't let me in (but NN1.1 is standards compliant - surely it complies with HTML1.0). At least my site works fine in it :-)
An Apache module might be useful for something like this. Strip out useless tags and format it so it is still useful, but just contains the facts.
That's approaching accessibility from the wrong way. HTML was never meant to be the root of all documents, but the end-point. Start with a syntax that allows information and content the highest priority - then transform it to the required output.
For me, I'm building a DocBook editing enviroment for all my content, then use simple XSL transforms that can produce any desired output from HTML for browsers or palms, to text only, to RTF, to PDF and PostScript. That way one source can be used to create multiple looks and feels.
Want to change the layout, edit one .xsl file, then run the transform - new website without changing any content.
-
AC = Not KarmawhoreFor those wondering,
XSL has two parts, XSL:FO and XSLT. XSL:FO is an XML format for the printed page. XSLT is a language for transforming one XML format into another.
XSLT is an incredibly useful language to learn. Imagine being able to take a Docbook file and spit out XHTML or XSL:FO > PDF, or... yes, even plain text.
For an idea of how easy it is I wrote a Docbook to HTML converter for 100 docbook tags in four hours, and I hadn't touched XSLT before.
XSLT is based around rewriting a data tree. You can step around the tree like a file system. In a hardcoded way like
//html/body/h1, or relatively like ../body.
The language has loops and built in functions for analysing the tree. For example, how many times does a paragraph (p) occur in this HTML document when it's parent is a table cell (td)? count(td/p)
The downside is the bloated syntax. You know those horror stories about an XML programming language that looked like <xml:if(blah= blah)> do this </xml:if> ? XSL is that. But if you can get past that you'll find one of the most wonderful things for mangling XML into whatever format you want.
-
Re:Two Things that will Help...
Oh I don't know, Docbook seems pretty damn popular. Now if someone would write a WYSIWYM for it I'd be happy, but as is, the semantics are awkward when selecting styles and trying to make it WYSIWYG.
-
Similar problem here...
At my company (in fact it's a local branch of an US based corporation) we have similar problem. There is a team here developing a system designed specifically for a customer. As one can expect along with such a system goes all the documentation - everything you could expect starting from the analysis, through functional specification and coding guidelines to end user and administrator's manuals. To make things more complicated part of the development - and the documentation - is being done by a subcontractor (which happens to be on another hemisphere) - and it is being prepared in English, but some parts of it (especially the manuals) have to be translated into local language.
Up until now it has been a growing mess with documentation being written in Word (with all the usual problems Word has with large files, with lots of graphics - screens, no versioning etc.), with no standards, with people getting into one another's way while trying to update the numerous documents.
Recently together with a friend we have came up with the idea to switch all that into neat XML/SGML files, with CVS based versioning and everything based on open standards and free software as much as possible. To our surprise the management liked the idea and we got a green light to do some research. And then the problems have begun.
First, the editor. Coding XML files with vi or alike might be nice for a hacker - and is great for creating and testing XML formats used then for data storage etc. - but it is out of the question for documentation authors. And it is pretty understandable - to be able to concentrate on the content, on the text itself, the author needs to see only the contents, as nicely rendered as possible - no tags getting into way in each sentence, no learning for years how to use the editor (thus Emacs with its psgml mode is not an option - don't flame me, it's just a fact). After a long search I have to say that there is no working, finished GNU/free editor that would match our requirement of almost-WYWSIG presentation of an XML/SGML file. As to commercial ones the only two that look good are XML Spy 4.0 - but it is just a poorely working beta for now - and Arbortext's Epic - which is almost exactly what we need, but is a bit expensive at around $700 a license.
Nevertheless, with no other options left we decided to go for the Epic when it comes to the editing side. We got an evaluation package and begun testing.
Now, we were from the start convinced that DocBook DTD & tools that go along with it are the best choice for the kind of problem we faced. Epic supports the DocBook but comes along with their own version, which in turns doesn't work well with the Linux sgml tools that we use for translating the XML/SGML files to useful end formats. On the other hand not all Epic's features can be used when one just tries to edit the document based on an "external" DTD. To enable things like being able to see the graphics files inserted into the document one has to hm... "customize" the Epic by creating some additional configuration files (like
.FOS files) using yet another expensive tool Arbortext sells - the Epic Architect.But that is not the end of the problem, because the stylesheets currently available for translating the Docbook based XML/SGML files into useful formats are not well documented and partially don't work (for example tags related to inserting pictures in the document are ignored when trying to generate a printable document). There is for example a project on Sourceforge that develops XSLTs and DSSSLs for translating Docbook based XML into various formats, but so far I was not able to make them work - and there is no documentation. Also the DSSSL based machinery for translating SGML files that comes with various Linux distros is far from perfect - HTMLs are generated mostly OK, but printed documents (.tex and
.pdf) leave much to be desired.So, from our point of view it looks like we will have to buy an expensive editor and then someone would have to spend a month or so tweaking the editor, modifying the stylesheets for our needs, developing procedures and so on. And that someone would have to be quite a competent person (with deep knowledge of the subject), someone, who could be probably better used directly in the development project.
As for now the future of our little plan of switching from mess to neat XML based solution is uncertain. Mainly because we would have to build that neat solution ourselves, as what we can get from outside at the moment are some bits and pieces that - although nice by themselves - just don't fit together.
(And, BTW, I haven't even touched the nice catch with CVS - to be really useful in the kind of environment that we envisioned it would have to be integrated with the editor - and that doesn't seem likely).
-
Doc Book
Why not use DocBook it is XML based extendable what more could you ask?
Have Fun -
Have a look at DocBook/XML
Have a look at the DocBook/XML system. It is used by a lot of Open Source projects, including PHP and phpOpenTracker. A variety of XSL stylesheets exist, for transformation from XML to HTML, PDF or LaTeX for instance.
-
Re:Intro/Tutorial on DocBook?
Maybe if you had bothered to look around docbook.org a little more you would have noticed that there is an entire O'Reilly book available online and for free about Docbook and how to use it. You can also purchase the dead trees version from your local bookstore.
-
Intro/Tutorial on DocBook?
I hadn't heard of DocBook, so I went fishing on docbook.org for some basic info.
The state of the documentation for this product is fairly lacking. (Hey, it's a DOCUMENT application!) There's no "getting started with DocBook" stuff. There's no official tutorial.
The closest thing to a tutorial I found is this page: DocBook intro. I'll excerpt the front page.
- DocBook intro
Here is my tutorial on DocBook. I never completed it, but it is still useful, since others don't focus on a complete beginner tutorial.
Last modified: Mon Jul 27 11:19:57 1998
Frankly, this sums up my issue with many Open Source projects: making a technically superior tool is not enough to generate wide user acceptance. There has to be an easy migration path from what the user's already got.
DocBook needs at least ONE of the following to get people going:
RTF/DOC/FrameMaker/TeX to DocBook converters, supporting at least a good 75% of basic features,
A usable migration tutorial that assumes the user already makes RTF/DOC/FrameMaker/TeX documents,
A usable editor that shows the results, even if it has to be two-paned to show both source and results.
I'm not flaming Open Source in general, but this is not the first time I have heard of a tool that would fit my needs exactly, except they put very large barriers to entry in my path.