Embedding XML In Docs?
An anonymous reader writes "Now that XML is the de facto standard (for good or ill) for doing message passing, I find that I need to give XML examples in the documentation that we produce. We're stuck with Word and up till now I've just been doing the examples as cut and paste from the log files. We include schemas in the appendix but it seems that the clients like the 'readability' of the raw XML over other approaches we've tried. I'm wondering what everyone else is doing in the world of XML documentation."
Keep including excerpts/relevant portions in the documentation, and separately, provide supporting reference materials - full XML files, XSD, etc.
"Now that XML is the de facto standard (for good or ill) for doing message passing, I find that I need to give XML examples in the documentation that we produce."
Jabber basically is an XML bridge. Combined with Peer to Peer it's a powerful combination.
Asking how to do this in Word is like asking how to cut a board in half with a hammer. In both cases, you're using the wrong tool for the job.
That said, I can tell you what we do for documentation. We have a wiki (Confluence, though any should work) that is perfectly capable of handling XML or any of a number of languages. We then have automated processes which periodically pull certain pages, strip the navigation elements and render them to PDF which, depending on the process, get transfered to various locations (samba fileshare, a couple of different intranet sites we maintain or into our CMS workflow to be approved and added to our public site).
Since it's a wiki, the input is easy and anyone in our company can contribute (if we were larger, we might add more access controls). Yet it also produces professional-looking PDF documents.
find -name "*base*" -exec chown us {} \; ; ln -s
You may try StylusStudio's xml schema documentation generator or simpler/opensource/customizable dtdtoc written in python.
find -name "*base*" -exec chown us {} \; ; ln -s
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
No trolling intended, but just having the schemas is like just having the UNIX man pages without examples.
...
Let me clarify, bear with me- The man page for 'ping', for instance, is all-encompassing but rather intimidating when it comes to every-day use:
NAME
ping - send ICMP ECHO_REQUEST packets to network hosts
SYNOPSIS
ping [-dfnqrvR] [-c count] [-i wait] [-l preload] [-p pattern]
[-s packetsize]
DESCRIPTION
Ping uses the ICM... etc
Okay, enough. At that point, they've more than lost me. All I want to know is, How do I use it?
A simple example gives much more 'instant gratification' style information:
user@host:~$ ping www.google.com
PING www.l.google.com (64.233.183.104) 56(84) bytes of data.
64 bytes from nf-in-f104.google.com (64.233.183.104): icmp_seq=1 ttl=245 time=11.3 ms
64 bytes from nf-in-f104.google.com (64.233.183.104): icmp_seq=2 ttl=245 time=69.3 ms
This is enough for everyday use. No need to bother with the gritty details at first. Once the users get to that point, they won't mind the schemas and full help descriptions.
Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
I really don't know if this is a good idea
[] Leonardo Kenji Shikida
Before you flog yourself too much with XML, check out JSON: http://www.json.org/.
It's supported by every language under the sun, and really simple to use. You may end up needing the extra capabilities of XML, but if you don't JSON is a much friendlier experience.
...XML document itself. :P
(Isn't that the beauty of it?)
It is by my will alone my thoughts acquire motion; it is by the juice of the coffee bean that the thoughts acquire speed
I thought that the point of XML was to embed the documentation in with the data, so that it was human-readable? This doesn't make any sense. If XML has to be documented anyway, then what's the point? To increase network traffic? To fill up "extra" hard drive space? Old fashioned character-delimited is a better way to go if you have to document the thing, anyway.
I don't respond to AC's.
If you need to have XML fragments in your Word document, one of your best options is to copy and paste from Visual Studio. The result is nicely indented, colorized and mono typed. If you don't have Visual Studio, you can download it Visual Studio Express for free.
Just open Visual Studio and create a new XML file (don't create a project-- there's no need to do so; just use File->New->File... and select XML file). Copy and paste your XML fragment into the new file. Press Ctrl-K, Ctrl-D to reformat the document. Then just select the fragment you want and copy and paste into Word.
I hope that helps.
You do good XML documentation the way you do good documentation of any kind...
1) Examples.
2) Functional examples.
3) More examples.
People learn best when they have a skeleton of knowledge to hang the meaty details on. By all means, have a detailed description of each element in the XML, but give lots of examples so people can get a sense of the big picture of what's going on. And make sure your example are real-world enough to cut/paste and modify for people who need to get something up and running in a hurry.
There's a reason that K&R is considered one of the best language books every written. It has tons of examples, and also has a lot of the formal stuff in a useful format.
Sometimes it's best to just let stupid people be stupid.
-1, Go Back to K5.
And XSD with a good XML editor is better than most documentation you could produce.
Throw comments into the XSD, and it's gold.
If you're already dealing with XML files, I would suggest that the main barrier to using a toolset such as DocBook (SGML or XML variants) should be gone already.
DocBook is excellent at enforcing proper structure and contains all the elements you need (really!) to write tech documentation.
Several high profile projects such as FreeBSD, KDE, GNOME and others use DocBook as their main doc format, as do I believe more tech companies than actually want to admit it. I maintain the PF tutorial at http://home.nuug.no/~peter/pf/ as DocBook SGML myself.
The tools most people use for DocBook are free (most likely just a few mouse clicks or commands away through your package system), but some proprietary/commercial tools are available too. The main reference is at docbook.org, it certainly would not hurt to check it out.
-- That grumpy BSD guy - http://bsdly.blogspot.com/
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.
Not everything is instantly understandable.
For help on the sentence, see wikipedia.
liqbase
Using a less difficult markup... JSON.
We publish tons of documentation that has to explain XML formats. What we do is include DTD diagrams in our documentation that shows the structure of the XML document graphically. The tool we use to generate them is Tibco's TurboXML and we've been using it for years. Obviously we include examples, but the DTD diagrams really show you all you need to know. I know, I know its commercial software. Maybe there's something open source out there that does something similar, not sure. Hope this helps!
An engineer is someone who spends 3 hours trying to solve a 2 hour problem in 1 hour - Anonymous
If I understand the question right, how to present structured documents within human readable text then closest you will ever come to a right answer is YAML. look it up at wiki pedia.
Some drink at the fountain of knowledge. Others just gargle.
oops, Here's the link. Also a word of advice: you can embed XML without modification in YAML just by indenting it. So you can have both in the same document. Unlike XML, YAML allows for some (limited) relational hierarchy and for type casting as part of the language itself. You can use this to simplify a highly nested XML document with lots of redundant entries. just make an !!xml type-def.
Some drink at the fountain of knowledge. Others just gargle.
We use a large XSLT file (more accurately a series of files with xsl:includes) which document the functions of the XML. You can transform any XML query or response with this XSL, and it will document the call for you. There's also an XML file which when transformed with this XSL will give you full schema documentation.
So it's your choice, you have complete documentation, or you can get documentation on any call by passing its content through the XSLT.
Slay a dragon... over lunch!
For those of us actually interested in opinions / answers to the poster's question, please actually respond to the QUESTION. Anonymous didn't ask for criticism over the choice of languages, keep that in mind.
For better or worse, where I work, tech specs are Word. I use the style just mentioned for my XML or sometimes embed XML Spy schema fragments as JPEG.
That is pretty good, but as your example is not valid XML, we need to wrap it inside a valid XML to make it actually work:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE documentation [
<!ELEMENT documentation (#PCDATA)>
]>
<documentation>
<![CDATA[<XML documentation>XML documentation</XML documentation>]]>
</documentation>
So it seems that it's a publishing cycle question. You should have a source document with placeholders that some later process replaces with code snippets. Your schemas could have foreign nodes that denote placeholder Ids, for to map them up.
You could use Docvert which lets you make XML Pipelines from content and build plugins and conversion stages, that way you could replace placeholders from any of your schemas. It generates HTML and DocBook so you could then convert that to PDF or whatever.
-Docvert converts MSWord to OpenDocument, clean HTML
If I have to include XML, I will typically have XMLSpy format it, then copy it from XMLSpy. Inserting in this way will retain the colorization and formatting for readability.
In Word, I will shrink the text to where it is barely readable, since anyone wanting to read it can easily cut-and-paste elsewhere or simply zoom in. In doing so, I minimize the amount of space I consume with ugly XML, and it minimizes the ugly line-wraps.
But as an alternative, I prefer using Object Diagrams (not Class Diagrams) to describe the message structure in a format-independent way that is much more readable and appealing...assuming the audience can grasp the notation. I suggest this any time the explicit description of XML tags, attributes, etc is not absolutely necessary.
The only way to win is not to play.
XML is like violence
If it doesn't work -- use more
If you're serious about doing documentation, use an XML editor with something like the DocBook DTD/Schema, not Word. Word is for shopping lists and letters, not "real" documentation. And yes, Word does actually have a real XML editor, but it's pretty crummy; and no, Save As XML (WordML or OOXML) doesn't count.
The problem is that most XML document editors suck for non-XML-gurus. They can display either plaintext with syntax colorisation (Emacs/psgml/xxml) or pseudo-WYSIWYG, but lack the interface smarts that would make them usable (see my paper to Markup last year on this topic, or wait for the full report next year :-). Both have their advantages and disadvantages but they all require a fairly deep prior knowledge of XML. In your own case this may be fine, but not if you want to hand the editing suite to your non-XML colleagues.
A good documentation system takes some effort to build, but the results in terms of usability, persistence, quality, etc are usually well worth it. In the specific case of quoting code, XML's CDATA section feature lets you embed code verbatim, and one of the possible outputs is to transform the XML to LaTeX using XSLT, and thus enable the use of things like the listings package, which makes pretty-printed code in your PDFs.
Pay no attention to the neanderthalers who want you to regress to some text processing application.
Word is ideal for tech documentation, as it gives you the tools to do better-than-good typography, as well as to easily enhance the text with illustrations and inclusions—to create documentation that's tuned for the reader, not the writer.
I'm assuming you know how to set up suitable styles. For the rest, you have more than cut and paste as an option. Keep in mind that you can embed just about any file as an object linked to the file (Insert|Object|Create from File) so that any changes to the file are automagically sync'd in the document. This can be a log file, a Visio diagram, or an Excel spreadsheet—anything it takes to clearly describe what you're doing. If your life is really interesting, there are also compound documents to play with.
I spend my days (and some nights) developing technical documents. I cannot conceive of using a lesser tool. I've played some with Open Office, but it isn't there yet.
I'm a Programmer. That's one level above Software Engineer and one level below Engineer.
Its designed for doing technical documentation. It began as a SGML toolset but has since moved on to XML.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
I wrote an XML book a decade ago and I don't understand how it is different than documenting any other programming technology. You write chapters on various topics and you cut and paste examples into appropriate places with appropriate prose around them. Word, FrameMaker, TeX, DocBook, DITA: the container documentation technology is not particularly relevant. I mean sure, there are more sophisticated things you can do (auto-testing, auto doc-generation, XML escaped within XML etc.) but you need to tell us what kind of documentation you're trying to produce and why the no-brainer techniques are not sufficient for you.
Well, the XML docs I'm in the middle of writing currently have XML snippets embedded in the doc, and pointers to XSDs and suchlike as appropriate. The version of the doc that's going to come down the pike in about 6 months will have pointers to the schemas, cross-links to "live" XML sample code (plus the raw text thereof incorporated as appendices), etc. But that's waiting on the dev-rel server to go live. Lack of a reliable hosting space can be so... problematic.
As for tools - one works with what one has/can get. You can make this work in damn near anything, despite what the [must hate Microsoft] masses crow on about.
And yes, I've been doing this for a while. Just 20 years, this summer.
Paste a graphical representation (aka UML or similar diagram)
That's what we do here at XYZ123 corp.
It's also useful for developers, since you get the fell for the structure pretty fast.
I may add that it would be good if a tool does that for you.
So far only Eclipse (with some XML plugin) does the trick but you must chop the image to make it fit into word or whatevr.
I've been tasked with complex XML documentation in the past...
The problem as I see it is that there are multiple levels of documentation.
In large systems schemas are often used in multiple places; they all use the same schema but they use it in different ways.
An order for 1 item may be accepted from a customer; expanded by technical sales; automatically validated and corrected; translated to different parts-lists for delivery planning; have existing inventory added for connectivity; translated to vendor parts - and all in the same schema (some customers provide more data so interact lower down the stack...).
So they all have some things in common: core technical (data types, syntax rules for the address data etc)
Then there is general information which is common to multiple systems (use of a shared ID from inventory system A for systems 1,2 and 3 but inventory system B for 4 5 and 6)
Then there is stuff which is specific to an interface (even though this value Y is optional in the schema; we need it if that value is set to X; oh and you can't do this...)
Embedding anything beyond the simple stuff in the schema is asking for trouble (in this situation).
Personally I've used embedded docs in the schema where possible. Usually explaining the schema structure and general aims.
Then there's a report that describes the schema usage in eg order processing. This puts a few paragraphs around each element or logical group. This often includes or references inter-system translations.
Finally there are interface specs/reports which comment on specifics above and beyond that detailed elsewhere - the logic that determines validity; the specific range of values allowed at a point. Which optional elements are actually mandatory under what circumstances...
Create an specific parragraph style (i.e. XML code) for the XML bits you want to put in your document. In terms of borders and background, make it as complicated or minimallistic as you wish, but I strongly suggest using courier new our any other monospace font.
You may also find useful to tag every document with a caption, so you can reference it later on (and do things such as "see the example on page xxx", with xxx being a reference).
I hate Word, but I use it for specifications that need to be available off-line because it provides the best printable-output of the various source-formats my organization allows.
My technique is still evolving, but I currently specify an XML format by first describing its purpose/context in plain-text, followed by a UML class diagram to visualize the information it captures, followed by an XML example, followed by its XSD. For the class-diagram, I use WSD so it will print well and I scale it so that it fits on a single page. I put both the XML example and XSD into a "table" containing exactly one cell with its background shaded to a light-grey, borders set to 0.1 inch, font set to Courier New, and manually add a line-return and/or spaces to address any line/page-wrapping that may occur. For the XSD only, I add a single-space before and after each xs:element and use bold on the "name" attributes.
Note: when designing a format, I actually start with the UML class diagram, so putting it into the spec is really a non-effort - especially since the UML tool I use (Borland Together) can paste it directly into the Window's clipboard from which I can then Ctrl-V to get it into the Word doc.
We use XML a lot to pass business objects around. Often through web services (SOAP), but not only. We never document the XML structure itself (although there is some documentation in our WSDL files). Instead we describe the business objects. Each object has a name and a list of fields. The field has a type (primitive or Object type), a length, a multiplicity (1, 0..1, 1..n, 0..n usually) and a description. All the semantics go into the description. The meaning of some fields may depend on the performed operation. This should be described together with the description of the operation. The objects and fields are then represented as XML elements with the same name. We hardly ever use attributes.
Sample:
Object: Person
Fields:
name string(32) 1 The last name
fname string(32) 1 The first name
dob date 0..1 Date of birth
kid Person 0..n The persons children
XML:
<Person>
<name>Doe</name>
<fname>John</fname>
<dob>1974-12-31+02:00</dob>
<kid>
<Person>
<name>Doe</name>
<fname>Harry</fname>
<dob>1991-12-1+02:00</dob>
</Person>
</kid>
<kid>
<Person>
<name>Doe</name>
<fname>Sally</fname>
<dob>1993-12-2+02:00</dob>
</Person>
</kid>
</Person>
find a text editor that can color you xml, I use Programmers Notepad but there are many that will do the job.
Export the xml file to html and then copy/paste from ie to word. "xml now in real color!"
but you just do that in the overview documents, you also need to provide a real reference and not just a DTD you need to provide detailed appendix style document in human readable form.
--meh--
My company makes software that uses XML messages as the interface between a user's invoicing system and our system for tax calculation. The schema for the various messages is well-commented, but we felt the raw XML wasn't as human-friendly as we'd like.
We tried output from XML Spy which was an improvement for some purposes, but what we really wanted was a succinct table that listed each element/attribute along with its occurance info, valid values, and a text description. We initially did this manually in Excel, but then we internally-developed a tool that reads the XML and outputs the info in a human-friendly HTML table.
Our XML documentation includes the raw schema + the HTML reference tables + sample messages. We provide this doc in HTML format with a frameset to provide table of contents, search, and index (currently WebHelp output from RoboHelp, but we're in the process of switching to Author-IT).
"I drive way too fast to worry about cholesterol."
Docs embed XML!
uh, no.. it is microsoft that does that when I come to think about it.
I recommend two things:
... blah blah blah | O | Yes
XML Sample files
Store sample documents in a \Samples subdirectory under the directory storing the word files. The word document must 'include' them by using the "Insert\File\Insert as Link" functionality in MS word.
XML Structure tables
The most useful way to illustrate XML visually doing the following:
1. Take an XML document that illustrates as much of the schema as possible.
For instance, this could be a document that includes all optional elements. (This may not be possible in some cases where a subelement can be only one of several different types - in this case, you could use different tables, or even auto-include Word tables.)
2. Pretty print the document using a text or XML editor so XML is properly indented. Remove contents and closing tags that occur on one line so that only the structure of the XML is shown. For instance, remove '20020302T00:00:00</RequestedDeliveryDate>' from the line below
<PurchaseOrder>
<OrderHeader>
<POIssuedDate>
<RequestedDeliveryDate>20020302T00:00:00</RequestedDeliveryDate>
You may want to convert tab indentation to spaces at this point so that space is more efficiently used.
3. Paste the document into a word table with additional columns to add usage notes, and other metadata for each element. This is best done with the page setup in portrait mode.
It's difficult to illustrate this on Slashdot, but the snippet below sort of illustrates the idea using wiki lingo
XML Element | Notes | Mandatory/Optional | Mapped to Backend
<PurchaseOrder>
<OrderHeader>
<POIssuedDate>
<RequestedDeliveryDate> | the date
Like most coders I've been having to do this for some time. My approach seems to allow our customers to easily understand the XML we use:
1. Data Requirements (DB Schema and Expected Values/Ranges)
2. Sample XML Without Data, Just the Schema Values From (1). ie. [FirstName]nVarChar(15)[/FirstName]
3. Then Show the XSD File That Validates the XML.
Then a full description of each element, etc followed by some samples. True this can get lengthy for really complex schemas but even then it makes it pretty easy to read and "understand" WTF is going on.
Just my $.02There are only 10 kinds of people in the world. Those that understand binary and those that don't.