Slashdot Mirror


Creating and Using XML-Based Internal Documents?

Richard Emberson asks: "Once again into the breech...or at least the ground floor in a new startup. This time around, I would like to have all of the Engineering documentation internally online: a unified, internal, CVS-ed, web-based, development organization document tree covering the engineering process, methodology, coding standards, nightly build/test reports, FAQs, new hire information and help pages and the documentation for each project. Recently I've written documentation (on Linux of course) using the Apache XML-stylebook tags, stylesheets, and Ant-base publishing - and I like it. So my questions are: Has anyone done this and, if so, how were the links between documents managed?" Does your workplace use XML in its internal documentation? If so, how well does your system work, and what advice would you pass on to anyone else attempting something similar?

"If you start out with only one project (product), how do you structure it so that when new projects come into existence they can easily be integrated? Are there documentation templates out there upon which I can base the various development documents (like requirements, product development plan, design, coding walk-thru standards, etc.) and not have any of this swell too be so large that no one will be able to produce, maintain or read it?"

176 comments

  1. InfoImaging and Dig35 meta data uses XML by purduephotog · · Score: 3, Informative

    Don't know if you'd consider it engineering texts, but XML is used in moving metadata from pictures around. There's an open source and binaries downloadable.... might help your implementation.

    Good luck- it's quite impressive once you get the trees set up correctly :)

  2. XML is good by teknopurge · · Score: 1, Insightful

    Standards are good. XML is good. Documentation should be in a standard format. Network traffic should be in a standard format. IPC traffic should be in a standard format. XML is good.

    Developers,Developers,Developers,Developers.....

    -teknopurge

    http://techienews.utropicmedia.com help us beta!!!

    1. Re:XML is good by Anonymous Coward · · Score: 0

      you have just proven that the way politicians operate is successful. all you said was that you support standard, and XML and that documentation, network trafic, and IPC should all be in a standard format. WOW what content....you realy deserve that +2 insightful..NOT!!!

    2. Re:XML is good by Jules+Bean · · Score: 3

      ...and people should bear in mind that that is in fact about all that is good about XML.

      XML is simply a notation for structured information. It's not the first, it won't be the last, it's not the most compact (lisp is far compacter), and it may well not be the best.

      But it's around, there are some robust libraries to parse it, so you might as well use it.

      It's all directed acyclic graphs, really...

      Jules

      --
      -- Any sufficiently advanced technology is indistinguishable from a perl script.
    3. Re:XML is good by the_2nd_coming · · Score: 1

      it is also the most simple

      --



      I am the Alpha and the Omega-3
    4. Re:XML is good by Anonymous Coward · · Score: 0

      It's all directed acyclic graphs, really...

      Wow...did you just learn how to use that really big word in class today? You must be very proud of yourself for being able to use it properly in a sentence.

      Acyclic Graph = Tree

    5. Re:XML is good by ftobin · · Score: 1

      It's all directed acyclic graphs, really...

      You know, you could've just said "It's all trees, really", and made it a lot simpler. Speaking of which, XML may simply be a notation for hierarchial and mixed data, but it happens to be a fairly simple one that extends well into multiple namespaces, is unambiguous with regards to parsing, and in general is human readable.

      While I agree XML may not be the best at at accomplishing specific tasks, its general nature makes it a pretty good fit for general, all-around use. Uniformly using XML for a lot of things helps out us humans more than anything else.

    6. Re:XML is good by Anonymous Coward · · Score: 0

      ...Power to the people! Death to obfuscating ubergeeks.

  3. Nobody Uses XML by gamorck · · Score: 0

    XML is a buzzword. A poor one at that. XML is no more compatible and no more universal that the concept of binary data.

    XML is not a standard - rather it is the cumulation of marketing hype designed to lure inexperienced web developers into the trap of full buzz compliance (which normally comes with little real experience).

    So in essence - when this guy says, "I want the companies data in XML" he may as well be saying that he wants the company's data in one of those newfangled "binary" files.

    Can't slashdot do any better than this? The only reason I read anymore is for the trolls. The quality of their postings outweigh the quality of the stories they are attached to. That is a truely sad commentary on the slashdot community.

    Gam

    "Flame at Will"

    --
    I love idealists not because I am one, but because they make life bearable for pragmatists such as myself.
    1. Re:Nobody Uses XML by pubjames · · Score: 0, Offtopic

      You are so ignorant about technology you don't deserve to consider yourself part of the 'Slashdot community'.

    2. Re:Nobody Uses XML by Anonymous Coward · · Score: 0

      And you are so ignorant of the english language
      you shouldn't be allowed to use it. You don't
      know the difference between an apostrophe and a
      quotation mark.

    3. Re:Nobody Uses XML by mudimba · · Score: 1

      I think that your concept of XML is exactly backwards. Instead of being equivalent to binary data, I would say that XML can be used like text data used to be.

      One of the great things about the Unix philosophy is the notion that programs should use text for their input and output. This allows for programs to be piped together, and large problems can be solved by combining many small solutions.

      As data becomes more complex and is shared among many different machines and architectures, straight text files are no longer sufficient. This is where XML comes in. Not only is it compatible and universal across machine types and OS's, but it is also able to be shared across any program.

    4. Re:Nobody Uses XML by L8Knight · · Score: 1

      I couldn't have said it better. With the time it takes a program to parse through that XML crap I could have written a binary to ascii converter to display my fuel efficient binary data nicely!

    5. Re:Nobody Uses XML by Anonymous Coward · · Score: 0

      I don't know, that sounds like a perfectly typical slashdot reader to me...

      All you need is an opinion to get through the door ;-)

  4. XML by Anonymous Coward · · Score: 1, Offtopic

    XML is a complement to HTML
    XML is not a replacement for HTML.

    It is important to understand that XML is not a replacement for HTML. In future Web development it is most likely that XML will be used to describe the data, while HTML will be used to format and display the same data.

    My best description of XML is as a cross-platform, software and hardware independent tool for transmitting information

    1. Re:XML by the_2nd_coming · · Score: 1

      this is just a nit pic but actualy XML is a compliment to XHTML, the current HTML does not support XML very well......just a nit

      --



      I am the Alpha and the Omega-3
    2. Re:XML by MobyDisk · · Score: 2, Insightful

      1) Your point is irrelevant since this discussion is not about HTML.
      2) XML is not a replacement or a complement for HTML. HTML has nothing to do with XML. XML is a extensible markup language which can be used to transfer an infinitum of data forms, HTML happens to be one of th emany many uses it has, but not nearly the most important. XML is more commonly used for databases, RPC calls, log files, EDI, or new languages than as a complement for HTML.

    3. Re:XML by Anonymous Coward · · Score: 0

      Well, I hopr XHTML appreciated it.

    4. Re:XML by Tet · · Score: 3, Interesting
      XML is more commonly used for databases, RPC calls, log files


      I'll curse the brain dead moron that first suggested using XML for log files for the rest of my days. It is completely unsuited to the task. I have to wade through megabytes of pointless XML every day, searching for errors that are obfuscated by the sheer volume of crap that surrounds them.

      --
      "The invisible and the non-existent look very much alike." -- Delos B. McKown
    5. Re:XML by danboy · · Score: 0, Redundant

      It's also good to point out that while HTML is a mark up language and therefore good at displaying data that XML will be displayed using XSL, Similar to the way HTML can be displayed using CSS.

    6. Re:XML by Anonymous Coward · · Score: 0

      XML is a complement to HTML
      XML is not a replacement for HTML.


      Why has this been modded as informative? This guy doesn't know what he's talking about.

    7. Re:XML by Anonymous Coward · · Score: 0

      Then you're not doing it right.

    8. Re:XML by cduffy · · Score: 2

      How much work is it to write a script (I'd use Python, but whatever) that parses the log files and extracts whatever's interesting to you, and displays that info in your preferred format?

      Really, I'm asking: how much work?

      Because from where I stand (as a programmer who does some sysadminning on the side), it doesn't look that hard.

    9. Re:XML by erc · · Score: 1

      And you could write all your code in assembly, it doesn't look that hard?

      It is STUPID IN THE EXTREME to store textual data in anything other than the simplest text format you can get away with. Use the lowest common denominator, but I guess they don't teach that anymore in school, just "use the latest XYZ!" nonsense that's almost taken over this field.

      If you want to use XML, fine, translate text to XML, but it is IDIOTIC to store it that way.

      --
      -- Ed Carp, N7EKG erc@pobox.com PGP KeyID: 0x0BD32C9B What I'm up to: http://intuitives.mine.nu
    10. Re:XML by Tet · · Score: 2
      How much work is it to write a script (I'd use Python, but whatever) that parses the log files and extracts whatever's interesting to you, and displays that info in your preferred format?


      More than you'd imagine, because it's not pure XML. It's XML intermixed with other data in different formats. But still not that hard. The point is, though, how much less work would it be if they'd just logged stuff in a sensible format to start with?

      --
      "The invisible and the non-existent look very much alike." -- Delos B. McKown
    11. Re:XML by Hard_Code · · Score: 2

      Ditto for configuration files:

      <a-big-configuration>
      <a-section>
      <a-section-name>
      whoopee
      </a-section-name>
      <a-section-comment>
      here is an irrelevant comment!
      </a-section-comment>
      <a-value>
      <note-about-this-value>
      in case you didn't know, this is a value - watch out!
      </note-about-this-value>
      <a-value-name>
      bloatedness.factor
      </a-value-name>
      <a-value-value>
      <a-value-value-type>
      <a-value-value-type-value>
      integer
      </a-value-value-type-value>
      <a-value-value-type-description>
      an 'integer' is a whole number...ya moron...
      </a-value-value-type-description>
      </a-value-value-type>
      <a-value-value-value>
      1000000000000
      </a-value-value-value>
      </a-value-value>
      </a-value>
      </a-section>
      </a-big-configuration>

      --

      It's 10 PM. Do you know if you're un-American?
    12. Re:XML by cduffy · · Score: 2

      XML *is* the simplest format, because nobody ever needs to write an XML parser: every major language already has three.

      There are databases optomized for loading and running queries on XML data based only on the included metadata. There are graphical XML editors which allow a structured view of arbitrary data, given a DTD (which is also standardized and has tons of tools capable of processing).

      Writing a custom parser for every single data file one might wish to store because one chose to use a text format that looked easier strikes *me* as stupid in the extreme.

      Btw, I didn't learn this stuff in school -- I learned it the hard way, writing conversion tools for other people's "simple" proprietary formats.

    13. Re:XML by cduffy · · Score: 1

      One man's sensible format...

  5. The problem is by EQ · · Score: 2, Insightful

    Document interchange with our customers had to be in WOrd, so thats what we got stuck with eventhough we initially started off in the direction you are heading.

    Good luck taking on the Microsoft Monster.

    --
    Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo! http://goo.gl/J9bkO
    1. Re:The problem is by danboy · · Score: 1

      I believe OpenOffice is using XML as thier standard document format. You could use it as your office suite, and just save off as a MS word doc when you need to deliver to a client (That's if MS word is not already supporting XML, XSLT)

    2. Re:The problem is by white_owl · · Score: 1
      Although it may not meet every need. I have made sample documents in Word with the right styles and saved them as html. This gives you the cascading style sheet that word needs. Then you slap that on the begining of the documents that come out of your XML system (I used a lot of

      type tags and then inported the files back into Word. At that point I had a styled Word file.

  6. Doc Book by jjr · · Score: 5, Informative

    Why not use DocBook it is XML based extendable what more could you ask?
    Have Fun

    1. Re:Doc Book by Anonymous Coward · · Score: 0, Offtopic

      what more could you ask?

      How about the use of periods to separate sentences?

    2. Re:Doc Book by skwog · · Score: 0, Offtopic

      Burritos. I could ask for burritos. :)

      --


      You can laugh without eating a sandwhich, but you can do both if bring one.
    3. Re:Doc Book by Anonymous Coward · · Score: 3, Informative

      Another vote for DocBook. We use it and it works great. You can process DocBook SGML (it also comes in an XML flavor, but we haven't used it) into pdf, rtf, and html. Books can come with multiple chapters so that many folks can work from a cvs repository at once. Then, you build the docs every night the same way you build the code. It takes a smart person a few hours to set up the build process, but it's no big deal. O'Reilly has a book on it, check Amazon.

      DocBook is excellent, no question about it.

    4. Re:Doc Book by asyn42 · · Score: 2, Interesting

      Doc Book is fine for tech people that can understand and value the benefits of the open nature it promotes. But at the risk of starting a flame war, is there a good editor that provides a WYSIWYG environment for editing the documents?

      We have a very common environment of mixed levels of technical expertise. MS-Word is the standard that everyone has, and everyone can use. Currently it is the dictated standard for internal documentation within R&D. The only way to get something in to replace it, is to provide a UI which is at least close to its simplicity of use

      Any solutions out there for DocBook?

    5. Re:Doc Book by Anonymous Coward · · Score: 1, Interesting

      These guys have a single-sourcing documentation app... tech writers enter content into an app that kind of looks like Word or FrameMaker, and then you output to which ever format you like.

      http://www.arbortext.com/

      It's not quite as simple as I make it sound: you'll have to develop a DTD and Stylesheets... but for the content-people, is easy-breezy.

      Don't know how well this falls into the scope of a fully XML-based internal doc environment though.

    6. Re:Doc Book by bt · · Score: 1

      DocBook is nice for managing certain types of content, but using it often means trading a high level of structure for a generic one. You're stuck using the tags that DocBook defines, which is good for books and articles, but it doesn't scale beyond that very well. For example, using MathML or SVG with DocBook isn't a lot of fun.

    7. Re:Doc Book by cenobyte · · Score: 3, Interesting

      I'd agree. DocBook is excellent, and comprehensive. It is designed for applications like this (technical documentation, etc.), is well supported (built into all ArborText's editors, Frame+SGML, etc.). It is well documented: it has an O'Reilly book written by Norman Walsh. It is very easy to add support for it into stanard text editors: I use xemacs and it works fine (no flames please).

      Linking features between documents can be achieved using <olink> and <ulink> tags. The exact details would depend on your repository setup, but the things you want to do are catered for.

      Best of luck.

    8. Re:Doc Book by jon · · Score: 3, Insightful

      I just finished using docbook-xml 4.1.2 - docbook-dsssl 1.71(contains a print and an HTML sytlesheet for use with RTF and TeX backends) - openjade 1.4devel1 (which uses the stylesheets just mentioned in its RTF and TeX backends) and pdfjadetex to produce a 140 page technical manual in .pdf format. I used psgml mode with Emacs 20.7 as my editor. (I was looking at the interesting group of programs called psgmlx but didn't have time to actually try them.)

      In short, for editing, Emacs/psgml mode worked fine, but the complexity of keyboard commands, my aging brain, and deadline pressure meant that I could only learn to use a small chunk of the possible features, especially of psgml mode which, unlike Emacs, was new to me. I opened one Emacs frame on one page on a 4 x 7 desktop for each chapter and then skipped from desktop page to desktop page with a full screen for each. Hint: Make a tags table so that you can then do search and replace operations across all files that make up your book. (See 22.16 Tags Tables in the latest GNU Emacs Manual (14th edition).)

      Using docbook-xml, I found that tables were difficult, too difficult. I was unable to make a simple 3 colum x 5 row table within the hour I'd alloted and finally gave up. Ended up using the tag to get literal text. Fortunately, only one or two tables were desired. Checking the archives on the docbook-related mailing lists, I was not able to put Norm Walsh's advice (given to someone with a similar table-related problem) to go read the info on the new table model to practical use within my limited timeframe.

      Also, index generation was a bitch. I used collateindex.pl in conjunction with openjade. There must be a better way, but I couldn't find it in the limited time I had.

      No graphics were included, but some of the traffic regarding the problems of others gave me a headsup that I should allow plenty of time for experimentation before actual work could begin, learning how to insert both vector and bitmap graphic files in a document that would be turned into a .pdf.

      Incidently, I was also able to produce MIF files for FrameMaker from the same XML source with openjade, a client requirement.

      YMMV.

    9. Re:Doc Book by Anonymous Coward · · Score: 0

      XSSSL is a stylesheet language for Docbook, and XSSSL can define this. Thanks!

    10. Re:Doc Book by Anonymous Coward · · Score: 0

      I wish it were comprehensive, but so far I haven't found anything that can do math except TeX. Without high quality math you can't have real technical documentation.

    11. Re:Doc Book by cpeppler · · Score: 2, Interesting

      You may want to take a look at XML Spy as an XML Editor. It's a commercial product (about $199), but I've used it to build a DocBook file, and then used the DocBook XSLT scripts to generate HTML. I've also used it to generate custom DTDs, which worked pretty well. The product can import Word files (converts Word Styles to element types). What they really need is an XSL script to generate Word formatted files. That would be great! I'm not sure your basic office folks are quite ready for it, but with a little training, they might be weaned off of Word.

  7. Javadoc by t00tie · · Score: 0, Offtopic
    Javadoc is great for code comments - but kinda lacks "external" documentation. Especially stuff like "How to set up the development and production environments". Do consider using a similar directory tree for documentation as for code.

    t00t TooT

    --
    I asked my closed-source vendor about ubiqitous computing.
    He answered "Oh no! You-not-be-quit-us!"
  8. s/again/more by cthlptlk · · Score: 1

    It's "once more into the breech"

    1. Re:s/again/more by Vulch · · Score: 1

      If we are getting picky, Henry V's speech before Harfleur actually starts "Once more unto the breach". :-)

      Anthony

    2. Re:s/again/more by Anonymous Coward · · Score: 0
      Henry: Once more unto the breach, dear friends, once more! Consign their parts most private to a Rutland tree!

      Richard: Let blood -- Blood -- BLOOD! -- be your motto!

    3. Re:s/again/more by EnderWiggnz · · Score: 2

      sed: -e expression #1, char 12: Unterminated `s' command

      if you're going to correct someone using regex, do it right :-)

      --
      ... hi bingo ...
    4. Re:s/again/more by gorgon · · Score: 1

      His version works fine in vi. Since there's no direct reference to sed, I think he's fine and you're the one who should be careful about corrections ;).

      --

      And I'd be a Libertarian, if they weren't all a bunch of tax-dodging professional whiners.
      Berke Breathed
    5. Re:s/again/more by EnderWiggnz · · Score: 1

      oh my god...

      you do realize, that this makes us totally confirmed geeks...

      i'm gonna go do something manly now... heh.

      --
      ... hi bingo ...
  9. breech? by cheezfreek · · Score: 1

    Breech? Did that guy mean "breach" or "breeches"? Personally, I'd prefer going once more into the breeches...

  10. Why Are You Asking Me? by 4of12 · · Score: 5, Insightful

    It sounds to me like you're already a step ahead of the rest of the world, for the most part.

    My workplace uses a hodge-podge of formats including "special" ASCII text files, Framemaker, HTML and Microsoft Word. Needless to say, it's a mess. No open, standard, consistent tools to examine all of our documentation. Yeah, you can grep HTML, but the others are a pain. And don't even think about automatic script language based conversion among these formats.

    I suspect you're more advanced in your thinking than 90% of the places out there. Why not continue with your thinking and let the rest of us know what you decide?

    --
    "Provided by the management for your protection."
    1. Re:Why Are You Asking Me? by sphealey · · Score: 5, Interesting

      "My workplace uses a hodge-podge of formats including "special" ASCII text files, Framemaker, HTML and Microsoft Word. Needless to say, it's a mess"

      The question I would ask is, Is that because the tools are inadequate? Or because human thinking and creative processes are a mess?

      I have worked in highly structured engineering shops where everything is done according to procedure and every document stored in a structured manner (basically using procedures laid down in 1920!). These shops excel at delivering well-scoped projects in understood knowledge realms (the mythical bridge) on-time and on-budget. They do not do so well at handling projects in poorly understood knowledge realms, or projects where the environment and/or requirements changed rapidly.

      I have also worked in loose, no-standards, anything goes engineering shops without any structured document/knowledge processes. I wouldn't hire one of these shops to build a bridge on time, but in fast-changing environments they do much better than the first type.

      Conclusion: Be sure you understand what type of shop you are supporting before tying everything down with highly structured processes.

      sPh

    2. Re:Why Are You Asking Me? by FFFish · · Score: 2

      A common misunderstanding of the ISO900x (and similar) requirements is to think that it demands inflexible, planned-in-advance, detailed-to-the-extreme structures.

      It doesn't.

      You can run a highly creative shop and still achieve ISO9001 compliance. When documenting the processes, you "build in" the flexibility that's required to maintain the creativity, while at the same time avoid regulating it to such an extent that you squelch the creativity.

      It can -- and has -- been done. Standards are a *very* good thing in all work environments. Standards that require source-cause fixing of "bugs" (errors, mistakes, mis-steps, call it what you will) are even better.

      --

      --
      Don't like it? Respond with words, not karma.
    3. Re:Why Are You Asking Me? by Anonymous Coward · · Score: 0

      right.

      And if you are going for the procedural style, do not use CVS for it. you will be writing a lot of exra code to make sure the right things go into CVS at the right point int the procedure.
      Use Aegis for the procedural control. It supports a solid development process and enforced review. That will save you a lot of headache trying too find out what files where changed for some particular design change. Search for "Aegis peter miller" at google to find the website...

    4. Re:Why Are You Asking Me? by Anonymous Coward · · Score: 0

      ISO-900x:

      I sure wish that the shops that adhered to ISO-900x & other QA ensurance methods had understood this. They tended to end up bein places where engineers were acting more as technical writers(overpaid) than engineers.

      Conversely, I have also worked for shops that had no document/design process whatsoever.

      Some of these worked well, when the project staffs were small enough, and others not at all, referring to both ISO-900x & anarchists.

      All I can say is take the reply with reference to the 1920 shop to heart. It is a truism. If your startup is working on new techn ologies & applications don't kill it with too much process.

      "on time": Delivering projects on time in my experience has more to do with the understanding of the subject matter by those making the timelines than any other factors. Most of the projects that I have been attached to did not come in on time because it was a manager who made the timeline w/o consultation w/the engineers and/or poor or no understanding of the actual scope of the project. (I.e. most of these guys made scheduloes that they thought that the customer wanted to see. and/or did not modify timelines to reflect additional design changes and additions.)

      BTW: I have NEVER seen good/useful ISO-900x produced documentation. It has tended to be crap produced to meet the process requirements, and the best documentation that I have seen has been ad hoc. (Part of the problem with ISO-900x process generated documents is that they rarely seem to be updated... A process fault? yes, but I think that is because most ISO-900x processes that I have seen are so cumbersome. Partially this is the fault of the process designer and the remainder the fault of the auditing agencies.)

  11. I love how MS is dealing with XML by the_2nd_coming · · Score: 2, Interesting

    they say that office will use XML, however, I doubt that they will use it as file formating as that would put a big leak in thier desktop domination.

    they will most likly us it for communicating between thier office applications.
    of cource they don't tell you that in the press releases, they just say "Office will be moving to XML because we want to support standards" to bad they arn't using XML in a standard way (I know I will get flamed for that last part :-p)

    --



    I am the Alpha and the Omega-3
    1. Re:I love how MS is dealing with XML by MobyDisk · · Score: 2

      Your information is dated: Microsoft Word 2000 and beyond use XML as the native file format.

    2. Re:I love how MS is dealing with XML by big.ears · · Score: 2

      It is so ironic that MS is embracing XML as the state-of-the-art cutting edge technology (evidence of their innovation?). Why? Well, what is XML? some super-encrypted file format? some highly compressed communication system? Some super-efficient binary code transfer protocol? No--its text-based markup, designed specifically to avoid the inter-operability problems associated with a long tradition of proprietary binary document formats.

    3. Re:I love how MS is dealing with XML by Anonymous Coward · · Score: 0

      I love the smell of Innovation in the morning!

    4. Re:I love how MS is dealing with XML by mfarver · · Score: 2, Informative

      Really? My copy of Word 2k doesn't seem to save native in XML, nor can I find any options concerning it. We would love to see this feature, since most of our technical documents are in word and the inability to search the documents is killing us. In fact XML is the primary reason we are examining OpenOffice.

    5. Re:I love how MS is dealing with XML by Anonymous Coward · · Score: 0

      Ok, me stupid:
      WHy can't i read word doc files in notepad then?

    6. Re:I love how MS is dealing with XML by Anonymous Coward · · Score: 0

      Ha ha ha ha ha. Now you go and load a Win 2000 *.doc in notepad and tell me what you think?

      MS uses XML for the Office to HTML integration - But the XML is used in a wonderfully wacky and completely proprietary manner.

      U surprised?

    7. Re:I love how MS is dealing with XML by coyote-san · · Score: 2

      Last I heard, the MS XML format replaced

      blob of binary data

      with

      <ms-office type="word">
      blob of binary data
      </ms-office>

      Or something like that. Just like that joke about the guy lost in the helicopter, they provided a format which is technically correct, but totally misses the point.

      --
      For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
    8. Re:I love how MS is dealing with XML by Anonymous Coward · · Score: 0

      Microsoft will never properly support XML. It is not in their interests to do so. Microsoft's proprietry formats (Word, Excel documents etc) are one of the few major barriers to companies moving to other alternatives. Whereas the whole point of XML is to make it easier for applications to exchange information.

      Microsoft say that their applications use XML, which technically they do, but it is deliberately done in a way which makes it very difficult to interpret.

      It's really just another marketing ploy. Microsoft can say that their products are XML compliant, so that the 99% of the world that don't understand these things can think 'XML is the latest thing! Microsoft supports it! Microsoft goood!'

    9. Re:I love how MS is dealing with XML by greenrd · · Score: 1
      No, this was a particularly blatant example of a Microsoft Big Lie. Try opening a .doc file in Notepad sometime. Look like XML to you?

    10. Re:I love how MS is dealing with XML by Def+Mango+Raygun · · Score: 1

      If they do it's news to the help system. Try typing in "save as XML", in Help search, you get nothing back. I also just looked at a MS2K .doc file. Looks like binary to me.

    11. Re:I love how MS is dealing with XML by FFFish · · Score: 2

      Alas, Microsoft *has not* moved to XML.

      Word2K saves its files in an ASCII format that *looks* like XML, but isn't. It's jam-packed with, you guessed it!, proprietary "extensions." It's not parseable as pure XML.

      Add in that the Word DTD isn't, to my knowledge, a publicly-available DTD, and the XML mess that Word produces is just as opaque as the hodgepodge mess that they called RTF.

      --

      --
      Don't like it? Respond with words, not karma.
    12. Re:I love how MS is dealing with XML by Anonymous Coward · · Score: 0

      That's an old Slashdot joke, no need to take it as fact. There is actually no real XML support in Office (2000 at least).

    13. Re:I love how MS is dealing with XML by spankenstein · · Score: 2

      Ok... so i looked at a document that I created in Word 2000 and saw nothing that even resembled XML.

    14. Re:I love how MS is dealing with XML by coyote-san · · Score: 2

      It might be an old joke, but if so it's based in fact. Knowledgeable people I know and trust have told me the same thing - that early XML support (at least) was nothing but a thin wrapper around the existing file format.

      The question is if Office XP has peeled back a few layers of the onion.

      --
      For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
    15. Re:I love how MS is dealing with XML by MobyDisk · · Score: 2

      I guess you guys never heard of compression. It is a technique that allows files to become smaller, and often times a side effect is that text files become binary.

      If you ever use the Word 2000 object model to contact MS Word you will get XML streams in and out of it.

  12. Zope by DOsinga · · Score: 5, Informative

    Have you considered zope? It is perfect for storing a document tree, it has strong support for XML, including an extension for DocBook and supposedly it integrates with Apache. Gives you also lots of options to format the
    XML as Html.

    1. Re:Zope by Anonymous Coward · · Score: 2, Interesting
      Zope is fine for some shared writing environments, but I'll take a step out on the branch and say that there isn't enough failsafe tools readily available that can handle XML conversion, storage, and group-writing to make it worthwhile to invest in Zope instead of the solution he's thought up.

      CVS is designed for the group. Revision control is handled in Zope primitives, but to be able to back out of something, while diff'ing another thing would be a pain to code for it while maintaining its (admittedly half-decent) UI.

    2. Re:Zope by hey! · · Score: 2

      Strong support for XML? A lot depends on what you mean by support.

      The Zope people have been tinkering with XML for some time, and have had some interesting capabilities, but they are a bit behind the curve (IMHO) on supporting XML standards like XSLT. Perhaps there's a bit of impedence mismatch, since Zope has alway been about content/presentation separation, but does it in a different way than XML. I'd say that Zope is currently rather more web centric than most people seriously interested in XML need it to be. People will want the capabilities to generate printed manuals and PDFs from their documents. On the other hand, Zope rocks for making HTML. For example, I think that for transforming XML into various forms of HTML pages, Zope's Page Templates is going to be a far simpler and better for HTML designers to deal with than Apache Turbine or WebMacro. On the other hand it's apples to oranges, because there is a lot more flexibility built into the servlet model than into Zope.

      It's not saying you can't do everything with Zope, just that you have a lot more that you will have to either develop or wait for somebody else to do before you really have a complete solution.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    3. Re:Zope by dgroskind · · Score: 2, Informative

      Three relevant links to read in considering Zope for XML are:

      Creating XML Applications With Zope

      Create a XML Based Document Repository

      Cant Handle Humongous XML

      In some data management scenarios, using Zope obviates the need for XML markup. In practice, content management issues like security, revision control, and online access through a browser are bigger issues than markup. Zope provides solutions to all these problems.

      My main caveat in using Zope is that finding all the relevant documentation for XML or anything else is a veritable Easter egg hunt. The Zope API doesn't seem to be documented in one place. More than once a Zope tutorial seriously proposed that the reader read the Python source code for further information.

  13. Stylebook is dead by Lt · · Score: 3, Informative

    I use stylebook for internal documents at planetu.com, but stylebook seems to be dead. Docbook is a much better choice. I just have not got around to converting to it.

    The really nice thing about using XML is that I can automate some of the documents. Such as the list of valid form fields for a HTML/jsp page.

    For the nightly builds, change logs and javadocs, I use Alexandria.

  14. Linking between documents by kampi · · Score: 3, Informative

    In your case I'd probably choose docbook, but if you're just looking for a way to automatically setup the links between several XML documents you might take a look at w3make (which I wrote) or a similar project called XWeb which was written by Peter Becker.

    --
    -- a blessed +42 regexp of confusion (weapon in hand) You hit. The format string crumbles and turns to dust
  15. related: hire a librarian by Anonymous Coward · · Score: 1, Interesting

    No joke. Start interviewing these information flow officers now. Learn what they can do for you. Getting a sharp young librarian into your business early can be the guide or gardener your jungle will need.

    Don't associate librarians with "the library"; there's more of them in businesses than public/school libraries. These are specialists trained in information organization, retrieval, and distribution.

    And, being a recession, you may find this investment quite affordable right now.

    1. Re:related: hire a librarian by Monkius · · Score: 1

      So I guess you're a librarian.

      Cut back a little on the puffery and spin, and I might be more persuaded.

      --
      Matt
    2. Re:related: hire a librarian by ctimes2 · · Score: 1

      YES! See if you can find one that looks like those occassional guest stars in Playboy. :)

      Ctimes2

      PS - if you took offense to this... lighten up.

      --
      My cube. My friend. My solace. My prison.
  16. Maybe in the old days... by Carnage4Life · · Score: 2

    XML is a complement to HTML
    XML is not a replacement for HTML.


    I disagree rather strongly with this. I don't know what your experience is with XML but there are lots of shops that use XML for both presentation and data interchange because of its versatility. An XML document can be presented using an XSLT stylesheet or parsed using a DOM, SAX or whatever API. So the same document that can appear on a website because it has a stylesheet to transform it to HTML to maker it viewable is the same document that is used by your applications as a config file, data file, database updategram, etc. with zero modifications to the file.

    This is a very, very powerful aspect of XML. In my opinion, HTML is dead and considering that there's been an XHTML Recommendation for close to two years I wonder why people are still clinging to XHTML (Yes, I know it's because the browser developers have dropped the ball).

    1. Re:Maybe in the old days... by Anonymous Coward · · Score: 0

      Oddly, enough, I can think of one browser that happens to have 95% of the market share that can display XML using CSS. But that's irrelevant here, because the only point that we need to deal with is that the person to whom you have just replied is a moron. Why do you waste your time with these people, man?

    2. Re:Maybe in the old days... by FatHogByTheAss · · Score: 1
      I disagree rather strongly with this. I don't know what your experience is with XML but there are lots of shops that use XML for both presentation and data interchange because of its versatility. An XML document can be presented using an XSLT stylesheet or parsed using a DOM, SAX or whatever API.

      The 'T' in XSLT stands for 'Transformation.' By the time the data described in the XML has reached you, it might not be (and probably isn't) XML any more. It's been transformed into something else.

      --

      --
      You sure got a purty mouth...

    3. Re:Maybe in the old days... by lectos · · Score: 2, Insightful

      You just proved his point. Displaying XML as HTML is the only way to insure compatibility cross browsers. XSLT, DOM parsing, and SAX parsing use transformations into HTML to display the XML. You notice you are still using HTML? I do!!! It's not the direct use of HTML to style the XML, but it is still HTML. XSLT's template formatting is based on HTML you know. Since, the majority of web browsers only read HTML directly HTML is the easiest and most logical way to style XML. By that definition, XML is a complement to HTML because HTML will still exist regardless. With that said, if we ditch HTML what are we going to use to style the XML? What would the web browser developers have to create to style XML in the browser? They'd have to integrate an XML based schema for styles...hey that's what XHTML/HTML is!!! XHTML is just an XML representation of HTML in insure your style and outputs are valid and well formed. And yes, I know you can use CSS to style XML but that is not totally cross browser. We still have those people running around with Netscape 4 out there that doesn't support any of this stuff. It's silly to say HTML is dead.

    4. Re:Maybe in the old days... by Anonymous Coward · · Score: 0

      If this were true then Mozilla Quest Quest would render properly.

    5. Re:Maybe in the old days... by lectos · · Score: 1

      Mozilla and Opera support XML+CSS but Internet Explorer does not. Microsoft chose to use a proprietary form of XSLT instead. So that makes 5% of the market with this capability.

    6. Re:Maybe in the old days... by Anonymous Coward · · Score: 0

      You and the other nut above you are both wrong. IE has had support for xml+css since at least last year. The page your idiot friend there linked to is not xml. You can tell because it doesn't start "?xml ..." and it doesn't have a DTD.

  17. XSLT by dorker · · Score: 0, Redundant

    It's called XSL and XSLT. Read up on it.

  18. Our company is using XML for all documentation. by MemRaven · · Score: 5, Informative
    I asked our docs infrastructure person to pipe up, I'll see if she does before this gets to like 400 comments and nobody sees.


    We standardized all our early documentation on XML, and it's been working great. Admittedly we're using Perforce, not CVS, but we're doing something very similar to what you want to do.


    All our documentation is in XML format, in a DTD that we defined. We then have XSLT transformation scripts that convert that documentation to HTML format, and scripts that automatically update our development intranet whenever changes get checked in, along with scripts that invoke javadoc and doxygen on all the code to convert that to HTML format. We're in the process of being able to convert the same documents to PDF format to be able to publish those same documents, in the same formatting, to pretty-formatted documents for printing.


    This, aside from the simplicity of not having to worry about formatting documentation when you write it, is pretty cool. It's easier for me (as an engineer) to write a very sparse, structured XML document that will end up looking very good on screen, than to learn enough HTML to make my documents look good. And it's easier for us to enforce a standard look and feel across all documentation this way, because only the XSL transforms have to change if we change our formatting.


    But the real advantage is coming out with more advanced uses. For example, when we have configuration files, we have a special DTD that we define the documentation for the configuration files in, and then any documents that describe the configuration files are automatically converted both to the HTML documentation, AND to an example configuration file for users. We can also mark some things as only visible internally, so that the same document can have data that's visible to end-users, and data that's visible to employees (so if we have advanced configuration options that we don't want customers mucking with because they're for debugging the system, we can document them in the same place, and just hide them from customers but let our support and professional services people in on the secret).


    The best part is that because our XML DTD is very structured, someone like me (an engineer), will actually use it because it ends up being easier for me than writing in plain text, whereas I wouldn't do it in HTML (or if I did, it would just look like crap). It also makes it much easier to do integrations across branches of code: because we know the DTD for our XML documents very well, it's more likely that integrations will go smoothly, which helps keeping multiple branches of code and docs in sync automatically. If you go with a binary format, you're not going to be able to do that, and every time you make a change, you're going to have to manually change the documentation for each and every branch. With ASCII or HTML, everybody's going to produce documents that look a little different, so you're not going to be able to have as easy a time in integrating between branches.


    Our docs infrastructure person can pipe up in terms of the particular technology that we're using, but it's turned out to be one of the best infrastructure decisions that we (actually, she) made, and it's saved uncountable hours and actually made it more likely that people will write documentation themselves, because they don't have to pull out some crazy windows tool: just edit the document in emacs, and it'll still look pretty for the customers.

    1. Re:Our company is using XML for all documentation. by Anonymous Coward · · Score: 0

      All our documentation is in XML format, in a DTD that we defined. We then have XSLT transformation scripts that convert that documentation to HTML format, and scripts that automatically update our development intranet whenever changes get checked in, along with scripts that invoke javadoc and doxygen on all the code to convert that to HTML format. We're in the process of being able to convert the same documents to PDF format to be able to publish those same documents, in the same formatting, to pretty-formatted documents for printing.

      And, of course, it's very easy to change the look-and-feel when your company is taken over (a.k.a. merges). ;-)

    2. Re:Our company is using XML for all documentation. by sien · · Score: 2

      Do you use XML for storing UML designs ? If so what tools do you use ?

    3. Re:Our company is using XML for all documentation. by Anonymous Coward · · Score: 0

      Tip: have a tag for the company name, and use it always.

      Then when you got 0wn3D, you just change the name in one place!

  19. Introducing XML documentation in the company by thbzcrt · · Score: 3, Informative

    Unless you are the CTO or the only developer in your company, you may not have complete control over the documentation format. Other people may, and probably will prefer to write documentation in Word than in XML. And I won't condemn them because Word is a good editor for documentations in plain English.

    A way to introduce XML-based documentation in a company is to prove what it can do (and not just speak about it). In a previous company, I expressed the desire to have a documentation generated from the source code, but nobody seemed really interested. So I did it myself, and when they saw it, they loved it.

    The idea was to parse our source files (which were in various languages, more or less easy to parse) and generate an XML documentation for the APIs. In a second step, other programs read the XML documentation and transformed it in RTF (Word) and HTML, using SAX and XSLT (I tried both and preferred SAX).

    The HTML version was installed on the Intranet and the developers used it as a reference documentation in their everyday work. They knew they could trust it more than any other documentation, because it was regenerated every night. They also liked it because, unlike Javadoc, the source code parser worked very hard to gather information from the code without forcing the programmer to use constraining comment conventions.

    The Word version was delivered to the clients as an API documentation.

    Other documentations were written directly in Word. The system worked very well, and ensured a good-quality and up-to-date API documentation without too much work.

    I also used the intermediary XML documentation for other purpose, including some code generation, which proved the versatility of XML.

  20. Try Conglomerate by Colin+Smith · · Score: 4, Informative

    More worthy, full document management system than the efforts being put into word processing.

    Conglomerate

    --
    Deleted
    1. Re:Try Conglomerate by sien · · Score: 2
      Just had a look at that - thanks.

      But I have to say - check the docs section for an interesting quote:
      "Sadly, Conglomerate has very little documentation at the moment, both user and developer documentation is practically nonexistent. There is, however, an introduction to conglomerate's architecture available, which should at least be of interest to potential developers."

      Now surely this is a terrible inditement of a system that is intended to be used as:
      "Software development groups who want a system integrating the entire process of authoring, storage, revision control and publishing, while using the same tools to create API documentation, release notes, tutorials, reference guides, and more, and publish them to online formats (HTML, Windows Help, custom formats) and paper formats (TeX/LaTeX) from the same sources."
    2. Re:Try Conglomerate by greenrd · · Score: 1
      Not necessarily. It takes time to develop a development environment (or similar) to the point where it can be used to "bootstrap" itself - at the beginning, non-stable early versions can't be developed with stable versions since the latter don't exist yet.

  21. XML Document Architectures by under_score · · Score: 5, Informative
    Hi. I have a little experience with this. I'm not going to bore you with the story, rather just get to a simple description of possible architectures for what you want and why you might want them. Finally, I'll conclude by saying that what you are doing is extremely ambitious: don't falter when it gets hard and overwhelming.
    1. Plain XML, without schemas
      XML is a markup langauge that is supposed to be human readable. Thus anyone can whip up an XML document that describes some data (e.g. documentation on software). It helps if you have standards to make the XML consistent.
    2. Plain XML, with schema (or DTD)
      Creating schemas for all you different types of documentation is probably the first big pain in the butt you will deal with, but it is pretty essential to get a project like you describe to work. It helps by setting common standards which all participants in your org can use to understand the docs they are looking at. Now you also get some tool support for creating and validating your XML documents.
    3. Database -> XML
      Store all your documentation data in a database and use common db tools to extract it and format in XML. Why bother? Tool support! Lots of software development project tools support using a db as a repository for the various work products (documentation and code and stuff). This also allows you to have somewhat easier methods for serving your content to interested parties with appropriate security constraints.
    4. Repository -> XML -(XSLT)-> HTML
      Here we add the ability to transform the human-readable-but-cumbersome syntax of XML into html for viewing on a browser. The big effort for this sort of architecture is that you have to create the XSLT for all your different document types and you need some way of linking-to/searching your documents from the html into the repository. Some application and web servers help with this. I'm most familiar with the Java space, and Tomcat with various xml libraries can be made to do this.
    5. Repository -> XML -(XSLT)-> XML -(XSLT)-> HTML
      This is the most flexible architecture in which pure data XML is transformed into an intermediate form which represents an abstract presentation of the XML and which is then transformed into HTML (or WML or PDF or whatever). The first stage of transformation you need one XSLT style sheet for each document type to convert it into the presentation XML. Then for the second stage you need one stylesheet for each display format. The big advantage here is that if you need to publish to a new document format, you don't need to re-write _all_ of your first stage transformations, you only need to add one new second stage transformation.
    There are of course variations. Check out IS Architectures - Organizing the Web Server for more details when one of your outputs is HTML.
  22. it is too, and more by Anonymous Coward · · Score: 0
    XML is a complement to HTML
    XML is not a replacement for HTML.

    Actually, XML is (amongst other things) a meta-language, a means of defining sets of tags. XHTML, a particular set of XML tags, is a replacement for HTML.

  23. Re:run for your lives! it's... by stevew · · Score: 0, Offtopic

    this is a BIG jpg that is just a picture of S Ballmer sitting down. Don't bother.

    --
    Have you compiled your kernel today??
  24. Why XML? by dgroskind · · Score: 3, Informative

    A good place to start is Open Source XML Database Toolkit by Liam Quin.

    The key point is that the best approach depends upon how the data will be accessed, used and updated. There does not appear to be an off-the-shelf, one-size-fits-all solution, even if you go to a commercial platform.

    The advantage of XML is that you can start with a simple approach and migrate to a more complex approach without having to do an expensive data conversion.

    The disadvantage is that XML can be quite expensive to set up on legacy documents and expensive to maintain as well. For documents that change frequently, have multiple uses, or require precise retrieval strategies, XML is the way to go. It's particularly useful when version control must be tracked at the paragraph level.

    If version control takes place at the level of the whole document, retieval is done by keyword, and documents are displayed in one form only, XML may not add anything but trouble.

  25. Process follows team by michaelmalak · · Score: 1
    Not all projects are the same. Not all teams are the same. When it comes to process and tools, practice YAGNI (You Aren't Gonna Need It).


    Let the team gel together. When you see a problem, then bring in some process and tools. Just as with when there is a conflict between the law and custom it isn't law that wins, so it is with process and tools dictated from on high. Process and tools need reasons for existing. By introducing them later, the purpose is clear to the team lead, and more importantly, the purpose is clear to the team. It also allows team members to bring to the table their favorite tools and processes, and the team lead may learn something.


    It sounds reactive rather than proactive, but as with XP, it's really a meta-level up where you're planning ahead to be reactive, and you have a quiver of tools and processes at the ready.

  26. Files Easy, Editing Hard by dschuetz · · Score: 4, Insightful

    Forget about how do you build the repository -- that's easy. (Well, okay, non-trivial, but with databases, cvs, and even just simple shared folders, storing the docs is the least of your worries).

    I still maintain that the biggest hurdle in any standardized document system (especially if you include multiple concurrent authors) is the front-end editor. I wrote a simple (and highly buggy, I'll admidt, so you who know me keep your traps shut!) VB application that provided a multi-user front end to a database. The back-end (PHP) pulled all the appropriate rows for any given doc together and mashed it into a nice, navigable HTML document. I even had PDF support at one time (but it was even flakier than the GUI).

    However, it was not XML, so it was REALLY limited in how easy it was to create new views on the data. The biggest problem I ran into was trying to find a good GUI editor -- this thing was written for security engineers, not HTML experts, and I wanted them to concentrate on content, not tags. I eventually settled (and settled is the right word) on the Microsoft DHTML control. Worked well enough for the time (two years ago at this point), but I still think half my problems stemmed from that widget, or bad interface programming to it. The advantage? WYSIWYG (more or less) editing. Seamless multi-user editing of the same document (well, okay, we had some record locking issues. :) ) But again, the long pole of the tent was the editor widget.

    Since then, I've wanted very much to rewrite the thing to handle full XML, and I understand there's an effort underway to do just that (I've since moved to different pastures), but it's slow going. I've looked at current technology (ABIword, for example), and i'm just not convinced that it's going to be easy to get a good semi-WYSIWYG XML editor going. At least not on the cheap.

    Some time ago was posted here an app called Conglomorate, which I still think has about the best approach to visually representing an XML document. But it hasn't been updated in forever, and was slow/buggy the one time I played with it. More recently, the XMLmind XML Editor (XXE) has shown a lot of promise, even including CSS files for editing DocBook XML. They even have source available. Again, goes a long way to letting you edit diverse XML files in a logical way -- not by forcing you to look at ugly tree-views of an XML file, like so many first-generation editors. Finally, the latest XML Spy editor beta goes a bit father even than XXE, using a full XSLT transform to provide a WYSIWYG format for XML files. Theoretically, with this, you should be able to display any of your documents in whichever approach you like -- full WYSIWYG, tables, trees, block labels, whatever.

    Of course, neither of these latter tools work in a concurrent editing fashion. But that's a "minor" enhancement -- put together a robust DB back end, allow for good record locking, editor-to-editor communications for lock management, transaction log to allow back-out of changes, etc. Lots of possibilities. Take XXE, put this kind of capability on the back-end, an integrated login and document management system, and you've got a kick-ass document solution. Work the backend to allow for multi-stage review and publishing, and provide output engines for HTML, PDF, WAP, whatever, serve different subtrees of the system to, say, internal project web servers, external web servers, sales and marketing (for glossies), etc., and everyone can manage everything, real-time, GUI, with one tool.

    But I dream.

    (seriously -- if anyone's really working this, I'd love to help. I just wanna use it at home for my own web pages.)

    1. Re:Files Easy, Editing Hard by smallpaul · · Score: 2

      XMetaL is the leader in the XML editing category (in North America, anyhow). They've been in the structured editing business for roughly 15 years. Another strong contender is Documentor.

  27. DocBook by jdevons · · Score: 1

    I would agree that Doc Book might be a good solution.

    I think that the most important thing you can do though is make certain that people do not have to edit XML directly.

    Create something that allows them to work without remembering all the tags and brackets and everybody will be happier.

    --
    I do everything the voices in my head tell me to...
  28. find existing DTD's by Anonymous Coward · · Score: 0

    Many companies have already created or are creating DTD's (A file to define the parameters of your own brand of XML)
    for example, if your company made car parts, the automotive industy has a working XML data format to define all the details of an automobile. In using existing formats as a basis, you have much of your work done for you and can start with the insight of others in that same industry as to what type of information is likely to be included. This also opens the door to good interoperability and potential use of existing tools to convert your 'CarXML' document into different document formats. The best thing you can do is use existing work like this to build on thus you are :
    --"Standing on the shoulders of giants"--

  29. Use a wiki by AveryRegier · · Score: 1
    We use a wiki for this purpose. The one we use is from devtools.org. The advantage of this versus a static site is the ease of updating the information.
    1. Updates do not have to go through any particular person(s).
    2. An engineer can update the work directly through his browser.
    3. Links amongst related documents are handled dynamically instead of explicitely. Thus pages can theoretically be updated with links to related information without actually manually changing the page.
    1. Re:Use a wiki by Khalid · · Score: 3, Informative

      Yes I confirm this ! at work we use twiki (twiki.org) one of the best wikis I kow of, really a very nice collaboration tool. It can be used as knowledge management repository too, very easy to use and to start getting people using it.

  30. Forget CVS; start a Wiki! by Sunlighter · · Score: 3, Informative

    The Wiki Wiki Web is a set of editable, cross-referenced web pages. Anybody can view them and anybody can edit them, and they are searchable. Wikis are pretty useful for internal documentation projects. It should be possible to extend the concept to add the security that is typically required and to add support for XML. Of course, all that means I am practically suggesting you write your own custom Wiki, which may take too long for you. But you could probably start with an existing Wiki and get good results. I have set up UseModWiki (which is a CGI script written in Perl) and gotten good results.

    Hope this helps!

    --
    Sunlit World Scheme. Weird and different.
    1. Re:Forget CVS; start a Wiki! by platos_beard · · Score: 1
      Ok, I've never looked at Wiki before now, so maybe I'm missing something. I've looked briefly at the TWiki and PHPWiki samples (I hate Perl). It may be true that "anybody can edit [wiki pages]", but they won't.

      Editing wiki pages looks like a HUGE STEP BACKWARDS from the editing that most of us are familiar with. I can't imagine trying to get more than one or two of the people where I work using it, and how much good can a collaborative tool do if most of the intended collaborators won't?

      --
      What's a sig?
  31. One word ! Wiki ! by Khalid · · Score: 4, Insightful

    More exactly one of the best incarnation I know of : twiki (twiki.org). Absolutly terrific ! it can be used as a collaboration medium, a knowledge base repository and much, much more, you will find new ways of using it everyday. I have installed it where I work ! and people have been ecstatic !

    1. Re:One word ! Wiki ! by Cato · · Score: 2

      I use TWiki a lot as well - it doesn't address the same problem space as XML, and in fact doesn't really have semantic markup, but it is very useful as a way of collaborating by creating web pages. The idea is that anyone with a web browser can edit any web page. There are some people working on transforming Wiki pages into documents, using TWiki, but that's not really its focus.

    2. Re:One word ! Wiki ! by 2b · · Score: 1

      TWiki is very cool. Our approach to documentation is to use TWiki for internal docs, notes, "shared whiteboard" and then migrate to DocBk for delivery externally. This seems to work well since wiki is very fluid, and then DocBk (with CVS) is more controlled, and can be delivered in many different ways (e.g pdf, html, rtf, etc).

    3. Re:One word ! Wiki ! by Winged+Cat · · Score: 2

      Agreed...for less structured data. If you've got a poorly understood problem you're documenting, where the lack of understanding precludes even putting together a DTD for the docs, Wiki works well. And given as "poorly understood" tends to mean "interesting"...

  32. we use XML for our knowledge base by valmont · · Score: 3, Informative

    I work at an ISP ... (not AOL, not MSN) We have a whole department who's in charge of writing up procedural documentation, walkthroughs, how-to's, FAQ's, to solve just about any problem you could ever encounter on almost any platform and operating systems under the sun on your way to getting connected to the internet.

    As soon as XML standards and derived technologies and languages (XSLT, DOM, and more) started to be strongly established nearly 2 years ago, we moved whatever existing documentation we had into XML, conforming to internally developed DTD's and specs, after a couple guys and I built a handy HTTP-based authoring tool that leverages technologies built-in Internet Explorer 5.0 which I've previously described right here, allowing writers to not have to know anything about XML, and simply click their way thru easy interactive forms, in a fairly compelling user interface ...

    With all of our information stored in XML, we can easily present it to various audiences, may it be our members who can search it by keywords to help themselves in our online support area or our technical support reps who can browse directory trees to specific XML documents and have access to more detailed information about hardware and platform configurations, document revision information and more.

    The bottom line is this system works really well. And we have the amazing peace of mind of having GREAT information in a format that can never become completely obsolete, and that is always a couple XSLT stylesheets away from fitting just about *any* need.

    Whether you make up your own DTD's or follow existing standard DTD's like DocBook mentioned in other posts, as long as you put some thought into structuring your XML data at the beginning, you can only win in the long run: XML documents can easily be processed into other XML DTDs/formats to represent the information in a way that better fits another application, and/or transformed into other documents made of a markup language meant for presentation like HTML or WAP.

    yea. XML is nifty. :)

  33. Similar problem here... by Eminence · · Score: 2, Informative

    At my company (in fact it's a local branch of an US based corporation) we have similar problem. There is a team here developing a system designed specifically for a customer. As one can expect along with such a system goes all the documentation - everything you could expect starting from the analysis, through functional specification and coding guidelines to end user and administrator's manuals. To make things more complicated part of the development - and the documentation - is being done by a subcontractor (which happens to be on another hemisphere) - and it is being prepared in English, but some parts of it (especially the manuals) have to be translated into local language.

    Up until now it has been a growing mess with documentation being written in Word (with all the usual problems Word has with large files, with lots of graphics - screens, no versioning etc.), with no standards, with people getting into one another's way while trying to update the numerous documents.

    Recently together with a friend we have came up with the idea to switch all that into neat XML/SGML files, with CVS based versioning and everything based on open standards and free software as much as possible. To our surprise the management liked the idea and we got a green light to do some research. And then the problems have begun.

    First, the editor. Coding XML files with vi or alike might be nice for a hacker - and is great for creating and testing XML formats used then for data storage etc. - but it is out of the question for documentation authors. And it is pretty understandable - to be able to concentrate on the content, on the text itself, the author needs to see only the contents, as nicely rendered as possible - no tags getting into way in each sentence, no learning for years how to use the editor (thus Emacs with its psgml mode is not an option - don't flame me, it's just a fact). After a long search I have to say that there is no working, finished GNU/free editor that would match our requirement of almost-WYWSIG presentation of an XML/SGML file. As to commercial ones the only two that look good are XML Spy 4.0 - but it is just a poorely working beta for now - and Arbortext's Epic - which is almost exactly what we need, but is a bit expensive at around $700 a license.

    Nevertheless, with no other options left we decided to go for the Epic when it comes to the editing side. We got an evaluation package and begun testing.

    Now, we were from the start convinced that DocBook DTD & tools that go along with it are the best choice for the kind of problem we faced. Epic supports the DocBook but comes along with their own version, which in turns doesn't work well with the Linux sgml tools that we use for translating the XML/SGML files to useful end formats. On the other hand not all Epic's features can be used when one just tries to edit the document based on an "external" DTD. To enable things like being able to see the graphics files inserted into the document one has to hm... "customize" the Epic by creating some additional configuration files (like .FOS files) using yet another expensive tool Arbortext sells - the Epic Architect.

    But that is not the end of the problem, because the stylesheets currently available for translating the Docbook based XML/SGML files into useful formats are not well documented and partially don't work (for example tags related to inserting pictures in the document are ignored when trying to generate a printable document). There is for example a project on Sourceforge that develops XSLTs and DSSSLs for translating Docbook based XML into various formats, but so far I was not able to make them work - and there is no documentation. Also the DSSSL based machinery for translating SGML files that comes with various Linux distros is far from perfect - HTMLs are generated mostly OK, but printed documents (.tex and .pdf) leave much to be desired.

    So, from our point of view it looks like we will have to buy an expensive editor and then someone would have to spend a month or so tweaking the editor, modifying the stylesheets for our needs, developing procedures and so on. And that someone would have to be quite a competent person (with deep knowledge of the subject), someone, who could be probably better used directly in the development project.

    As for now the future of our little plan of switching from mess to neat XML based solution is uncertain. Mainly because we would have to build that neat solution ourselves, as what we can get from outside at the moment are some bits and pieces that - although nice by themselves - just don't fit together.

    (And, BTW, I haven't even touched the nice catch with CVS - to be really useful in the kind of environment that we envisioned it would have to be integrated with the editor - and that doesn't seem likely).

    1. Re:Similar problem here... by valmont · · Score: 1


      take a look at my comment, you could basically spend a couple days to a coupl weeks max building a trivial to complex tool to edit all your documentation :) drop me a mail at valmont|at|wildstar|dot|net if u want more info. good luck :)

      -c

  34. Word to XML? by Bilbo · · Score: 3, Insightful

    MS-Word is the standard that everyone has, and everyone can use.
    True enough, but it doesn't answer the original question. Can MS-Word (or even StarOffice, if you want a non-MS option) output XML? I know it will output HTML, but it's such a bastardized mass of proprietary and font/format specific crap that it's essentially useless unless you extensive filter it first.

    The whole point here is not to create pretty formatted documents, but to leverage the power of XML to add context and meaning to the content of the documents in order to create a rich and interlinked heirarchy of information. Conventional word processors just create blobs of information -- pretty formatted blobs, but blobs nonetheless...

    --
    Your Servant, B. Baggins
    1. Re:Word to XML? by Anonymous Coward · · Score: 1, Informative

      Office 2000 includes Loseless XML support. That is if you save a Word document into XML, and read it back into Word, no formatting should be lost.

    2. Re:Word to XML? by coyote-san · · Score: 2

      But what does that XML actually look like? Early MS XML support, at least, replaced

      blob of binary data

      with

      <ms-office type="word">
      blob of binary data
      </ms-office>

      Technically a well-formed XML file (assuming that the DTD shows the ms-office tag as having CDATA content), but it's useless as a shared document format.

      --
      For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
    3. Re:Word to XML? by Anonymous Coward · · Score: 0

      Actually, Word 2000's "XML" support is pretty much horseshit. It's HTML with some psuedo-XML which allows Word to round-trip documents from HTML and back. Excel hits closer to the mark with the files it creates for the activeX control, but the XML is still BFing ugly.

      Maybe Office XP has improved this situation, maybe not.

    4. Re:Word to XML? by JamesOfTheDesert · · Score: 1
      Office 2000 includes Loseless XML support.

      Bullshit. Office saves as bloated HTML. It throws in namespace-qualified elements so that it looks all XML-like, but deep down it's bastardized HTML. Not all elements are properly closed or nested, and many attributes aren't quoted.

      Sure, you can save it out and load it back into Word, but so what? You can't save it and load it into an XML editor.

      --

      Java is the blue pill
      Choose the red pill
  35. XLink by Jeffrey+Baker · · Score: 2

    For links between document, use the w3c's XLink Specification

  36. Star Office by LetterJ · · Score: 3, Interesting

    OpenOffice (the equivilant of Mozilla to StarOffice/Netscape) has gone to an XML format for its native format. It's actually several XML files Zipped up in an archive, but you can easily open it and look at the plain XML. Check out the recent builds of OpenOffice. Many of the gripes against StarOffice 5.2 have been resolved.

  37. biggest barrier in a small shop/startup by StandardDeviant · · Score: 2

    The biggest barrier I've encountered in small or startup environments to something like this is organizational "buy in". For example, at $workplace[-1] I wrote a defect tracking system custom tuned for us. Worked pretty well on a technical level ... but nobody used it becuase with < 10 people, it was easier to just turn around and hand somebody a postit-note bug report.

    Then again, that is characteristic of one of the central challenges facing a small organization, namely how to grow the structures to make a larger organization maintainable. I.e. overcoming the "this is a pain in the ass, why do we need this now?" factor. The answer to this depends strongly on the people involved, but if you can make the system about as painless as typing into an ASCII file or scribbling on a postit, you stand a better chance of success.

  38. Use XHTML by Ars-Fartsica · · Score: 2
    You will get the extensability of introducing your own namespaces, but you can use straight HTML where it is applicable, which will allow you to use pre-built browsing and editing tools.

    You really don't want to get involved in building browsing and editing tools for an arbitrary schema, its not worth the time.

  39. XML is huge threat to Microsoft by pubjames · · Score: 2, Interesting

    I've not seen this point of view expressed much on Slashdot, so here goes:

    Microsoft have been very clever getting where they are today. One of the principal means they have got there is using the interfaces between different functional elements like keys, to lock customers in, and lock competing technologies out.

    XML is a simple, standard way of formatting diverse information types so that it is easy to exchange data between applications, and easy to write programs for. It is brilliant in it's simplicity, and anyone who has studied it will know that it is 'not just another format', but one of the most important standards ever developed in the history of computing.

    This represents a huge threat to Microsoft as it threatens one of their main strategies. I believe that everyone in the open source world should learn XML and it's associated standards, and use them as far as possible in their programming work. If, for instance, the open source community adopts DocBook, or Sun's forthcoming XML standard for documents, for all open source word processors, I don't think it will be long before there are so many useful document manipulation applications available that there will be a compelling business reason to move from Microsoft Word to an open source alternative.

    Learn XML folks!!!

    (There are probably a few reading this who are thinking - 'but Microsoft says that their office file formats are now XML based'. To you all I can say is that you should learn XML, and then you'll realise that what Microsoft is doing doesn't really have anything to do with what XML is all about).

  40. XML/XSL by Anonymous Coward · · Score: 0

    The system we use here ( I won't give out a URL as it'll be slashdotted - suffice to say we're a Cambridge UK based ASP ) is to write all of the latest documentation in XML and just browse around using XSL processing instructions in the original file. It all works nicely and we even have XSL files that are processed by other XSL files to produce HTML based on the internal comments which works very nicely.

    We solved the interconnection of the files by having one uber XML structure file that works like a tree contents so you can navigate around because the XSL generates links to the other documents as layed out in the XML file using the document().

    Get into XSL it's loverly.

    As an aside we even have an XML that contains directives to run external XSLs on the source XML and stitch the results back into the original XML and re-apply. It's a great way to expand things.
    We use it for website construction.

  41. Re:XML and CVS by Zeinfeld · · Score: 2
    My biggest worry with XML and CVS is someone using a GUI editor that chooses to re-wrap hard line breaks, or change every tag in some (XML) legal, yet CVS noticeable way.

    What you really need is a source code system that can recognize that the input is XML and convert the document to canonical form before applying the diff. Unfortunately you would have to write code, the C14N for XML is, well utterly unsuited for that task.

    Alternatively you could switch to a source code manager that used compression across the different archived versions rather than a simple diff. Unfortunately that probably involves writing the code yourself.

    --
    Looking for an Information Security student project suggestion?
    Try http://dotcrimeManifesto.com/
  42. XML/XSL - You know it makes sense by Richard_J_M · · Score: 1

    The system we use here ( I won't give out a URL as it'll be slashdotted - suffice to say we're a Cambridge UK based ASP ) is to write all of the latest documentation in XML and just browse around using XSL processing instructions in the original file. It all works nicely and we even have XSL files that are processed by other XSL files to produce HTML based on the internal comments which works very nicely.

    We solved the interconnection of the files by having one uber XML structure file that works like a tree contents so you can navigate around because the XSL generates links to the other documents as layed out in the XML file using the document().

    Get into XSL it's loverly.

    As an aside we even have an XML that contains directives to run external XSLs on the source XML and stitch the results back into the original XML and re-apply. It's a great way to expand things.
    We use it for website construction.

  43. Editor? by xant · · Score: 2
    The question I keep hitting is how do you create these documents in the first place? The XML "editors" out there are badly inadequate for writing documentation. What you need is something that will let you write with no more effort than a text editor, and then go back and select a bunch of text and ctrl-c and voila, you have a comment, or voila, you have a section title. HTML editors such as Mozilla's have the right idea; unfortunately there's logically no way to write in HTML and convert it to XML.


    Clearly having the document stored as XML is superior because you can convert it to whatever you want, and the tags represent the document's meaning rather than its appearance. But producing the document using raw tags is not friendly; it discourages documenters.


    What editor do you use?

    --
    It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
    1. Re:Editor? by Petronius · · Score: 1

      XMLSpy. Runs on Windoze (yuk!) but truly is a great tool.

      --
      there's no place like ~
  44. UML by MemRaven · · Score: 2
    Actually, kinda. We use Visio to create our UML diagrams (so the few people that actually draw them have to have Windows, other than that all engineers use Linux or Solaris), but then we translate the drawing (which is just for pictoral reference) into another XML DTD.


    That one gets fed into a custom repository system which automatically generates all the code we need for using the objects, including javadocs and XML documents, when then go into the rest of the channel for documentation. The repository system also generates multiple types of storage back-ends (Oracle, Postgres, Java Serialization, In-Memory-Only, etc.) so that the engineers never have to worry about data access code, and we can change what type of storage we're using at a customer site.


    So in that respect, we're using Visio or MagicDraw just for the documentation of the UML model, but the "real" model is always in XML and only used by the repository system we've developed.

  45. Different problems by coyote-san · · Score: 3, Informative

    You're dealing with different problems here.

    DocBook, and any decent document XML DTD, gives you the ability to tag your text with some description of what it means. It might be "chapter" or "list," or it might be domain specific like "files," "bugs" and "see also" (for man pages). The presentation details are left to the processing software to handle.

    MS-Word, in contrast, is nothing but a paint tool for words. You can certainly give your styles names that have some domain meaning to you, but it's still ultimately nothing but a set of style instructions.

    For a single document, this isn't a big issue. But if you have a lot of documents and you want to reuse content, it's impossible with the MS-Word approach. With DocBook, in contrast, it's easy to set up your documents so that the same file can be reused in multiple places, but only selected content will be reused.

    IMHO, if your technical writers can't make the shift to meaningful tags, you're better off without them. (The writers, not the tags.) If they can't handle this level of structure, their writing is undoubtably muddled and confused no matter how pretty it looks.

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
  46. Link CVS yourself by DamienMcKenna · · Score: 1

    Epic's site says "Customization: C++, Java, Perl, TCL and Visual Basic programs can be written to customize Epic Editor operation." so you should be able to grab some of the code off the net (eg WinCVS's) and link in CVS support yourself.

  47. XML: what problems should you expect by mir · · Score: 1

    OK, time to play the devil's advocate (I am actually a big XML fan): here are the problems you should expect when switching all your docs to XML:

    • the DTD/schema will never please everybody: in an old-fashioned word processor if someone is not happy with the template you are using they usually find a way to change it and there is nothing you can do about it, with XML you can (and you will most likely have to) coerce people into sticking to the company policy. People usually don't like it.
    • editing XML is still either painful or expensive, tools like FrameMaker (you need the SGML version to do XML) or Adept editor are outrageously expensive, even more recent tools like XML-Spy are not eally cheap, and not everybody wants to use emacs,
    • encoding problems: XML doesn't actually force you to use Unicode (actually UTF-8 or UTF-16) but it heavily suggests it. A lot of tools will only output UTF-8 for example. THis might make it difficult for your documents to be used in the rest of your tools, as most likely few of them are Unicode friendly.
    • math is a real problem, the only sane way to represent math in XML is Math ML, which is not widely supported.
    • XSLT: you will most likely be drawn to writing XSLT "stylesheets" to transform your documents. Don't be fooled by the name stylesheet, these are programs, written in an angle-brackety user-unfriendly functional-hell language. Even if you happen to like XSLT (did you guess I don't? ;--) it _is_ a new language that you will have to learn.

    In short XML is probably a good choice, it gives at least independance from the word processing software and allows you to include/retrieve data automatically in/from your documents, but don't underestimate the trouble you will have to go through to get to your "integrated-wonderful-all-encompassing system" (which doesn't exist right now, so you will get it in version 1.0...

    --
    Look, that's why there's rules, understand? So that you think before you break 'em. (Terry Pratchett)
  48. UML in XML by pi_rules · · Score: 2

    Ever heard of the viso-ish tool called 'Dia'? It's a GTK based application, with ports to Win32. It stores all of it's diagrams in an XML format nativly. It's not as robust as Visio, but it diagrams all my thoughts out rather nicely and I can send them over to all the other developers using Win32 platforms.

    Sounds like a nice system -- and a well organized team. You hiring? :)

  49. How we do it around here: by Hard_Code · · Score: 2
    --

    It's 10 PM. Do you know if you're un-American?
  50. I don't like XML by L8Knight · · Score: 1

    I'm all for standards, but I really don't like XML. I know, I know, I must be the only one. Maybe its my exposure to it, or maybe its what my company wants to do with it, but I'm just not a fan. For documentation at my company we use a mix of ASCII text files, Word files, or whatever the writer wants. My biggest beef with XML is that it is amazingly slow to parse (can you tell, I'm a programmer) and clunky (I hate all those tags). I think XML is a buzz word that caught on in the eyes of corporate management and they decided it sounded cool so they made everyone support it.

    Personally, I like Latex. I just started to get into it, but its really easy to learn, and it converts into nice looking postscript, or PDF files. Its so easy to convert a plane text file into a latex file. Who needs this Microsoft crap anyway?

  51. Writing content in XML by pdoubleya · · Score: 1

    I've been working on a project to document software engineering methodology and best practices (eventually to be published and freely available). The content is written in XML. I wrote the DTDs for the basic document types and XSL stylesheets for output to HTML. I've written a few docs already for the system (two, three dozen or more). Some notes on my experience.

    1) Editing: I find the XML editors available online to be slow to use. I don't want to click on a tree to navigate through my document--too much mouse work--and I find the format for editing individual nodes is cumbersome if you have a complex document structure. I settled on using a good text editor (EditPlus, not free) with some templates for automatically inserting large node sections that are typically edited in a block. The editor has syntax-highlighting so it's easy to differentiate the XML from the content. I also have template files for quickly starting a new document of each type, with all the basic nodes already pasted within so I can open a new template and start typing.

    2) Display: I wrote some simple XSL sheets to start with, that just displayed the XML in a readable, basic HTML format. I have my editor linked to a script that transforms the current XML on command, and opens the HTML in the same editor--so I can quickly preview the results. I use the Apache Xalan XSL processor with Java JDK 1.3. I also have an Apache/Jakarta Ant script to transform all documents on comment. Since I started, I have spent some time improving the layout, but writing a basic XSL for display is not difficult--I recommend having basic ones around (no gifs, simple tables, etc.) for no-nonsense preview of the content.

    3) Navigation: I have specific tags in my docs (like reference or ext-reference for linking documents. The XSL generates the appropriate tag for navigation, based on attributes in the tag. My newer approach is to generate either static links or a link to a Java servlet with the page name; the servlet then handles the navigation request. The servlet approach has the advantage that it can eventually transform documents on the fly, allow you to specify a transformation type (e.g. pretty, simple, text-only) and even to pull content from a repository.

    4) Content management: Right now I use directories to manage the content, since I support 5 document types, and use the file names to distinguish the content. I spent a bunch of time when I started figuring out the DTDs of the various document types so I would organize content most efficiently and with little or no duplication of intent. I recommend that approach--it's been a lot easier, when I want to write, to identify the document type I want, open a template, and start typing. I think this is where the first real effort comes, not in picking an editor, etc., but in deciding exactly how the information will be categorized. It's important to have searchable content, or content you need to introspect, to be pulled out from the main. For example, in using a reference tag, I can write a verification routine that checks the reference is valid (e.g. if a file, check that the file exists). I try to avoid writing anything I would need to parse out--for example, dumping a code sample write within a text block. I'd rather have a code-sample tag for easy identification.

    I'd stress that the biggest hassle has been getting the XSL to work properly and generate readable XML, especially for more involved document structures. It's easy to break and hard to debug! In that vein, I recommend starting with very simple XSLs that output really really simple HTML and building up from there.

    -Pdoubleya

    --
    "I honestly would vote libertarian if their candidates weren't usually total cooks."--slashdot poster
  52. Try AxKit or Cocoon by barries · · Score: 1

    I developed a system like this for a previous job. We started with a source browser and then began checking engineering documents into our repository and (thus) making them browsable on the web

    It can be really be nice. The hardest things were social issues; it requires a bit more discipline to maintain documents in a repository than by file sharing and email. Establishing the taxonomy so that people know where to put things and where to look for them is critical; but getting a good search engine up[1] can help there. Using small documents to redirect browsers to the "right" place can also help, since you can take a document that spans categories and put it in multiple places. I also wanted to get a Wiki in place, but left before we got there.

    Having the revisions of design docs and test specs (which always change as implementation proceeds) tied to the appropriate revs of the software and other documentation using CVS tags, perforce change #'s/labels, or whatever is really nice.

    As far as linking, HTML and MSOffice links were always made relative to the current document, so you could browse them when you'd checked them out on to your local machine. That was another social issue: educationing people to be careful, since most HTML editors and MSOffice HTML manglers of that vintage needed a little extra care to create relative links.

    Since this was software engineering, we also implemented in-browser diffs (a lot like CVS web) and source code browsing as research tools. Having the design and discovery ("hey! so that's how this code works!" "Quick write it down before you forget!") docs that can link to the source code means that a new team member can read through a design doc and jump in and see the source code (where we kept implementation documentation).

    You can modify your server-side source code browser to find things like file:../../design/index.xml in source files and convert them http: links when browsed, then use an editor or xterm that activates embedded links and be able to refer to design docs in soruce code. This also allow developers to link out to web sites, though we slurped anything that important in to the repo to protect against bit rot.

    Since you've choosen XML a priori, you should definitely look at AxKit (Perl-based) or Cocoon (if Java's your fave) as delivery vehicles. Both are ASF official projects, though AxKit is a recent addition and hasn't made it to an Apache webserver.

    Never used Cocoon, but AxKit can easily back-end to CVS and apply whatever transformations you like. AxKit can apply it's own XPathScript and XSP (language="perl") style sheets; various C-based XSLT engines (Sablotron and libxslt come to mind); and, of course, 100% Pure Perl to thoroughly munge your docs. Then it caches the results (if you like) and optionally GZips them (which is nice for dial-up or VPN use).

    AxKit's main drawback at the moment is that it's web site is down due to sluggishness of British Telecom in installing a new data line, but you can find out more by searching cpan for the AxKit module.

    If you do it right, you will have a very cool system.

    HTH,

    Barrie

    [1] We did this before XML was all the rage, and getting meaningful searches of MSOffice files checked in to a reposiroty was a right pain. We ended up sharing out a directory tree that was updated nightly with the head revisions and letting people search with MSOffice's built in File Find.

  53. coaxing structure from Word by Anonymous Coward · · Score: 1, Informative

    Here's a way to get and enforce structure in Word documents.

    Word allows named styles, and with View>Normal you can even show the stylenames on the screen (Tools>Options>Style area width).You can create a template with paragraph and character styles that correspond to the structure you want. The template can hack the toolbars: put your styles on a toolbar and in the pop-up menu, if you want. Have the template change the Save command to save docs in rtf.

    Users write in Word. Save into the CMS. In the CMS gui, choose validate or finish.

    The CMS gui then launches a series of processes. First, convert the rtf to xml-like tags, based on the styles used in the document. Second, run some clean-up script to make a well-formed xml document. Third, run a script to do any validations that xml can't handle. Fourth, run a standard xml validator. If the validator finds a problem, you fix it back in Word. You only edit Word files. To preview, translate down to xhtml.

    Users will have to cooperate, but then they're usually paid to cooperate. RTF is nasty, but this is as straightforward a conversion as you can get. The biggest problem is Word's lists. You could either guess at lists based on formatting, or require (hidden) begin/end list paragraph styles.

    I think this approach could let you escape Word. Next upgrade, you could switch to Star Office or KWord.

    A little more detail.

  54. Amaya! (was: Re:Files Easy, Editing Hard) by shibboleth · · Score: 1

    The W3's Amaya lets you open two windows onto the same document, one a nice gui editor/browser and the other xhtml plain text. Saving in one window updates the other. For collaborative editing, see their Jigsaw with WebDAV.

    For searching my docs, i use Apache and ht://Dig. And for quick, organized access to the same docs i created a PHP4 application that allows me to easily create categories, assign docs to them, and to title the documents. That app, with an ht://Dig search field, is my home page, and it works great. Basically i re-created the functionality of a Lotus Notes db i used to use at work (w/o collaboration or replication features) while keeping all the data accessible to any other app that speaks xhtml.

    If there's interest i'll post the PHP4 code somewhere. (If i can ever get just get this message posted. /. has been on a /long/ lunch break.)

    --
    "Be thankful you are not my student. You would not get a high grade for such a design :-)" - Minix pro
  55. This is an easily solved problem by srussell · · Score: 1
    I set something like this up for a project I'm working on at the US Forest Service. I started with the CVS repository that we use for our software sourcecode. I wrote a 72 line XML-RPC server and a 15 line client in Ruby. Add one line to CVSROOT/loginfo, and it was done. It took all of three or four hours, including debugging. Any .xml files checked into the "fsweb" CVS project are automatically checked out on the web server, and processed with Xalan. The benefits are realtime page updates, and static web pages (so as to not bog down the server). It works really well.

    The real problem is finding a good document-oriented XML editor. It is impossible to convince most of the users to edit documents in XML source, and most of the word-processor-like XML editors available are expensive and either feature incomplete or not very user friendly. We're looking at Morphon at the moment, primarily for the platform independance, but it has some problems.

  56. Document Library by GumbyEnder · · Score: 1

    I work for a financial/banking development company. We began a project a few months ago that takes all of our internal documentation (tech specs, user source, source descriptions, data library, design specs, database and client information) and builds an XML based library. Obviously we aren't rewriting all of our docs to meet a formalized XML standard, however we are attaching tag information on each document that categorizes and describes the document.

    A standard for future specs (and whenever rewrites are required of old specs) is being drafted so that all documentation can be parsed and searched with the full power and blessed beauty that is XML!!

    GO UBERGEEKS!!

    --
    To code is to foo. Or.. is it to foo is to code..
  57. Yes, I have some experience with this by shepk1 · · Score: 1

    This is what we do. It works, and it solves many of the issues associated with traditional documentation "systems".

    1. Define a DTD of the elements you will need in your documents. You can go with a predefined DTD like DOCBOOK, or you can roll your own. Because my goal was to get engineers to write as many documents as possible (thereby making my life easier), I chose to create my own DTD - There's about 60 elements. Special documents (like projects, or resumes, for example) have their own
    DTD (and XSL) in order to keep the main documentation DTD as simple as possible.

    2. Create a general XSL sheet that transforms all
    documents which conform to the DTD into HTML for your internal web-site - all of the presentation
    logic is in this one XSL sheet (although it can use xsl:include if you'd like to break it up by header, table of contents, content, footer, etc).

    3. Set up documentation sub-branch(es) in the source control system. (In the
    code base and QA/infrastructure/whatever else you plan to document and publish). The closer it is to the code, the more likely that engineers will add reviewcomments, fix errors, update it, etc.

    4. Anyone and everyone writing documents does so
    according to DTD and checks them into the designated directory structure in the source control repository. (As far as they are concerned, this is the end of their work -- it just magically appears on the site)

    5. Set up Ant to use one of the Trax processors (I use Xalan2).

    6. Write general purpose ant targets to convert
    documentation with general XSL sheet, build indexes, etc. Write build.xml files that call these tasks and convert/index/etc all the documentation in each sub-branch.

    7. A monkey or chron-job converts the documentation and pushes it to the website.

    Benefits:

    Tables of Contents, Index pages, indexing, image-links, etc. are all generated. There is a much lower chance of links being broken.

    Once you get the XSL right, you don't have to worry about consistency with the look and feel.
    You really can just concentrate on content.

    In my experience, engineers are much more likely
    to use VI/emacs to edit/review a document than Word or Framemaker. The GUI XML editors are getting better...

    Drawbacks:

    Someone has to own the DTD, beat their head against the wonderful syntax of xslt, and be willing to decipher the Ant stack traces when things inexplicably bail.

    It's not yet commonly done (or at least they don't post on may newsgroups), so often you are charging ahead without much feedback.

    Technical writers are difficult to hire. We haven't tried to find another one yet, but I imagine asking people to write in XML will limit our talent pool. Of course... in theory I could make all this work with Framemaker+SGML...

  58. Re:XML and CVS by coyote-san · · Score: 2

    You don't have to rewrite the entire source control software, just put in a filter. CVS has a number of hooks for this - at a minimum you should run an XML validator to ensure that only valid XML is checked in.

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
  59. Business Correspondence - Simple DTD's and XSL's? by rodo · · Score: 1

    For my mini-freelancing business I need to occasionally write business stuff such as offers and invoices. What I have been looking for without success in the last days is a collection of some simple, free DTD's and ideally some accompanying XSL Stylesheets for business letters.

    I am familiar with docbook/emacs/psgml. Just thought there might be something that is to simple business letters the same that docbook is to documentation. There is also this minor-mode for emacs that looks very promising: xslt-process - it would make sense to use emacs with this for correspondence to have automatically pdf's generated.

    What I'd need is much less sophisticated then docbook though, just something that intelligently suggests some tags for - in the case of an invoice - items, price, customer etc.

    Any hints anybody? Somebody have fragments of such stuff lying around?

  60. The docs infrastructure person pipes up by shepk1 · · Score: 2, Informative

    I didn't realize this thread was already going when I posted, so I'm pasting it in again here:

    This is what we do. It works, and it solves many of the issues associated with traditional documentation "systems".

    1. Define a DTD of the elements you will need in your documents. You can go with a predefined DTD like DOCBOOK, or you can roll your own. Because my goal was to get engineers to write as many documents as possible (thereby making my life easier), I chose to create my own DTD - There's about 60 elements. Special documents (like projects, or resumes, for example) have their own
    DTD (and XSL) in order to keep the main documentation DTD as simple as possible.

    2. Create a general XSL sheet that transforms all
    documents which conform to the DTD into HTML for your internal web-site - all of the presentation
    logic is in this one XSL sheet (although it can use xsl:include if you'd like to break it up by header, table of contents, content, footer, etc).

    3. Set up documentation sub-branch(es) in the source control system. (In the
    code base and QA/infrastructure/whatever else you plan to document and publish). The closer it is to the code, the more likely that engineers will add reviewcomments, fix errors, update it, etc.

    4. Anyone and everyone writing documents does so
    according to DTD and checks them into the designated directory structure in the source control repository. (As far as they are concerned, this is the end of their work -- it just magically appears on the site)

    5. Set up Ant to use one of the Trax processors (I use Xalan2).

    6. Write general purpose ant targets to convert
    documentation with general XSL sheet, build indexes, etc. Write build.xml files that call these tasks and convert/index/etc all the documentation in each sub-branch.

    7. A monkey or chron-job converts the documentation and pushes it to the website.

    Benefits:

    Tables of Contents, Index pages, indexing, image-links, etc. are all generated. There is a much lower chance of links being broken.

    Once you get the XSL right, you don't have to worry about consistency with the look and feel.
    You really can just concentrate on content.

    In my experience, engineers are much more likely
    to use VI/emacs to edit/review a document than Word or Framemaker. The GUI XML editors are getting better...

    Drawbacks:

    Someone has to own the DTD, beat their head against the wonderful syntax of xslt, and be willing to decipher the Ant stack traces when things inexplicably bail.

    It's not yet commonly done (or at least they don't post on may newsgroups), so often you are charging ahead without much feedback.

    Technical writers are difficult to hire. We haven't tried to find another one yet, but I imagine asking people to write in XML will limit our talent pool. Of course... in theory I could make all this work with Framemaker+SGML...

    1. Re:The docs infrastructure person pipes up by perfecthash · · Score: 1

      Now if said great company to work for would just OPEN SOURCE all of this satori, wouldn't the world be a better place?

  61. DOC-XML-DOC? by nickjohnson · · Score: 1

    I would love it if my company switched to an XML based document repository, but the biggest problem we face is that certain collegues _need_ to use Microsoft Word, and at present, we don't have the $$ to justify purchasing a mid 5 figure document conversion program. Does anyone know of maybe an open-source project that is trying to achieve document conversion between a standard XML DTD (like DocBook?) and MS Word? It would have to be able to convert both XML->DOC and DOC->XML. Maybe an inexpensive solution for smaller organizations?

    1. Re:DOC-XML-DOC? by Anonymous Coward · · Score: 0

      This may help.

      "The answer may well lie not with StarOffice as an application, but with the code that Sun makes available via the OpenOffice project. In particular, this code includes the complex file filters that convert Microsoft's proprietary formats to an open XML format. This is code that anybody writing a productivity application can use. The field is open for application vendors to write applications which work well with Microsoft's formats and yet provide the targeted alternatives that IT will want."

      see http://news.cnet.com/news/0-1273-210-7058195-1.htm l?tag=bt_bh for full article.

  62. Everything is the only solution by Sixty4Bit · · Score: 2

    I seriously think the Everything http://www.everything2.com/ engine would be a great way to maintain documentation on a project. I know it isn't XML based but I am sure one could easily connect to the database and kick out the needed files and put them into XML. The meat of this gentleman's quesstion was "How do you link your documents?" Currently we link ours with Rational Requisite Pro. Which if anyone took the time to actually use it would be very beneficial. But what better tool for linking than Everything? I can easily add a link by adding brackets around a [link]. If I want to [link] to the [requirements] or a specific requirement [req1.4.2] or a use case [Authenticate User] I just add brackets.

    I am preparing to start a new project. I may have to install [Everything] first to see if it will work. Just have to make a couple of rules like all requirements must have a [Requirements] link so that they can all be found. Or better yet, instead of using Person, Place, Idea or Thing you could use Requirements, Use-Cases, Test Cases etc... Interesting idea. Customize [E2] to be a requirements/documentation management system.

    --
    This is not the sig you are looking for...
  63. Tree Views are my bane by xant · · Score: 2
    There are several good and bad treelike XML editors. Merlot was the best cross-platform one I found. Unfortunately these projects all use the same interface: you must laboriously add each element, click in the right box, type the text, all the while making sure not to break the docbook specification.


    We need something better. We need an interface like kword or abiword or lyx, all three of which offer:

    • a fast, unintrusive text editing interface, and
    • poor-to-mediocre docbook export from its internal format

    but not a single one of which offers docbook/xml import, which is critical if you are writing documentation in XML. XML is designed to be the source, the font from which all formatted, renderable versions (read: pdf, html, rtf, txt, etc.) spring. Hence, it must be the saved version, the one you keep in your source/document control system. The others are just conversions from XML. Yet no editor imports XML! This ought to be the easy part! You just write an XSL stylesheet to convert XML-of-your-choice into your editor's internal format.
    --
    It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
  64. *sigh* you all make it hard on yourselves... by kuma · · Score: 1

    reading lots of this shit is so irritating... seems everybody keeps thinking of working in text or word, this sounds like hell to me. depending on microsoft tools in your workflow is asking for headaches. this from experience.

    yeah, might cost you money, but what about framemaker, indesign or quark. you will have trouble, but if you are a *programmer*, you can easily extract tagged text from documents, store them as xml in a database and translate for presentation on the way back out.

    doing this now to publish about 16000 pages per year, reducing price errors on catalog pages (saving upwards of 15 million dollars according to some marketing wonk), it just works. the editors basically get it, not hard.

    we preserve not just style, but meaning. able to handle all sorts of funky evolving business/presentation policy via simple parsing. got people around here wondering if i'm a genius.

    (if anyone wants to do something like this in another publishing shop, write me and we can talk about challenges... fair warning: this is macos based workflow, getting something going on windows should be easy though)

    dozens of workflow/asset management vendors charge upwards of 500000$ to introduce simple database technology to clients, closed solutions that put mis programmers out on the street and force client groups to work according to a prefab plan.

    the only thing some of these assholes have going is some glue code to move data in-and-out of word/quark/whatever...

    thinking of writing up a general open solution, something like the arsdigita system, for publishing workflow/asset management, give it away and sell consultant programming. do me a favor, beat me to this pipe dream!

  65. ( XML vs PDF ) vs Big Wigs by wheeda · · Score: 1

    Our campany is just now starting to mandate that all released documents shall be in .pdf form (original word, solidworks, whatever still available). We tried to get XML accepted at our company, but we couldn't get the big wigs on board.

    Everyone uses .pdf (nearly). Even big wigs can probably use .pdf. I think that once a company has realized the gain from the efficiencies of a common document platform (.pdf), big wigs will be much more likely to invest money in the deployment of .xml

  66. Re:XML and CVS by tomRakewell · · Score: 1

    This doesn't solve the problem necessarily. If you add a tag fairly high up the hierarchy, you could end up shifting a lot of content in an indentation level. The content has hardly changed, but the diff is huge.

    This assumes your standardized XML is indented, as it often is. I think the ideal CVS filter would be to have *no* indentation.