Domain: oasis-open.org
Stories and comments across the archive that link to oasis-open.org.
Comments · 276
-
Re:Maybe it's better...
Nice rant, but the facts say otherwise. Check out the ebXML web site for some facts and you'll find it has wide support, was created under the chairmanship of IBM and is supported by key current electronic business project RosettaNet, a range of current players like CommerceOne and Documentum and of course OASIS, which is huge consortium with wide membership including Microsoft and IBM.
Maybe closer to the truth is that the openness of ebXML is a huge threat to the market strategy of Microsoft, offering among other things a standardised XML markup for business transactions (UBL) that undermines the standard-connections, proprietary-content trojan horse that their web services strategy is creating.
-
While they are at it
If they can port Office without help from Microsoft, maybe they could also implement compatibility with open standards.
-
Some Cool Technical Stuff
One piece of this that is not getting much attention right now (that would probably be of interest to
/. readers) is the registration system. I'm not getting into the politics of this, the DRM or the "right or wrong" arguments.
In this initial rollout PSU and Napster decided to limit the service to students living in the residence halls. It does not matter which of the 21 campuses you are on, just that you live in a res hall.
We also needed to ACTIVELY protect the privacy of the students, not just to comply with FERPA but because we are not in the business of providing marketing data to private institutions.
The way we went about this was to use the Internet2 Middleware Initiative's Shibboleth software. Similar to Liberty in that it is a federated single sign on system that uses SAML, it is one of the unsung heros in this.
Without getting into TOO much low level detail of how Shib works (which is available at the above link for those interested), here is a quick overview of what we are doing:
Basically PSU students are redirected to Napster's shibboleth protected registration webpage (this shib component is an Apache auth module) which sends them back to a PSU server to do the actual authentication. The student authenticates to the web server (kerberos backended userid and password). This server is also a component of Shib and it redirects the user (actually an http post) back to the Napster reg system along with a SAML authentication assertion.
The SAML authentication assertion is a blob of XML data that contains an opaque handle for the user (used in the next step) and a URI back to the last piece of Shibboleth at PSU called the Attribute Authority. This assertion is also digitally signed with an x.509 cert (w3c's XML-Signature spec) so that Napster knows it can trust this (not tampered with, generated from a rogue "man in the middle" server, etc).
The last step is when Napster makes an SSL wrapped call to the Attribute Authority requesting attributes about the student who is trying to get in. Remember up to this point all they know is his opaque handle (long string of numbers which uniquely identifies the user, but provides no information). The Attribute Authority looks as the cert of the requesting server, sees that it is Napster and queries LDAP for the data about the user that it is allowed to release. This is configurable to be anything we have, name, email, address, department, semester standing, etc. HOWEVER we only pass TWO things to Napster. (1) an entitlement string that identifies whether or not that user is allowed to get this service, and (2) a persistent opaque handle, which is basically the userID encrypted with the name of the target site and a secret seed value.
The entitlement string is generated at PSU and is populated in the user's LDAP entry based on the criteria that was set (res hall students only for now) and the persistent opaque handle gives Napster something to look at to make sure each students only registers once, but they still have no idea who that user is or anything about them other than that they are a student at PSU in a res hall.
Now if the student chooses to use their PSU email address when creating their Napster account, or gives them their CC number because they want to purchase songs that is their decision. The doubleplus good factor here is that PSU does not give that data up. We merely assert on the user's behalf that they are allowed to sign up under this agreement.
This Shibboleth stuff is running on Linux at both places and with the exception of requiring Java at the Origin end (PSU), is entirely comprised of open source software. The Napster guys we worked with were also very clueful and were definitely down with Linux, using it except where Windows was necessary (WMA streaming)
So I are very pleased at what -
Passport does not compete against Liberty
WS:Federation does.
In the federated identity world, the showdown is going to come between Liberty and WS:Fed. Liberty currently has the advantage of actually existing, and the spec followed a very open and transparent development model that was very inclusive (as spec development goes). WS:Fed on the other hand was developed behind closed doors by Microsoft and (to a lesser extent) IBM, and is just now applying for standards body recognition.
Another noteworthy point is that Liberty by design is very similar to Shibboleth, an Internet2 Middleware initiative for higher education federated authentication/authorization that has been very successful. Both are built off of Oasis's SAML spec. Shibboleth however places far more emphasis on user privacy.
Finkployd -
Re:XAML Proprietary?
I've actually never heard of XAML before, but according to Webopedia, it sounds like the same old crap, once again.
:)
I don't think XAML as used in Longhorn (i.e., a declarative presentation language) is the same as Transaction Authority Markup Language. In fact, MS seems to be pulling a Firebird on the XAML guys. -
Re:What I want to know ....
given that we are seeing lots of governments adopting or considering adopting F/OSS, how long before document and data interchange in its current form (read: MS Office) becomes enough of a hassle that consumers and businesses will demand software that conforms to open data interchange standards?
The problem is, there isn't really a suitable format for office documents available just now. The leading candidate there is probably the OASIS Open Office XML Format standardization effort, however I have no idea if that project is progressing in a timely way. -
SWX format?
I wonder if the SWX format will ever really take hold.
Maybe it will -
Re:Exactly
There is no XML "standard" for Office documents.
KDE recently announced that KOffice would embrace the document formats of OpenOffice.org.
This means that a Windows user running OpenOffice.org could save a document, send it to a KOffice user on Linux, and expect it to open.
There is an effort to make a standard XML based office document format. Two office suites, so far, embrace it.
Article in InfoWorld
OASIS charter
XML for the masses -
not only Gate$y boy
Oasis have been working to get a whole bunch of people "talking" for years, mission overview here. The membership list is quite comprehensive. Lets hope something useful come of it.
-
not only Gate$y boy
Oasis have been working to get a whole bunch of people "talking" for years, mission overview here. The membership list is quite comprehensive. Lets hope something useful come of it.
-
Re:Just one question...
-
That other office suite
Let's wait how long it takes that other office suite vendor to see the light. After all, they are an OASIS member themselves...
-
Re:SXC ?
it could be very cool to support the sxc (openoffice) format. what about this ?
Indeed. It's silly to have to use .xls format to move data between spreadsheets on Linux. On the other hand,
OASIS format is a lot more important strategically, at this point. But what the heck, we're spoiled, we want both, don't we. -
Re:My Wish List
I'd like to see configuration of everything move to an XML standard, and this should be coupled with flexible visual tools.
This is a very bad idea. And I say that as someone who uses XML daily, and is generally very fond of it.
A telling example: XML-based configuration files have made my working with XML quite a lot harder. As you might know, XML systems - and SGML systems before them - can use so-called "catalogs" that map public identifiers or URIs to local files, so that when you reference the official location of, say, the DocBook DTD in a file, you don't have to download it every time. In the old SGML days, that was done with a line-oriented catalog file that would contain something like
PUBLIC "-//OASIS//DTD DocBook V4.1//EN" "/usr/local/share/sgml/docbook/4.1/docbook.dtd"
Unfortunatly, in their great wisdom people (namely OASIS, the organization also responsible for the DocBook and lots of other, usually quite good, standards) decided that line-oriented formats are no good, and developed an XML format that looks something like this:
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalo
g ">
<public publicId="-//OASIS//DTD DocBook XML V4.2//EN" uri="file:///usr/local/share/xml/docbook/4.2/docbo okx.dtd"/>
</catalog>The problem: Try installing a new DTD. It will hopefully have it's own catalog file, that you only have to register with one of the catalogs already known.
For the old variant, all you have to do is "echo CATALOG new_catalog_file >>
/some/existing/catalog". Removing it again is easily done with grep -v or sed. Try something like that with the XML format, and it will end up unparsable. You either have to edit it by hand, or use a special program that knows about XML and XML-based catalogs.In other words, the main effect of the new format is that you cannot use the traditional Unix tools anymore. Manipulating the config files now requires specialized programs, making things like portable install scripts very hard. And I really, really doubt that any GUI or other tool benefits from the XML format - SGML catalogs, and most config files, are damn easy to parse, the hard part is getting the semantics right - what values are legal, what options exist, how to present them to the user in a visually pleasing and intuitive way etc. XML doesn't help you one bit with that.
XML is cool for complex structured documents. Config files are neither documents, nor are they supposed to be complex.
-
Re:both PDF and HTML
What about XML and XSLT? With one stylesheet for publication and others for reading on various platforms you get best of both worlds. If you have id attributes on all your elements you can get as anal about your layout as you like in your publication stylesheet, and still have the content be fully adaptable unmodified by simply using a different stylesheet.
The trick is having everyone using the same schema so that those adaptible stylesheets can be used for multiple documents. For publishers, the trick would be getting those pesky authors to use it.
The Open eBook is one attempt at a standard, DocBook is another. DocBook is notably used by O'Reilly to produce paper and electronic versions of their books. -
Re:Good deal...
I doubt that this will be that great a deal - instead, look for the W3C to become less and less relevant going forward.
Unfortunately, this is already happening. Look at the Web Services area. OASIS has taken the lead for the standardization of most Web Services technologies working on top of SOAP (UDDI, ebXML, SAML, XACML, etc.) Is it a coincidence that OASIS has a RAND policy instead of a royalty-free policy?
Is is a coincidence that some large companies pulled out of W3C and moved to OASIS? Of course, there are other reasons than the patent policy. The high membership fees of the W3C may be working against it. Also, some political wars between IBM, Microsoft and Sun can explain why some discussions started in some W3C working groups were killed and moved to OASIS. But still, I am hoping that W3C can regain some of the influence that it has lost in the recent months. Otherwise, the royalty-free policy may be largely irrelevant.
-
DocBook, sorta
DocBook is cool. I'm writing a book using it. But it's not the format for all technical content. If you're writing your basic mass-market computer book, or the web equivalent, DocBook probably has everything you need. (Though the markup for the official DocBook reference is forced to use generic tables to list element parameters -- there's no specialized element!) But I'd hate to use DocBook for a big API document base, especially one where single-sourcing is an issue. IBM's DITA framework is immature, but looks promising.
-
RSS
-
Re:How?The project page explicitely says:
This is an open source implementation of the OASIS XACML standard, written in the Java (TM) programming language.
It doesn't say that the standard is open source. It doesn't say that Java is open source. It says the implementation in Java is open source.
Of course, it doesn't prevent from creating close-source implementations of the same standard. But XACML standard specs by themselves are openly available from OASIS.
-
flexible & powerful != difficult and confusing
Suggest you (and anyone else finding W3C XML Schema confusing and/or difficult) take a look at RELAX-NG.
RELAX-NG is an alternative schema language that is much more flexible and easy to use than W3C XML Schema. RELAX-NG also defines a non-XML syntax that makes it even easier to work with, since the non-XML syntax can be translated into XML using a tools such as Trang. There are many other tools that make it easy to work with RELAX-NG.
Eric Van der Vlist is working on a RELAX-NG book that you can read at http://books.xmlschemata.org/relaxng/
Take a look at RELAX-NG - you might never want to work with XML Schema again... -
flexible & powerful != difficult and confusing
Suggest you (and anyone else finding W3C XML Schema confusing and/or difficult) take a look at RELAX-NG.
RELAX-NG is an alternative schema language that is much more flexible and easy to use than W3C XML Schema. RELAX-NG also defines a non-XML syntax that makes it even easier to work with, since the non-XML syntax can be translated into XML using a tools such as Trang. There are many other tools that make it easy to work with RELAX-NG.
Eric Van der Vlist is working on a RELAX-NG book that you can read at http://books.xmlschemata.org/relaxng/
Take a look at RELAX-NG - you might never want to work with XML Schema again... -
flexible & powerful != difficult and confusing
Suggest you (and anyone else finding W3C XML Schema confusing and/or difficult) take a look at RELAX-NG.
RELAX-NG is an alternative schema language that is much more flexible and easy to use than W3C XML Schema. RELAX-NG also defines a non-XML syntax that makes it even easier to work with, since the non-XML syntax can be translated into XML using a tools such as Trang. There are many other tools that make it easy to work with RELAX-NG.
Eric Van der Vlist is working on a RELAX-NG book that you can read at http://books.xmlschemata.org/relaxng/
Take a look at RELAX-NG - you might never want to work with XML Schema again... -
Links
I was going to mention RELAX, but since your post is already here I'll just add a few links:
Official?? site of RELAX (RELAX earthiling! we come in peace!)
OASIS on Relax-ng (much more dry).
I'm not sure it would be so bad if both standards came to be popular. A few years ago at an XML conference one of the speakers described the XML world being split into three camps - data modelers (who would be backing XML-Schema), Document-centric folks (who would back RELAX), and one other group (whose leanings I forget but I guess they don't care about typed XML documents!!). Having a data-centric and document-centric approach to XML might not be so bad, each having good uses in different scenarios. -
Re:Power
I am currently working on a series of articles on RELAX NG. In most ways, I think RELAX NG really is the best of all worlds. It is more powerful than W3C XML Schemas, while being a natural extension of the semantics of DTDs. Moreover, if you choose to use the compact syntax (non-XML), you get something very easy to read and edit by hand.
I am old, and I am wary of the ways of hype. But after reading this and other comments on this thread, I had a look at the RELAX NG tutorial. All I can say is: wow. Given that this stuff is already known to be formally correct, I am finding it very hard to believe that the W3C should not just punt on XML Schema and just adopt RELAX NG instead. It seems to have every advantage: You can understand it, it is powerful, James Clark endorses it, the tutorial is helpful...what's not to like?
-
Re:Power
This is true to a certain extent, but the complexity of XML Schema outweighs its expressiveness.
RELAX NG seems to be a much better compromise to me. YMMV.
-
NotNeccesirily (XML != Open)
I'm not an XML expert, but my understanding is that MS or anyone can write proprietary code into the CDATA section of any XML document and therefore only their tools can accurately parse the document.
Of the XML code I have seen generated by MS applications, it's a mess, and lacks any adherence to well structured content, it's spaghetti xml. Same with the style sheets generated and the associated classes.
If this was put in the hands of the programmers who gave us
.doc to .htm or those responsible for the code generation in FrontPage, what would you expect the results to be like?There is also the huge debate (if my memory serves me right) that happened regarding the first W3C XSL recommendation where MS fought for a less strict implementation (because they saw their documents could not comply with a strict implementation).
Also, didn't Microsoft Corporation Selects SoftQuad XMetaL To Create XML Content ?
-
Re:InfoWorld articlesI was at the launch presentation of Office-11 by Jean Paoli at XML 2003 in Baltimore MD last week, and I'm also a late sign to MS's extended beta list for the product (now closed).
To clear up some points people have commented on (based on a very preliminary inspection plus a lot of discussion at the conference):
- The default save format is still
.doc (ie you have to go the extra click to save in XML format) - If you pick to click it, the default XML format is MS's own office-document vocabulary, which retains all the formatting, held in attributes. Hairy but processable, and they will be shipping their schema for it so people can reprocess it externally. But this format will (of course) only represent the appearance, not any structure.
- It will also let you specify your own schema (or an industry standard one) and let you supply a binding of named styles to your element types, so you can edit using what look like styles but actually get represented in the saved file as XML markup. There is some debate as to whether this constitutes "being an XML editor" or just "being a wordprocessor that saves data in XML" (my money is on the latter).
- It will not support DTDs, so you're stuck with W3C Schemas whether you like them or not*
- The discussion over a [more?] suitable schema/DTD for handling office documents (wordprocessing, spreadsheet, presentation) continues at the OASIS TC on Open Office XML Formats **
* [Bias note] I think W3C schemas were a big mistake; provision for data content typing and validation, namespaces, and extended grouping could have been achieved by extending DTD syntax; and wimpy programmers who moan about having two syntaxes to handle should get a life - it's not a big deal, the code is free and has been in use for 15 years
:-)** Sun has donated the OpenOffice (aka StarOffice) XML file formats to the public domain. It's worth remembering that {Star|Open}Office has been saving in XML as its native format for some time now, and has a lot more experience at this than MS.
- The default save format is still
-
Re:InfoWorld articlesI was at the launch presentation of Office-11 by Jean Paoli at XML 2003 in Baltimore MD last week, and I'm also a late sign to MS's extended beta list for the product (now closed).
To clear up some points people have commented on (based on a very preliminary inspection plus a lot of discussion at the conference):
- The default save format is still
.doc (ie you have to go the extra click to save in XML format) - If you pick to click it, the default XML format is MS's own office-document vocabulary, which retains all the formatting, held in attributes. Hairy but processable, and they will be shipping their schema for it so people can reprocess it externally. But this format will (of course) only represent the appearance, not any structure.
- It will also let you specify your own schema (or an industry standard one) and let you supply a binding of named styles to your element types, so you can edit using what look like styles but actually get represented in the saved file as XML markup. There is some debate as to whether this constitutes "being an XML editor" or just "being a wordprocessor that saves data in XML" (my money is on the latter).
- It will not support DTDs, so you're stuck with W3C Schemas whether you like them or not*
- The discussion over a [more?] suitable schema/DTD for handling office documents (wordprocessing, spreadsheet, presentation) continues at the OASIS TC on Open Office XML Formats **
* [Bias note] I think W3C schemas were a big mistake; provision for data content typing and validation, namespaces, and extended grouping could have been achieved by extending DTD syntax; and wimpy programmers who moan about having two syntaxes to handle should get a life - it's not a big deal, the code is free and has been in use for 15 years
:-)** Sun has donated the OpenOffice (aka StarOffice) XML file formats to the public domain. It's worth remembering that {Star|Open}Office has been saving in XML as its native format for some time now, and has a lot more experience at this than MS.
- The default save format is still
-
Re:you know...
This isn't true, in fact it has to be in our interests that as many companies as possible are competing in this market place so the rather immature technology is further improved.
The key thing is not the number of vendors, but how single sign on systems and user repositories interoperate so that trust credentials are passed between system to enable single sign on between different vendors products.
The key initiative behind this is SAML the Security Assertion Markup Language. All the main vendors of SSO (RSA, IBM, Netegrity) are supporting this standard.
The major vendors had a bake off recently to test interoperability which I undersand went very well, with all 12 product successfully passing credentials between each other. I couldn't find much about it, other than a list of the participating vendors which can be seen here. -
Re:you know...
This isn't true, in fact it has to be in our interests that as many companies as possible are competing in this market place so the rather immature technology is further improved.
The key thing is not the number of vendors, but how single sign on systems and user repositories interoperate so that trust credentials are passed between system to enable single sign on between different vendors products.
The key initiative behind this is SAML the Security Assertion Markup Language. All the main vendors of SSO (RSA, IBM, Netegrity) are supporting this standard.
The major vendors had a bake off recently to test interoperability which I undersand went very well, with all 12 product successfully passing credentials between each other. I couldn't find much about it, other than a list of the participating vendors which can be seen here. -
Re:Just Say NO!
There's no such thing as a perfect schema language in a general sense. It all depends on the application at hand. Actually, trying to be perfect is one of the main reasons for the W3C XML Schema bloat; they simply try to squeeze too much into a single specification.
RELAX NG is much more streamlined; it focuses on specifying grammars for XML structures. Nothing more. I doesn't try to glue on concepts like object orientation (which is another reason for the W3C XML Schema blur). It's just very pure and hence easy and intuitive to learn and use.
Also, with the recent addition of the Compact Syntax, editing and reading schemas has never been easier. Utilites for working with the compact syntax can be found here and here.
So even though there is no perfect schema language, I'd say RELAX NG is far more perfect than W3C XML Schema in many situations. If you have applications that require W3C XML Schema, you can use Trang to convert your RELAX NG schemas.
-
A nice idea, but ..
I'd love to put openoffice on my machines in my somewhat large (and unnamed for the usual reasons) organisation. We've discussed it at the executive level, and the sole reason for staying with MSOffice is *other* organisations.
We rely on communicating with government, military and corporate entities, and their standard is Office. Period.
Whilst the import functions on openoffice are very good, they have to be (from a business critical aspect) absolutely 100% compatible -- and when you're dealing with multi-chapter doc files which use 90% of Word's capability, well, from my testings inhouse, I can't guarantee that level of accuracy. Images can move around, hide text, etc.
What I've done is start a different tactic within the organisation. All documents are PDF unless they require collaboration on the document. If collaboration is required, I'm now looking into a web-based solution (via our portal). Now, this does produce new challenges, but it does break the '.doc' monopoly.
Another damn important point is XML. With MSOffice moving towards their own XML, and with movement on producing an open standard for XML documents (slashdot article | actual link), this may be the approach that ends this problem. But it's going to be some time yet.
This is going to be a slow moving issue. I recommend we all relax, keep working on this, and slow and steady will win the race ... eventually. -
PatentsI'm not sure I follow you here.
For one thing, those more nasty of companies seems to have a lot more say in OASIS and ISO than in W3C. Nor can I look upon XML as a subset of SGML, but maybe I'm totally wrong about that.
In ISO, it seems like that cross-licensing of patents is done to shut smaller companies out of the process. OASIS too has RAND licensing and as usual, makes no attempt to define what is "reasonable" or "non-discriminatory":
OASIS.IPR.3.3 Determination of Reasonable and Non-discriminatory Terms The OASIS Board of Directors will not make any explicit determination that the assurance of reasonable and non-discriminatory terms for the use of a technology has been fulfilled in practice. It will instead use the normal requirements for the advancement of OASIS specifications to verify that the terms for use are reasonable.
This is contrary to how hard W3C has worked to ensure royalty-free patents.
I'm not saying that James Clark's stuff isn't better, it may well be, he is certainly among the foremost in this field. But this bashing of W3C seems highly undeserved.
-
Re:Darn DTD's
-
Re:Besides
Currently, the only generic standard for document exchange using XML is HTML.
Either I'm misunderestimating your statement, or you left out DocBook. For the folks who work with XML and use a DTD (as opposed to well-formed), THAT is pretty much the standard. It's fairly thorough. For more info, see the folks at OASIS of all people.
One of us is missing something then. Could easily be me. If so, please enlighten me... -
Re:But what about...
...why wouldn't they want to be on this panel and strongarm them into using their particular metadata to describe documents?Microsoft hates to compete on an even playing field. By defining Office.NET's XML schema on their own terms, they can make arbitrary changes as they see fit. Competitors will be forced to either chase down the changes or give up a measure of compatibility. And the terms of the Appeals Court settlement won't help, since it does nothing to counter the massive inertia of Office's existing market share.
In the past, Microsoft might have embraced-and-extended the OASIS spec. Now, they're under too much scrutiny to get away with it. I think they've had a change of heart in their PR philosophy: Better upfront arrogance than skullduggery in hindsight.
Besides, a large contingent of Microsoft's rivals are members of OASIS. The Committee is chaired by a Sun employee, and they're contributing the OpenOffice.org/StarOffice schema as the baseline. While they've just announced the Call for Parcipitation, Corel has already joined by press release. If Microsoft tried to join with a "my way or the highway" attitude, they'd probably be pointed at the nearest interstate.
-
Re:But what about...
...why wouldn't they want to be on this panel and strongarm them into using their particular metadata to describe documents?Microsoft hates to compete on an even playing field. By defining Office.NET's XML schema on their own terms, they can make arbitrary changes as they see fit. Competitors will be forced to either chase down the changes or give up a measure of compatibility. And the terms of the Appeals Court settlement won't help, since it does nothing to counter the massive inertia of Office's existing market share.
In the past, Microsoft might have embraced-and-extended the OASIS spec. Now, they're under too much scrutiny to get away with it. I think they've had a change of heart in their PR philosophy: Better upfront arrogance than skullduggery in hindsight.
Besides, a large contingent of Microsoft's rivals are members of OASIS. The Committee is chaired by a Sun employee, and they're contributing the OpenOffice.org/StarOffice schema as the baseline. While they've just announced the Call for Parcipitation, Corel has already joined by press release. If Microsoft tried to join with a "my way or the highway" attitude, they'd probably be pointed at the nearest interstate.
-
I would not participate if I were MS
Ignore all of the obvious issues about the value of the
.DOC monopoly. Consider instead that the name of the working group is the same as the name of a product that competes with yours and that the working group has pretty much decided that the file format will be based upon the file format of that competitor.In other words, Microsoft would be participating in the canonization of the file format of not only a competing product but an open source competing product. Can you really blame them from seeing that as a no-win situation?
I wish the standardizers and coders the best of luck and I would love to see them succeed. But I'm sure none of them are naive enough to have expected Microsoft to participate. Only the scandal-hungry hounds at CNet and Slashdot consider this news.
-
Re:ASCII Only?
What about DocBook, which features encoding for books in both SGML and XML? It was devised for computing books, but one imagines it would not be too hard to devise a standard to apply to all works of literature.
-
Re:Passport competition?
I don't know if it means much but Microsoft is a Sponsor Member.
-
Re:What we need is a ISO standard
This is already in hand. Sun are taking the OpenOffice XML file format to OASIS for standardisation. Something should be announced about the formation of a working group on this real soon now.
-
Sounds like you already made up your mind
You've obviously use LaTeX quite a bit already. That's hardly a fair comparison. You compare something with which you are already comfortable with something you haven't used at all before.
As far as markup goes, one of the reasons for using the open/close tag pair in XML was because so many people have written HTML and are used to that model.
As for complicated markup, there is a Simplified DocBook that reduces the amount of elements you have to know and keep track of while still remaining 100% DocBook compatible. Write a little now, and as your experience and comfort grows, so can your markup choice. Simplified DocBook now, full DocBook when the volume of documentation requires it later (By that time, more editors will have come out hopefully).
DocBook to PDF is handled by converting to XSL:FO (not to be confused with XSLT) syntax and serializing with something like FOP. LaTeX is actually closer to XSL:FO than to DocBook. If you're trying to convert to PDF by hand, you're expending more effort than you needed to. You can find premade stylesheets for HTML and FO and documentation about how to use them without reinventing the wheel. The advantage of going to XSL:FO instead of a direct DocBook-to-PDF is that there are serializers out there to output FO syntax to PDF, PostScript, PCL5, and RTF. It would be a shame to just make a one trick pony.
As for emacs, there are emacs extensions written for DocBook that help you with tag choices and automatically close the tags for you. Isn't that one of the main complaints you had about the syntax? And you're comfortable with emacs, right?
Note that you are using LaTeX to drive the layout. This is not how to use DocBook. In fact, DocBook goes out of its way to avoid any layout information in the file. Say you want to search for all documents with a section title that contains "apple". Anyone with a document parser can implement this no matter who wrote the DocBook file at any organization. LaTeX you could do this as long as everyone agreed upon the element identifiers -- which doesn't happen at every company. DocBook is content, HTML and PDF are layout, and never the twain shall meet...except during the transformation step.
If you prefer LaTeX, peace be with you. But they cannot really be compared as LaTeX -- while possible in implementation -- does not enforce a disctinction between semantic content and layout presentation. DocBook does. This adds some complexity for the initial startup sometimes, but it pays off when you actually have to organize and index those documents in an archive. You should talk to the folks at the Linux Documentation Project for more insight on this. -
Sounds like you already made up your mind
You've obviously use LaTeX quite a bit already. That's hardly a fair comparison. You compare something with which you are already comfortable with something you haven't used at all before.
As far as markup goes, one of the reasons for using the open/close tag pair in XML was because so many people have written HTML and are used to that model.
As for complicated markup, there is a Simplified DocBook that reduces the amount of elements you have to know and keep track of while still remaining 100% DocBook compatible. Write a little now, and as your experience and comfort grows, so can your markup choice. Simplified DocBook now, full DocBook when the volume of documentation requires it later (By that time, more editors will have come out hopefully).
DocBook to PDF is handled by converting to XSL:FO (not to be confused with XSLT) syntax and serializing with something like FOP. LaTeX is actually closer to XSL:FO than to DocBook. If you're trying to convert to PDF by hand, you're expending more effort than you needed to. You can find premade stylesheets for HTML and FO and documentation about how to use them without reinventing the wheel. The advantage of going to XSL:FO instead of a direct DocBook-to-PDF is that there are serializers out there to output FO syntax to PDF, PostScript, PCL5, and RTF. It would be a shame to just make a one trick pony.
As for emacs, there are emacs extensions written for DocBook that help you with tag choices and automatically close the tags for you. Isn't that one of the main complaints you had about the syntax? And you're comfortable with emacs, right?
Note that you are using LaTeX to drive the layout. This is not how to use DocBook. In fact, DocBook goes out of its way to avoid any layout information in the file. Say you want to search for all documents with a section title that contains "apple". Anyone with a document parser can implement this no matter who wrote the DocBook file at any organization. LaTeX you could do this as long as everyone agreed upon the element identifiers -- which doesn't happen at every company. DocBook is content, HTML and PDF are layout, and never the twain shall meet...except during the transformation step.
If you prefer LaTeX, peace be with you. But they cannot really be compared as LaTeX -- while possible in implementation -- does not enforce a disctinction between semantic content and layout presentation. DocBook does. This adds some complexity for the initial startup sometimes, but it pays off when you actually have to organize and index those documents in an archive. You should talk to the folks at the Linux Documentation Project for more insight on this. -
Re:I end up having a lot of the same questions
There is also a Simplified DocBook DTD. We used it at my last job. It is a small but useful subset of DocBook that can get you started.
All Simplified DocBook files are also completely valid DocBook documents. But there are far fewer elements and constructs to keep in your head. It's also geared toward smaller items such as articles instead of complete books. At my company, we made a couple of template documents and then just had people fill in the blanks. People ended up working faster once we got them to stop worrying about formatting and styling (non-trivial).
Start writing in SD and as the collection of documents grows, you can look into combining them into a cohesive DocBook collection as time permits and your experience level grows. -
Re:All Sites
Perhaps they should look into Legal XML. Then they could use XSL or CSS to make it presentable.
-
Re:WHAT'S A WAP PAGE?
WML - Wireless Markup Language (here's a link)
-
Re:Free as in beer!?I am starting a beer database...as discussed at the Extreme Markup conference in Montreal a couple of weeks ago. I committed a while back to starting a Beer "published subject" (using Topic Maps) as a use-case, and quite a lot of people have expressed an interest.
An announcement (here and elsewhere) will be up soon. Meantime mail me if you want to contribute. You will need to grok XTM if you want to get involved in the detail, but simpler submissions will be taken via a Web form.
///Peter -
Twitch Switch & Assistive Technology Links
Check out the twitch switch and other assistive technology aids that TelSol makes.
Other related links are here.
-
Re:Ugh. DTDs?!?
XSL stylesheets won't work on DTDs because DTDs do not utilize XML syntax.
BTW - DTDs are not obsolete either. They are far easier to read, use, and maintain than W3C XML Schema.
Take a look at RELAX NG. RELAX NG is far more elegant (e.g. easier to learn, use and maintain) than W3C XML Schema. RELAX NG provides the capabilities of W3C XML Schema - and more. -
Re:Schema war is not over...W3C XML-Schema is bloa
Using XML to describe XML simply makes sense.
In this case RELAX is far superior, it has both an XML and a non-XML represenatation and is build on top of a clean model by some brilliant fellas.
XML Schema, OTOH, is just a bloated mess.
DTD's are antiquated
Perhaps, but they are readable. XML Schema is anything but readable.
and I can't even transform against them for meta-meta-data tasks
Oh, now that's something you do every day. Using XML syntax for everything is just plain stupid. IF you have to do transforms, use RELAX, it has a cleaner model anyway... doing transforms on XML Schema is like pulling teeth.