XML Schema a W3C Recommendation
J1 writes: "The World Wide Web Consortium has officially given its Stamp of Approval to the XML Schema specification. This makes it an official W3C Recommendation. The press release has the details."
← Back to Stories (view on slashdot.org)
As you can clearly see, the old system was just far too unwieldy and complex. I am glad that they have made things so much simpler.
XML has needed a truly powerful schema language to enforce data constraints in data-heavy documents. This is very much akin to having database schema for databases. With a declarative language and a common processor enforcing primary constraints on data, you free each application from having to do their own consistency checks.
XML Schema has a lot of powerful features, including the separation of types from structure, two kinds of type inheritance, modularization, default values for attributes and simple elements, and the flexibility to be as strict or as lax as the situation dictates for validation.
Having said that, the big battle brewing is whether XML Schema is going to be shoehorned into all the other XML protocals that need a data model description before there's been a wide base of practical experience developed. There's already a divide between data modelers and application developers because of the specialized knowledge that SQL and relational database design imposes; I think XML Schema does nothing to narrow that gap, which is unfortunate since class hierarchies and the hierarchical data model of XML seem a natural fit.
If you post it, they will read.
Not to detract from the humour value of your post, let me give a simple example of everyday XML usage where schemas are essential for XML.
You've got a database, with a 2 column table. Say "Company name"(char[40]) and "Net profit this year"(int). Ywanna get data to go in this table, in XML format, from another company. That XML's gonna look something like this:
Ok. Now how do you specify that the Company name should be <= 40 characters and the profit should be an integer? A DTD gives no way of doing this, it just says what order the tags can come in. Without XML schema, you're reduced to sending emails saying "Please make the Company name at most 40 chars and please make Profit a signed integer". Which is evil, cos you might have to do that for a 200 table database, and also there's no way of using that email to automatically check that XML file.
OTOH a schema lets you specify exactly what you want in a precise, even fairly simple, machine-readable format.
Now do you believe me that schemas are really important? :-)
perl -e 'fork||print for split//,"hahahaha"'
First, for XML itself. What is XML? A standard way to store and describe data in a manner that is readily addressable by virtually any computing platform. I could write Vic20 programs that handle XML (to a limited degree, 4K ain't much to work with). What else offers that? Let's examine a couple alternative data formats that, while not a comprehensive sample, illustrate the problem with non-XML formats. First, a comma-delimited format is pretty well standardized and can be addressed on virtually any computing platform -- but the data is not described. A database in Visual FoxPro provides column names that describe the data -- but it's not readily addressable on a wide variety of platforms (at least not directly). Thus, XML provides the data and the description, even including the relationship among data (i.e., the 'name' is a component of the 'customer').
So what's the Schema big deal? Well, with XML alone, you can't give someone a data format to follow which provides type checking, length restrictions, etc. If you're trading data with someone, you not only want to know the names and relationships of the data fields, and the data itself, you also want to know how the data will be formed. Is it an integer? Is it a 20 character field? You could presumably build a proprietary extension to XML that would allow you to describe those constraints, but why go through that trouble to get an end result that works only for you, when you can take a pre-built language for describing those constraints which works for everyone?
If you want to just store your own data, and you're certain that you'll never change your software, then XML doesn't offer much. It's not the most compact format. But if you exchange data with others, and/or if you are likely to change your data management software, XML becomes a valuable tool, and the Schema spec strengthens it considerably.
(Caveat: I'm relatively new to XML and am definitely in learning mode. The above describes the benefits I see from the viewpoint of someone who has several very messy data exchanges to clean up.)
No Laughing Allowed!
DTDs are rules on how the document is to be "formatted". In other words, where certain elements and tags are to be placed within a document. This refers to the document's structure. Liken it to HTML .. most HTML files have <html><body> then </body></html> tags (in that order). So long as those are in the correct order (as specified by the DTD), the document is "verified" correct by a validating parser. Yet, this has nothing to do with the data between those tags.
This is where schemas come in. They represent a validation against not only the document's structure, but also the data it contains (i.e., the data between the tags). You could liken it to the constraint on a database table's field. I.e., CustomerType = V or I (Valid, or Invalid). To continue the example from above, you could specify a schema the restricts the content of the data between the html and body tags.
Hope this helps.