Domain: w3.org
Stories and comments across the archive that link to w3.org.
Comments · 6,785
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Wow! Slashdotted Already?
This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.
Overview of XML NamespacesAs XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.
However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.
The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.
An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.
Namespace DeclarationsA namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.
Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</bk:book>
</bk:bookstore>
In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.
It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.
Default NamespacesThe previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.
A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.
Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tra cking" />
</book>
</bookstore>
This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.
Qualified and Expanded NamesA qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.
Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>
The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}schema>&n bsp;
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>
To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.
Namespaces and AttributesNamespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.
<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>
In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>
Namespace URIsA namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.
The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.
One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.
Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:
http://my.domain.example.org/product/[year/month][ / rea]
See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.
DOM, XPath, and the XML Information Set on NamespacesThe W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.
The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.
Namespaces in the Document Object Model (DOM)The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.
Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.
Namespaces in the XPath Data ModelThe W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.
Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.
Namespaces in the XML Information SetThe XML infoset recommendation considers namespace declarations to be attribute information items.
In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.
XPath, XSLT and NamespacesThe W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.
XPath and NamespacesThe XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).
For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.
using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}
try{
//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]); //create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]); //Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr); //print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
&n bsp; + PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}
Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
Example 1xpathquery.exe bookstore.xml
/bookstore/book/titleSelects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
<title>The Autobiography of Benjamin Franklin</title>
<title>The Confidence Man</title>xpathquery.exe bookstore.xml
//@genreSelect all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman" and returns:
<title>The Confidence Man</title>
However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.
<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<bk:book genre="novel" bk:genre="fiction"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:title>The Confidence Man</bk:title>
<bk:author>
<bk:first-name>Herman</bk:first-name>
<bk:last-name>Melville</bk:last-name>
</bk:author>
<bk:price>11.99</bk:price>
</bk:book>
</bookstore>
Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only. Example 2
xpathquery.exe bookstore.xml
/bookstore/book/title
Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.
xpathquery.exe bookstore.xml
//@genreSelects all the genre attributes in the document and returns:
genre="autobiography"
genre="novel"
xpathquery.exe bookstore.xml
//title[(../author/first-name = 'Herman')]Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.
The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.
The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes. Example 3
xpathquery.exe bookstore.xml
/b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstoreSelect all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>xpathquery.exe bookstore.xml
//@b:genre b urn:xmlns:25hoursaday-com:bookstoreSelects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
bk:genre="fiction"xpathquery.exe bookstore.xml
//bk:title[(../bk:author/bk:first-na me = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore
Selects all the titles where the author's first name is "Herman" and returns:
<bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore"> The Confidence Man</bk:title>
Note This last example is the same as the previous examples but rewritten to be namespace aware.
For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.
XSLT and NamespacesThe W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.
Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.
The example that follows contains:
A program for use in executing transforms from the command line.
An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
The resulting output. Program Imports System.Xml.Xsl
-
Re:Irony
The EFF, a foundation created to defend the technological freedoms and rights of individuals around the world, is promoting the use of Macromedia Flash, the success of which will fill the coffers of a pro-DMCA company at the expense of open W3C [w3.org] standards. The irony is sweet and poignant.
Please point me to the place on the W3C site where it talks about an open standard for what flash does. Are you going to try to refer me to SVG? I don't think that will fit the bill. Don't see anything about sound or interactivity.
Perhaps you could stop your whining and generate your Flash from PHP, or you could head over here and read the spec for Flash, published by Macromedia in 1998. -
IronyThe EFF, a foundation created to defend the technological freedoms and rights of individuals around the world, is promoting the use of Macromedia Flash, the success of which will fill the coffers of a pro-DMCA company at the expense of open W3C standards. The irony is sweet and poignant.
I mean, they don't have to be Free Software fanatics or anything, but I do expect the EFF to at least make a token effort at not squandering money, and keeping true to their beliefs. Surely the $1000 licensing fee for Macromedia Flash plus time spend developing the animation could have been better employed to defend peoples' rights in the real world, rather than promoting a proprietary plugin that I can't even see. I mean, the total amount I have donated to the EFF in the past is probably less than $1000, and I'd consider myself one of their more pecunative sponsorors.
Self-declared "pragmatists" may claim that Flash is the most widespread, or that it is the 'best', but he is a fool who denies that freedom from vendor lock-in and the goals of the EFF do not go hand in hand. I'll be contacting them directly over the next few days; the only right thing for them to do is to remove the proprietary Flash animation from their site. -
Now with valid HTML and correct linksA helpful reader named Jon Doyle wrote in to tell me I had a big HTML error in Why You Should use Encryption that caused half the page to be one big link. I hadn't noticed that because it didn't occur in the browsers I'd tried it with.
I ran it through the W3C HTML validator and found quite a few problems with the HTML, and have fixed them. The page now validates as HTML 4.01 transitional.
Also I have long had a bad link to a page called "Email Encryption Made Simple", and several people have written in over the last couple years to give me an updated URL, but I never got around to fixing it. Now the link works.
Finally, I urge the use of PGP on the page. But Network Associates no longer supports PGP. I thought it would be helpful to mention GNU Privacy Guard, which is actually what I use these days. I added links to it and will try to elaborate on it in the discussion sometime in the next week or so.
-
Re:I fail to see
You mentioned nothing about SSH in the first post hence I had to judge what type of use you are based on the things that you did actually remember to mention. Apparently you don't grasp web design, apache, or any of the other tools that you would need to do what you want to do (IE: using a search engine).
http://www.w3.org/International/O-URL-code.html
Thanks for playing.
Idiot. -
Re:I fail to see
W3C has a page on the subject. I don't know why it doesn't work in your case; I suspect you're dealing with character set issues, but you didn't mention what webbrowser you were using or anything, so it's hard to tell what went wrong.
-
more information...
This was discussed on RISKS some time back. They provide a link to a copy of the article.
Also, from draft-masinter-url-i18n-08:
6. Security Considerations
If IRI entry software normalizes the characters entered, but the resource names on the interpreting side are not normalized accordingly, and the interpreting software does not take this into account, there is a possibility of "spoofing". Similar possibilities turn up when interpreting software accepts URIs in various native encodings or allows accents and similar things to be ignored.
"Spoofing" means that somebody may add a resource name that looks the same or similar to the user while actually being different, or a resource name that contains the same characters, but in a different encoding. The added resource may pretend to be the real resource by looking very similar, but may contain all kinds of changes that may be difficult to spot but can cause all kinds of problems.
Conceptually, this is no different from the problems surrounding the use of case-insensitive web servers. For example, a popular web page with a mixed case name (http://big.site/PopularPage.html) might be "spoofed" by someone who obtains access to (http://big.site/popularpage.html).
However, the introduction of character normalization, of additional mappings for user convenience, and of mappings for various encodings may increase the number of spoofing possibilities. In some cases, in particular for Latin-based resource names, this is usually easy to detect because UTF-8-encoded names, when interpreted and viewed as legacy encodings, produce mostly garbage. In other cases, when concurrently used encodings have a similar structure, but there are no characters that have exactly the same encoding, detection is more difficult. A good example may be the concurrent use of Shift_JIS and EUC-JP on a Japanese server.
Administrators of large sites which allow independent users to create subareas may need to be careful that the aliasing rules do not create chances for spoofing.
-
Re:Dump all "Office" software packages
Dump propreitary formats, standardize on something even bigger and widespread.
What should that standard be that NN4-created "HTML" documents adhere to? I hope you don't talk about HTML, try feeding one of your documents to a HTML validator. -
Re:So did Apple, evidently
http 1.1 server pipeline data... that is the compress everything the server sends you... the you broswser decompresses it. Read about it here
-
Re:Heres the post everyone should read first
5) So if I find a page that Mozilla/Netscape can not render, yet IE can, how does that prove that the bug isn't with Mozilla/Netscape.
-
Re:Heres the post everyone should read first
5) So if I find a page that Mozilla/Netscape can not render, yet IE can, how does that prove that the bug isn't with Mozilla/Netscape.
-
Re:Heres the post everyone should read first
5) So if I find a page that Mozilla/Netscape can not render, yet IE can, how does that prove that the bug isn't with Mozilla/Netscape.
-
Re:Heres the post everyone should read first
5) So if I find a page that Mozilla/Netscape can not render, yet IE can, how does that prove that the bug isn't with Mozilla/Netscape.
-
Re:Heres the post everyone should read first
5) So if I find a page that Mozilla/Netscape can not render, yet IE can, how does that prove that the bug isn't with Mozilla/Netscape.
-
Re:Heres the post everyone should read first
5) So if I find a page that Mozilla/Netscape can not render, yet IE can, how does that prove that the bug isn't with Mozilla/Netscape.
-
Re:Heres the post everyone should read first
5) So if I find a page that Mozilla/Netscape can not render, yet IE can, how does that prove that the bug isn't with Mozilla/Netscape.
-
Re:Heres the post everyone should read first
5) So if I find a page that Mozilla/Netscape can not render, yet IE can, how does that prove that the bug isn't with Mozilla/Netscape.
-
Re:Heres the post everyone should read first
5) So if I find a page that Mozilla/Netscape can not render, yet IE can, how does that prove that the bug isn't with Mozilla/Netscape.
-
Re:As a Web Designer...
Well, it's not really a fair test since the site is made up of fairly broken HTML (see validator.w3.org), and the designer obviously only tested it to make sure it "works" right under IE.
-
MathML worksMathML is part of the preview release, although Netscape seems pretty quiet about it. It wasn't mentioned in the PC World article either. I tried it out on the Mozilla MathML torture test and it works fine. The only negative is that you need to separately load some math fonts
... at least on unix.Undoubtably MathML support is there because it is in Mozilla. Between Mozilla, Netscape, and IE (with MathPlayer), all of the major browsers will support MathML. That together with support from math programs such as Mathematica, it really looks like MathML will finally become real this year.
There's a conference on MathML at the end of June this year. Leslie Lamport (LaTeX fame) and Roger Sidje (who did the MathML support in Mozilla) are among the invited speakers.
-
Re:Maybe I'm missing something . . .
SOAP also uses the same port 80 (to sidestep firewalls)
Soap can use port 80 if it wants, but there is absolutely nothing stopping you having your SOAP server running on any port you want. There is nothing forcing you to even use http/https for the transport layer.
To quote from section 2.1 of the SOAP standards
A SOAP message such as that in Example 1 may be transferred by different underlying protocols and used in a variety of message exchange patterns. For example, with a Web-based access to a travel service, it could be placed in the body of a HTTP POST request. In another transport binding, it can be sent in an email message (see section 3.2). Section 3 will describe how SOAP messages may be conveyed by a variety of transports.
There are security issues with SOAP, but this whole port 80/firewall thing is a red herring. -
WS SecurityWell Microsoft, IBM and somebody else have released the WS Security "spec" (whitepaper) to address some security issues with SOAP, namely message-level digital signature and encryption. It's technically clean, if a little light on detail.
Things to note (strategic):
None of SOAP, WSDL, UDDI, and now WS Security are "Royalty Free".
SOAP isn't a de jure standard -- it's a W3C "note".
UDDI was supposed to move into an open standards body in 2001 but still hasn't.
By publishing WS Security on their websites and through no open standards body we see Microsoft, IBM and that other company abandoning even attempts to appear open.
On the technical side -- if you want to see a little deeper into the security issues left unsolved by SOAP, I recommend you look at the OASIS technical committee specification, ebXML Message Service Specification version 2.0 rev C.
-
Re:Really really bad design.
I agree. I used a couple of different versions of wget on a couple of different boxes to make sure I was getting all the page requisites. I can't believe that page renders properly on anything.
http://validator.w3.org/ says:
"Sorry, this document does not validate as XHTML 1.0 Transitional." Which is what the document claims to be. -
Re:Flash would work, right? Wrong!I'll have to disagree with using Flash. It's closed, it costs money, etc, etc. Use SVG instead. Being XML based it is simply a matter of constructing a text file and then running it through an viewer. If it turns out not to have the scope of primitives that you need you can try something else. However, I bet that you'll be able to do most of the animations that you have imagined with little difficulty.
On a related note, I've concentrated quite a bit over the past year in generating charts and graphs to multiple output formats simultaneously. Of interest here are some new versions of old standard apps, Gnuplot and Graphviz. Both applications contain SVG output capabilities in their latest builds. I've been using both to generate both PNG, but also PS and SVG. I will generally convert the PS to PDF via Ghostscript. To me this represents an incredible time savings by allowing me to generate PNG's to act as thumbnails for the SVG and PDF versions of the same graph. Consider also the ability to nest SVG objects within a larger SVG picture (or animation). To aid in technical illustration you could actually embed SVG chart animations with the other custom 2D animations that you seek to create to further clarify the idea you are trying to present.
Raster, vector and publication quality visualizations in one fell swoop without spending a dime. Schweet!
-
Re:CSS rendering bug
they do
:)
this page is their test and it complains about the missing doctype too.
You have to put the url in, slashdots crudbuster wouldn't let me post the complete addy. -
CSS rendering bug
I have zero experience with Mozilla's development, so I thought I'd ask for advice before spamming bugzilla...
Mozilla incorrectly renders this w3c CSS1 "float" test. How do I determine if this is known: what kind of bug do I search for? If it is not known, where and how should I file it, or should I report it to a Mozilla insider to file for me? -
Re:Hmmm. Photomesa...Think in terms of the real world where you can inspect your intended target from a distance and decide what the best route is to get there. That can't happen in 2D w/o alot of cumbersome reference (ala CLI).
Photomesa is based on the Jazz API (a Java class lib), that implements a ZUI (zoomable user interface), a paradigm sometimes described as 2.5 D interface.
You can use the mouse to move in xy plane und with an extra key press (or possibly via mouse wheel in J2SE 1.4) you change the height of the camera in z direction. This combined with semantic zooming, which means different levels of detail associated with the height, make for very nice user interfaces.
A similiar API has been developed at Xerox France, the VTM library which used by the W3C for the nice IsaViz tool.
Despite IMHO Java still sucks regarding performance, both APIs perform very well with large object scence graphs. Like I wrote, combined with Java's luxury 2D API, it enables us to build attractive user interfaces.
Regards,
Marc -
Re:So Abiword isn't IE compatible?
I don't have IE but, is IE HTML compatible?
http://validator.w3.org
-
Re:Mozilla not ready for ecommerceMozilla is not broken.
From the HTTP 1.1 specsNote: Because the source of a link may be private information or may reveal an otherwise private information source, it is strongly recommended that the user be able to select whether or not the Referer field is sent. For example, a browser client could have a toggle switch for browsing openly/anonymously, which would respectively enable/disable the sending of Referer and From information.
Repeat after me:
The REFERER field is OPTIONAL.
O P T I O N A L
Anyone relying on an optional field is making a mistake in their design. Just like the people that assume everyone will have images on, or javascript on, or flash, or windows media player. Just because the majority has something doesn't mean you can ignore the minority. Could support for this sort of thing being added to Mozilla as a user-selectable option, though? Sure.
And, just because IE and earlier Navigator versions did something does not mean it's correct. If it did, then PNG wouldn't support transparency. -
Tim Berners-Lee
"The ability to refer to a document (or a person or any thing else) is in general a fundamental right of free speech to the same extent that speech is free. Making the reference with a hypertext link is more efficient but changes nothing else. . . . There is no reason to have to ask before making a link to another site."
-
Bad example
I mean, a free open standard has worked pretty well for HTML.
Yes it has... but I worked as a <shame>webdesigner</shame> for a (short) while at the end of the browser wars.
And let me tall you.
For the longest time html was a mess! They (the w3c) even canned the 3.0 version and went to 3.2 because things were so confused. And 4.0 and CSS took years before most browsers implemented it in a reasonable way.
You can still run into issues created by Netscape and Microsoft in the browser wars if you don't watch out...
But you're right about things turning out ok in the end.
Html is good, css is ok, the browsers conform better to the DOM every day, and xhtml is a true blessing!
But it sure was a rough ride! -
This just in!
-
See this letter I wrote to TimBL
Once you have put a page on the Web, you need to keep it there indefinitely.
How is this possible if you happen to lose control of the domain? I wrote a letter to Tim Berners-Lee about this issue.
In "Cool URIs don't change" you wrote that URIs SHOULD never change. However, you left some questions unanswered:
In theory, the domain name space owner owns the domain name space and therefore all URIs in it.
However, what happens when ownership of the domain name is suddenly removed from under a user's feet?
Except insolvency, nothing prevents the domain name owner from keeping the name.
Wrong. A trademark owner in Yakkestonia can drag a domain name owner into WIPO court and have the domain forcibly transferred. Under ICANN's dispute resolution policy, the plaintiff gets to pick the court, and WIPO has shown itself to find for the plaintiff in an overwhelming majority of cases. This is sometimes called "reverse domain name hijacking."
And in theory the URI space under your domain name is totally under your control, so you can make it as stable as you like.
Not if a hosting provider provides only subdomains (or worse yet, subdirectories) and does not offer an affordable hosting package that lets a client use his or her own domain name. Would it be reasonable to construe the "Cool URIs don't change" article as a warning against using such providers?
John doesn't maintain that file any more, Jane does. Whatever was that URI doing with John's name in it? It was in his directory? I see.
What is the alternative to this situation for documents that began their lives hosted on an ISP's or university's server space?
Pretty much the only good reason for a document to disappear from the Web is that the company which owned the domain name went out of business or can no longer afford to keep the server running.
Or that the hosting provider pulled the document under a strained interpretation of its Terms of Service because the company didn't like the document's content.
Is there an official W3C answer to these questions?
-
Hurrah for the w3c troll!
Thank goodness, you have my blessings !
Try running any .NET aspx page (asp.net)through a validator and see what you get! , makes /. look good :p
"The great thing about standards is there are so many to choose from" - A.non -
Slashcode's HTML vs. Microsoft HTMLIntroduction
Recently there has been some controversy over Slashdot's apparent disregard for browsers other than Konqueror or Netscape (*cough* IE *cough) ability to render the page, and some unfortunate crapfloods which would appear differently in different browsers.
The "editors" (I use the term loosely) of Slashdot appear to believe that Slashcode generates perfect HTML which any browser should render correctly, else the browser must be "buggy".
Slashcode's HTML Output
Just curious, I tried running the front page of Slashdot through the W3 validator to test this claim. The results were shocking.
Lets stick to the facts and drill down into the numbers. The W3 validator found HUNDREDS of errors on the very first page of Slashdot that you view every day. It terminated with the simple line Sorry, this document does not validate as HTML 3.2..
So, what is broken? Is it IE? Or is it the amateur garage-style open source code which is at fault? You be the judge.
Apparently, Slashcode follows the open source coding and testing ethic of "it worked for me". It's just too much to ask them to try to test their code for conformance and compliance, or even just try it on a variety of platforms.
Microsoft's HTML Output
Still curious, I tried running msn.com and microsoft.com through the validator. I was totally taken aback when the validator reported ZERO ERRORS in *either* of these pages.
Conculsion
1. It may benefit the coders to attempt to adapt to some kind of acceptable process for designing, writing, and testing their own code. Perhaps some professional experience would be beneficial here. Certainly an accountability for certain quality standards must be implemented.
2. Perhaps Slashdot should consider switching to IIS 5.0 or
.NET server and rewriting their code using a stable, reliable platform like Visual C++ or .NET. Perhaps only then will the browser compatbility issues will be resolved.These are just suggestions. I am here to help.
-
Slashcode's HTML vs. Microsoft HTMLIntroduction
Recently there has been some controversy over Slashdot's apparent disregard for browsers other than Konqueror or Netscape (*cough* IE *cough) ability to render the page, and some unfortunate crapfloods which would appear differently in different browsers.
The "editors" (I use the term loosely) of Slashdot appear to believe that Slashcode generates perfect HTML which any browser should render correctly, else the browser must be "buggy".
Slashcode's HTML Output
Just curious, I tried running the front page of Slashdot through the W3 validator to test this claim. The results were shocking.
Lets stick to the facts and drill down into the numbers. The W3 validator found HUNDREDS of errors on the very first page of Slashdot that you view every day. It terminated with the simple line Sorry, this document does not validate as HTML 3.2..
So, what is broken? Is it IE? Or is it the amateur garage-style open source code which is at fault? You be the judge.
Apparently, Slashcode follows the open source coding and testing ethic of "it worked for me". It's just too much to ask them to try to test their code for conformance and compliance, or even just try it on a variety of platforms.
Microsoft's HTML Output
Still curious, I tried running msn.com and microsoft.com through the validator. I was totally taken aback when the validator reported ZERO ERRORS in *either* of these pages.
Conculsion
1. It may benefit the coders to attempt to adapt to some kind of acceptable process for designing, writing, and testing their own code. Perhaps some professional experience would be beneficial here. Certainly an accountability for certain quality standards must be implemented.
2. Perhaps Slashdot should consider switching to IIS 5.0 or
.NET server and rewriting their code using a stable, reliable platform like Visual C++ or .NET. Perhaps only then will the browser compatbility issues will be resolved.These are just suggestions. I am here to help.
-
1300 lines of eula?
maybe a scheme similar to the Platform for Privacy Preferences (P3P) Project could be used to unclutter this mess.
but i doubt that software manufacturers of popular adware products would agree to a eula, that is brought in readable form stating "all your information are belong to us - click yes to let us install trojan profile-harvesting software" -
You must learn about functional languagesI wrote this big long post about language paradigms but my browser erased it. Thank you, IE.
Anyhow, what you've got to get through your mind is that XSLT is a functional language; it's not procedural. We're not in Kansas anymore, Toto. Functional languages are a different beast than what you're used to, i.e. procedural languages.
If you've taken a CS course on programming languages and they covered functional languages well than you don't have to learn anything, just realize that XSLT is functional and program accordingly.
If you haven't taken a CS programming languages course or didn't cover functional languages then you need to learn the functional paradigm. Think transformation, not variable assignment. Think functional composition, not a row of statements. Think recursion, not looping.
To learn it yourself, you're going to need a good book on functional programming, such as Haskell or Erlang. Both have open-source implementations available. This is not something you're going to learn in a day, it will take you a month of diligent study before you understand what you're doing. If you know rigorous logic well and are adept in discrete math and set theory you'll advance more readily.
In addition to learning functional programming, you'll have to learn the quirks of XSLT. For one, you need to have mastery of the specs for XML, XPath, and of course XSLT. Al these specs are quite readable: read them through, try out examples. You also need to be aware of the limitation of XSLT, especially in these early days of the 1.0 version (2.0 should be more useful). Often you'll need to either do processing before/after XSLT with Perl or something else, or directly incorporate JavaScript into your XSLT document. Be sure to understand inclusions and use XSLT library documents with useful functions.
Once you understand functional languages in general and XSLT in particular you'll be on your way to mastery of XSLT. You'll also learn alot along the way that will be useful even if you never touch XSLT again. In addition, you'll have experience in the maxim uttered by other posters that you must use the right tool for the right job. XSLT is superb for most common XML transformations (such as XML to HTML) and is far better than any other languages for doing this. But sometimes you need something else like Perl or Java for direct manipulation. Use the right tool for the right job.