XML 1.1 Spec Hits Some Snags
oever writes "News.com reports that the new XML 1.1 specification defines a new newline character, making it incompatible with the 1.0 specifiation. Apparently, IBM has been pushing the new character to avoid having to modify their software, thereby invalidating everybody else's XML software."
I wonder if this will have any impact on MS plans for making the next generation of Office. AFAIK, they're planning to make all the applications work together through XML... Then again, it is "only" a newline character... :P
God does not play dice - Albert Einstein
Why don't they make new-lines overridable? Then IBM can put the override at the beginning of their files.
Considering what some other vendors have done to standards, one tiny addition (which is an improvement) proposed by IBM shouldn't be a big deal. Sure, it feeds the news hounds, but seriously, compare the scale of the impact of one desirable change to all the suffering caused by other such changes in emerging standards (Microsoft's in particular).
IBM has contributed so much, it's only natural that some changes might be characterized in the news as benefitting them more than other parties. Is anyone that worried about adding a new EOL character in 1.1 that XML 1.0 "chokes" on ?
"Whoever would overthrow the liberty of a nation must begin by subduing the freeness of speech."--Benjamin Franklin
Fair point, but how many people do you think have actually used those characters in an XML1.0 document?
IBM would appear to be right, too, when they note that an application should look at the version identifier which is present at the top of the XML stream.
"The truth is that there are a lot of IBM mainframe systems out there, and they're very important," said Ronald Schmelzer, an analyst with ZapThink. "The truth is that this is not really for IBM's benefit, it's for IBM's customers' benefit. And I think that's fair. An international standard shouldn't change for the benefit of a company's future project, but it's clear that end-of-line characters are not a strategic business strategy for IBM."
is 0x156C in my programming area, 'nough said. EBCDIC is still live. Did you know that about 90% of todays enterprise data is stored in EBCDIC chars? You better update the XML specs :)
Anybody care to explain to me _why_ we need so many different newline characters | sequences? I see a point in having a single \x0a character, because a newline is one character. I see a point in having \x0a\x0d and \x0d\x0a, because they represent more accurately how a typewriter does it (and conform better to the original ASCII standard, I think). However, one of these is kind of redundant, and history seems to have decided that this is \x0a\x0d. But why, for goodness's sake, do we need all those others??? Why is it that people always do things their own way instead of following standards that work fine???
Please correct me if I got my facts wrong.
Does this mean that XML has reached the end of the line and it is time to start working on the next big thing?
Doesn't make this XML files uneditable with most editors, like vi, pico and gedit? They all use \n (byte 10) as newline character.
Not really. The change isn't exactly huge; it makes XML a bit more consistant with regard to UTF, but I don't see it breaking anything other than for those who both:
TBH if you were that lax in specifying your XML version and characterset, and then made use of non-printable characters that actually had known uses in the default charset, you deserve everything you get.
So I want off and read it (Or at least, what appears to be it. There is a rant someway down the page you link to. Is that it?)
So anyway, I read it. Surprise the surprise, the guy doesn't actually offer any actual examples of where this change would actually cause a break in itself. All he basically does is cry that 0x85 is designated as a new line character, and how dare IBM do such a thing! Then he goes into a rant about IBM, monopolies and patents. Uh huh.
The fact is that 0x0085 is designated as NEL (NEw Line) as part of the Unicode specification. XML 1.1 allows the use of Unicode, which XML 1.0 did not. Therefore, if you are using XML 1.1, and you are using 0x85 and expect to see a grave a, your document isn't a Unicode compliant document anyway, and you shouldn't be complaining that a non compliant document doesn't work with a compliant parser.
If all these people want to use 0x85 in their XML 1.1 documents, then they'll have to properly convert them to Unicode as the specification allows. Surprising, that.
I'll admit that I don't know much about the technical side of xml (and I really can't see all of the great advantages to it, either), but since when does a parser care about whitespace? Wouldn't it make more sense to let the newline character match that of the overlying OS so people can actually TYPE those newline characters? Switching to unicode is fine and dandy, but what about all of those legacy systems that don't support it?
Do you really need reason for beer? Wingman Brewers
There is only a rant on that page, no examples.
And you know what? I think an XML v1.1 document would be incompatible with any non-updated program, no matter what the changes in v1.1 are -- for if the program wasn't upgraded, it can't know what XML v.1.1 means. And there must be some difference, otherwise it wouldn't have a different version number
Jeroen
The Slashdot commentary has been pretty one-sided so I'll try and address the other side. First, IBM has said that this fix is for their mainframe customers, not for themselves. But nobody in the XML world has heard from these customers. As far as I know, no user has submitted a request for this NEL feature. No user has sent a message to the many XML mailing lists. No user has posted to Slashdot. Updating all of the XML parsers in the world is really expensive and if the mainframers don't care enough about the problem to storm the gates then maybe it isn't hurting them that badly. So from a democratic point of view, we're going to make life harder for the people who care enough to scream out loud in order to make life easier for the small minority who perhaps are not even that badly impacted.
Further discussion is on xml.com.
Okay...maybe I'm not looking at this incorrectly, but...
If IBM problem is they don't want to force everyone to update their Mainframes and cause them a head ache...but won't they still have to upgrade their Mainframes to support XML 1.1 with new XML 1.1 compatible parsers?
Eric B
ebresie@gmail.com
I'm dealing with some cross-platform XML these days. It's generally pretty wonderful, but the newline character is something that drives me a bit batty. If anyone can bring some unity to this disunity, I'm sure that all of the XML world and the Java world would be better off. It's an anachronism.