W3C launches Binary XML Packaging
Spy der Mann writes "Remember the recent discussion on Binary XML? Well, there's news. The W3C just released the specs for XML-binary optimized packaging (XOP). In summary, they take binary data out of the XML, and put it in a separate section using MIME-Multipart. You can read the press release and the testimonials from MS, IBM and BEA."
I was drownding in debt. There was no where to turn. My wife left me, my friends all left me. Even my dog, he left me too. I had to do something.
That's when I found Binary XML. They were able to help with the debt. They got the creditors off my back and got me back on my feet.
Thanks Binary XML!
(I thought this was going to be about a standardization of compressing XML files that got rid of the excess bloat in the markup.)
Here's my binary XML-like file format which gives the best of both text and binary file formats. It's human readable and efficient at the same time! Finally, an end to the text-versus-binary wars. Here's an example file:
The following data is in binary.
UH)(&T^( @#t79nui**&tb x9#@ $Y*_@$ji[P{O@JIOHXIOU$HIIU#$hiuoHOP$UJ [etc.]
As a software developer I find this particularly good.
While I myself would prefer to write a binary protocol and send the data through a TCP socket I can no longer do that.
When we land big contracts at work that deal in government and health the key thing they need now is interoperability with others. What does this mean? XML. Whether or not you like it, XML is here to stay. Its what everyone is pushing.
Therefore we had to adapt and start using it. Not just for B2B, our rich desktop clients now communicate with the server using XML web services.
The problem we've encountered is sending binary data. Right now we have to encode the data in base64 XML which uses lots of resources. I will give more look at this but it looks particularly good.
Unless I'm horribly misreading the specification, it appears to be a way to package up XML documents and binary data that they reference into a neat package with MIME - not a way to convert a (text) XML document into a binary one.
Whatever happened to the virtues of simplicity, like a file containing a header record detailing the field names, and rows containing the data in either fixed-length or delimited form? Damn fast to implement, debug, read from and write to. Parsing? What parsing? Read the first line, split it to get your headers, and read 1 line per record.
Ideal for data exchange. Easy to manipulate via javascript on the client. Simple to display and manipulate via the DOM (Document Object Model). Not resource-hungry. Handles both text and binary data. Dirt easy on the server.
I ran a test to compare, and I'm able to select, format, and serve 1000 records this way in less time than 100 records in simple HTML, never mind xml. By doing this, the client can page through, say, 25 records at a time without having to hit the server every few seconds to see the next/prev pages.
And they're going to do what, say "gzip it" ? The amount of bandwidth and CPU time this wastes is abysmal.
Someone needs to stop these people.
o/~ Join us now and share the software
This is simply a way to reference binary data from within an XML document and to have that binary data included in the same payload (using MIME).
Passing binary data in XML is a big problem. Everybody just invents their own method of doing it (although most are just variations on the theme presented here).
There is a need for this specicification but it is not ground breaking or even particularly /. newsworth.
Ummm...it's "OK". This is probably the least ambitious Binary XML spec imaginable. That may actually be good, but I don't know. Lets see what's up here...
First of all, it's completely impossible to stream this format. All the binary chunks have to be read at some point in the future when the actual XML non-opaque content is complete. In a stream, that never happens. (Of course, XML isn't the most stream friendly protocol...you can't validate a stream.)
Secondly, this isn't wonderful for large files either; you're constantly seeking for binary data that can be many megabytes away. We solve this in web pages by having the images be completely separate (binary) files.
Thirdly, its telling that they used a PNG as a data type. Besides being yet another file format that needs its own custom binary parser (heh, I like PNG, I'm just complaining about it in the XML whinespace), it's big and simple and there's just one there. One of the things I really liked about the various Binary XML formats was the degree to which they expressly typed things like arrays of floating point values or little-endian integers. Converting values between binary and string format is an enormously painful process, one that frankly I'm astonished hasn't received CPU acceleration at this point. Every other Binary XML format has seriously thought about how to efficiently but correctly manage large arrays of such values. XOP just says...heh...you wanna dump alot of data efficiently? Check your typing at the door. Feel free to bring a buffer-overflow ridden parser in with you if you like, though.
Don't get me wrong, there's a fundamental simplicity to XOP that I can certainly understand how it's appealing. But it seems to go so massively against what XML represents that I'm not entirely sure XOP encoded content deserves to be compliant with the very regulations that forced XML adoption in the first place: Opaque formats are too expensive to maintain for any amount of time, therefore either self-describe or don't get deployed. A self-decribing document that says "All performance-critical content is opaque" seems rather counter to this spirit.
"Remember the recent discussion on Binary XML? Well, this has nothing to do with it, but we are proud to present a standard for larding out XML even more before attaching it to an email."
I, for one, welcome our new bandwidth eating plaintext overlords.
Dave
I write a blog now, you should be afraid.