Using XML in Performance Sensitive Apps?
A Parser's Baggage queries: "For the last couple of years I've been working with XML based protocols and one thing that keeps coming up is the amount of CPU power needed to handle 10, 20, 30 or 40 concurrent requests. I've ran benchmarks on both Java and C#, and my results show that on a 2ghz CPU, the upper boundary for concurrent clients is around 20, regardless of the platform. How have other developers dealt with these issues and what kinds of argument do you use to make the performance concerns know to the execs. I'm in favor of using XML for it's flexibility, but for performance sensitive applications, the weight is simply too big. This is especially true when some executive expects and demands that it handle 1000 requests/second on a 1 or 2 cpu server. Things like stream/pull parsers help for SOAP, but when you're reading and using the entire message, pull parsing doesn't buy you any advantages."
<?xml version="1.0" encoding="UTF-8"?>z RKiQIXOSMfOskEqGRJJIpshcyjxLokLGIoRkCplDvP/zVhe/Dv /n+77d5/5+nude92zn7LPWXmt91mevvc+++1Vl5I7AhBB0+/v6 G5rpaNAwCBRiY3SRTkwMQid8wtza1NDO3M3UBAIDLk9C0Ajglz xEB4LEwuEQJAoN0SPcBsFh0Dg48F+yEAwUhUMC/6UCIVyfhuDQ CBRwLS4OoTO1NiH0DCH5x8XO9DwdICEchoPQQX//wNCQn78h1n Q0v1qQGDjuzzYUHIMFtSGxoGexSAQS3IZFgNpQ8A3akKg/2mCA NDBwG/rPZ2EwJO5PmWEwFBz0LByHAN0Hx6FB9yFR0D/1ANrgf+ oLA4wFB7ehQc9i4CjQOzBwDEgPLHAjqA2HBr0XB4eC25BIDKgN jQHpi8NB/3wHHJD6T1ngUAT2T92At4L8BQ4Y+M/3wmFQLBTUhg DZHA5DgcYeDsNCQc8Cwvw5pnA4HA16LzDMoHfAMWD5EFDwOxBw 5J8+DkcAwQBqw8DA/eEwoP6QgHagNiQK9CwSAwM/i0OAxhkFQ4 P6QyFAfg9HoeEgPVBAwP3ZhobiQGOP3sBGaBQK9F7ArUD3YcBY AgfcGfRewByg/jAYkD/DMThQ7MOxMAzoPiwS/CwWUATUhgU/i4 OCsAmACBho/HAosB44DAgTAbcC+R8CGP0/bY6AgrETAcWAxh4B xYFiEAGDQ8FtSMSfY4qAoTF/jh8ChoOCZIHDQPEBeAHuz3hDwM GxjwAGHyQzoAioPwQC/qePIxAoUPwiEBgsaEyRUBCOI5CA94La kDjQGCAxCNA7gNtAbSgYCCeBvAvyA4LIoPeiAID+sw0NA+UKBB oBwjoEGngY1IYFjxUGCAdQGxwNkg+zwRhgMAiQjTA4UCwAaA8F 9YfdwK+waCxIDywO7Pc4GCg3InAI8Pjh0FDws1hQHkRCoSAsRg Ie/acsSCgKlOORgEuC78OBxgoJgyP/lA8JhMef44KEYU">
<session session="2003-06-27T17:03:39GMT+08:00" session-serialNumber="06302003b01" encode-version="1.8"><structure id="bzip2"><info cdate="2003-07-12T14:57:07+08:00" expiry-date="" id="OBD12" mdate="2003-07-12T14:57:07+08:00" name="" notes="" organization="Sd7+/OtxQ==" version="1.0"/><content code="H4sIAAAAAAAAAMy9CThW2xc/rpQpYxKJvIakEu88IJk
Hint: The shorter the header, the faster.
P.S. This is a joke, for humor-impaired
1. I use DOM objects, in this case the MSXML free threaded model, to handle xml strings and read out the string only at the last point.
2. I would also suggest using wstring/string in the STL library as you can reserve string buffers in advance in case you have to handle the XML as strings, that's if your using c++, don't know much about c#/java sorry.
using this method I have manage to push it to ~200 concurrent requests.
mlati
well there's your problem.
With mod_perl, XML::LibXML, XML::LibXSLT, I EASILY get 100/per second. and my code is shitty.
what do you do with the XML, do you generate HTML from it with XSLT or what?
another thing to try: intelligently cache your results in shared memory. you can easily double performance or more.
I love XML, and I use it anywhere I can get away with it, but I know from my old job, that switching to a binary protocol that is streamlined for the task at hand can give you performance gains over XML protocols that are just plain ridiculous.
I think we the results we measured were something like 1000 times as many connections on a custom binary protocol over an XML based one.
That was in C++ mind you. YMMV.
Give me liberty or give me kill -s 9
This is an example of the wrong way to use XML.
XML is great because it's extensible and a markup language. It's great for storage, configuration files, and certain forms of data transmission (which is just a sub-set of storage).
What XML is not good for is performance-critical transmission protocols. It's too verbose and too complex, and both are bad for protocols. That is the mistake made by the author of the article. Go with a structured protocol and skip the XML.
"Times have not become more violent. They have just become more televised."
-Marilyn Manson