Using XML in Performance Sensitive Apps?
A Parser's Baggage queries: "For the last couple of years I've been working with XML based protocols and one thing that keeps coming up is the amount of CPU power needed to handle 10, 20, 30 or 40 concurrent requests. I've ran benchmarks on both Java and C#, and my results show that on a 2ghz CPU, the upper boundary for concurrent clients is around 20, regardless of the platform. How have other developers dealt with these issues and what kinds of argument do you use to make the performance concerns know to the execs. I'm in favor of using XML for it's flexibility, but for performance sensitive applications, the weight is simply too big. This is especially true when some executive expects and demands that it handle 1000 requests/second on a 1 or 2 cpu server. Things like stream/pull parsers help for SOAP, but when you're reading and using the entire message, pull parsing doesn't buy you any advantages."
I love XML, and I use it anywhere I can get away with it, but I know from my old job, that switching to a binary protocol that is streamlined for the task at hand can give you performance gains over XML protocols that are just plain ridiculous.
I think we the results we measured were something like 1000 times as many connections on a custom binary protocol over an XML based one.
That was in C++ mind you. YMMV.
Give me liberty or give me kill -s 9
Have you profiled your application?
/db connection/db speed). Look at your own code with a profiler to see the bottleneck.
:-)
Do you test on a dedicated test system?
If your only getting 20 concurrent users regardless of platform (could be, it really depends on the setup and complexity of the problem), maybe the technology isn't the problem but it could be network etc.
benchmarking is fine, but if you do it on the whole system you don't know what the problem really is.
Find out precisely what the problem is (network/xml parser/your app logic
If you do end up blaming the parser, change it! (and i don't mean using a different parsing method as most use a sax parser to generate the tree anyway) there are parsers that are 50% faster than those used as standard (xerces isn't the fastest java parser around!). Also look at the most efficient way of using the tree (java dom is, as already said, slow in usage) or maybe you can go from sax directly to your object model without using a tree but building your own sax parser.
If you can't get a performance gain (which I really doubt), be honest to your client. "If you want to do it that way it's going to cost you" or "it can't be done on one machine" how did they get the idea they could handle 1000's of requests a second anyway? Work on your expectationmanagment (basicly work on making their expectations more realistic). If you promise mountains make sure you can deliver them first. If you can't deliver them make them not want mountains but molehills
So what you're saying is that you stopped using XML and used something completely different that has a visual similarity to XML.
Hint: if it doesn't do unicode, DTDs, CDATA sections and all the other crap, its not XML.
i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net
Well, I wasn't really advocating writing your own XML parser, although if enough parameters are fixed (encoding, namespaces and such) and the DTD is simple, that might be an option. I was just trying to say that the parser does not have to be slow. Just try to find a SAX-style parser, one that lets you define events associated with tags (parsing on-the-fly) instead of one that slurps an XML file and produces a DOM-tree out of it. While the tree might prove more convenient (you can traverse it in all directions), its construction and destruction might be expensive.
With mod_perl, XML::LibXML, XML::LibXSLT, I EASILY get 100/per second. and my code is shitty.
Amen. All of my XML processing code for the last year has been written using the above-mentioned tools, and it's been fast enough that I haven't needed to spend time performance tuning.
See the apache axkit project for more info.
Maybe it's time someone wrote an intelligent pre-parser. Take a cursory look at the XML and pass it on to an appropriate parser based on encoding, DTD, size, etc. Or run the document through a pipeline, where every single request takes longer to process, but you can several in the pipe at the same time.
There's no reason there has to be a single heroic XML parser that does everything.
A Government Is a Body of People, Usually Notably Ungoverned
So compress the XML. Since it's text, and usually very regular text, it compresses nicely. A simple pretuned huffman filter will do wonders.
A Government Is a Body of People, Usually Notably Ungoverned