Java IO Faster Than NIO
rsk writes "Paul Tyma, the man behind Mailinator, has put together an excellent performance analysis comparing old-school synchronous programming (java.io.*) to Java's asynchronous programming (java.nio.*) — showing a consistent 25% performance deficiency with the asynchronous code. As it turns out, old-style blocking I/O with modern threading libraries like Linux NPTL and multi-core machines gives you idle-thread and non-contending thread management for an extremely low cost; less than it takes to switch-and-restore connection state constantly with a selector approach."
Of course old school techniques are faster. We don't drop old school because we want better performance, we drop it because we're lazy, and want easier ways to get the job done!
I need trepanation like I need a hole in the head.
JDK7 will bring a new IO API that underneath uses epoll (Linux) or completion port (Windows). High performance servers will be possible in Java too.
Look at the timestamp of this presentation :) It's a bit of old news.
It was discussed here: http://www.theserverside.com/news/thread.tss?thread_id=48449
And it mostly shows that NIO is deficient. I encountered similar problems in my tests. Solved them by using http://mina.apache.org/ .
I'm not sure where / when NIO got equated to lower latency. The primary benefits of NIO (from my understanding of having designed and deployed both IO and NIO based servers) is that NIO allows you to have better concurrency on a single box i.e. you can service many more calls / transactions on a single machine since you aren't limited by the number of threads you can spawn on that box (and you aren't limited as much by memory, since each thread consumes a fair number of resources on the box).
For the most part (and from my experimentation), NIO actually has slightly higher latency than standard IO (especially with heavy loaded boxes).
The question you need to ask yourself is... do you require higher concurrency and fewer boxes (cheaper to run / maintain) at the expense of slightly higher latency (which would work well for most web sites), or are your transactions latency sensitive / real-time, in which case using standard IO would work better (at the cost of requiring more hardware and support).
the entire point of asynchronous is to acknowledge you will be waiting for IO, and try to do something else useful rather than just wait... asynchronous will obviously end up taking more time because of the overhead of managing states and performing the switches, but the tradeoff is something useful was getting done while waiting for IO a little longer instead of doing nothing except wait for the IO to complete. which method is best is completely application specific.
You'll laugh, hysterically.
On Windows, the fastest way to do multithreaded I/O with a producer/consumer queue pattern is IO Completion Ports.
The fastest way to write a bunch of buffers to disk is WriteFileScatter. The fastest way to read a bunch of data from disk is ReadFileGather.
SQL Server uses these APIS to scale.
When I used to work at MS in evangelism, there was a big debate about how Unix does things one way, and Microsoft does it a COMPLETELY different way that you just can't #define away - it's just different. A guy named Michael Parkes said "I cannot go to these clients and say REPENT! and use IO completion ports! They do thread per client, because they have fork()".
When you listen to the technical explanations, the Microsoft way actually IS better - it's just aht it's totally incompatible with evrything else.
Learn IOCP and watch your context switches drop.
Ff you have multiple cores that do nothing otherwise (like all benchmarks happen to act), multithreading will use them and asynchronous nonblocking I/O won't, so maximum transfer rate for static data in memory over low-latency network will be always faster for blocking threads.
In real-life applications if you always have enough work to distribute between cores/processors, your nonblocking I/O process or thread will only depend on the data production and transfer rate, not the raw throughput of the combination of syscalls that it makes. If output buffers are always empty, and input buffers are empty every time a transaction happens, then both data transfer speed is maxed out, and adding more threads that perform I/O simultaneously will only increase overhead. If it is not maxed out, same applies to queued data before/after processing -- that is, if there is processing. So if worker threads/processes do more than copying data, then giving additional cores to them is more useful than throwing them on to be used for I/O.
Contrary to the popular belief, there indeed is no God.
This may be true for Java.
It isn't true for C/C++.
With C/C++ and NPTL, the many-thread blocking IO style yields slightly lower latency at low IO rates, but offers significant latency variability and sharply decreased thruput at higher IO rates.
It seems that the linux scheduler is much to blame for this-- the number of times that a thread is scheduled on a different CPU increases dramatically with more threads, and this trashes the caches.
I've seen order-of-magnitude decreases in performance and order-of-magnitude increases in latency as a result of what appears to be the cache trashing.
FYI: Java will run on platforms that support C. Please see GNU gcj and the Classpath project.
So from a different perspective, Microsoft had to kill off Java to get anyone to use XNA, and this is supposed to be evidence of XNA's superiority?
...But I digress...
I don't think you quite got my point. Let's try a few more examples:
As should be painfully obvious by now, placing arbitrary restrictions on a comparison makes the comparison meaningless. Your original statement was that comparisons are null if the target systems aren't equal. Limiting the discussion to a single case where you know the comparison is flawed makes the comparison useless.
Instead, let's simply compare where the two technologies can be used. Java can be targeted for many systems. XNA can run on four. For any randomly-selected non-PC target platform, it seems the chance of Java working is significantly higher than XNA (or anything else, for that matter).
A more equally-weighted comparison is Java vs. .NET. Both are based on publicly-available specifications, and both offer similar functionality. I'd argue that neither is any better than the other in theory, though in practice Java has better support.
You do not have a moral or legal right to do absolutely anything you want.
My understanding is that it is not supposed to be faster. It is non-blocking and asynchronous which serves a different need.