Slashdot Mirror


Java IO Faster Than NIO

rsk writes "Paul Tyma, the man behind Mailinator, has put together an excellent performance analysis comparing old-school synchronous programming (java.io.*) to Java's asynchronous programming (java.nio.*) — showing a consistent 25% performance deficiency with the asynchronous code. As it turns out, old-style blocking I/O with modern threading libraries like Linux NPTL and multi-core machines gives you idle-thread and non-contending thread management for an extremely low cost; less than it takes to switch-and-restore connection state constantly with a selector approach."

26 of 270 comments (clear)

  1. And this is news? by Just_Say_Duhhh · · Score: 4, Insightful

    Of course old school techniques are faster. We don't drop old school because we want better performance, we drop it because we're lazy, and want easier ways to get the job done!

    --
    I need trepanation like I need a hole in the head.
    1. Re:And this is news? by bolthole · · Score: 5, Insightful

      naw, old school gets dropped simply because it's "old" (ie: not trendy/buzzword compliant).
      Many times, the "old school" way is EASIER than the newfangled way.

      Example: the 100-200 line perl scripts that can be done in 10 lines of regular oldfashion shell.

    2. Re:And this is news? by HFXPro · · Score: 4, Informative

      Except NIO is usually not as straight forward as java io. It isn't particular hard to use either if you learn to use threads to handle the I/O and pass information through queues.

      --
      Reserved Word.
    3. Re:And this is news? by TommydCat · · Score: 3, Funny

      select() sucked the life out of me in the 90s and I don't think I'll ever recover...

      --
      This comment does not necessarily represent the views and opinions of the author.
    4. Re:And this is news? by ShadowRangerRIT · · Score: 3, Insightful

      Asynchronous I/O is by no means easier. There's a hell of a lot more to keep track of, and a lot more work to do to make asynchronous I/O work correctly; synchronous I/O is much easier to code, and apparently it's faster on Linux to boot.

      --
      $_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
    5. Re:And this is news? by cosm · · Score: 4, Insightful

      Of course some old school techniques are faster. We don't drop old school because we want better performance, we drop it because we're lazy, and want easier ways to get the job done!

      Minor addition to your comment, for some may get the wrong impression if it gets modded up the chain.

      That is a bit of a generalization, and not necessarily accurate. I would say that heavily tested, tried and true techniques are faster. Libraries that fall into the aforementioned realm tend to be older, and hence more time for testing and refinement, but being old doesn't necessarily guarantee it will always be faster all of the time, as your comment implies.

      --
      'We are trying to prove ourselves wrong as quickly as possible, because only in that way can we find progress.' RPF
    6. Re:And this is news? by ShadowRangerRIT · · Score: 5, Insightful

      Example: the 100-200 line perl scripts that can be done in 10 lines of regular oldfashion shell.

      Clearly you're not using Perl the way it was meant to be used. This obsession with coding Perl the way you'd code Java (with classes/objects, libraries to do what shell utilities do, etc.) makes it very verbose. But if you use it the old way (quick and dirty scripts, no compunctions about calling to external shell utilities where they can do the job quicker, not bothering with use strict or use warnings, using the implicit variables shamelessly, etc.), Perl is, almost be definition, just as compact as shell. After all, if shell can do it, so can Perl, you just need to wrap it in backticks (and most of the time, Perl can do it natively with equal or greater compactness). Granted, when you code Perl like that it becomes more fragile and the code is hard to maintain. But then, so was the shell script.

      The problem with a lot of verbose Perl scripts is that the developers were taught to program Perl like C with dynamic typing (as I was initially, before I had to do it for a job and read Learning Perl and Effective Perl Programming cover to cover). I'm not completely insane, so I do code with use strict and warnings enabled, but I don't use the awful OO features, and even with the code overhead from use strict, my Perl scripts are usually equal to or less than 120% the length of an equivalent shell script (and often much shorter). Plus, using Perl means you don't need to learn the intricacies of every one of the dozens of shell utilities, most of your code can transfer to environments without the GNU tools (and heck, it doesn't explode if the machine you run on only offers csh and you wrote in bash), and most of what you're doing runs in a single process, instead of requiring multiple processes, piping text from one to another, constantly reparsing from string form to process usable form.

      --
      $_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
    7. Re:And this is news? by osu-neko · · Score: 3, Funny

      Perl is not New Fangled. I am sorry to say Perl is one of those .COM languages that has sparked peoples interest for a few years but have settled down to niche language. So it is now an Old School Language... Sorry...

      :o

      GET OFF MY LAWN!

      --
      "Convictions are more dangerous enemies of truth than lies."
    8. Re:And this is news? by ShadowRangerRIT · · Score: 5, Informative

      .COM languages? You mean a web language? You do realize Perl was written as a replacement for sed, awk and the shell languages (csh, bash, etc.), to make systems administration easier by providing a single language that used a familiar, C-like syntax and made text parsing trivial. The web was a non-entity when Perl was created. The fact that it was an acceptable language for web development is tied to the initial design goal of parsing text quickly, but that was never the purpose of Perl, and the spread of the language was not solely (and not even primarily) due to its use on the web.

      --
      $_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
    9. Re:And this is news? by CODiNE · · Score: 4, Insightful

      In agreement with your post...

      As a recent article showed, traditional algorithms may be less optimal on modern systems with multiple layers of cache and various speed memory systems. New or old it's always important to benchmark and find the right tool for your particular needs.

      --
      Cwm, fjord-bank glyphs vext quiz
    10. Re:And this is news? by Ossifer · · Score: 4, Funny

      Tape a piece of cardboard over it.

    11. Re:And this is news? by Lunix+Nutcase · · Score: 3, Insightful

      In the past, successful developers were all highly skilled. It was a necessary trait for success both because development was difficult, and because there were so few ways to make money developing software. Unsuccessful developers stopped developing, and their code does not persist until today.

      You must not work with much legacy code. I've dealt with shitty code that is both a couple years old to a many decades old (a mix of C, Fortran, Ada, various assembly, etc). This notion that all old programmers were godlike gurus is mostly myth.

    12. Re:And this is news? by FoolishOwl · · Score: 3, Funny

      Some people are less afraid of SkyNet than they are of regular expressions.

    13. Re:And this is news? by Jeremi · · Score: 3, Informative

      but usually default to megabytes per thread, so if you have thousands of concurrent clients, you will soak up memory in fairly large quantities.

      There's an important distinction to make here: a thread's stack will reserve (so many) kilobytes/megabytes of address space, but it won't actually use up very much RAM unless/until the thread starts to actually use a lot of stack space (e.g. by doing a lot of recursion).

      On a 32-bit machine, starting too many threads can allocate all of your process's 2-4 gigabytes of address space, which can cause problems even though you have plenty of RAM still free.

      On a 64-bit machine, on the other hand, the amount of available address space is mind-bogglingly huge, so running out of address space isn't a problem you're likely to run into, even if you run a gazillion threads at once.

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    14. Re:And this is news? by ShadowRangerRIT · · Score: 3, Informative

      I don't see how implicit variables are necessarily bad practice. It's a language convention. Programmers write for loops that iterate over variables named 'i' all the time, and it's usually accepted, if not condoned, even if it's just as lacking in descriptiveness as $_ (and $_ has a well defined role, where i is purely by convention). Perl uses $_ as the default loop variable, certain methods process it by default when provided no arguments, etc. If you know Perl well, it's quite natural. Similarly, @_ holds arguments passed to a function (having it be the default storage for return values is deprecated, so you don't see it all that often in other contexts), and shifting off it is standard. Don't assume your lack of familiarity means it's automatically poor style.

      --
      $_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
  2. Waiting for JDK 7 by Anonymous Coward · · Score: 4, Informative

    JDK7 will bring a new IO API that underneath uses epoll (Linux) or completion port (Windows). High performance servers will be possible in Java too.

    1. Re:Waiting for JDK 7 by binarylarry · · Score: 3, Informative

      Finally, all the worlds enterprise systems can switch to Java... ....oh wait

      --
      Mod me down, my New Earth Global Warmingist friends!
  3. Old news. by Cyberax · · Score: 4, Informative

    Look at the timestamp of this presentation :) It's a bit of old news.

    It was discussed here: http://www.theserverside.com/news/thread.tss?thread_id=48449

    And it mostly shows that NIO is deficient. I encountered similar problems in my tests. Solved them by using http://mina.apache.org/ .

    1. Re:Old news. by binarylarry · · Score: 3, Informative

      Mina is great although the brains behind the project left and started a new project, Netty.

      I've heard from multiple sources that netty tends to outperform mina although I've been using mina with no problems.

      --
      Mod me down, my New Earth Global Warmingist friends!
    2. Re:Old news. by bill_kress · · Score: 4, Interesting

      I had a problem where the customer wanted to discover a class-b network in a reasonable amount of time.

      Aside from Java's lack of ping causing huge heartaches the limitation was that when using old Java IO it allocated a thread per connection while waiting for a response.

      This limited me to 2-4000 outstanding connection attempts at any time. Since most didn't connect, I needed at least 3 retries on each with progressive back-off times--the threads were absolutely the bottleneck.

      I reduced the time for this discovery process from days (or the machine just locked up) to 15 minutes. With nio I probably could have reduced it significantly more (although at some point packet collisions would have become problematic).

      NIO may not be defective, it just may be solving a problem you haven't conceived of.

    3. Re:Old news. by Anonymous Coward · · Score: 5, Insightful

      Would that be the problem of never having heard about Nmap?

  4. NIO != lower latency by yvajj · · Score: 5, Insightful

    I'm not sure where / when NIO got equated to lower latency. The primary benefits of NIO (from my understanding of having designed and deployed both IO and NIO based servers) is that NIO allows you to have better concurrency on a single box i.e. you can service many more calls / transactions on a single machine since you aren't limited by the number of threads you can spawn on that box (and you aren't limited as much by memory, since each thread consumes a fair number of resources on the box).

    For the most part (and from my experimentation), NIO actually has slightly higher latency than standard IO (especially with heavy loaded boxes).

    The question you need to ask yourself is... do you require higher concurrency and fewer boxes (cheaper to run / maintain) at the expense of slightly higher latency (which would work well for most web sites), or are your transactions latency sensitive / real-time, in which case using standard IO would work better (at the cost of requiring more hardware and support).

  5. uh...... DUH?! by Michael+Kristopeit · · Score: 5, Insightful

    the entire point of asynchronous is to acknowledge you will be waiting for IO, and try to do something else useful rather than just wait... asynchronous will obviously end up taking more time because of the overhead of managing states and performing the switches, but the tradeoff is something useful was getting done while waiting for IO a little longer instead of doing nothing except wait for the IO to complete. which method is best is completely application specific.

  6. True for JAVA, but not generally true... by grmoc · · Score: 4, Interesting

    This may be true for Java.
    It isn't true for C/C++.

    With C/C++ and NPTL, the many-thread blocking IO style yields slightly lower latency at low IO rates, but offers significant latency variability and sharply decreased thruput at higher IO rates.
    It seems that the linux scheduler is much to blame for this-- the number of times that a thread is scheduled on a different CPU increases dramatically with more threads, and this trashes the caches.
    I've seen order-of-magnitude decreases in performance and order-of-magnitude increases in latency as a result of what appears to be the cache trashing.

    1. Re:True for JAVA, but not generally true... by grmoc · · Score: 4, Interesting

      Unfortunately, nothing I can publish without permission.
      I can say that I'm in charge of maintaining the software that terminates all HTTP traffic for Google. Draw your own conclusions.

  7. Re:Should be using Scatter/Gather +IOCP on windows by dr2chase · · Score: 4, Interesting

    I'm afraid I have to disagree. No fan of Microsoft, but I helped build a the-Java-Programming-Language-TM Virtual Machine on Windows, with M:N threads, back before Java 1.4, and IO Completion ports worked well, and we got good performance out of them. We rewrote the network IO to work behind the curtain with threads, with the result that the one-socket-per-thread model actually did the I/O completion port thing, with as many as 32k Java threads running in a grand total of about a dozen Windows threads (stacks were small, stacks grew on demand. Certain things were tricky.).

    The largest wins of doing it this way were:

    1) got to use the underlying OS's preferred way of doing async IO (on another OS, we might do it differently)
    2) lots of threads allowed
    3) because Java "context switches" were extremely lightweight, lots of "expensive" stuff got faster (e.g., lock contention).

    I also accidentally (really -- I had to choose one of two threads to go first, and chose the right one, on a whim) built-in an anti-convoying heuristic for contended locks, that was really useful when code contained a hot lock.

    But, the rest of the system was not especially Microsoft-y; all of us came form a Unix background, and when we were done, we did Unix again. IO Completion ports, at least one Windows, were the best choice (and I tried it 2 or 3 other ways, and they sucked).