Slashdot Mirror


Java Performance under Linux

krshultz writes "IBM has posted a great technical article on Java performance on its DeveloperWorks site. I learned a lot about Java and Linux in general." This is a nice big well-indexed article. Go.

44 of 141 comments (clear)

  1. How do other OSes do it? by Telcontar · · Score: 3

    How do the *BSD schedulers cope with that problem? What alternatives to an evaluation of the "Goodness functions" have been thought of?

    Is it maybe possible that one only makes a rough (heuristic) estimate of that function, maybe based on older (exact) values, which are only updated from time to time? The same goes for the ranking of the results of these functions (apperently much time is lost here). After all, with so many threads
    a) one does not have to select the best process to run - choosing a good one is OK
    b) having a bigger data structure in the Kernel should not be a problem - the testing machines had 1 GB of RAM...

    1. Re:How do other OSes do it? by Wesley+Felter · · Score: 2

      Some other schedulers don't do any kind of O(N) "goodness" calculation over all the runnable threads. Simple real-time schedulers (which admittedly have other deficiencies) just choose the thread on the front of the highest non-empty run queue. Other schedulers use stochastic (i.e. partly random) methods to pick the next thread without having to look at all the runnable threads.

  2. Interesting... by jd · · Score: 5
    This is not the first time someone's commented on the Linux scheduler. There have been unofficial patches for it for some time, and there have been more than a few complaints as to the way it operates.

    There seem to be three directions people want to go with the scheduler - coarse-grain, fine-grain and real-time. Instead of arguing which is "best", why don't the developers do what they've always done in the past - put the stuff in, and used menu options to let people choose! If one (or two) of the options turn out to be really redundant, back them out! Nothing's lost, but a few cycles of human time. And it's better spent with code than with flame-thrower.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    1. Re:Interesting... by Big+Jojo · · Score: 2

      ... or maybe not that interesting.

      The interesting bit seems to be reorganizing some of the task structure so it's cache-friendly in one stress case. OK, that's healthy. Applause; it'll create some breathing room in systems overloaded in that way.

      But that doesn't really look at the hard issues. There was handwaving about multi-level thread models; good thing there was a ref to Solaris threads there, which has had that for over six years now. That "many to many" threading stuff is also called "two level scheduling", which can be nice (cheap to switch between user mode threads, like Green threads) but isn't always good (excess kernel interactions on thread wakeups).

      If this is the start of ongoing work at IBM, I'll be pleased. Not many Linux folk have access to the sort of measurement tech IBM applied here. But don't overrate this specific contribution; take the cache-line patch (assuming it doesn't slow anything else up) and move to the next problem.

      Volano, for all the press it gets, isn't a very good benchmark.

  3. Linux threading implementation questions... by Malc · · Score: 2

    "The result, however, is that each Java thread in the IBM Java VM for Linux is implemented by a corresponding user process whose process id and status can be displayed using the Linux ps command. "

    This statement from the article seems to imply that threads are implemented as separate processes. But then following statement implies that they are all in the same address space:

    "Instead, a special version of the fork() system call (also known as clone()) has been implemented that allows you to specify that the child process should execute in the same address space as its parent. "

    That's interesting. Threads typically need access to shared data, and having a separate process per thread would cause a performance hit accessing this data. However, they're all in the same parent process space, so does that mean they can share pointers (errrr, references)? What happens if I kill one of these threads (processes) from the command line (presumably the same as terminating a thread under windows, but easier to carry out as you don't have to write a program to do it)? Is it efficient to have the main scheduler scheduling threads? How do other OS's do it? I would have thought it bad as using a multi-threaded app would effect the performance of the whole system (but as an application writer, I guess I wouldn't want thread scheduling using up some of my process's time slice!)

  4. Multiple NICs by aheitner · · Score: 2

    You just need 1 web server process for each NIC; it's the equivalent of having a multithreaded TCP/IP stack for multiple NICs.

    That's reasonable, since the processor can fundamentally move much more data than the combined throughput of the NICs (if that's not true, you need a faster processor/better memory bus, of course).

  5. You misunderstand me. by aheitner · · Score: 2

    I mean a 2-10x speedup in anything.

    I can't imagine the scheduler is taking up more than the 20% it's taking up in this benchmark when you run Apache with a zillion processes.

    So if the scheduler were infinitely fast, Apache would get 20% faster. BFD. I'm more interested in changing the design of the webserver itself so it's 2-10x faster.

  6. No by aheitner · · Score: 3

    You can't rely on user-space thread switches, it's just too messy. Remember, in theory Win3.1 did user space timesharing, and we all know how much of a joke that was. If you're going to be doing more than one thing at once (conceptually of course) in a single userland context, you should structure your code appropriately for the grain you want and build an event structure or whatever is appropriate. Calls to yield are just flaky -- you should be able to structure things much more intelligently that that.

    For example, a webserver has the fundamental building block of a packet on the transport, which has an MTU. So a webserver ought to be able to build a grainsize based on sending at least one (or perhaps several if the packet size is very small, such as ATM; this would be decided by looking at the hardware at configtime or runtime on the server, not a big deal) packet, then considering which client to serve next. Of course it helps massively if you can collect intelligence on what type of connection the client has, i.e. don't try to send more than a few kilobit to modem users, etc etc.

    It's bad in the first place if there has to be a VM choosing which code to general (slow slow slow!). I'm only talking about the "real-code" (i.e. traditional, compiled webservers and other multiconnection servers) case here. I don't believe one has the right to complain about performance at all if one chooses to use Java, so even tho "green-threads" style threads might be appropriate and effective for Java, they're not very useful for a real application.

  7. Wait a minute by aheitner · · Score: 5

    I've got a fundmental disconnect here ...

    Okay, the Linux scheduler is slower than it could be. It is taking up "up to 20% of CPU cycles" in the very process-intensive (given that native threads are no lighter weight than processes) benchmark, 400-2000 processes.

    But there's a more fundamental problem: a 20% speedup isn't significant. I'm not saying we should abandon all speedups that don't affect asymptotic complexity; I'm just saying that I'm looking for speedups of at least 2x-10x before I'm impressed with anything. 20% is small stuff.

    There's a bigger issue here: this many processes will never be fast. The cost of a context switch is high given current processor designs, and is not likely to get lower. Even assuming that on a thread switch, since you're dealing with the same data as the previous thread was using, the TLB and code/data caches remain useful (on a process switch in general they don't, and refilling the caches is very expensive), you still have to store a whole bunch of stuff to memory for the old thread and bring a whole bunch of stuff of stuff out of memory for the new thread. And you've got to leave userland for a bit to do that. Slow slow slow slow.

    It seems to me that in general we need to reconsider the approach of relying on the operating system to schedule and share resources (in the case of chatservers and ftpservers and especially webservers, where we see the real performance hits for massive thread/process expenses). Right now all this stuff is based on the Berkeley sockets API, a high level network API (i.e. one that doesn't at all consider what the transport will be). This has been a tremendously successful API; it's used on all platforms (well I can't speak for sure for Mac :) and it can be reasonably argued that Berkeley sockets paved the way for the Internet.

    But the fact remains that your ethernet card is fundamentally a serial device. I have to wonder if it wouldnt' be possible to write a webserver which does know about the transport for a change, and which could in only one process sit there putting packets onto the wire at a level much closer to the hardware, and therefore save a lot of expense in making the operating system arbitrate all these zillions of threads that want to share the connection.

    It would be an interesting project to say the least.

    1. Re:Wait a minute by Ryandav · · Score: 2

      very minor question here, all respect intended:

      >Or maybe one of the network card manufacturers. If they designed a special NIC with the IP stack implemented in hardware
      >or firmware, then the OS kernel wouldn't need any modification other than a simplified device driver module. I think that
      >would eliminate the bottleneck you're referring to.

      Isn't this what a winmodem is? Didn't they offer negligable performance increase?

      I bet I'm thinking something totally stupid, so please tell me what I'm missing..

      --
      Check my Go-related blog for beginners: DGD
    2. Re:Wait a minute by ralphclark · · Score: 2

      a 20% speedup isn't significant. I'm not saying we should abandon all speedups that don't affect asymptotic complexity; I'm just saying that I'm looking for speedups of at least 2x-10x before I'm impressed with anything. 20% is small stuff.

      I think you're being unrealistic about this. This is 20% across the board for modern server apps after all, and 20% is a pretty good improvement for a single kernel tweak. You have to consider the cumulative effect of a number of such tweaks. I have substantial difficulty even imagining a single tweak that could double performance let alone multiply it by ten times!

      The cost of a context switch is high given current processor designs, and is not likely to get lower. Even assuming that on a thread switch, since you're dealing with the same data as the previous thread was using, the TLB and code/data caches remain useful (on a process switch in general they don't, and refilling the caches is very expensive), you still have to store a whole bunch of stuff to memory for the old thread and bring a whole bunch of stuff of stuff out of memory for the new thread. And you've got to leave userland for a bit to do that. Slow slow slow slow.

      As I understand it, threads on Linux are implemented via lightweight processes, using the clone() syscall which basically does a fork without copying the majority of the execution environment. i.e. it runs within the original environment and only copies those bits which have to be unique to each thread. The overhead isn't nearly what it is to do a traditional process fork.

      I'm pretty sure threads are here to stay. They do make it a lot easier to design real time applications and scalable server applications.

      It seems to me that in general we need to reconsider the approach of relying on the operating system to schedule and share resources (in the case of chatservers and ftpservers and especially webservers, where we see the real performance hits for massive thread/process expenses). Right now all this stuff is based on the Berkeley sockets API, a high level network API (i.e. one that doesn't at all consider what the transport will be).

      Sockets do help to keep interfaces simple and standardised. Doesn't the modern Unix "streams" design (which IIRC standard Linux doesn't support yet) rely on something similar? Admittedly it's not very fast compared to other IPC methods available on a single machine.

      But the fact remains that your ethernet card is fundamentally a serial device. I have to wonder if it wouldnt' be possible to write a webserver which does know about the transport for a change, and which could in only one process sit there putting packets onto the wire at a level much closer to the hardware, and therefore save a lot of expense in making the operating system arbitrate all these zillions of threads that want to share the connection.

      There are enough wacky people out there looking for something unique to do, that somebody will no doubt have a go. But I'm certain it will enjoy only limited popularity. It's just not the Unix Way, and the Unix Way is a pretty important reason for the success of Unix and Unix-alikes.

      Apart from anything else, it ties the application to a particular hardware configuration, effectively making the server it runs on into a proprietary piece of kit. For that reason, it seems to me that the most likely people to try it would be a hardware firm, maybe a server vendor like IBM.

      Or maybe one of the network card manufacturers. If they designed a special NIC with the IP stack implemented in hardware or firmware, then the OS kernel wouldn't need any modification other than a simplified device driver module. I think that would eliminate the bottleneck you're referring to.

      Consciousness is not what it thinks it is
      Thought exists only as an abstraction

    3. Re:Wait a minute by SurfsUp · · Score: 2

      Even assuming that on a thread switch, since you're dealing with the same data as the previous thread was using, the TLB and code/data caches remain useful (on a process switch in general they don't, and refilling the caches is very expensive), you still have to store a whole bunch of stuff to memory for the old thread and bring a whole bunch of stuff of stuff out of memory for the new thread. And you've got to leave userland for a bit to do that. Slow slow slow slow.

      Why? I don't see that. To switch threads in user space you push all the registers onto the stack, save the stack pointer, load the stack pointer of the next thread and pop its registers from the stack. It's just a few cycles, maybe 20 or so. If your thread is using floating point you've got more work to do - you have to save the FP context if another task was using it and load the FP context of the new thread. This isn't expensive because it doesn't happen often. (You do need to do some fancy dancing to detect automatically which threads are using FP and which aren't) You don't have to enter the kernel anywhere in this process, and that's a huge win. To make the user space task switches happen you sprinkle calls to a yield() function throughout the code, and at I/O points. This just doesn't eat much CPU time, compared to the enormous, odious cost of crossing into the kernel, killing the cache at a cost of several hundred cycles. And again when you cross back out. For Java, this approach is ideal because the VM has complete control of the code that gets executed - at load time it can insert the required yields as it sees fit.

      --
      Life's a bitch but somebody's gotta do it.
    4. Re:Wait a minute by Azog · · Score: 2

      I agree with almost everything you say... but have one minor nit to pick.

      An individual ethernet card is fundamentally a serial device, as you say. But don't many large servers have several network cards? I know the big Mindcraft server benchmarks used quad CPU, quad ethernet cards.

      If you made a webserver that knew about transport, it would have to know about dealing with multiple ethernet cards, right? And what if it's not ethernet, but FDDI or something exotic? Perhaps this would be more trouble than it's worth? Especially if the gains were small - as you point out, 20% improvements aren't that big a deal.

      On the other hand, isn't there a web server-in-a-kernel module designed for pure speed? Maybe it knows about transport, or could be extended that way...

      Someone who knows more than me can step in now...

      --
      Torrey Hoffman (Azog)
      "HTML needs a rant tag" - Alan Cox
    5. Re:Wait a minute by noom · · Score: 2

      AFAIK, Oracle8i's approach is to abandon the idea of running a database application on top of a "general purpose operating system" since, typically, a database server only runs one application -- the database itself. So, a Oracle8i can be thought of as a "specific purpose OS" optimized for database queries with a database app on top of it. Much faster.

      Can someone who knows something about the GNU Hurd kernel comment on how it manages network connections? I remember reading something about how just about everything can be managed by a user-level process in Hurd -- does this include the NIC?


      -NooM

  8. Idle criticism by jbert · · Score: 3

    Whilst it seems that these people are nice and thorough, a couple of points:

    1) If you are running one heck of a lot of processes/threads (same thing on Linux) you would expect the time spent in the scheduler to be big.
    That is unavoidable overhead of *all* thread models. (You can try and reduce it - thats good...but run enough threads and it will dominate).

    2) {I am not a hacker but} If they are at the level of seeing improvments in the scheduler by tweaking things like structure layout to improve cacheline localilty then can we sure that the "low performance impact" IBM Kernel trace patch is not having an effect? What was the throughput like (i.e. the main benchmark measurement) like on a stock kernel?

    3) If you move to a many-many scheduling model you *will* reduce the time spent in the kernel scheduler. However, you *will* spend time in your user-land scheduler. Which is the win?


    I don't mean to suggest that these people don't have some good points (I hope that they develop patches and I hope that the best patch wins), but it is important not to jump to conclusions.

    PS - I only skimmed the article, so I may have got the wrong end of the stick. I'm sure someone will put me right if so :-)

  9. Good for IBM. by Psiren · · Score: 2

    In essence:

    Here's what's wrong with Linux while running Java.
    Here's a patch to fix it.
    Thank you and good night.

    Nice to see big companies being so supportive. Who would have thought this would happen a year ago?

    "Sir, I'd stake my reputation on it."
    "Kryten, you haven't got a reputation."

  10. Developer Works is pretty cool... by Booker · · Score: 2

    I've found George Lebl's Making application programming easy with GNOME libraries articles on IBM's site to be a really good introduction to Gnome programming. Maybe old hat for some, but as a beginning Gnome hacker, it was very helpful for me. Good info on Glib, and the 3rd article has a great example of using libXML to handle XML data files....

    I think it's awesome that IBM hosts this information. Kudos to whomever made that decision at Big Blue!
    ----

  11. great technical article..and send feedback! by tuffy · · Score: 3
    I hope this patch, or something equivilent hacked out between IBM and Linus/Alan/etc. will make it into the 2.3 tree prior to 2.4. IBM looks to be continuing their great Java and Linux software development, much to everyone's benefit.

    And don't forget that little feedback thing at the bottom. Let IBM know these are the sorts of things we like to hear!

    --

    Ita erat quando hic adveni.

  12. Re:Java and real-world apps by toriver · · Score: 2

    [Fud deleted]

    Java is not interpreted, it's (usually) compiled (to bytecode - just like Perl and Python). The target platform just happens to be a virtual machine sitting on top of another system instead of being hardware. But why should one spend extra time writing in "C in wolf's clothing" when a more suitable language is there, just because you want a millisecond faster response time for a mouse click? Also: The JIT mechanism makes it possible to have efficient runtime compilation, where the "second-step" compiler can make adjustments based on actual runtime behaviour, which is impossible in C++ because it's compiled one place and run somewhere else.

    And C++ is such a mess of syntax that "efficient" and "C++" should not occur in the same sentence.

  13. Why not add non-blocking I/O to Java?? by SurfsUp · · Score: 2

    because the Java language lacks an interface for non-blocking I/O threads are especially necessary in constructing communications intensive applications in Java

    Wouldn't it be easier to add an interface for non-blocking I/O to Java??? Err - isn't it supposed to be an inherently multitasking language? Sheesh. This is an excellent example what's wrong with Sun's dog-in-the-manger attitude to the Java spec. "We already defined that, it can't be changed, fix it some other way! No, we're too busy defining new multimedia api's to take a look at it"

    --
    Life's a bitch but somebody's gotta do it.
    1. Re:Why not add non-blocking I/O to Java?? by SurfsUp · · Score: 2

      recall that the only way to efficiently use non-blocking I/O is to select the file descriptors to wait on. This is tricky enough that making threading easier was probably the right thing to do.

      Providing an efficient, easy means of handling non-blocking I/O is no more difficult than handling any other kind of ansychonous event, such as a mouse click. The user program can register a handler, and each time an I/O is done the VM calls the handler. I don't see that as being hard, either in implementation or usage.

      --
      Life's a bitch but somebody's gotta do it.
  14. IBM "Gets it" by SurfsUp · · Score: 2

    I really can't think of enough good things to say about IBM's open-source efforts. They're spending real money, and throwing good people at it. IBM has always known what quality control is about, and they're bringing that to the party. They're putting in whole products like Jikes and (more importantly from my point of the view) the Jikes parser generator. They're doing it without taking a holier-than-thou attitude, and without "cute" licenses like the SCSL. It's really hard to believe the old T.Rex of yore has turned cuddly.

    --
    Life's a bitch but somebody's gotta do it.
  15. Re:Obvious question #1 by JohnZed · · Score: 2

    The issue is that there is actually very little difference in the Linux kernel between threads and processes. They're all lumped into the same scheduler, and that's what creates the scalability problem. Other OSes, like Solaris, schedule threads within their respective processes.
    --JRZ

  16. Re:"Why threads Are A Bad Idea (for most purposes) by JohnZed · · Score: 3

    Curiously enough, two years later Ousterhout turned around and touted TCL's threading features as a major advantage that it enjoys over Perl.
    I've programmed a fair amount with both threads in Java and non-blocking I/O in C, and the one-thread-per-connection model is VASTLY easier to program, maintain, and use. Non-blocking I/O leads to code that's extremely non-linear, and much more confusing, than multithreaded implementations. It's like having to work with code that uses a million goto's; you never know where you'll be executing next. Threading, on the other hand, achieves the same benefits, but it lets the programmer work at a higher level of abstraction.
    Are C++ and Java broken because they use, for example, object-oriented representations of streams rather than a series of calls to "write" on a file descriptor? Well, this difference does cause a performance impact. But if you can get your product to market twice as quickly by using technologies that extract a 15% performance hit, isn't that worth the difference? As operating systems improve more and more to cooperate with sophisticated threading models, the performance hit for using them will continue to decline.
    Rather than sticking our heads in the sand and saying, "Well, there's another, more confusing, less modern way to do it that doesn't require us to change the way we've done things for years," let's actually try to find ways to make programming easier AND produce a high-performance result.
    --JRZ

  17. More on thread mappings by JohnZed · · Score: 5

    Interestingly enough, a heated thread on a related topic cropped up in the kernel-dev mailing list the other week. Check out Kernel Traffic for the details, but basically it had to do with some SGI engineers who wanted to make a change in a threading mechanism to facilitate 3D graphics performance on Linux. Linus explained that he felt their method was, basically, an unmaintainable, inelegant hack that has crept its way into Irix for marketing purposes but will never be in the Linux kernel.

    The relevant thing in relation to the IBM article is Linus' discussion of the philosophy of fork() and how strongly committed he is to this model. He's stated quite often, in fact, that this thread scheduling mechanism (which schedules threads as separate processes) is a very intentional part of the kernel design.

    Personally, I think this opinion will pretty much have to change over time when people are able to demonstrate very elegant patches for the many-to-many threading model discussed in the IBM article. In fact, if I remember correctly, this is the sort of threading model that TowerJ uses in their native Java compilation system to achieve such great scalability on Linux. You can find plenty of examples of in-process scheduling code if you're interested in checking it out: GNU portable threads is the first one that comes to mind, but almost every Java implementation offers this model as an option (green threads). The method IBM is talking about combines this inter-process tactic with the current, intra-process scheduler.

    It just makes sense that if you have 10,000 processes in a queue and you have to recompute goodness for each every time you enter the schedule, this will be a less scalable approach than if you'd created 100 processes with 100 threads each, so that thread_goodness only needs to be computed when that particular process is entered. Think about the management of a large corporation: does the top management allocate resources, set timetables, and otherwise schedule every single employee? No, they schedule a number of departments and projects, then the next level of managers schedules each of the employees within those.

    So far, I think this has been much less of an issue not just because Linux hasn't been focused on the enterprise space (where scalability to tens of thousands of threads is crucial), but more because the key server-side applications in Linux (Apache, etc), have been multi-process rather than multithreaded. Now, with the increase in multithreaded apps from Java (say what you will about the language, it makes threading MUCH easier than C) and, for example, the new Apache process models, we'll start to see serious real-world performance benefits for those OSes that have the best thread scalability. Linus, being the bright guy he is, will surely pick up on this make whatever changes are necessary. At least, that's the way I see it working out. --JRZ

  18. No one solution for everything. by brad.hill · · Score: 2
    As always, it depends on what you're doing. I'm still a Java evangelist to the nth degree, but I wouldn't think about writing a standalone GUI app in it yet.

    Nevertheless, in certain contexts, Java is the fastest thing around. In well thought out application frameworks (e.g. servlets) Java gives you better performance than you can get with C or C++ unless you're willing to invest years of work to duplicate those frameworks yourself.

    On new architectures and as SMP becomes ubiquitous, Java's pervasive thread awareness creates and runtime optimization has enormous potential that is only beginning to be tapped.

    As for those areas where it's just not there yet (client GUI apps, for example) you might look into Eiffel. It's a strongly OO language with a clean and simple syntax, garbage collection and compilers that target both native code directly and portable C code that can be run through a mature optimizing C compiler.

  19. Re:VM's will always be slow by noom · · Score: 3

    There are compilers available for linux (TowerJ being one) but their primary benifit is for server-side code; it'd be much more difficult (but not entirely impossible -- proof-carrying-code would work) to ensure safety if you distribute binaries to clients. Indeed, the whole point of using platform independent byte-codes is so that the JVM can ensure saftey. Platform-specific machine code running on a server will probably coexist with platform-independent java byte-codes for client applications.

    -NooM

  20. KornBizkit? by FascDot+Killed+My+Pr · · Score: 2

    What if Larry Wall had called his language "BeeGeesAirSupply"? Would you want to use it?
    ---
    This comment powered by Mozilla!

    --
    Linux MAPI Server!
    http://www.openone.com/software/MailOne/
    (Exchange Migration HOWTO coming soon)
  21. AWESOME! by FascDot+Killed+My+Pr · · Score: 5

    This article gave me a hard-on.

    It's not so much about Java. It's mostly about threading under Linux. The meat of the article is about how to improve the scheduler.

    But the BEST part was the scientific attitude AND clear explanation (and proof) of the issues. This is EXACTLY what Linux needs. Maybe IBM would like to fund an idea I've had for a while:

    Set up a lab that does nothing but Linux benchmarking. This lab would research things like the scheduler issue from this article, memory access patterns, filesystem layout, etc. All of this research would be available to the public for kernel development, third-party developers, benchmarketing (and rebuttals thereof), etc. The lab could also provide patches to "fix" issues, but that would be of secondary concern. The main purpose would be to supplement the (usually excellent) intuition of the kernel programmers with some hard science.

    To do it right this should really be a separate non-profit, but it could start out as an internal project at some large company.
    ---
    This comment powered by Mozilla!

    --
    Linux MAPI Server!
    http://www.openone.com/software/MailOne/
    (Exchange Migration HOWTO coming soon)
  22. Re:VM's will always be slow by Le+douanier · · Score: 2


    It would be interesting how a Crusoe chip would perform with a Java bytecode instruction set.

    Normally it should be morphing it as fast as the x86 instruction set, thus giving the performances of a compiled code for Java applications?

    Can someone more knowledgeable than me comment on this???

    --
    "The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers." Bill Gates,
  23. Re:VM's will always be slow by rullskidor · · Score: 2

    Be patient, gcj which is a part of gcc nowdays does that, if it gets a proper libgjc that is(libgjc = classes and stuff). I wouldn't expect anything from sun though...


    "Now if just someone could get on the stick and create a Java-like language that compiles directly to run on bare metal. Meanwhile, I'm painfully relearning C and C++ to get the kind of performance I need out of my applications. "

    --
    De lyckliga slavarna är frihetens bittraste fiender, legalisera!!!
  24. Re:Kaffe by rullskidor · · Score: 2

    It has evolved and works great!

    In the near future, about half the time it will take sun to release a buggy and outdated JDK for linux ;) well be able to run java in a totally free runtime either compiled as bytecode or as native-code.
    Have a look att these and see for yourselves :

    http://www.gnu.org/software/classpath/classpath. html

    http://www.japhar.org/

    http://www.transvirtual.com/kaffe.html

    http://sourceware.cygnus.com/java/

    --
    De lyckliga slavarna är frihetens bittraste fiender, legalisera!!!
  25. Java performance by harmonica · · Score: 2

    The performance decrease introduced by a VM can be important for some programs (many GUI style apps will *not* suffer from it). Then you can either use a Source-to-native or a byte-code-to-native compiler. There are commercial and free products available (this list includes some).

    Another workaround is the JNI, which lets you include non-Java code (e.g. C). It is used for the ZIP I/O, as an example.

    There are lots of numbers out there comparing C++ to Java performance, where C++ has a mere 10 % advantage. I'm not sure if this is always true, but with Sun's 1.3 beta and HotSpot or IBM's VM, you do get pretty decent performance.

  26. good sign by garver · · Score: 2

    What I find most amazing about this article has nothing to do with the Linux scheduler nor Java.

    The amazing part is that in the process of porting a product to Linux, IBM has taken the time to formally look into how to make it faster. This isn't earth shattering, but what they did after that is: they presented an open solution, on top of that, a patch!

    It would have been much easer for them to have simply complained that the scheduler was too slow and possibly not port Java (or quit dev.) until the "Linux community" fixed it. IBM's approach shows that they (or at least those that wrote this paper) consider themselves part of the "Linux community" and are willing to work within it.

  27. Re:VM's will always be slow by Bazzargh · · Score: 2
    Sure VMs will always be slow. But that is no reason to abandon platform independent binaries. Take a look at the research on Slim Binaries for Oberon: http://caesar.ics.uci.edu/oberon/research.html

    "This paper presents an alternative approach based on "slim binaries", files that contain a target-machine-independent program representation from which native code is generated on-the-fly at load-time. The slim binaries used in our implementation are based on adaptive compression of syntax trees, and not on a virtual-machine representation such as p-code or Java byte-codes. They are highly compact and can be read from a storage medium very efficiently, significantly reducing the I/O cost of loading. The time thus saved is then spent on code generation, making the implemented system fast enough to compete with traditional loaders."

    Many languages (eg Perl, Smalltalk, elisp, VB, Java) rely on something VM-like at an underlying level. I think a lot of brain cycles are being wasted on reinventing VMs, freeze/thaw mechanisms, portable binaries, etc. Part of the problem is that the VMs are too closely tied to the languages. It would be a Good Thing if there was a project looking at separating out the pieces for on-the-fly-compiled and interpreted languages, much as EGCS does for compiled languages.

    For such a system to catch on it would have to be a clear winner over existing VM systems. The Oberon Slim Binaries are just such a winning technology. With perl already being rewritten from scratch as Topaz, and pretenders banging at the door for elisp (CLOS and guile versions of emacs are in the works), is this such an impossible dream?

    (yeah yeah I know it is. dont mention 'eval'...)

    -Baz, living on another planet, as usual.

  28. Re:VM's will always be slow by javatips · · Score: 4

    I develop with Java since the end of 1995 (Java 1.0 Beta2).

    Over the years, I have seen a drastic increase of performance of the JVM.

    Now I have to disagree with you. In multithreading application I have seen Java beform better than C++. The application where build by the same person and used the same architecture. (The guy was a beginner in Java and experimented in C++.)

    Currently I develop server-side component based (EJB) application (using application server written in Java - WebLogic) and batch processes written in Java. I can say that they perform really well.

    From my experience, by coding carefully you can achieve wonderfull performance. We add a batch application must process 6 to 12 millions of record per day (and do a lot of processing on each records). The first version of the batch was doing 7 record per seconds (di not meet our requirements) by optimizing the code and changing algorythm we went to > 300 record per second.

    Maybe we could gain a 10% to 20% more speed if we rewrote the whole thing in C++. But it would take at least twice the time to develop and will not be as stable as the Java version.

    I conceed that Java is a little bit slower that c++ (not in all cases) but the gain in programmer productivity and stability is really worth it.

  29. Re:Want something native with Java semantics? by Scurrilous+Knave · · Score: 2

    Cute story. Sadly, from a humor-related point of view, it's an urban legend.

    I'll tell you something that is true, and that doesn't reflect particularly well on Ada's genesis. Ada's design was commissioned, though not actually executed, by the US Department of Defense. Just as with most military equipment, the Ada language would be put in the hands of poorly trained, minimally talented military programmers, and had to function acceptably under those conditions. I don't know if that goal was ever actually achieved, but that's the source of the language attributes which cause most younger programmers to chafe under its "fascist restrictiveness". Believe me, after you've written enough code, and made enough stupid errors that any rather bright chimp would probably have avoided, you begin to realize that all those "restrictions" are actually helping you.

    Ada's real downfall was the mandate--something in human nature rebels at being forced, so the various DoD departments expended their creativity getting around the Ada mandate instead of using all that brainpower writing cool code. Now that the mandate is dropped, Ada is experiencing a resurgence in popularity, not the least in the free software world.

    It's not a miracle language, regardless of what some real or imagined DoD hypemeister might have said about it, but it's damn nice, a pleasure to use when used properly, and a solid tool for the development of large, robust programs. But don't take my word for it ... see for yourself.

  30. Want something native with Java semantics? by Scurrilous+Knave · · Score: 3

    If you're looking for a portable language that compiles to native machine code and which implements much of Java's semantics, check out Ada 95. You can find information here, or download a complete GPL'ed compiler here.

    I'm totally serious, folks. Do not regale me with tales of how much Ada sucks--most originate from introductory CS classes where Ada83 was shoved down unwilling throats by indifferent or hostile educators. Please, go read and experience for yourself before replying. And for those who dispute my claim about Java semantics, please pay special attention to the links on this page before you comment.

  31. Help us, Be! by RickyRay · · Score: 2

    It seems like the best solution of all would be if Be would donate their threading source to Linux. Most of the reason Be is so fast is that it has such a good thread scheduler (they probably have several patents on their techniques, in fact).

    Of course they need to make a living too. The trick would be to figure out how they could donate the code yet still have a way to come out ahead (they've invested many millions in development). I honestly can't think of a good solution. Ideas?

  32. Thread Oriented OS Needed by Baldrson · · Score: 3
    Systems in which light-weight threads are first order constructs, such as Mozart illustrate why relational programming will eventually subsume functional or procedural programming:

    Functions are special cases of relations.

    It's important to build relational semantics (light-weight threads with logic variables or their equivalent) in at the kernel. Otherwise, you end up kludging around, either recreating it at the higher levels or malanalyzing your relational task to fit your functional tool.

    Open source studies like this one are increasing awareness of the need for light weight threads.

    That's good.

    The next step will be for people to recognize that what they are doing with all those threads is essentially relational in nature so they can really address the impedance mismatch between relational database and object oriented programming.

  33. VM's will always be slow by fishlet · · Score: 2


    I used to be a Java evangelist, to the 10th degree, but now I have come to have a change of heart. I still love the language, the language IMHO being the best thing I've seen in years. It's that part about it being an interpretted language that really bugs me. Though I've seen significant improvements in speed over the years, still nothing to be overly excited about. And the so-called Revolutionary HotSpot engine promised by Sun turned out to be a major dissapointment. Fortunately I've come to see the error in Sun's thinking and to realize after all that Native code is indeed the way to go. Now if just someone could get on the stick and create a Java-like language that compiles directly to run on bare metal. Meanwhile, I'm painfully relearning C and C++ to get the kind of performance I need out of my applications.

    Regarding Java on Linux, I've observed that it always runs slower on Linux than on windows which is a real shame considering Linux is a better all around server. IBM's VM runs quite good by comparison, but it is still topped by IBM's VM in windows.



  34. SGI's IRIX scheduler - "less is more" by john@iastate.edu · · Score: 5
    I'm reaching way back in my memory here, but I recall a white paper (perhaps from Usenix) from SGI where they investigated how to keep their scheduler from using so many cycles - not so much from a "improve throughput" thrust, but more so to "improve responsiveness".

    Their conclusion was that what you wanted to do was have a two-level scheduler -- a real quick + dirty part that ran at interrupt level and just grabbed the next runnable processes from a circular list of the highest priority processes -- in and out in just a few cycles, but perhaps not grabbing *the* highest priority process this time -- then "every so often" (in computer terms, e.g. some fraction of a second) a lower level scheduler ran which did a more thorough re-ordering of the processes.

    Of course, one immediately sees that this lower level scheduler could even be a regular process (making syscalls) which means you can plug in whatever scheduling algorithm you like.

    --
    Shut up, be happy. The conveniences you demanded are now mandatory. -- Jello Biafra
  35. Soundbites for the lazy by Eamonn+O'Synan · · Score: 2

    ...each Java thread in the IBM Java VM for Linux is implemented by a corresponding user process whose process id and status can be displayed using the Linux ps command...

    The striking observation here is the amount of kernel time spent in the scheduler .... between 30 and 50 percent for kernel 2.2.12-20 and between 37 and 55 percent for kernel 2.3.28...

    ...it became apparent that a significant amount of time could be spent calculating the scheduler's goodness measure...

    We wondered what would happen if the fields required by the goodness function were placed together in the task structure....

    The Linux scheduler needs to be modified to more efficiently support large numbers of processes.

    The Linux kernel needs to be able to support a many-to-many threading model.

    We look forward to working with the members of Linux community to design, develop, and measure prototypes of Linux code to support the changes described above.


    --------------------------------------

    --

    --------------------------------------
    Dere's a storm a-comin'...
  36. The right tool for the right job by Tassach · · Score: 2
    I used to be a Java evangelist, to the 10th degree, but now I have come to have a change of heart. I still love the language, the language IMHO being the best thing I've seen in years. It's that part about it being an interpretted language that really bugs me. Though I've seen significant improvements in speed over the years, still nothing to be overly excited about. And the so-called Revolutionary HotSpot engine promised by Sun turned out to be a major dissapointment. Fortunately I've come to see the error in Sun's thinking and to realize after all that Native code is indeed the way to go. Now if just someone could get on the stick and create a Java-like language that compiles directly to run on bare metal. Meanwhile, I'm painfully relearning C and C++ to get the kind of performance I need out of my applications.


    I would never consider Java to be a replacement for C/C++ -- they are different tools suited to different purposes. One programming language will never meet every programmer's needs for every concievable task. The DoD learned this lesson the hard way in the 80's when they mandated that Ada be used for all new software projects.

    All programming involves making tradeoffs: you cannot optimize EVERY variable, as many of them are inversely proportionate to one another. If the most important design criteria of your application is raw performance, then definatly code it in C (with in-line Assembly as required). If fault tolerance is your overriding concern, use Ada. And so forth...

    The vast majority of programmers in the business world are not writing performance-critical code. For most business applications (which is what the vast majority of programmers do from 9 to 5), the overriding concerns are not performance and reliability, but ease of coding and maintainability. This is where 4GLs like VB, PowerBuilder, and Delphi shine - they allow programmers of average skill to produce (at least somewhat) functional software quickly.

    If you compare the architecture of most 4GLs to Java, they are remarkably similar: you have a compiler which produces p-code/bytecode, which are then executed in a dedicated run-time enviornment. What makes Java different is that the language itself is vastly superior, and it's much more of an open standard than propriatary 4GL's like VB and PB. OK, so it's not GPL'ed nor totally open - big deal, it's more than M$ will ever do with VB. In my book, Good Enough Right Now beats Perfect Sometime Real Soon, in my book. For many (most?) business apps, Java is good enough.

    If I'm a business manager, and I want an application to do X, typically I want it done yesterday. A programmer of average ability can produce a working application in less time using Java then a s/he could using C++, and that code will be more maintainable. If the performance is unacceptable, it's often more economical to just buy a bigger box than it is to hire a coding wizard who can wring every last ounce of performance out of the current hardware.

    (This is not to say that I advocate bloatware or mediocroty; I just realize that it's often a financial reality. Face it, not every coder is a guru; most places have to make do with the people they have.)
    --
    Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?