Slashdot Mirror


Help crack the Java 1.6 Classfile Verifier

pdoubleya writes "As part of the development of Mustang (Java 1.6), Sun is developing a new, smaller and faster classfile verifier which they want your help in trying to break. As Sun VP Graham Hamilton puts it in his blog entry, "As part of Mustang we will be delivering a whole new classfile verifier implementation based on an entirely new verification approach. The classfile verifier is the very heart of the whole Java sandbox model, so replacing both the implementation and the basic verification model is a Really Big Deal.... The new verifier is faster and smaller than the classic verifier, but at the same time it doesn't have the ten years of reassuring shakedown history that we have with the classic verifier." You can read about the new verifier on Gilad Bracha's blog, and join the new Crack the Verifier initiative to if you can break it. Read all about the Crack the Verifier - Challenge."

18 of 276 comments (clear)

  1. Take Java seriously by rexguo · · Score: 4, Interesting

    Before those who go on to dismiss Java for various reasons (no matter how ignorant they are), take a look at the presentation given by Google at this year's JavaZone conference on how Google is using Java internally at extreme scales. Among them are AdWords and GMail.

    --
    www.rexguo.com - Technologist + Designer
    1. Re:Take Java seriously by Cthefuture · · Score: 1, Interesting

      What I don't understand is exactly what advantage is Java providing on the server-side. Do you really need cross-platform bytecode at that level?

      Is it just because of the extensive Java API's? That seems unfortunate because you could have the same API's in a native compiled language and get much better performance. If it's a safety/security issue then again you could build the same thing in a native compiled language, sandbox and all. Native compiled languages are just as portable (or even more portable) as Java. The main problem is having to recompile for each platform, but on a static server that is no big deal.

      To me it just seems like a huge waste to have the massive Java environment running bytecode on a static platform.

      As a language Java is certainly not easier to use than the higher-level languages like PHP, Perl, Ruby, etc. It's very verbose and complicated (relatively speaking). I can understand using scripting languages, it's Java that doesn't make any sense.

      I mean really, is it just because Java provides a lot of easy to use API's?

      --
      The ratio of people to cake is too big
    2. Re:Take Java seriously by Anonymous Coward · · Score: 3, Interesting

      Java may not be "slow" any more, but it's an INSANE memory hog. Part of this is because the heap NEVER shrinks - as you allocate more memory, the heap just grows and grows until it reaches the heap limit (which can be user-set). The other part is because you need to load in large numbers of "support" classes, along with the virtual machine itself, just to get to something as simple as "Hello World".

      Now, I need to explain that "heap NEVER shrinks" bit, because people are going to hop in and start talking about the garbage collector. Well, yes. The garbage collector does indeed free memory. HOWEVER, it only returns the memory to the Java VM heap, NEVER to the operating system.

      So, simple real-world example. Java VM starts with 8MB heap. Your application allocates 40MB worth of objects, increasing the total heap size to 64MB. Java now has 24MB "free" memory within the VM, and 40MB used. You can determine this in code using Runtime.freeMemory() to get the memory free in the heap, and Runtime.maxMemory() to get the maximum memory the heap has available.

      Your program then "releases" 38MB worth of that data as unneeded. Eventually the GC collects that. Now you have only 2MB used, but the maximum heap size (as determined by Runtime.maxMemory()) REMAINS at 64MB. So now Java has 62MB "free" within the VM. Even if your program never uses more than 8MB after that, Java will ALWAYS keep the 64MB heap. It will NEVER shrink it.

      This is fine in server setups, but it absolutely blows for client apps, where it's not unusual for a certain process to require a lot of memory, and then be ready to release it back to the OS.

      Ultimately, on client apps, this winds up causing swapping, as chunks of unused VM heap get swapped out in favor of applications that actually need the memory they allocate. Swapping = slow.

      You could call this behavior a "memory leak" but it's almost certainly by design.

      Java may not be slow, but it's definitely bloated.

    3. Re:Take Java seriously by justsomebody · · Score: 3, Interesting

      While I agree with you on all accounts I can't help but comment you.

      Both Java and .Net have the same problem. Sloppy memory. As long as you don't use a lot of atomic memory blocks with higher load on machine where it runs, everything is ok and just as you said it. I tested both on the same tests and always fallen in the same problem. No direct memory control, GC waited until it was too late, and everything started to crawl. GC somehow avoids doing work if software is taking most of the CPU, it is also the same reason why Java beats C or C++ on speed tests (Every speed test where they try to proove that Java or .Net is faster than C or C++) uses allocation and freeing of memory in some loop. And while both C and C++ actualy do free memory, Java and .Net just mark that as garbage and wait for GC to clean up the mess which doesn't happen if load is too high. Just take any test where Java was faster and test loop to make 2^32 instead of 1000 or 10000 calls. This way you will actualy use more memory than you actualy have with allocations.

      p.s. Not bashing, just saying results of my testing. If you can suggest some approach, do that, but so far not even one person suggested something I haven't tried yet. Both Java or .Net would be a real gift for me if only I could use them for my needs.

      --
      Signature Pro version 1.13.2-3 release 83.5 beta3try7 after-breakfast edition
    4. Re:Take Java seriously by Myolp · · Score: 2, Interesting
      Perhaps it's because there are a ton of good Java developers available, compared to the amount of C/C++ developers. But it could also be because Java is acutally faster at things like memory allocation. I also believe that the large amount of ready-to-use and stable software components available makes a difference when choosing Java for your server application. Then there are the large number of standards built on Java, like J2EE or J2ME, that allows you to focus on the application-specifics in your project and ignore all boiler-plate code necessary if you would have choosen C++ (for instance). There are also several very , very good IDEs for Java with features you won't find in IDEs for other languages.

      I guess there are more reasons than these, but those were the ones that came to mind at the moment.

    5. Re:Take Java seriously by 955301 · · Score: 2, Interesting


      Yes you do. The advantages of being able to develop on my local linux notebook and deploy to a Solaris cluster should not be overlooked simploy because it's important at dev time, not production time. Recompiling on another platform means retesting on that other platform. I'd rather run my unit and integration tests off the production & staging environments, load test in staging and no testing in production. This way unit and integration can be part of my build process (http://maven.apache.org/ and not something I have to redo on the final production hardware.

      And your overlooking the JIT as well.

      --
      You are checking your backups, aren't you?
    6. Re:Take Java seriously by AKAImBatman · · Score: 5, Interesting

      The reasons to use Java on the server are quite simple. The combination of factors that attracted developers to Java in the first place make them want to use it on the server. Those factors are:

      1. Cross-platform capability - Many companies still prefer to deploy applications on large Sun, IBM, or Linux (name your brand) servers. However, these companies would also like to give their developers Windows desktops so they can interact with the rest of the company. (Who most likely uses MS Office/Outlook.) As long as you avoid explicit path names, it is quite easy (and common!) to develop on a Windows machines but deploy on a Unix or Unix-like machine.

      2. Automatic Memory Management - So your server is running along, and suddenly someone generates an unexpected error. In Java you can sleep soundly because even the worst programmer would have a hard time doing anything to completely take down the application. If you use a language that allows direct memory management, you have a good chance of that new guy coding a General Protection Fault/Segfault. The result is that your entire system coredumps when you least expect it.

      3. Security - While Java is able to control the Security of the ENTIRE JVM through its security framework, most companies are happy with the lack of buffer overruns, code injection techniques, and other common attacks. That's not to say that a poor programmer can't put a security hole in the application wide enough to drive a Mack truck through, but at least you can rely on the underlying system not to betray you.

      4. Flexibility - The Java server side frameworks are exceedingly flexible in their designs. For example, the servlet framework allows you to plug in your own custom server page technology. I have seen many a programmer (including myself) implement something like Reports by simply linking the ".rpt" extension to a custom servlet. The servlet then loads the requested configuration file and executes it. Very nice.

      Another example is servlet filters. Need a security framework added in a hurry? Just add a filter servlet! It will execute before the rest of the code, allowing you to check the variables and security permissions to ensure that the client isn't trying any funny business.

      5. API - When Java was first introduced, it absolutely creamed all the competing languages in the richness of its bundled API. As time has worn on, this has changed. However, Java still enjoys a sizable lead over even C/C++ with features such as Type IV (tested cross-platform, pure Java) JDBC database drivers. Unlike ODBC, many of these drivers have been tuned for excellent performance. Similarly, there are free APIs for handling Office Documents, PDF Creation/Editing, SOAP/XML-RPC communications, Object-Relational mapping, Image Management/Creation/Editing, CORBA, XML Databases, XSL-T, etc. While these APIs are all available for C/C++, there are significant cross-platform issues with many of them, as well as a lack of common "pluggable" APIs that allow for one API to many implementations.

      Other languages have a hit/miss score with these sorts of features, often not providing these features, providing only a small subset, or only being available in an expensive commercial package.

      6. Dynamic Loading - While C/C++ can manage dynamic loading of shared objects, it's a very difficult thing to implement. Java does it out of the box, with a full reflection API and interface support, thus allowing such wonderful code as Beans, Servlets, Pluggable Drivers, self-organizing code, and a host of other features that other systems can't compete with.

      (If you don't believe me, try adding support for a feature in PHP sometime. "It's so simple! Just install the SO and recompile PHP!" Meh.)

      7. Performance - This may sound like an odd thing to say, but the performance of Java is a key selling feature. Java server applications may execute more slowly than one written in C/C++ (just as C/C++ may execute more slowly than

    7. Re:Take Java seriously by TheRaven64 · · Score: 4, Interesting
      I would refer you to some research done around a decade ago, which involved running a MIPS emulator on a MIPS machine. The emulator was doing dynamic optimisation, and got around a 10% speed increase over the same code running directly on the hardware.

      A Java VM does some things that are simply not possible with C. To inline a C function, you need the source for both - this leads to some really ugly things like putting simple functions in headers, which should be reserved for interface definitions, not implementation details. The Java VM will inline functions on the fly. This can potentially give a huge performance boost - I got almost a 50% speed increase on some C code I was recently writing by shuffling things around to allow the compiler to inline some common functions.

      The other advantage of higher-level languages is that they provide more semantic information to the optimiser. Consider the trivial example of autovectorisation. In C, if you want to do an operation on a vector, you will usually iterate over every element and perform the operation. The compiler then needs to check that there are no dependencies between loop iterations, which can be non-trivial. In a language like FORTRAN or Smalltalk, you can simply perform an operation on a vector type. The compiler then just needs to check if the operation you are trying to perform corresponds to one or more vector unit instructions, and substitute these in to your code. This is much easier to do.

      C is a fairly easy language to write optimised code in for any CPU up to and including a 386. For anything more modern, you will find yourself fighting a language which is simply not designed to deal with parallelism - and compiler writers find themselves fighting even harder.

      --
      I am TheRaven on Soylent News
    8. Re:Take Java seriously by msuzio · · Score: 2, Interesting

      Thank you. "Wannabe" is just the right term. It really burns my ass when someone who is ignorant (and I mean that in the truest sense of the word -- they might not be stupid, but they obviously do not actually know what they are talking about) just spits out the standard "Java is teh suX0r!!!! It is sl0w!!!!" stuff.

      *rolls eyes*

      LISP weenies and C++ gurus can knock Java. They've earned the right to do so. Anyone who has written code for one of the "scripting languages" people position as competitors to Java (they aren't, just different tools for different jobs) can knock Java, because they've shown they understand the compromises and internals of designing a language. ...but if you've sorta kinda read a book on Perl, or you think PHP "r0xors" because your favorite Counterstrike fansite uses it for the forum system, or you think Java is slow because an applet of Jake waving ran slow in Netscape 3.0 in 1996, you might not quite be qualified to offer an opinion on this subject.

  2. Re:Don't believe the hype! by craigmarshall · · Score: 3, Interesting

    > C is portable, fast, very complex and since 35+ years the leading standard for professional OS and APP development.

    I agree that C is portable and fast, however I don't it can be called very complex.

    The smallest programming language manual I have ever owned (and I've owned quite a number), has to be "The C Programming Language", often hailed as the One True Reference to the language. How can it be that complex if the manual is less than half the size of most of my other manuals? I think languages (in general) have got more complex since then. The size of the .Net Framework is huge, there's no way that's simpler than the C standard library. Then you've got to think about reflection, inheritance, dozens of things that C just doesn't have.

    If what you mean is that C programs end up looking more complex, that's probably because C is used for systems programming. If you mean that you have to write more code to do it in C, then you may have a point, but I think C is actually one of the simpler languages. The closer to assembly you get, the simpler the language has to be.

    Craig

  3. Re:Why not prove it? by fuzzy12345 · · Score: 3, Interesting
    Here's a flame to all the respondents to the parent post. They say (if I may paraphrase) "Code verification is hard. I want my MOOMMMMMY!" Well, it's certainly more difficult, in the short run, than the "throw it against the wall and see if it sticks" approach. But it has been done, it isn't as hard as the naysayers are making out, and it's one of those things that you don't improve at unless you try. Google VLISP for an example of a provably correct compiler.

    One thing's for sure: Improvements in software quality will be harder to come by if everybody's attitude continues to be "Bugs are inevitable. Formal proofs are beyond us. Let's keep doing it the old way."

    --

    Everybody's a libertarian 'till their neighbour's becomes a crack house.
  4. Java 1.5 by Craig+Ringer · · Score: 2, Interesting

    Java 1.5 introduced the two things that make me willing to consider Java as a practical language for real work (as opposed to a "safe to let untrained programmers run rampant, too bad about the 10000k LoC required to do anything" language). Those two things are collections and generics.

    I was forced to use Java 1.2 some time ago, and found it a horrific experience with my background in dynamic languages. Since then, I've learned C++ and got used to the pluses and minuses of static languages (both in the sense of "compiled" and in the sense of "statically typed"). Java also largely ceased to suck, so having to work on it again and finding that sort code that would've been hundreds and hundreds of repetitive lines can now be expressed using a short set of comparitors and a collections-based sort was ... refreshing.

    After Java 1.5, I can understand why they'd want to let things settle down for a while. It seems to me that they finally got all the really important stuff into the language.

  5. Re:Optimization and late binding by Hard_Code · · Score: 2, Interesting

    "Java may, someday, with sufficient ingenuity, rival or even beat C++ in performance"

    For long-running throughput-bound server processes, doesn't it already?

    --

    It's 10 PM. Do you know if you're un-American?
  6. Re:dotNET is overrated by zootm · · Score: 2, Interesting

    C# makes a distinction between virtual and non-virtual methods (which is largely used for optimisation which is not available otherwise, as I understand it). The distinction between override (which seems a little unnecessary, but it's arguably better than ambiguity) and new stems from this.

    I wouldn't say there's a huge difference between C# and Java, certainly not of the kind you're trying to imply there is. C#'s syntax is a little closer to C++, not so much the Windows API.

  7. Re:dotNET is overrated by $RANDOMLUSER · · Score: 2, Interesting

    Actually, I was reasonably serious, I just said it silly-like. While I can see an object having multiple begaviors (in Java-speak implementing an interface or two); I cannot see an object having multiple states, which is what's really implied by MI. To me, an (non-trivial) object is about the data it contains, and/or the state it preserves. To say that a thing is both one thing and another (from a data standpoint) in the same breath, to me just smacks of bad design. Again, nothing wrong with having unrelated behaviors, just no unrelated data, please.

    --
    No folly is more costly than the folly of intolerant idealism. - Winston Churchill
  8. you have no clue by RelliK · · Score: 3, Interesting
    You obviously have never worked on large server-side applications. Other posters have already listed some reasons for java's popularity. I'll add some more:

    * java is nearly as fast as C++ according to all the benchmarks I've seen. Yes, really. The perception of java as being "slow" is simply the legacy of the old awt apps. Yes, the awt gui was (and is) slow. Server-side java applications are not. The "much better performance" is simply not there, particularly for typical enterprise apps.

    * *All* the enterprise apps (which is the area where java is particularly successful) store stuff in a database and/or talk to remote apps. Newsflash: a database query or a remote procedure call is *orders of magnitude slower* than an in-process procedure call. Once you include DB/RPC into the equation, whatever little speed advantage C++ has is wiped out completely.

    * This is CS 101: performance of a program is largely determined by the algorithm used. You can write a linear search in assembly, and it will be very fast for small lists. But for large lists, a binary search written in shell script will beat it.

    * In an enterprise application scalability is much more important than raw speed. So what if I can write a C++ app that's 20% faster than an equivalent java app? Java has frameworks that make it easy to write an app that you can scale horizontally (i.e. by adding more boxes). Easy being the keyword.

    * Developer's time is much more expensive than runtime. It is *much faster* to write an app in java than in C++. And for all but the smallest/simplest apps it is faster to write the app in java than in PHP/perl/whatever.

    If it's a safety/security issue then again you could build the same thing in a native compiled language, sandbox and all.

    Uhhhm, yes. Safety and security are *big* issues in enterprise apps. Show me *one* native language and platform that does it. You are saying it like one can just wave a magic wand and have it built in no time. "You could build the same thing" is not "it's already built".

    I mean really, is it just because Java provides a lot of easy to use API's?

    Yes. among all the other things I've mentioned.

    These are just a few reasons why java is so popular in enterprise apps. Sure, I wouldn't write a game in java, but for enterprise apps, it's perfect. Why java and not PHP/perl/? Simply because java is better. It has all the advantages of compiled laguages (type safety, variable declaration checking, syntax checking, etc.) without some of the disadvantages (manual memory management). Think of java compiler as a sanity checker for your code. It will catch common mistakes like typos, missing return statements, invalid function parameters, etc. A scripting language will not complain about that, but force you to spend hours tracking down the bug. That's why java is faster to develop in than any scripting language for large apps.

    --
    ___
    If you think big enough, you'll never have to do it.
  9. Cross-platform really is a big deal by JavaRob · · Score: 2, Interesting

    What I don't understand is exactly what advantage is Java providing on the server-side. Do you really need cross-platform bytecode at that level?

    Actually, yes -- the cross-platform ability is extremely useful. Speaking personally: the two biggest projects I have worked on, both for one client, are deployed (production) on a IBM iSeries server (these used to be called AS/400s -- using the OS/400 operating system), and a Solaris server respectively. Both web apps are built on the same code base, and we developed and tested them on Windows 2000 workstations (XP, now, plus I am starting to do more and more development in RedHat Fedora).

    Can you imagine if I needed my own iSeries at home to run a test server here? Those things aren't cheap. Also, because the client has more in-house iSeries experience, we're going to be moving the Solaris webapp to an iSeries as well at some point -- and guess what? The Java code doesn't need any changes whatsoever; it's only the database SQL that will need to be migrated (DB2 UDB to DB2400 SQL isn't consistent).

    When I'm starting new projects, I can get people started on architecture and writing code in most cases *without* finalizing the eventual platform, and without getting access to the big hardware yet. You aren't locking yourself into anything from the beginning -- this is actually a pretty serious power to have. It also allows me to run side by side performance testing on servers to see the *real* differences in capabilities; this is HUGE because the folks selling the big iron suddenly are a commodity, not an unquestioned master in a domain with benefits we can't actually measure usefully.

    Just my 2 cents -- I'm sure some people wouldn't actually care (e.g., "my webhost only runs RedHat, so that's all I need to care about"), but gotcha-free cross-platform code is a big deal.

  10. Here are some reasons: by DimGeo · · Score: 2, Interesting

    Java Pros:
    1. Zero memory fragmentation. The GC compacts the memory at runtime. This means indefinite uptimes. A server written in a refcounted script language might lack that.
    2. Zero chance of a buffer overflow attack anywhere. Maybe if there is a bug in the VM, however, this might become possible.
    3. All libraries in the standard distribution have been tested for almost a decade now.
    4. Incredibly powerful multithreading and synchronization.
    5. Rapid development of fast programs. Only someone well versed in Java can do that, but java is well worth it when you have the people. This can be done in other languages but at insane costs in security.
    6. Performance costs for all of the above is within the 20% margin, which is great for a server app that does not do anything computationally expensive. Most of the work is offloaded to a fully optimized DB server anyway.
    7. With the right framework, you can easily load and unload modules at runtime. Not easily done though.

    Java Cons:
    1. Incredibly slow startup time. It may take up to a minute for a large app to get fully loaded and JITed. This is a non-issue in a server environment, however.
    2. Extreme memory usage. Up to 10 x the equivalent C++ app. However, the GC makes sure that memory usage remains almost constant under similar loads for months and years of uptime, because there is no memory fragmentation.
    3. Due to 2, sometimes most of the memory gets swapped. This shouldn't happen in a server environment, but on a desktop running server apps (dev machines for instance) this is a great nuisance. It might take running a full GC manually to force your redmond-developed OS to re-load all the memory for the app. Again, a non-issue for servers.
    4. The default Sun Java VM configuration makes Java run any program with a 64 megs of mem usage limit. This is ridiculous for a serious Java app. It takes passing a command-line param to fix that. People can get frustrated because of this.