Decompiling Java
If you are interested in Decompiling Java, then this book tell you exactly how to do that. There's no fluff and every chapter counts. I can safely concur that Fiachra's observations are indeed correct. You'd better be prepared for some serious hard core details, but then that's what you'd paid for. It is really great to read a book that doesn't end each chapter with a few links to the real material because the author couldn't be bothered to write it up.
So what do you get? As a battle-hardened Java coder of not a few years programming, I wanted to find out about the gory details of bytecodes and how to get at them. Now it's a subject I always knew I should know about, but never took the time to read up on it. Decompiling Java puts all that knowledge into one place.
Here's a quick run-through of the chapters so you know what you're getting:
Ch.1 IntroductionDecompilation isn't just another coding tool - there are other, real world issues like ending up in jail to think about. Godfrey proposes a sort of code-of-honour for decompilers. This book could so easily have been positioned for the fr33ky kod3r skript kiddie market, and I'm glad that the author and publishers took a mature and sensible approach to the subject. I have had to decompile purchased code because of bugs and I'm glad that someone took the time to think about an ethical framework for doing this.
Ch.2 Ghost in the Machine
A good and solid introduction to the JVM and the classfile format. If you're in the market for this book, you probably already know most of this, but a refresher course is always good. For me, it definitely sorted out a lot on internal hand-waving on the subject. Just remember kids, the only thing to fear is fear itself - it's only binary data after all.
Ch.3 Tools of the Trade
Although the author builds his only decompiler later in the book, it nice to get a chapter devoted to the existing toolset and the Java decompiler scene.
Ch. 4 Protecting your Source
For the honest developer, knowing how to decompile code is more about protecting your own source code than breaking someone else's (who wants to read other people's smelly code anyway!). This chapter is one of the most directly practical. I had always assumed that obfuscation was a magic fix that I could apply if necessary. In reality, good obfuscation is just like good encryption (that is, uncommon, difficult to verify, and still subject to lateral attacks). Even compiled bytecode has relatively low entropy, so the value of obfuscation must be considered carefully.
Ch.5 Decompiler Design
This is were it starts getting a wee bit technical. Decompilation, as you can imagine, is a bit of a black art, and there are many ways of doing it. Some of them involve scary maths and some involve scary coding and the rest both. But that's why you don't meet many people who can write decompilers. Godfrey does a great job of taking you on a practical run through this fog of decompilers. At the end of this chapter you will be able to decide for yourself what approach is best suited to your problem domain. Again, this material can be challenging but it's like boot camp: You just gotta.
Ch.6 Decompiler Implementation
If the previous chapter hurt your brain and scared you silly then this chapter will have you weeping for joy. The author takes a practical, effective, and most importantly, understandable approach to actually implementing a compiler. Now, as he freely admits, his design may encounter difficulties with edge effects and infrequently used idioms, but it will take you to the point where you can solve them yourself. I really had to smile at how simple and effective the approach taken here is - instead of the expected multiple passes and mind bending parse tree manipulation, we have a single-pass, source-generating decompiler for Java. You won't follow it all first time, but it does work and you can verify it for yourself. Like I said at the start, you don't get that empty feeling from this book, and this chapter is pretty much why. I bought a book about decompiling Java, and now I can.
Ch.7 Case Studies
This chapter addresses the "why" of decompiling, returning again to the moral questions raised at the start. It's more food for thought than prescriptive preaching though, which again is refreshing. I have admit to dipping into this chapter while reading the rest of the book - the human interest angle always works a treat!
Of course, no book is perfect. What I think could have helped a bit overall would have been a introductory chapter to bytecode. But it's not a great loss and bytecode is actually pretty simple once you get your head around it. Still it might have lessened the learning curve somewhat.
Decompiling Java is a great addition to that section of your bookshelf dedicated to serious books that will be around for a while. The JVM specification and Java bytecode are not going to change that much, so this book is something you'll be able to use for a long time. Personally the best thing about this book for me was that it took me to the next level. Not many books can do this. As a working coder, I pretty much put things like decompilation into the "too hard, just for academics, and I could never grok it", category. It's great when a book comes along that can can you out of that comfort zone.
You can purchase Decompiling Java from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, carefully read the book review guidelines, then visit the submission page.
So it's a book about reconstructing bytecode into human interprettable info, but it doesn't have an intro to them? That seems awfully strange. Are you sure you didn't miss something?
-dave
http://millionnumbers.com/ - own the number of your dreams
I've read both and I have to say Covert Java is slightly more in-depth, but perhaps more for people more familiar with Java.
When anger rises, think of the consequences.
Confucius (551 BC - 479 BC)
Everytime I take a piss after my morning cup of joe...
From excellent karma to terible karma with a single +5 funny post...
in 1999 i wrote a paper on security in set-top boxes (one of my first papers); yay. but, one approach we had was to build a custom class loader that would actually load encrypted classes.
9 .pdf
:) i had a number of successful prototypes built - but, unless you build the class loader into hardware (ie: cannot access the .class file), its just another hurdle, nothing more.
the details of the paper are:
1999 - Security in Set-Top boxes
European Multimedia, Embedded Systems and Electronic Commerce
EMMSEC '99, Stockholm, SWEDEN
June 21-23, 1999
COPY: (pdf)
http://www.ardiri.com/publications/emmsec9
there was a lot of interest on this topic back in the time
Good review, but I have one major nit to pick.
What ethical problems? Decompiling is perfectly moral and ethical. Whether it is illegal is a seperate and, for me, almost irelevant issue. If I legally own a copyrighted work I am allowed to read it, period and end of story. Corporate licences excepted, software is SOLD, not licensed despite the scary words on the box and the dread click through EULA.
Hell, I learned assembly by writing a disassembler (in BASIC) and reading the Microsoft BASIC roms, then later reading the commented listings that ran in Color Computer Magazine. (TO avoid a copyright fight, and because M$ refused to grant them permission, CCM ran only the comments and memory locations, leaving the reader to run their own dissassembly for the opcodes.)
The only ethical problem would be lifting the code and reusing it without permission and I think we all know that is wrong.
Democrat delenda est
I had been looking through this book just the other day. Glad to see a review.
Thanks, Beannie
It has always been the case with Java (and in general many other interpretted/pcode generating languages) that enable them to be decompiled. I remember, back in old VB days, you could take a VB (pre 3.0) executable and decompile to get the original source. Of course, variable names were changed (since VB compiler changed them when converting to pcode).
As systems get more open/advanced, the sources are more difficult to hide. In case of web apps, there is no need to decompile anything, the javascripts are available for all to see in plain text. Even more advanced applications that use ASP pages that execute on the server, can be seen by changing the URL to list the source rather than execute them (I dont remember the exact syntax, but I think it is related to the alternate data stream in NTFS)
That is the reason, we have copyright. On a more personal note, I think it serves the community if someone can see your implementation in code, get inspiration and either correct mistakes or expand on the code.
No Java developer should be without DJ Decompiler (which sits on top of JAD, the actual decompiler, command line only). Seriously, this book may be useful, but most people are way below needing to know any of this. If you do need to know it or are just curious, fine.
Oh, and obfuscation, blah, any good IDE (like IntelliJ IDEA) is able to help you work around this junk.
Anyhow, decompiling the classfile with "javap -c" shows that a couple of instructions get eliminated by dropping the explicit comparison to "true". So the classfile gets smaller, it loads faster, and (unless the JIT compiler is smart enough to do constant propagation on that conditional) it'll run faster, too.
The Army reading list
>knowing how to decompile code is more about protecting your own source code.
There are many reasons to learn about, implement and use decompilers, but I don't think "to properly protect your intellectual property" should be one of them.
I'm got somewhat interested in this book (never heard about it before), but I think I'm going to pass. Sounds like the decompiling described is too much of a one-trick pony -- which is fine, it's about decompiling java after all -- but I'd really like something like an extension and update of Cifuentes work in book form, with the lessons from the IDA team too.
You know, from the beginning; starting with machine descriptions and disassembly for a generic front-end, efficent IR, and on up through the back end.
Now that'd be a tome [worth paying for].
Belief is the currency of delusion.
I guess we got a review of that book just as an occasion to discuss real java decompilation, did not we?
Next time you will see entertainment industry trying to sue Sun for built-in circumvention system.
the decompiler compiles you!
Er... um...
the compiler decompiles you!
Er...
the java decompiles itself!
Ah, whatever.
- Kevin
The less confident you are, the more serious you have to act.
Let me get this straight: the author recommends that 'honest' developers obfuscate their code?
I've read programs that I thought were obfuscated, but later found out were just poorly written. Other times I've run into programmers who, tin hats firmly affixed, went to great lengths to make sure no one learned their Merlinesque techniques for getting the most out of BASIC.
In context, the author seems to be talking about obfuscating object code. Yikes! What's the opposite of debugging? Buggery?
Encrypting object code to make it harder to reverse engineer is a giant waste of time. Here are more productive ways to spend the the same amount of energy:
In fact, I can't think of many worse wastes of time than making a compiled program hard to understand.
sigs, as if you care.
Did you then pass out from lack of food and sleep? the most I cna manage is two days before geting mighty uncomfertable.
"Have you ever thought about just turning off the TV, sitting down with your kids, and hitting them?"
It's useful but not very effective at actually making your code unreadable.
I find that since everything resolves to a native call you can usually figure out what a coder is doing pretty easily.
In my experience most obfuscation programs are actually used more often then not for reducing code/class sizes and improving efficiency slightly.
non-stop reading for 4 days and the first thing he does is post on /.?
I might have gone the bathroom, or perhaps had a snack. Maybe a nap.
Most techie book these days are quickie grab-bags, and you end up paying for a lot of dead trees that you aren't interested in.
And so I suggest a service like O'Reilly's Safari Bookshelf. It includes the full text of over 2,000 technical books, many not published by them. No killing trees, far less money than buying books, plus full text search.
Developers: We can use your help.
I have yet to try it on byte-code produced by non-Java languages, but I'd be interested to see the results...
(It sucks that it's no longer free. The version I've got I installed through Debian, for goodness sake, years ago. Does anyone know any free alternatives that work as well?)
Sun has put the Java bytecode specification online for free..
Reverse engineering in Java is as simple as the compile process itself. Besides there are already free tools available so why bother??
Mocha was available in 1996. Any half-serious java developer understands what decompilers and obfuscators are. They've read the JLS and the VM spec. They've probably reluctantly had to use JAD to debug some 3rdparty library. They can read license files which tell them what they can and can't do with those libraries without getting into legal trouble.
Why is this topic worthy of 280 slashdotted pages? Color me mystified.
The opposite of debugging is, of course, "embuggening".
Hat tip to Jebediah Springfield.
N4st0r, trixx0r h0bb1tz0rz! Th3y st0l3 0ur pr3c10uzz!
Sounds interesting.
No, wait, the other thing - tedious.
Decompiling Java by Godfrey Nolan on Amazon.
Another book on the subject is Covert Java : Techniques for Decompiling, Patching, and Reverse Engineering by Alex Kalinovsky... probably more targetted at those who are already pretty familiar with things and want a more in-depth look.
(Yes, Slashbots, those are affiliate links... that doesn't make them any less useful though, does it?)
SSL Certificate
Jesus fucking christ, stop with the Amazon affiliate links already.. as if we couldn't already search for the book on Amazon our own damn selves.
Yes, you could search, but the grandparent did the work for you, so now you don't have to.
But seriously, whether they are affiliate links or not, they work exactly the same way. You don't get charged more because its an affiliate link, it doesn't harm you in any way. Furthermore, they were upfront about the fact those are affiliate links.
I could see your point if they were just random links, but both of them happen to be very on-topic to the conversation.
who, as a compiler hacker, would have expected an optimization pass to transform the first form into the second form before generating the bytecode.
Or more precisely, to understand that both forms are testing for the same thing, and to produce identical simplified bytecode.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
The only experience with java decompilers I've had, was my first year of CS study. My teacher was bitching about my coding style, so I downloaded hbd for the next assignment, and decompiled the .class file, fixed up the code to the point that it actually compiled again, and handed it in.
This approach to "security" in Java is so trivially easy to circumvent that its worthless.
/ 01-qa-0509-jcrypt.html
There are a number of papers and articles detailing why this type of approach to "IP security" is so misguided. One such article is here: http://www.javaworld.com/javaworld/javaqa/2003-05
The crux is that at some point in time, you have to deliver the encrypted class to the JVM in an unencrypted format. Intercepting this delivery is incredibly easy (no expert knowledge required, the details for doing so are detailed in the article above), at which time someone can just write the unecrypted class file out to disk (or wherever they wish). Voila! All your IP are belong to us.
that was aimed to foil decompilers.
;)
Its starts off with public variable names like:
public int YOU_DECOMPILING_NOOB =-1;
public int NO_SKILLZ_4U=100;
and then the obfusticator kicks in:
where a1 and al(with an L) are switched around.
The variable and method names look similar.
if (a1.b1.x.y == al.b1.xl.y2){
a1.v1.x.y &= al.b1.x1.y2 >> 0x4c;
a1.b1.x( al.b1.x2 );
}
Ouch! Also, I think every decompiler has some weaknesses and isn't able to undo all code. I know Jad has some limitations. Unfortunately, I wasn't able to get the source of the code that broke the decompilers
Hell, I learned assembly by writing a disassembler (in BASIC) and reading the Microsoft BASIC roms, then later reading the commented listings that ran in Color Computer Magazine. (TO avoid a copyright fight, and because M$ refused to grant them permission, CCM ran only the comments and memory locations, leaving the reader to run their own dissassembly for the opcodes.)
...) I still put the formal instruction third in the list of activities that taught me how to code well, with reverse-engineering from object (sometimes accompanied by very distantly related source code) first and "playing" (writing and using my own software and experimenting with the machines' behavior) second. (Fourth, and still important, was trade journals and other publications.)
Regardless of the ethics, reading other people's code is, IMHO, the single best way to learn how coding works. And decompiling from object gives you a DAMNED throough understanding of the guts.
I too cut my programming teeth reverse engineering other people's code. And desipte having had an excelent formal programming education from some of the best in the field (Galler, Riddle, Blue,
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
It should be ``Java bytecode decompiles YOU!''.
I support the Center for Consumer Freedom
I have been decompiling Java regularily. Just get Jode Jode Its very simple and effective. As long as the writers are not using ubfuscation tools, the code is fully readable in it's original form sans commenting.
Capture a java applet?
By which I mean, there is a java applet running in my web browser. I'd like to decompile it and look over the source code. It's small enough I believe this would be informative. Is there a good way to do this?
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
System.out.println ("I'll give it a read ");
Comment removed based on user account deletion
The simplest version of cracking a Java program is using JAD to decompile the source, making a few changes in source (like changing the license check to always return "full enterprise version" instead of "time-limited demo"), compiling your altered class, replacing it the JAR, and running the app.
Most obfuscators will make this track impossible, by doing things like using language keywords (while, for, if, and so on) for class/method/variable names, so that when you decompile the thing it cannot be recompiled. They also mix stuff around in the classfile enough so that figuring out what method is doing what becomes non-trivial -- stupid things mostly (like naming methods l1(), ll(), I1(), Il(), etc.), plus a few tricks to stop JAD from fully decompiling the class.
Enough of these little things add up to make the work involved in altering the decompiled class excessive and difficult.
The more sophisticated Java cracker doesn't bother. They decompile enough source to get their bearings, then edit the appropriate bytecode directly, with a classfile editor. Fortunately, most people with this level of experience can just pay for the frickin software they want.
I'm actually not obfuscating my Java code yet, but I'm going to start... it's just too easy to crack Java code without it. yGuard obfuscator is pretty decent LGPL one, that can run as an Ant task.
The books about decompiling Java are excellent advertisements for C++.
Advertisement: Want J. Random Hacker to fiddle with your code? Use Java.
well i was going to post an example, but the server said:
Lameness filter encountered. Post aborted!
Reason: Please use fewer 'junk' characters.
The answer to Java decompilation is a write-once, read-whaaa? language.
Obfuscators DO work. They're certainly not foolproof, but they definitely make it more difficult to crack a program of any size.
I'm not talking about tiny programs; but who even bothers decompiling tiny midlets? Isn't it obvious what they're doing? With tiny programs, if you know enough to be cracking Java programs, you might as well just write the thing out yourself. It's not magic.
But for larger applications, any decent obfuscator can make it very time-consuming to decompile and edit the programs. I posted more on this in another thread, so let me just say you really have to try it out before you say obfuscators don't work. They definitely DO work at foiling the average cracker who won't spend hours and hours reconstructing a $100 piece of software.
The mere fact that you've written a contract does not impose any obligations on others. The affected parties have to agree to it. Without agreement, it's just words on paper with no legal weight. It can't in any way prevent people from buying the book or reading certain chapters. Nor can it magicly cause one action (buying the book) to carry other obligations (agreement to the terms). Without agreement it may as well not exist at all.
You'd better watch it, that's probably enough evidence for Dunkin Donuts to get it's DMCA lawyers out after you.
Correcting myself here... When I said "the affected parties have to agree", I really meant "the parties bound by the contract have to agree". If you were to require all the book stores to agree to a contract that would require them to get the customer to agree to a contract or refuse to sell the book, and refuse to give them any books to sell if they don't agree, that would work. But if anyone gets a book without agreeing to the terms (and without the law being broken) then they would not be bound by those terms. This is pretty much how NDAs work. Good luck publishing your book under an NDA. ;)
With EULAs, in theory you could hack the installer to proceed without requiring you to click "I agree", and you'd be doing nothing wrong. Except in countries like the U.S. where there is the DMCA, which makes it illegal to circumvent technological measures that protect access to a copyrighted work (which the EULA screen seems to do).
But don't listen to me. I'm not a lawyer.
I'd agree with you that no obfuscator could really make it impossible to recreate a piece of software from the bytecode... but of course the only real aim is to make it hard enough so that it would be easier to simply purchase the software.
Though obfuscation comes with its own array of potential issues, especially in remote applications or those that rely on reflection
Obfuscators pretty much all offer you enough flexibility to exclude classes that will need to be used via reflection or with RMI... or to even save the map of random method names, etc. so that you can make updates to the source then come out with an obfuscated result that is compatible. I'd usually handle this just by NOT obfuscating method names in public interfaces... you can still obfuscate everything else, including instance variables, local variables, and all method code.
yes. the last book i read non-stop is Advanced PHP Programming. Lucky it's not a book on php syntax etc . instead it focus on how to design/maintain web application written in php. php is merely being used as example.
decompiling java..what for ?
The opposite of debugging is coding.
Hmm. Sounds like an oxymoron to me.
Heard any good sigs lately?
As if we didn't have enough fun with Gentoo...
This book may well be perfectly good, but I've been put off it by Fiachra's astroturfing of it. He's a friend of the authors.
The mention of his name in the above review (for no apparent reason) makes me suspicious.
Dave.
--- These are not words: wierd, genious, rediculous
What????!!!!
Stepping. Through code. With a debugger. THAT is something you think the average programmer can't do????
Remind me never to hire any programmers you know! :o)
There are many sites who use server side java script to power their web applications. Try viewing the source at .
Don't see much do you? And those functions are not included in separate files either.
Here's a guide on server side js
C++ can be decompiled, but it is missing a layer of information that is present in byte-code decompilation. There is a higher percentage of mistakes in C++ decompilation. Those who are smart enough to find the mistakes are smart enough to earn a good living programming, they don't need to hack someone else's code.
I haven't tried to do it, but I suspect that GCJ produces easily identifiable structures. Compiled C++ is much harder do decompile.
Are you sure that's "irony"?
There's definitely a place in this world for both open and closed source software, and I work on both. I get different rewards out of open vs. closed source projects... though at the moment I pay my rent with closed source work. Because I need to be able to do that, I feel pretty strongly that I should be able to make the choice of whether my work will be open or closed.
Interestingly, a good obfuscator is a pretty obvious open source project (and there are more than the one I mentioned). Why? Because it's a fairly common need for many professional developers using Java, and a major part of open source development is scratching that itch. When enough people have the same itch, it makes more sense for them to work together and make it open source, than it does for them to work separately, then try to sell many competing (and lower-quality) implementations.
True enough, my point is that if someone wants to hack your code, the fact that it's a native executable is hardly a giant barrier (and if you really think it is, you can native compile your java code anyway)
There are two kinds of programs, basically. There are those that implement some programming or mathematical algorithm. If you have the assembly language, it is not very difficult to discover the algorithim.
Most Java, however, doesn't deal with protocols or fundamental algorithms. Most Java implements business rules. Each rule is not worth much, but the entire manner of operation of Amazon's web site, for example (which may not be written in Java), contains literally thousands of business rules which would be very valuable for a competitor to know. As a practical matter, it is very difficult to turn thousands of anything in assembly language back into higher level code.
--
100 Facts and 1 Opinion -- The Non-Arguable Case Against the Bush Administration
I agree.
Also, code for user interfaces (UI) is not worth dis-assembling, because most of the intellectual property is visible on screen anyway.
Good UI is the easiest thing to steal in a piece of software.
Hey, Adrien!
If I were to take the code and *sell* it, or
a product derived from it, that's one thing.
I have no problem disassembling any product for
my own education or enjoyment otherwise.
I'm tired of all of this "moral snafu" crap when it really boils down to fear by the company/programmer about threats to their
profit margins.
Correction: Bad UI is the easiest thing to steal in a piece of software.