Slashdot Mirror


Your Java Code Is Mostly Fluff, New Research Finds

itwbennett writes In a new paper (PDF), researchers from the University of California, Davis, Southeast University in China, and University College London theorized that, just as with natural languages, some — and probably, most — written code isn't necessary to convey the point of what it does. The code and data used in the study are available for download from Bitbucket. But here's the bottom line: Only about 5% of written Java code captures the core functionality.

31 of 411 comments (clear)

  1. Makes sense to me by Anonymous Coward · · Score: 5, Insightful

    I'll admit I just read the summary article and not the paper itself, but I wouldn't say that this is overly surprising.

    Right off the bat due to this preoccupation we Java types seem to have with accessor methods (which I think if we admit, do something besides just set or get a private member variable like 1% of the time, why the hell we still do this I don't know..), and the frequent necessity for hash, clone, and equals methods, most of which is auto-generated, you end up with a bunch of small methods that do very little but up the code count.

    Beyond that, I think good design usually works out this way. You (or at least I like to) build up in layers, each layer using the previous layer at a higher level, until you get to the top where you have a few seemingly simple bits of code that pull it all together. When you get big complex functions doing a bunch of stuff vs the described small functions adding little bits of functionality along the way, I think you are doing things wrong.

    That's not to say people (and this is common in Java) go way overboard and end up with huge chains of methods that just pass the buck and complex control structures where you need a debugger to figure out whats going on, but if done right it can make for easily maintained and readable code.

    1. Re:Makes sense to me by theshowmecanuck · · Score: 4, Insightful

      Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
      -- Fowler

      --
      -- I ignore anonymous replies to my comments and postings.
  2. Your Article Is All Fluff, Reader Finds by kaputtfurleben · · Score: 5, Insightful

    This article uses a lot of words to say absolutely nothing.

    1. Re:Your Article Is All Fluff, Reader Finds by msauve · · Score: 5, Insightful

      I think they're advising that you remove all error checking, help messages, and logging, since that's not required for "core functionality."

      --
      "National Security is the chief cause of national insecurity." - Celine's First Law
    2. Re:Your Article Is All Fluff, Reader Finds by IamTheRealMike · · Score: 4, Insightful

      Plus other bits of code actually required to make it run.

      They also say that they think the same findings would hold for C++. So whilst it's a bit hard to know if this technique is useful without reading and pondering the paper, it isn't saying much about Java specifically.

      That said - we all know Java is a very simple and verbose language. That has some advantages like ultra-fast compiles, but lots of disadvantages too. So here I'm gonna point out Kotlin, which is a new JVM language with transparent Java interop (in both directions). It's a lot more concise and expressive than Java, whilst simultaneously having a stricter type system. The neat thing about Kotlin is, it's developed by JetBrains so you get completely seamless integration with their refactoring IDE. Also there is a Java-to-Kotlin converter feature that lets you turn a Java file into a Kotlin file instantly, and you can convert a codebase on a class-by-class basis. So you can start using the features of the new language right away. Also, it runs on Java 6, so it's Android compatible.

    3. Re:Your Article Is All Fluff, Reader Finds by goose-incarnated · · Score: 4, Insightful

      Well yes, it is about Java, the first language to mix coding with literature.

      It wasn't the first. LaTeX :-)

      --
      I'm a minority race. Save your vitriol for white people.
  3. Same for any code by Ubi_NL · · Score: 4, Insightful

    In my experience, 80% of my code deals with checking for user error and thing like that (i.e not enter a string where i expect a number, does this socket really exist). This is important functionality, but indeed, it is not 'core'...

    --

    If an experiment works, something has gone wrong.
    1. Re:Same for any code by Dutch+Gun · · Score: 5, Insightful

      Agreed. As the saying goes: "The devil is in the details".

      It's often very easy and quick to write the "core" functionality, but dealing with exceptions (both in workflow and code), one-offs and special rules, shifting requirements, scope creep, etc, etc... It may not be core, but it's a huge amount of work to write it all. I remember a saying that went something like "80 percent done... now you've only got 80 percent to go", meaning that the perception of being "nearly finished" is much different than the reality.

      It's especially bad when you're racing to meet a milestone with payment tied to specific functionality (I've seen this in the videogame industry), and just barely write enough code to more or less hit that "easy" initial 80 percent, but never get that "last 80 percent" until the end of the project. It ends up as a hellish crunch-mode disaster at the end of the projects, with managers not understanding why the project seems to implode near the end.

      --
      Irony: Agile development has too much intertia to be abandoned now.
    2. Re:Same for any code by quantaman · · Score: 4, Insightful

      I agree that a certain level of fluff is essential, but some also comes from the language itself. Getters/setters are a great example, that's a lot of fluff that almost vanishes in a language like python without detracting from maintainability or stability. Errors are a more subtle example, what kinds of errors are possible given the language and API? At what level does the API want you to handle errors? How much code do you need to handle those errors properly? This can greatly influence the volume of necessary fluff.

      --
      I stole this Sig
  4. Peanuts by Anonymous Coward · · Score: 5, Insightful

    No. This is what happens with a language with an extremely verbose API and extreme boiler-plate requirements. The best Java developer in the universe isn't going to be able to get around this.

  5. The alternative by halivar · · Score: 4, Insightful

    Imagine a language with no fluff, no cruft, no boilerplate. Everything is essential and concise. You have something akin to either assembly or too-clever Perl. The fluff is necessary. The fluff provides context, readability, and maintainability.

    1. Re:The alternative by Trepidity · · Score: 3, Insightful

      I agree you can get too clever with concise syntax, but Java really does not seem like it's at optimal point on that tradeoff. Some really common things are very verbose, to the extent that it harms readability imo.

    2. Re:The alternative by mean+pun · · Score: 3, Insightful

      Extra lines give the code checking and refactoring tool more information.

  6. This sounds silly ... by gstoddart · · Score: 4, Insightful

    A couple of important points to keep in mind here. First, the MINSET itself is not executable; itâ(TM)s merely the smallest subset of the code which characterizes the core functionality. Some of the other 95% of the code (the chaff) is required to make it run, so itâ(TM)s not useless.

    So, we can do a computer transform on it to make it into something a computer can express efficiently, but we ignore the fact that the other 95% of the code is the error checking and other shit which you can't do without.

    The whole premise of this "study" has nothing to do with code, how to write it, or what that entails.

    I once had a co-worker who kept telling me that lisp or scheme would magically make it so you just wrote a two line program -- something like "getReady; justDoIt".

    When I asked him who the hell would write "getReady" and "justDoit", he seemed to think it would be some magic step which sorted itself out. The hard parts don't just magically happen. I can write main() in C which says "getReady(); justdoIt();" -- that doesn't mean that I don't need to implement those parts.

    This sounds equally stupid.

    Since when have coders started subscribing to wishful thinking where you just wave your hands and the computer does all the hard stuff?

    --
    Lost at C:>. Found at C.
  7. Source Versus Machine Code? by Bob9113 · · Score: 4, Insightful

    Really? Are they just pointing out that source code is meant for human readability, and the actual instructions are more concise? Is anyone surprised by this? Even a quick compression test shows me 80% reduction without even removing the most obviously human-oriented stuff like comments and long variable names.

    Can I get some of this research grant money? I've got a theory about sparse matrices mostly containing zeros.

  8. The 90/10 law was discovered years ago by JoeyRox · · Score: 3, Insightful

    90% of the time is spent executing 10% of the code. But when something goes wrong you want that other 90% of the code to be there so that you don't l lose 100% of your work :)

  9. Re:Peanuts by Altus · · Score: 5, Insightful

    Yes

    --

    "In America, first you get the sugar, then you get the power, then you get the women..." -H. Simpson

  10. Waste in Housing by lordeveryman · · Score: 5, Insightful

    Did you know that only about 5% of the average house is actually load bearing? The rest is just fluff. Why are we wasting so much valuable material in houses?

  11. The alternative by Anonymous Coward · · Score: 2, Insightful

    If every single program in the universe contains the same boilerplate strings... They are indeed unnecessary. Java is just about the worst for this. Python requires drastically less redundant meaningless fluff.

  12. Re:Peanuts by gstoddart · · Score: 4, Insightful

    Hmmm, I don't know MakeRocketLauncherGoNow() vs Foo() ... yeah, I think having the code read like sentences makes a lot of sense.

    If the onus is on human readability, that simple sentence is more than I've seen many coders put in comments.

    --
    Lost at C:>. Found at C.
  13. Re:Peanuts by sycodon · · Score: 4, Insightful

    Any decent code written to be readable and maintainable has lots of "fluff". That's what makes it readable and easy to maintain.

    Much preferable to the mishmash of one line wonders that do ten different functions.

    --
    When Fascism comes to America, it will call itself Anti-Fascism, and tell you to give up your guns.
  14. Re:Peanuts by Qzukk · · Score: 4, Insightful

    You forgot the MakeRocketLauncherGoNowFactory, the MakeRocketLauncherGoNowFactoryFactory, the MakeRocketLauncherGoNowException, the ...

    --
    If I have been able to see further than others, it is because I bought a pair of binoculars.
  15. Re:Peanuts by supton · · Score: 5, Insightful

    Ok, here's the deal, sometimes readability is in fact a function of how succinct something is, not how verbose it is. In human (verbal) languages and in cross-cultural communication we refer to this as high-context and low-context language. In code, a parallel could be applied. Succinctness is not a value in itself (read Paul Graham's defense of Lisp vs. Python, I disagree with Graham), but it can often be a good means to an end when context surrounding your identifier choice is clear as freakin' day.

  16. Java is not written like other languages by buchner.johannes · · Score: 5, Insightful

    But contrary to python or ruby code, for example, most Java code is not written by hand. No one ever writes import statements for example. Eclipse is so excellent at understanding Java code structure that the writing efficiency is comparable. It brings other benefits too -- I have found re-factoring of large code bases is substantially easier in Java than any other language. This is thanks to the strong structure implied by the language, which can be exploited by tools. In other languages this is prohibited, e.g. Ruby, where every word can mean something different and you can not know until runtime, or C when cluttered with macros.

    --
    NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
  17. Re:New research find's water wet by popo · · Score: 5, Insightful

    Yes, but the point is silly anyway.

    The notion that everything that isn't core functionality is "fluff", gives the impression that it is non-essential.

    Let's say I have a weather application that reports meteorological data for a specific zipcode. Let's say that I have a super slick user interface, and I display animated weather graphics in HD.

    Fluff?

    Not at all. A spartan application which displayed a bunch of plaintext data might have zero downloads. Sexy, eye candy might equate to 20 million downloads.

    Which raises the question: What is the actual point of this app? Is it to display weather information?

    No. The point of this app is to get downloaded.

    So what's "core" again?

    --
    ------ The best brain training is now totally free : )
  18. Re:Peanuts by phantomfive · · Score: 4, Insightful

    Any decent code written to be readable and maintainable has lots of "fluff". That's what makes it readable and easy to maintain.

    In my experience with real-life code bases, the more 'fluff,' the less readable and harder to maintain it becomes. If your hypothetical example of the on-line wonder has a problem, it is easy to rewrite. If a thousand-line program has a problem, it's harder to replace, even if (especially if?) it used many design patterns.

    My point there is, the more lines of code you have, the harder it is to maintain. I don't think that's controversial.

    Flexibility and maintainability come from well-defined interfaces between sections of code. It doesn't come from adding fluff.

    --
    "First they came for the slanderers and i said nothing."
  19. Re:Peanuts by angel'o'sphere · · Score: 3, Insightful

    You don't need (and no one does do it) a RocketLauncherFactory if you only have one RocketLauncher type.

    However such a Factory is quite useful if you happen to find a 'rocket' used in 'rocket launchers' and you need either an instance or a description of a launcher that actually can use that rocket.

    Also factories are quite interesting as you can sent them usually 'orders'. That means if you order a few rocket launchers you can make sure when you unwrap them, that there is an assorted set of ammunition in the case as well.

    But if you hate Factories and Exceptions ... no one can help you.

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
  20. Re:What will change now? by StormReaver · · Score: 4, Insightful

    I still have visions of layers of adapter classes, which serve absolutely no purpose other than to appease Java.

    Those adapter classes exist to make interfaces with lots of methods easier to manage. I've learned and forgotten many languages over my 30 years of programming, but Java is one of those elegant languages that makes programming pleasant. The only thing I truly hate about it is the stupid memory limits imposed by its early life for applets. That one thing makes desktop programming more irritating than it needs to be.

  21. Re:Peanuts by angel'o'sphere · · Score: 4, Insightful

    Why are people always brining up the die hard fanboy argument?

    The least thing to say about it: it is un polite! What do you expect me now to answer? As a non fan boy but serious java developer I have to say ...???

    Sorry, I don't have to need to defend myself all the time why I use a certain thing.

    I use a Mac, for good reasons. I use an iPhone for other good reasons. I use Java, but I also use C++, I don't use C, for good reasons.

    And after 35 years in the industry I can tell you: I'm very disappointed. The stuff that rules the world is run by marketing. Not by fan boys.

    Not to mention doing just about anything with most Java APIs involves all kinds of intermediary wrapper objects. That is complete nonsense.

    Starting a post with a general insult and then making a wrong claim is poor sportsmanship imho.

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
  22. Re:Peanuts by firewrought · · Score: 3, Insightful

    Most of the "modern" languages seem to have this addiction to overly verbose libraries and obscenely long syntax. Do we really need method names that could constitute a simple sentence?

    Long names are fine and even valuable. The real gremlin is in overly-abstracted API's, code generators, verbose XML configuration files, and other tools/libraries that have sacrificed usability while pursuing long feature lists and total control over a particular problem domain.

    It is, in a funny way, the opposite usability trajectory that Gnome and many others in the UX crowd followed when they went off and started zealously reducing features in the name of simplicity.

    Personally, I think that the underlying design principles should be the same whether you're designing application interfaces to be used by the general public or whether you're designing API's to be used by developers: in both cases you're trying to take something complicated and make it simpler. Sure, add those new/advanced features when you can, but do so in a way that doesn't raise the learning curve for the most common use cases.

    --
    -1, Too Many Layers Of Abstraction
  23. Re:I always new this was the case with Java by 0123456 · · Score: 1, Insightful

    And it still runs like a pig, thanks to garbage collection, lack of unsigned types, and the need to go through Java bytecode to generate the final host code. I remember the joyous days of running software that did encryption in Java rather than calling a native library, and ran at least ten times slower than it would have in C.