Slashdot Mirror


Your Java Code Is Mostly Fluff, New Research Finds

itwbennett writes In a new paper (PDF), researchers from the University of California, Davis, Southeast University in China, and University College London theorized that, just as with natural languages, some — and probably, most — written code isn't necessary to convey the point of what it does. The code and data used in the study are available for download from Bitbucket. But here's the bottom line: Only about 5% of written Java code captures the core functionality.

36 of 411 comments (clear)

  1. Makes sense to me by Anonymous Coward · · Score: 5, Insightful

    I'll admit I just read the summary article and not the paper itself, but I wouldn't say that this is overly surprising.

    Right off the bat due to this preoccupation we Java types seem to have with accessor methods (which I think if we admit, do something besides just set or get a private member variable like 1% of the time, why the hell we still do this I don't know..), and the frequent necessity for hash, clone, and equals methods, most of which is auto-generated, you end up with a bunch of small methods that do very little but up the code count.

    Beyond that, I think good design usually works out this way. You (or at least I like to) build up in layers, each layer using the previous layer at a higher level, until you get to the top where you have a few seemingly simple bits of code that pull it all together. When you get big complex functions doing a bunch of stuff vs the described small functions adding little bits of functionality along the way, I think you are doing things wrong.

    That's not to say people (and this is common in Java) go way overboard and end up with huge chains of methods that just pass the buck and complex control structures where you need a debugger to figure out whats going on, but if done right it can make for easily maintained and readable code.

    1. Re:Makes sense to me by theshowmecanuck · · Score: 4, Insightful

      Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
      -- Fowler

      --
      -- I ignore anonymous replies to my comments and postings.
  2. Your Article Is All Fluff, Reader Finds by kaputtfurleben · · Score: 5, Insightful

    This article uses a lot of words to say absolutely nothing.

    1. Re:Your Article Is All Fluff, Reader Finds by msauve · · Score: 5, Insightful

      I think they're advising that you remove all error checking, help messages, and logging, since that's not required for "core functionality."

      --
      "National Security is the chief cause of national insecurity." - Celine's First Law
    2. Re:Your Article Is All Fluff, Reader Finds by Anonymous Coward · · Score: 5, Funny

      Comments and descriptive variable and method names should also go, we're much better with "void x(int c) { a.b(c); x.b.g.y(c) }", as the real coders do not maintain code, they just write it. And the disk space is so expensive that even linefeeds should be avoided whenever possible.

    3. Re:Your Article Is All Fluff, Reader Finds by IamTheRealMike · · Score: 4, Insightful

      Plus other bits of code actually required to make it run.

      They also say that they think the same findings would hold for C++. So whilst it's a bit hard to know if this technique is useful without reading and pondering the paper, it isn't saying much about Java specifically.

      That said - we all know Java is a very simple and verbose language. That has some advantages like ultra-fast compiles, but lots of disadvantages too. So here I'm gonna point out Kotlin, which is a new JVM language with transparent Java interop (in both directions). It's a lot more concise and expressive than Java, whilst simultaneously having a stricter type system. The neat thing about Kotlin is, it's developed by JetBrains so you get completely seamless integration with their refactoring IDE. Also there is a Java-to-Kotlin converter feature that lets you turn a Java file into a Kotlin file instantly, and you can convert a codebase on a class-by-class basis. So you can start using the features of the new language right away. Also, it runs on Java 6, so it's Android compatible.

    4. Re:Your Article Is All Fluff, Reader Finds by goose-incarnated · · Score: 4, Insightful

      Well yes, it is about Java, the first language to mix coding with literature.

      It wasn't the first. LaTeX :-)

      --
      I'm a minority race. Save your vitriol for white people.
  3. Same for any code by Ubi_NL · · Score: 4, Insightful

    In my experience, 80% of my code deals with checking for user error and thing like that (i.e not enter a string where i expect a number, does this socket really exist). This is important functionality, but indeed, it is not 'core'...

    --

    If an experiment works, something has gone wrong.
    1. Re:Same for any code by Dutch+Gun · · Score: 5, Insightful

      Agreed. As the saying goes: "The devil is in the details".

      It's often very easy and quick to write the "core" functionality, but dealing with exceptions (both in workflow and code), one-offs and special rules, shifting requirements, scope creep, etc, etc... It may not be core, but it's a huge amount of work to write it all. I remember a saying that went something like "80 percent done... now you've only got 80 percent to go", meaning that the perception of being "nearly finished" is much different than the reality.

      It's especially bad when you're racing to meet a milestone with payment tied to specific functionality (I've seen this in the videogame industry), and just barely write enough code to more or less hit that "easy" initial 80 percent, but never get that "last 80 percent" until the end of the project. It ends up as a hellish crunch-mode disaster at the end of the projects, with managers not understanding why the project seems to implode near the end.

      --
      Irony: Agile development has too much intertia to be abandoned now.
    2. Re:Same for any code by quantaman · · Score: 4, Insightful

      I agree that a certain level of fluff is essential, but some also comes from the language itself. Getters/setters are a great example, that's a lot of fluff that almost vanishes in a language like python without detracting from maintainability or stability. Errors are a more subtle example, what kinds of errors are possible given the language and API? At what level does the API want you to handle errors? How much code do you need to handle those errors properly? This can greatly influence the volume of necessary fluff.

      --
      I stole this Sig
  4. Peanuts by Anonymous Coward · · Score: 5, Insightful

    No. This is what happens with a language with an extremely verbose API and extreme boiler-plate requirements. The best Java developer in the universe isn't going to be able to get around this.

  5. The alternative by halivar · · Score: 4, Insightful

    Imagine a language with no fluff, no cruft, no boilerplate. Everything is essential and concise. You have something akin to either assembly or too-clever Perl. The fluff is necessary. The fluff provides context, readability, and maintainability.

    1. Re:The alternative by Trepidity · · Score: 3, Insightful

      I agree you can get too clever with concise syntax, but Java really does not seem like it's at optimal point on that tradeoff. Some really common things are very verbose, to the extent that it harms readability imo.

    2. Re:The alternative by mean+pun · · Score: 3, Insightful

      Extra lines give the code checking and refactoring tool more information.

  6. 95% might be good enough for most... by bigsexyjoe · · Score: 5, Funny

    But I shoot to make 100% of the code I write fluff.

  7. This sounds silly ... by gstoddart · · Score: 4, Insightful

    A couple of important points to keep in mind here. First, the MINSET itself is not executable; itâ(TM)s merely the smallest subset of the code which characterizes the core functionality. Some of the other 95% of the code (the chaff) is required to make it run, so itâ(TM)s not useless.

    So, we can do a computer transform on it to make it into something a computer can express efficiently, but we ignore the fact that the other 95% of the code is the error checking and other shit which you can't do without.

    The whole premise of this "study" has nothing to do with code, how to write it, or what that entails.

    I once had a co-worker who kept telling me that lisp or scheme would magically make it so you just wrote a two line program -- something like "getReady; justDoIt".

    When I asked him who the hell would write "getReady" and "justDoit", he seemed to think it would be some magic step which sorted itself out. The hard parts don't just magically happen. I can write main() in C which says "getReady(); justdoIt();" -- that doesn't mean that I don't need to implement those parts.

    This sounds equally stupid.

    Since when have coders started subscribing to wishful thinking where you just wave your hands and the computer does all the hard stuff?

    --
    Lost at C:>. Found at C.
    1. Re:This sounds silly ... by Anonymous Coward · · Score: 3, Funny

      Wow, your co-worker sounds like an idiot. Everyone knows in lisp it would be (justDoIt (getReady)). It's the functional paradigm that makes it magic and that makes it ONE line not two.

  8. Source Versus Machine Code? by Bob9113 · · Score: 4, Insightful

    Really? Are they just pointing out that source code is meant for human readability, and the actual instructions are more concise? Is anyone surprised by this? Even a quick compression test shows me 80% reduction without even removing the most obviously human-oriented stuff like comments and long variable names.

    Can I get some of this research grant money? I've got a theory about sparse matrices mostly containing zeros.

  9. The 90/10 law was discovered years ago by JoeyRox · · Score: 3, Insightful

    90% of the time is spent executing 10% of the code. But when something goes wrong you want that other 90% of the code to be there so that you don't l lose 100% of your work :)

  10. Re:Peanuts by Altus · · Score: 5, Insightful

    Yes

    --

    "In America, first you get the sugar, then you get the power, then you get the women..." -H. Simpson

  11. Waste in Housing by lordeveryman · · Score: 5, Insightful

    Did you know that only about 5% of the average house is actually load bearing? The rest is just fluff. Why are we wasting so much valuable material in houses?

    1. Re:Waste in Housing by gstoddart · · Score: 4, Funny

      So your neighbors don't have to see your junk.

      --
      Lost at C:>. Found at C.
  12. Is this a Java problem? by rubypossum · · Score: 4, Informative

    It seems like the Java ecosystem is fine tuned for producing a high signal to noise ratio as far as intent of code is concerned. So much of the ecosystem stresses templates, massive IDEs and other automated tools that make the production of thousands of lines of unnecessary boilerplate incredibly easy. Besides, isn't this the nature of Java anyway? It seems like it's designed to produce the most verbose code possible in the hope that if everything is explicit more bugs can be diagnosed since the compiler has more to work with. It's almost a troll article, seriously, it's like the guy is just tryiing to piss people off.

    --
    I have a theory that the truth is never told during the nine-to-five hours. - Hunter S. Thompson
  13. Re:Peanuts by gstoddart · · Score: 4, Insightful

    Hmmm, I don't know MakeRocketLauncherGoNow() vs Foo() ... yeah, I think having the code read like sentences makes a lot of sense.

    If the onus is on human readability, that simple sentence is more than I've seen many coders put in comments.

    --
    Lost at C:>. Found at C.
  14. Re:Peanuts by sycodon · · Score: 4, Insightful

    Any decent code written to be readable and maintainable has lots of "fluff". That's what makes it readable and easy to maintain.

    Much preferable to the mishmash of one line wonders that do ten different functions.

    --
    When Fascism comes to America, it will call itself Anti-Fascism, and tell you to give up your guns.
  15. Re:Peanuts by Qzukk · · Score: 4, Insightful

    You forgot the MakeRocketLauncherGoNowFactory, the MakeRocketLauncherGoNowFactoryFactory, the MakeRocketLauncherGoNowException, the ...

    --
    If I have been able to see further than others, it is because I bought a pair of binoculars.
  16. Re:Peanuts by supton · · Score: 5, Insightful

    Ok, here's the deal, sometimes readability is in fact a function of how succinct something is, not how verbose it is. In human (verbal) languages and in cross-cultural communication we refer to this as high-context and low-context language. In code, a parallel could be applied. Succinctness is not a value in itself (read Paul Graham's defense of Lisp vs. Python, I disagree with Graham), but it can often be a good means to an end when context surrounding your identifier choice is clear as freakin' day.

  17. Java is not written like other languages by buchner.johannes · · Score: 5, Insightful

    But contrary to python or ruby code, for example, most Java code is not written by hand. No one ever writes import statements for example. Eclipse is so excellent at understanding Java code structure that the writing efficiency is comparable. It brings other benefits too -- I have found re-factoring of large code bases is substantially easier in Java than any other language. This is thanks to the strong structure implied by the language, which can be exploited by tools. In other languages this is prohibited, e.g. Ruby, where every word can mean something different and you can not know until runtime, or C when cluttered with macros.

    --
    NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
  18. Re:New research find's water wet by popo · · Score: 5, Insightful

    Yes, but the point is silly anyway.

    The notion that everything that isn't core functionality is "fluff", gives the impression that it is non-essential.

    Let's say I have a weather application that reports meteorological data for a specific zipcode. Let's say that I have a super slick user interface, and I display animated weather graphics in HD.

    Fluff?

    Not at all. A spartan application which displayed a bunch of plaintext data might have zero downloads. Sexy, eye candy might equate to 20 million downloads.

    Which raises the question: What is the actual point of this app? Is it to display weather information?

    No. The point of this app is to get downloaded.

    So what's "core" again?

    --
    ------ The best brain training is now totally free : )
  19. Re:Peanuts by phantomfive · · Score: 4, Insightful

    Any decent code written to be readable and maintainable has lots of "fluff". That's what makes it readable and easy to maintain.

    In my experience with real-life code bases, the more 'fluff,' the less readable and harder to maintain it becomes. If your hypothetical example of the on-line wonder has a problem, it is easy to rewrite. If a thousand-line program has a problem, it's harder to replace, even if (especially if?) it used many design patterns.

    My point there is, the more lines of code you have, the harder it is to maintain. I don't think that's controversial.

    Flexibility and maintainability come from well-defined interfaces between sections of code. It doesn't come from adding fluff.

    --
    "First they came for the slanderers and i said nothing."
  20. Re:Peanuts by angel'o'sphere · · Score: 3, Insightful

    You don't need (and no one does do it) a RocketLauncherFactory if you only have one RocketLauncher type.

    However such a Factory is quite useful if you happen to find a 'rocket' used in 'rocket launchers' and you need either an instance or a description of a launcher that actually can use that rocket.

    Also factories are quite interesting as you can sent them usually 'orders'. That means if you order a few rocket launchers you can make sure when you unwrap them, that there is an assorted set of ammunition in the case as well.

    But if you hate Factories and Exceptions ... no one can help you.

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
  21. Re:What will change now? by StormReaver · · Score: 4, Insightful

    I still have visions of layers of adapter classes, which serve absolutely no purpose other than to appease Java.

    Those adapter classes exist to make interfaces with lots of methods easier to manage. I've learned and forgotten many languages over my 30 years of programming, but Java is one of those elegant languages that makes programming pleasant. The only thing I truly hate about it is the stupid memory limits imposed by its early life for applets. That one thing makes desktop programming more irritating than it needs to be.

  22. Re:Peanuts by angel'o'sphere · · Score: 4, Insightful

    Why are people always brining up the die hard fanboy argument?

    The least thing to say about it: it is un polite! What do you expect me now to answer? As a non fan boy but serious java developer I have to say ...???

    Sorry, I don't have to need to defend myself all the time why I use a certain thing.

    I use a Mac, for good reasons. I use an iPhone for other good reasons. I use Java, but I also use C++, I don't use C, for good reasons.

    And after 35 years in the industry I can tell you: I'm very disappointed. The stuff that rules the world is run by marketing. Not by fan boys.

    Not to mention doing just about anything with most Java APIs involves all kinds of intermediary wrapper objects. That is complete nonsense.

    Starting a post with a general insult and then making a wrong claim is poor sportsmanship imho.

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
  23. Re:Peanuts by firewrought · · Score: 3, Insightful

    Most of the "modern" languages seem to have this addiction to overly verbose libraries and obscenely long syntax. Do we really need method names that could constitute a simple sentence?

    Long names are fine and even valuable. The real gremlin is in overly-abstracted API's, code generators, verbose XML configuration files, and other tools/libraries that have sacrificed usability while pursuing long feature lists and total control over a particular problem domain.

    It is, in a funny way, the opposite usability trajectory that Gnome and many others in the UX crowd followed when they went off and started zealously reducing features in the name of simplicity.

    Personally, I think that the underlying design principles should be the same whether you're designing application interfaces to be used by the general public or whether you're designing API's to be used by developers: in both cases you're trying to take something complicated and make it simpler. Sure, add those new/advanced features when you can, but do so in a way that doesn't raise the learning curve for the most common use cases.

    --
    -1, Too Many Layers Of Abstraction
  24. Nonsense by Anonymous Coward · · Score: 4, Interesting

    the code written, in the summer of 2012 the researchers downloaded 1,000 of the most popular Java projects from Apache, Eclipse, GitHub, and SourceForge. From that they got 100 million lines of Java code and tossed out simple methods (those with less than 50 tokens).

    So they tossed methods that were wrtten well. (methods that only do one thing) So if you wrote a simple 2 line validation of an input field. Field must be populated. Field must match regex. They tossed that as chaff?

    1. Re:Nonsense by lgw · · Score: 5, Interesting

      So they tossed methods that were wrtten well. (methods that only do one thing) So if you wrote a simple 2 line validation of an input field. Field must be populated. Field must match regex. They tossed that as chaff?

      Why the Hell should you have to write code over and over to validate that a reference isn't null, or an int is positive, or other such cases. Sure that's all part of the interface contract anyhow, right? For that matter, why is "allowed to be null" the default rather than an exceptional special case. Why isn't there a simple operator that decorates a parameter as "nullable" with a single character.

      Why not simply

      public Foo foo;

      No getter or setter needed, by default it can't be null. For those odd cases where null actually means something useful, then just write:

      public Foo? foo;

      This goes double for C#, where "?" is already established as the "nullable" decorator.

      Worth noting that many Java coders use Lombock to effectively achieve this already, just with auto-generated getters and setters, since we lack the courage ad/or authority to just have public members instead of pointless getters and setters.

      And, above all else, give us a way to declare that the returned value can't be null, and auto-throw if it is, so the caller never has to check!

      --
      Socialism: a lie told by totalitarians and believed by fools.