Your Java Code Is Mostly Fluff, New Research Finds
itwbennett writes In a new paper (PDF), researchers from the University of California, Davis, Southeast University in China, and University College London theorized that, just as with natural languages, some — and probably, most — written code isn't necessary to convey the point of what it does. The code and data used in the study are available for download from Bitbucket. But here's the bottom line: Only about 5% of written Java code captures the core functionality.
I'll admit I just read the summary article and not the paper itself, but I wouldn't say that this is overly surprising.
Right off the bat due to this preoccupation we Java types seem to have with accessor methods (which I think if we admit, do something besides just set or get a private member variable like 1% of the time, why the hell we still do this I don't know..), and the frequent necessity for hash, clone, and equals methods, most of which is auto-generated, you end up with a bunch of small methods that do very little but up the code count.
Beyond that, I think good design usually works out this way. You (or at least I like to) build up in layers, each layer using the previous layer at a higher level, until you get to the top where you have a few seemingly simple bits of code that pull it all together. When you get big complex functions doing a bunch of stuff vs the described small functions adding little bits of functionality along the way, I think you are doing things wrong.
That's not to say people (and this is common in Java) go way overboard and end up with huge chains of methods that just pass the buck and complex control structures where you need a debugger to figure out whats going on, but if done right it can make for easily maintained and readable code.
This article uses a lot of words to say absolutely nothing.
In my experience, 80% of my code deals with checking for user error and thing like that (i.e not enter a string where i expect a number, does this socket really exist). This is important functionality, but indeed, it is not 'core'...
If an experiment works, something has gone wrong.
No. This is what happens with a language with an extremely verbose API and extreme boiler-plate requirements. The best Java developer in the universe isn't going to be able to get around this.
Imagine a language with no fluff, no cruft, no boilerplate. Everything is essential and concise. You have something akin to either assembly or too-clever Perl. The fluff is necessary. The fluff provides context, readability, and maintainability.
But I shoot to make 100% of the code I write fluff.
Democracy Now! - your daily, uncensored, corporate-free
There is a old phrase about code being poetry. Java's the flowery kind rather than that Haiku.
A couple of important points to keep in mind here. First, the MINSET itself is not executable; itâ(TM)s merely the smallest subset of the code which characterizes the core functionality. Some of the other 95% of the code (the chaff) is required to make it run, so itâ(TM)s not useless.
So, we can do a computer transform on it to make it into something a computer can express efficiently, but we ignore the fact that the other 95% of the code is the error checking and other shit which you can't do without.
The whole premise of this "study" has nothing to do with code, how to write it, or what that entails.
I once had a co-worker who kept telling me that lisp or scheme would magically make it so you just wrote a two line program -- something like "getReady; justDoIt".
When I asked him who the hell would write "getReady" and "justDoit", he seemed to think it would be some magic step which sorted itself out. The hard parts don't just magically happen. I can write main() in C which says "getReady(); justdoIt();" -- that doesn't mean that I don't need to implement those parts.
This sounds equally stupid.
Since when have coders started subscribing to wishful thinking where you just wave your hands and the computer does all the hard stuff?
Lost at C:>. Found at C.
Really? Are they just pointing out that source code is meant for human readability, and the actual instructions are more concise? Is anyone surprised by this? Even a quick compression test shows me 80% reduction without even removing the most obviously human-oriented stuff like comments and long variable names.
Can I get some of this research grant money? I've got a theory about sparse matrices mostly containing zeros.
Stop-Prism.org: Opt Out of Surveillance
90% of the time is spent executing 10% of the code. But when something goes wrong you want that other 90% of the code to be there so that you don't l lose 100% of your work :)
Yes
"In America, first you get the sugar, then you get the power, then you get the women..." -H. Simpson
Did you know that only about 5% of the average house is actually load bearing? The rest is just fluff. Why are we wasting so much valuable material in houses?
If every single program in the universe contains the same boilerplate strings... They are indeed unnecessary. Java is just about the worst for this. Python requires drastically less redundant meaningless fluff.
In answer to your off-topic questions, no and no.
I'm curious as to why this matters. When I write functions I write lots of other code that doesn't pertain specifically to the objective but is required to provide stable reusable code. E.g. Re-working the data that was input so it can fit within the mold that is the core isn't representative of what the program's objectives BUT is required to achieve the final goal. Same goes for the interface and the validation routines. They don't depict the core function of the software but are critical to the successful use of said core code.
Am I surprised by the 5%? The answer is no. In most of my projects, lots of work goes into presentation and input validation. After all, making machine compatible with people isn't always easy.
No. This is what happens with a language with an extremely verbose API and extreme boiler-plate requirements. The best Java developer in the universe isn't going to be able to get around this.
Well, arguably Project Lombok is a small but good start.
Free, as in your money being freed from the confines of your account.
For Java, or any other language, removing a lot of boilerplate code would drastically increase the cost of code maintenance. If there's a linked library I already know which functions it includes and I'm free to pick and choose when modifying the code. Furthermore its inclusion allows commonality between code segments.
Of course spaghetti code is bad and plugging in arbitrary lines without understanding them tends to create spaghetti code. But what would be way worse is reducing every program to its core functionality.
Other extra lines of code serve to make a program easier to maintain. Separating functions where you don't really have to, following some expansionary coding rules, and the like create a little inefficiency to avoid creating a good deal more inefficiency for other reasons. Then there are API's, as someone else mentioned, and comments. Good code should contain a large percentage of nonfunctional lines.
If video games influenced behavior the Pac Man generation would be eating pills and running away from their problems.
Indeed.
I love Java, but not even a diehard fanboy will argue that it isn't excessively verbose and loaded with boilerplate code. The amount of code attributed to various getters, setters, and comparison methods alone often eclipses the actual functionality of a class. Not to mention doing just about anything with most Java APIs involves all kinds of intermediary wrapper objects.
I learned how to do 90% of the work in a week, but the other 10% you never finishing learning how to do. Of course, that last 10% is the difference between a professional/expert and a rank amateur.
I think this is due to the mental capacity of human beings. If the job is so complex you can't learn how to do most of the work quickly, then we split the job into two or more sub-jobs.
The same guy that plants the food no longer transports it or cooks it. Why? Because those jobs have gotten so complex.
But back to the main point, the simpler stuff may not get the applause, but it is still a large part of the work. Just because someone doesn't think that declaring the variable counts as relevant to the program, it doesn't mean it shouldn't 'count'.
excitingthingstodo.blogspot.com
Until we can read and write in huffman encoding, that's the way programming languages will always be.
It seems like the Java ecosystem is fine tuned for producing a high signal to noise ratio as far as intent of code is concerned. So much of the ecosystem stresses templates, massive IDEs and other automated tools that make the production of thousands of lines of unnecessary boilerplate incredibly easy. Besides, isn't this the nature of Java anyway? It seems like it's designed to produce the most verbose code possible in the hope that if everything is explicit more bugs can be diagnosed since the compiler has more to work with. It's almost a troll article, seriously, it's like the guy is just tryiing to piss people off.
I have a theory that the truth is never told during the nine-to-five hours. - Hunter S. Thompson
Hmmm, I don't know MakeRocketLauncherGoNow() vs Foo() ... yeah, I think having the code read like sentences makes a lot of sense.
If the onus is on human readability, that simple sentence is more than I've seen many coders put in comments.
Lost at C:>. Found at C.
Sure, I could throw out a huge amount of the java code I write.
I could use import package.* instead of import package.classname . And for the code I'm writing I could put all of my classes in the same package and eliminate importing any of them.
Next I could also throw out all the logging and error handling.
There's also the ditching of the corner cases, which aren't "core functionality" either.
But that's not exactly useful for anything other than an academic project now is it?
Any decent code written to be readable and maintainable has lots of "fluff". That's what makes it readable and easy to maintain.
Much preferable to the mishmash of one line wonders that do ten different functions.
When Fascism comes to America, it will call itself Anti-Fascism, and tell you to give up your guns.
I am sure it depends on a chosen technology, though (partly because technology defines selected group of authors).
This percentage would probably go up to low %20-30s in C++/Objective C and the like and well over %50 in C. Assembly would surely be virtually %100.
I wonder what Perl or Python would get, though (probably would fare only a bit better than Java)
Pure speculation, of course.
Yes
try (this.that->matter(opinion).is())
Yes we do. How many times have you fatfingered variable ll, when you meant variable l1 (or something similar).
Making code painfully clear may not be essential to original coder, but is vital to future users and maintainers, not to mention the benefit for learners.
So the answer is Hell Yes! we need verbose code.
You forgot the MakeRocketLauncherGoNowFactory, the MakeRocketLauncherGoNowFactoryFactory, the MakeRocketLauncherGoNowException, the ...
If I have been able to see further than others, it is because I bought a pair of binoculars.
Ok, here's the deal, sometimes readability is in fact a function of how succinct something is, not how verbose it is. In human (verbal) languages and in cross-cultural communication we refer to this as high-context and low-context language. In code, a parallel could be applied. Succinctness is not a value in itself (read Paul Graham's defense of Lisp vs. Python, I disagree with Graham), but it can often be a good means to an end when context surrounding your identifier choice is clear as freakin' day.
But contrary to python or ruby code, for example, most Java code is not written by hand. No one ever writes import statements for example. Eclipse is so excellent at understanding Java code structure that the writing efficiency is comparable. It brings other benefits too -- I have found re-factoring of large code bases is substantially easier in Java than any other language. This is thanks to the strong structure implied by the language, which can be exploited by tools. In other languages this is prohibited, e.g. Ruby, where every word can mean something different and you can not know until runtime, or C when cluttered with macros.
NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
10 PRINT "HAVE YOU TRIED BASIC?"
20 INPUT $X
30 IF $X <> "YES" GOTO 50
40 PRINT "WHY AREN'T YOU USING IT?"
50 PRINT "YOU SHOULD, IT HAS NO LIBRARIES AND SHORT KEYWORDS."
60 END
"National Security is the chief cause of national insecurity." - Celine's First Law
It should be apparent to an experienced software engineer that over time, better engineering teams will reduce the fluff. That doesn't happen in a SOC project. It happens over several years of iterative development, and tightening up of requirements.
Yes, but the point is silly anyway.
The notion that everything that isn't core functionality is "fluff", gives the impression that it is non-essential.
Let's say I have a weather application that reports meteorological data for a specific zipcode. Let's say that I have a super slick user interface, and I display animated weather graphics in HD.
Fluff?
Not at all. A spartan application which displayed a bunch of plaintext data might have zero downloads. Sexy, eye candy might equate to 20 million downloads.
Which raises the question: What is the actual point of this app? Is it to display weather information?
No. The point of this app is to get downloaded.
So what's "core" again?
------ The best brain training is now totally free : )
This sounds like the same hand-wavy BS that spawned our current infestation of Agile consultants.
They aren't even trying to be scientific here; this is just baldfaced click-bait, likely commissioned by some unproductive company who wants to look like a "thought leader." What are they even defining as "wheat" and "chaff"? Who decides which lines of code are which? Who decides who gets to decide that? What does it even mean to describe what code "does"?
Smart people can disagree about best practices and what constitutes "good" code - ultimately, I think most of it boils down to personal taste rather than any notion of objective correctness or big-picture productivity. Personally, I feel most productive in Java - but that's because of an interlocking mesh of many subtle reasons and has nothing to do with how many bytes my code files take up.
The original paper is here: http://arxiv.org/pdf/1502.0141... What they effectively do is create a set of all the tokens and punctuation in a method and then compress that set to include only those tokens that are "useful". They then compare the length of this set with the original method. I don't see how this is any more useful than compiling the method, looking at its bytes, and stating this method is mostly chaff since it can be reduced into a single 0 and 1, i.e. its BINSET is {0, 1}.
An example threshed set they provide for java bubblesort is: int, length, =, array, ., for, (, 1, 0, , ;, if, ++, ), {, [, j, 1, ., |, ], temp, }. If you're a programmer, you can probably see how silly this is just from looking at that set.
Any decent code written to be readable and maintainable has lots of "fluff". That's what makes it readable and easy to maintain.
In my experience with real-life code bases, the more 'fluff,' the less readable and harder to maintain it becomes. If your hypothetical example of the on-line wonder has a problem, it is easy to rewrite. If a thousand-line program has a problem, it's harder to replace, even if (especially if?) it used many design patterns.
My point there is, the more lines of code you have, the harder it is to maintain. I don't think that's controversial.
Flexibility and maintainability come from well-defined interfaces between sections of code. It doesn't come from adding fluff.
"First they came for the slanderers and i said nothing."
Yes, if you want your code to be human readable and self-documenting. If you want something with little or no fluff, maybe go the assembly language route?
Really it is a culture who have been mining in a pit for so long that they reason that getting to China is the easiest way out. They might be right.
In debates about Christianity, there are two groups: those looking for answers, and those looking to just ask questions.
You should have added a semicolon to one of the PRINT lines, just to confuse people.
systemd is Roko's Basilisk.
Even comments about C++ have memory leaks.
You don't need (and no one does do it) a RocketLauncherFactory if you only have one RocketLauncher type.
However such a Factory is quite useful if you happen to find a 'rocket' used in 'rocket launchers' and you need either an instance or a description of a launcher that actually can use that rocket.
Also factories are quite interesting as you can sent them usually 'orders'. That means if you order a few rocket launchers you can make sure when you unwrap them, that there is an assorted set of ammunition in the case as well.
But if you hate Factories and Exceptions ... no one can help you.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
I wouldn't call all of that eye candy. Dynamic graphics can display a huge amount of information really quickly. Your example, ironically, is where 'eye candy' is really useful.
putting the 'B' in LGBTQ+
Your Java Code Is Mostly Fluff
Nope. Not mine. Coming from C and C++, I know what I'm doing.
Slashdot, fix the reply notifications... You won't get away with it...
The purpose of java is to keep legions of mediocre corporate coders from doing too much damage to each other, and it does pretty good at that. Tediously spelling everything out is one way to try to force some context for code you don't see often. Think COBOL.
Three things:
- First, Java is needlessly wordy - consider the necessity of explicity writing getters/setters for any class where you want access control. What a pile of code for nothing.
- Second, you can write cryptic code or you can write understandable code. Understandable code involves a few more newlines, so what?
- Lastly, depending on your developers, yes, you can have overly long code. Someone who re-implements the same functionality 10 times instead of defining an abstract class and implementing it once - such developers exist. if you have one in your team, I do feel sorry for you. How prevalent is this? No idea...
Of course, TFA wasn't really about any of this. It is about a semantic analysis that determines the number of unique concepts in a method, reducing it to a "minset" which is no longer executable. This is an interesting theoretical analysis, but doesn't have a lot to do with real programs designed to actually perform actions with those concepts. Some methods are wordy because you want them to be clear, others are wordy because of what you are doing, and still others are wordy because of characteristics of the language you are working in.
Enjoy life! This is not a dress rehearsal.
Well, I could have told you this for free. I think in some cases, particularly legacy codebases, 5% is pretty generous too.
Well, no. They're doing none of that.
From a quick skim through the paper, they more or less conclude that java program text compresses really well, since it's full of redundancy, scaffolding, and so on, and so forth. I'd say they need quite a few words to beat around the bush and imagine all sorts of more or less related things, but this is the core of their findings.
This finding is fairly obvious since well-known, certainly compared to certain other languages, but now in some light science sauce made with questionable methodology. That last bit again from skimming.
The piece written around it is equally fluffy and even the things mentioned to "improve" on this mostly involve writing more code, of which we already have a lot containing a large percentage of this "chaff".
The real question is whether or not this scaffolding is a waste of time. One might say obviously yes, yet the market says no:
There's a large market for (mediocre and therefore easily replacable) java programmers, and by extension a lot of money in grinding out this scaffolding, since without java programs are not complete and therefore won't do anything.
Another point: There is also a large market for PHP "programmers" grinding out excreable code in an excreable language, with lots of padding to make up for obvious deficiencies in the fabric of the language -- as in PHP such things are very rarely the result of deliberate design choices, as they not unlikely are in java, instead usually the result of some incompetent code contributor missing a point or other while adding yet another misfit misfeature.
There are other languages around that more easily facilitate much more concise code (such as lisp, mentioned as 'List' in the paper) but those aren't half as popular.
Thus, if there is wisdom in markets and crowds, then this chaff must add some desirable property to the services of (mediocre) programmers. Therefore, the obvious follow-up on noting that this here programming language is rather verbose, the search for expressivity, is not something the market puts a premium on.
IMO these people were having a good time crunching source in some number crunching tool and are mostly in search of more funding. This too is not unusual in that environment. IOW, dime-a-dozen study trotting out a well-known fact for great funding. What else is new in academia?
On the other hand, what makes Java powerful compared to C/C++ is its optimizer. Provided that, among other things, there's no [directly accessible/modifiable] pointer in Java, the compiler gets much more room for deep and complex optimization.
Slashdot, fix the reply notifications... You won't get away with it...
Why are people always brining up the die hard fanboy argument?
The least thing to say about it: it is un polite! What do you expect me now to answer? As a non fan boy but serious java developer I have to say ...???
Sorry, I don't have to need to defend myself all the time why I use a certain thing.
I use a Mac, for good reasons. I use an iPhone for other good reasons. I use Java, but I also use C++, I don't use C, for good reasons.
And after 35 years in the industry I can tell you: I'm very disappointed. The stuff that rules the world is run by marketing. Not by fan boys.
Not to mention doing just about anything with most Java APIs involves all kinds of intermediary wrapper objects. That is complete nonsense.
Starting a post with a general insult and then making a wrong claim is poor sportsmanship imho.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Assembly has all those mnemonics and equates and macros and directives. ITYM machine language.
"National Security is the chief cause of national insecurity." - Celine's First Law
"all programs can be optimized, and all programs have bugs; therefore all programs can be optimized to one line that doesn't work"
I'm in my right mind and I have the answer to everything!
This, of course it is only 5%. IT is easy for a program to write code with 0 white space, and variable/function names that go like this (a, b, ...., a0, a1, a2, ....). But People should NEVER write code like that. Yes I would save you at least 95% of the disk space it would otherwise use up, but no one would ever be able to debug or change, or add anything to the code. And it would simple never even work, you would never get a finished working program.
Troll is not a replacement for I disagree.
Most of the "modern" languages seem to have this addiction to overly verbose libraries and obscenely long syntax. Do we really need method names that could constitute a simple sentence?
Long names are fine and even valuable. The real gremlin is in overly-abstracted API's, code generators, verbose XML configuration files, and other tools/libraries that have sacrificed usability while pursuing long feature lists and total control over a particular problem domain.
It is, in a funny way, the opposite usability trajectory that Gnome and many others in the UX crowd followed when they went off and started zealously reducing features in the name of simplicity.
Personally, I think that the underlying design principles should be the same whether you're designing application interfaces to be used by the general public or whether you're designing API's to be used by developers: in both cases you're trying to take something complicated and make it simpler. Sure, add those new/advanced features when you can, but do so in a way that doesn't raise the learning curve for the most common use cases.
-1, Too Many Layers Of Abstraction
The first response to this kind of it is 'So what?'. They made up a metric and found that in Java it's 5%. Whoop. They didn't even examine any other languages to see if the metric varies (if they had, perhaps it would be in someway interesting, though I doubt it would be particularly enlightening.)
There's nothing you can do with this information. Total waste of time.
A thousand pounds of wood moving at 300 feet per minute. Don't get in the way.
ALL SHALL SEE ME AND DESPAIR
Your claim smacks of hyperbole, but that aside I've also had to use code from developers who like to name their methods x(), f1(), t2() you get the idea. I can't tell if they're too lazy to type more than that, or they are striving to make all code fit in a 40-column window (ala GW-Basic), or they hate the idea that anyone else would ever try to read and comprehend their code.
There's got to be a nice balance.
This is old news folks.
Frederick P. Brooks Jr. wrote an excellent paper (No Silver Bullet: Essence and Accidents of Software Engineering) back in 1987 (link: http://www.cs.nott.ac.uk/~cah/...) that highlights how so much of the complexity that exists in software is *accidental*. This problem is in no way specific to Java, but the language and the supporting eco-system of conventions, libraries and the various supporting "enterprise" tools certainly contribute to the situation. As a language that champions OOP (note: the paper calls out OOP specifically), it makes sense that Java's mainstream-status would lend itself towards being a poster-child for Accidental complexity.
Having worked in the software industry, on various Java code bases, for the past decade: I have observed this curious phenomenon first-hand, repeatedly. It really is quite unfortunate, as it is very possible to write elegant, and concise Java code: one simply has to adopt a more functional programming style and limit mutability within their code. The problem is: most Java developers who appreciate the value of functional programming and immutable design, have already moved on to other languages that have a syntax, standard library and an eco-system that is centered around these principles. I've moved on to Scala largely for this very reason: I grew tired of spending an hour frantically searching through a mountain of convoluted procedural code and XML configs: just to to see why a boolean flag I set was not "seen" by a particular class method.
Sure, if you want your outsourced dev team writing // add 1 to i
i = i + 1;
Yes, I've seen this.
So Java is 95% crap, and Perl is 100% delicious. I can go along with that.
Besides, the only Perl code I've seen that looks like line noise were the winners of the Obfuscated Perl Contest(s).
Well-written Perl code does not, unless you're shortsighted enough to look down upon the extremely useful and important variable sigils.
can we say that 95% Java coders are fluffers?
PKZIP.EXE and PKUNZIP.EXE, together, are about 80 kilobytes.
The current version of WinZip for Mac is 26 megabytes, or 26,000 kilobytes. That's a 32,500% size increase for the same basic functionality.
However, I don't see a lot of people preferring the command-line versions. Why? Because it's easier to drag-and-drop a bunch of files into a dialog box and select an output location and folder, than to type all of that crap into the command line WITH the right flags AND no typos.
Things like menus, options / configuration panes, and nicely formatted help documentation are also preferable to "pkunzip.exe -?", and then remembering that you have to pipe the output to MORE in order to read the six pages of help text spewed out to your terminal window.
UI code is bulky, because it's extraordinarily detail-oriented. Think of all of the operations that your application UI has to support: windows, and resizing, and hotkeys, and scrolling, and drag-and-drop, and accessibility features and visual themes and variable text sizes and multithreaded event loops and asynchronous event handlers and standard file dialogs and child window Z-ordering and printing and saving application configuration info... etc.
If our IDEs didn't include visual UI designers and auto-generate like 99% of that code for us, app development would be horribly stunted AND much more preoccupied with hunting down bugs in UI code.
But all of this UI code is bulky and verbose and nitpicky because the UI is extremely important for any modern app. Thousands of apps exist that feature excellent functionality that is impossible or painful to utilize because the UI sucks.
Computer over. Virus = very yes.
My complaint about perl (and for that matter clojure too now) is that so many symbols have special meaning. and sometimes it is context dependent too. If your code contains $#`'~_ all over the place it makes it hard to read for anyone not intimately familiar with it. Sure, there are some well used conventions like _ for anything or triangle brackets for collections of types, but there comes a point where using a symbol to convey really important and subtle meaning is far harder to read than just putting in a keyword. All I can say is thank god Unicode was not invented earlier or there woudl have been 1000s of other characters involved.
Nullius in verba
You can probably guess what a *correct* code does from 5% of it. But the other 95% are very much needed for those 5% to actually do that correctly.
These violent delights have violent ends
And in their triump die, like fire and powder
Which, as they kiss, consume
->
Boy meets the wrong girl, they die for love.
->
Boy, girl, dead.
->
people.forEach(die)
I mean, sure it gets the job done, but man, might as well just pay someone in India to write and read it.
Yes, but the point is silly anyway.
The notion that everything that isn't core functionality is "fluff", gives the impression that it is non-essential.
Let's say I have a weather application that reports meteorological data for a specific zipcode. Let's say that I have a super slick user interface, and I display animated weather graphics in HD.
Fluff?
Not at all. A spartan application which displayed a bunch of plaintext data might have zero downloads. Sexy, eye candy might equate to 20 million downloads.
Which raises the question: What is the actual point of this app? Is it to display weather information?
No. The point of this app is to get downloaded.
So what's "core" again?
No, the point of the app is to display adds to the user.
Any insufficiently advanced magic is indistinguishable from technology.
'having the code read like sentences' was one of the goals of COBOL wasn't it?
I studied COBOL in college, and the best Professor in the Department was a big fan, but I rememder it as being generally perceived as 'uncool'. (And I was the guy who, when given a choice of which language to write a project in, would write it in Univac 1100 assembler. This was before 'C' became widely known or implemented.)
In theory, theory and practice are the same; in practice they're different. (Yogi Berra & A. Einstein)
This is Java we're talking about... we're looking at com.Vendor.Factory.Library.Sublibrary.RocketLauncher.MakeRocketLauncherGoNow();
your 'article' is as well...
/. doesn't think my title conveys my message clearly.)
(repeated because
Seemed like a good idea at the time...Actually, it STILL seems like a good idea after 30 years...
Good quality code should be largely readable in and of itself. comments to be included only where needed to clear up unavoidable complexity. Consider the following code:
You'll notice only one line in 10 is a comment, but the intent of each line is very clear in and of itself because of the clear choice of function names, and variable names. The only line of comment is used to create a human readable explanation of what the if statement is testing for, since it is not necessarily clear from the math itself. (Pardon the lack of indentation, I dont feel like fighting with HTML/Slashdot at the moment)
I wish I had a good sig, but all the good ones are copyrighted
A lot of the comments seem to be defending the necessity of the "chaff." That seems to miss the point of the article. The authors aren't criticizing the extra code (much of which IS necessary to make the code functional, readable, and maintainable), they're suggesting that recognizing that only a small subset of the code defines the core functionality can be used in interesting ways. Programmers already take advantage of this in a variety of ways: we have auto-complete in our IDEs, we use web frameworks that write a lot of glue code so we can focus on the problem at hand, and we (sometimes) use newer languages that remove the need for a lot of scaffolding code.
Their application section gives an idea of what they really have in mind: natural language programming for simple tasks, search for common tasks across diverse code bases, and summarizing code functionality using auto-generated "minsets." There are probably a lot of other tasks we could accomplish if we were reliably able to distill a large block of code to its semantic core.
In an ideal world where everything always works, you could easily ditch 90% of your code that deals with exceptional situations, none of that is core code.
In the real world however, that 90% of extra code isn't nearly enough to catch even half the poop the monkeys will throw at it.
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
the code written, in the summer of 2012 the researchers downloaded 1,000 of the most popular Java projects from Apache, Eclipse, GitHub, and SourceForge. From that they got 100 million lines of Java code and tossed out simple methods (those with less than 50 tokens).
So they tossed methods that were wrtten well. (methods that only do one thing) So if you wrote a simple 2 line validation of an input field. Field must be populated. Field must match regex. They tossed that as chaff?
in the language that CS professors are oddly convinced will make your code more reliable.
What they miss is that forcing the programer to spend 95 percent of his time jumping through pointless hoops, working around type restrictions making up types and putting down boilerplate makes the code too full of cruft to read, too long to understand, too unwieldy to change.
This is why I prefer to write in Ruby or Lua or Python or Scheme. Scheme may have an unreadable syntax, but it's the most powerful of the bunch and you can implement the most advanced things in the shortest code. It turns out the functionality is more important than syntax... Still its syntax is unacceptably bad. Ruby is unacceptably slow and needs more straighforward metaprogramming but it's the second most powerful in the list.
as java
Obviously you have never heard of APL.
Greetings, Frans
Have you seen what Haskell programmers consider to be best practices?
Although fluffy code was nearly ubiquitous in all code samples examined, the researchers found that the best quality code could be found at http://www.ioccc.org/
You don't need (and no one does do it) a RocketLauncherFactor
Lies! These guys did it.
If finding the one liner that has the problem takes weeks of digging through object onions?
If half the CPU work becomes method call overhead?
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Only a sith deals in absolutes.
I'm a Sith.
"First they came for the slanderers and i said nothing."
Just maintain all three, in synch with update locking, at all times.
It is a pentagon language. You shouldn't be surprised to do everything in triplicate.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
You should learn APL. You will love it.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
And it still runs like a pig, thanks to garbage collection, lack of unsigned types, and the need to go through Java bytecode to generate the final host code. I remember the joyous days of running software that did encryption in Java rather than calling a native library, and ran at least ten times slower than it would have in C.
Before someone says, "It takes too long to type longer names," the answer is yes, the first time you type it, but copying and pasting long names takes just as long as it does for short names; or autocomplete if you have a good IDE. Give me a break! Don't be a dumbass coder. If you aren't going to document well or comment worth a damn, then you better use decriptive names.
Less typing isn't the point of shorter names. Typing speed is nearly insignificant in coding.
The point is screen real estate, the more code you have in your field of view the better. Smart IDEs and larger screens help but because it is a perception problem, it won't solve everything.
In fact it is the reason why writing clean code is hard. You have to balance descriptive names and easy to understand constructs with high code density.
Sounds like a prime set of functions to reorganize into proper classes.
At least it runs fast on all platforms!
You forgot the MakeRocketLauncherGoNowFactory, the MakeRocketLauncherGoNowFactoryFactory, the MakeRocketLauncherGoNowException, the ...
Exactly. Check out EnterpriseQualityCoding/FizzBuzzEnterpriseEdition for a "proper" example:
https://github.com/EnterpriseQ...
There's a lot of flamebait here. I wonder if it would be as much if the example language was something other then Java?
Why is Snark Required?
Surprisingly, this comment has not a single +1 Funny mod. I know I laughed.
That's for method and class access control though...
That's why I only program in Brainfuck - all wheat, no chaff. There's a 100% correspondence between lexemes and critical functionality.
my, your, his/her/its, our, your, their
I'm, you're, he's/she's/it's, we're, you're, they're
Yes, but the point is silly anyway.
The notion that everything that isn't core functionality is "fluff", gives the impression that it is non-essential.
Yep, you've got to worry about reductivist thinking like this. If that were the case, then A Tale of Two Cities would, in its entirety, be: 'Sydney Carton had a twin.'
The rest is mere extrapolation.
Crumb's Corollary: Never bring a knife to a bun fight.
There are tools that will remove all reasonable white space when you are packaging the code for distribution though. So I can't see that as a concern either way.
The problem here is that "fluff" is being used to mean anything that's not the sexy and exotic part of the algorithm. Genuine "fluff" gets in the way of comprehension and is bad code, but structuring and redundancy are extremely valuable.
No, I think it was the goal of Ada which was supposed to replace COBOL and Fortran for the DOD. It also was the only language (at the time) that met the Steelman language requirements.
Sure, nearly anything complex can be mostly "fluff" if pared down to a nominal "core." May as well have written that an automobile is 99% fluff, since you only need one cylinder, a crank, and two wheels to make a vehicle...
But casting in Ada is not flexible or automatic for a reason. The code is meant to be highly reliable and hard to make simple mistakes with. It was meant for ballistic missiles, rockets, etc.
Typing speed is nearly insignificant in GuB-42's coding.
FTFY.
I've fallen off your lawn, and I can't get up.
My issue is that context is often ignored. You already know it's a rocket launcher, that you want it to launch now is obvious and make is superfluous. It should be:
rocketlauncher.go()
Yea, foo is stupid, but your example is to the other extreme.
I think you can tell how much of a language is "fluff" by looking at the length of "Hello World"
Java is nice, robust, and great for building enterprise applications with a lot of other developers. It is, however, known for being verbose.
My favorite hello world so far:
cat Hello World
Genuine "fluff" gets in the way of comprehension and is bad code, but structuring and redundancy are extremely valuable.
Prove it. This paper suggests it's not extremely valuable. You're saying it based on your 'intuition' which is wrong.
"First they came for the slanderers and i said nothing."
Heck I don't even like using or import statements and would rather fellow programmers type out the full classpath everywhere. I started doing that years ago and found it helps learn the full api and helps other more green programmers figure it all out quicker.
There is a LOT of context missing from your little example, but never fear, a few comments would/could clear that up.
Look, the bulk of what I end up putting into my code as "comments" is what seems pretty obvious to me, but I spend a lot of time implementing code based on industry specifications (IEEE, RFC's and others). When doing such work, it's usually a good idea to leave behind a way to link back to the specification being implemented (for my sake and for the sake of the programmer behind me) so it's easy to find.
I also tend to opine about each method's reason for existence, including what inputs it expects, what exceptions it might throw and what output you can expect. One thing I make a point to document in function and class headers is any place where this code may have some hidden affect on something, especially something you didn't pass in. I do this in the header, mainly because it's easer for the programmer that comes behind me, but also because we use doxygen to convert all this stuff into documents for later reference.
No, Comments are a necessity for my job and I think if you think about it, yours too. Maybe not as detailed as what I go though, but a very good discipline for a coder.
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
That is both disgusting and oh-too-accurate at the same time.
"Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
variable/function names that go like this (a, b, ...., a0, a1, a2, ....). But People should NEVER write code like that
When writing for the C-64 you HAD to write code like that.
You don't need to touch type to be productive as a software dev. I'm in my 50's and can't touch type to save my life, it has not been a problem for the last 25yrs. I have never been asked about it at interviews and I have no problem keeping up with the work. I'm not the only older dev in Oz who can't type, during the 70's boys were often not allowed to learn typing at school since "only girls grow up to be typists". When I went to uni as a mature age student in the late 80's it was simply assumed everyone could touch type. Touch typing was not taught as part of my degree (or any other CS degree I've heard of) for the reason the GP stated - it's an insignificant skill for the purpose of writing code and since I can peck away with 2-4 fingers at ~35wpm it would be a complete waste of my time (and my employer's money) to start learning it now.
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
In human language, redundancy is important for separating the signal form the noise. Esperanto, which has less redundancy than organic languages, is harder to understand in a noisy environment.
With programming languages, a lot of the redundancy is for things like meaningful identifier names and type safety. Hey, I've developed compact programming languages before. They're impossible to read and debug.
Exactly right. Ugly one-liners, or even whole functions, eh, I'll pound my head on them for an hour, but I'll figure it out. The time-consuming part is the structure.
You can spend weeks trying to understand the structure of a program before writing a few extra lines of code. That's what really steals your time.
"First they came for the slanderers and i said nothing."
True, he ads a lot to this conversation!
And this was for Java5 back in the day a decade ago in college.
I am sure it is more complex and huge frameworks like hibernate are popular now. People do not have time to dwell into javadoc when they can prototype stuff fast or use a tool to generate some code for a UML app which does not use the full java apis.
http://saveie6.com/
Groovy
Really, I don't want to write another line of boilerplate Java again. But for those who do, Groovy doesn't stop you.
If you post it, they will read.
After actually reading a lot of the paper, the conclusions of commenting programmers is raw ignorance. It appears some of them read the introduction (abstract) and thought they "knew" what the article was about. If one reads it, one discovers that the goal of the work is to provide a means of doing several interesting tasks (that I'd like to see done in Eiffel and placed into the IDE): 1. Code search: In my universe, is there any code that ________? -- some form of "wheat-keyword" query that can be quickly matched against a database held as metadata about the code universe. 2. Code completion: As I am hand-coding my feature (not just the line or even instruction I am typing), is there some other feature that looks like what I have already typed that the remainder of that feature can be applied to the one I am typing to "auto-complete" it? 3. Code reduction: Is there a language subset, such that a reduced keyword language could be hand-coded and the "fluff" or "chaff" filler be computed rather than typed, essentially making for a smaller and more powerful programming language and paradigm (when linked to #1 and #2 above)? These are very powerful and interesting questions. They are not implying that Java is 5% meaningful and 95% meaningless. It is simply implying a systematic means of code-reduction in an effort to make tools that do #1, #2, and/or #3 above! Fine article and good find. Thank you for sharing!!!
Scaffolding is actually what they're talking about. Lots and lots of really silly scaffolding to hold together a few dozen instructions of code. Java (I do regularly have to write Java code, unfortunately) really, really suffers from this.
The goal is not to use _many_ design patterns, but those, that fit and actually improve the design by making it easier to maintain.
or triangle brackets
I'm just curious: which brackets have 3 angles?
-IOVAR Web Dev Platform
Screw all that crap. Just use Lombok and all of a sudden, your code gets considerably more concise while (the intellegent developer) still knows precicely what's happening behind the scenes.
Bye!
No that was COBOL......
The Truth is a Virus!!!
I think they are referring to thing such as malloc calls in C. Although the call is necessary, its purpose is to serve the programming language and not necessarily the purpose of the program. The point is that the most appropriate language should be abstracted sufficiently so that the programmer can focus as much as possible on solving the problem and not servicing the needs of the language.
-rd
WTF? Terrible.
Fast enough for what?
What is ZAS?
Why are you using ZASFraction when you are testing a difference? Shouldn't it be ZASdifference or a ratio test?
You require an essay on why that particular variable was used? If I write area = PI * radius ** 2, do I need to explain why I used pi instead of e?
At some point you have to accept that your comments can't start with Euclid's elements and work up from there: you have to assume the reader has some prior knowledge. Once you accept that, it's just a question of where you draw the line.
If this code is about ZASes and their staging, and you don't know what ZAS is, perhaps you need to familiarize yourself with what the code is about at a high level before you can expect to be able to understand lines picked at random.
D- at best in school, rejected from source control at code read in industry. I'd hate to be supporting your shit.
Meanwhile, in the real world, code like this gets checked in to source control in professional development environments all the time.
If you have evidence that design patterns make code easier to maintain, I'd really like to see it, but I have doubts based on reasons outlined in this comment thread.
"First they came for the slanderers and i said nothing."
It's called "salt" and it's either a good or bad thing, depending on if you prefer fast development or low incidence of bugs.
So it's you Phil Johnson—if that's even your real name—who is the last-sentence-gnome for every story posted at Slashdot for the past several years now.
I see an opportunity here to promote my Hello World! Java enterprise software suite! Features include Singleton, Factory, and Strategy. This library incorporates advanced programming paradigms with modules that can be found in many top tier major player agile dynamic business enterprise organization synergistic application software shops! (Note that this is not the thread-safe version.) I would just paste it below, but apparently Java can't get past the "lameness filter" anymore... figures.
Available also on a coffee mug. Support free software!
And it still runs like a pig, thanks (...) lack of unsigned types (...)
Not sure the lack of unsigned types shall have a huge consequence on Java performance...
Slashdot, fix the reply notifications... You won't get away with it...
At least it lets you type an s without an apostrophe before it.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Most of Java's problems lie with the fact that the designers of its original API made some unbelievably bad decisions early in its development. Like:
* specifying arguments as int or String values, instead of enums... so every method you called that explicitly needed UTF8 had to be surrounded with try/catch (just in case you couldn't remember whether Java wanted you to call it "UTF-8", "UTF_8", or "UTF8" & threw an UnsupportedEncodingException). This particular thorn in my side was FINALLY fixed as of JDK 7.
Before anyone points out that enum is a semi-recent addition to Java, I'd like to remind everyone that even in 1996, you could declare a class with a private constructor, then use it to declare public static final constants of itself that were defined by their own declarations (which, I believe, is what 'enum' actually does behind the scenes, anyway)
* Pre-JDK8, Java's handling of nearly everything related to the concept of a Date/Time (parsing, printing, calculating, the works) was completely fucked. -- http://www.oracle.com/technetw...
* Swing (Enough said. It speaks for itself... and does it almost as loudly and proudly as AWT.)
Even MORE tragic, though, is the way Android's API architects perpetuated the EXACT SAME anti-patterns (string/int constants as args) with the Android API... including brand new framework classes that didn't exist until Android did & had NO REASON to be that way.
And this is why C is not an appropriate language when you're developing an application where the important aspect is the result, not the amount of memory of the speed required to achieve it.
A language is a tool, and each language is a somewhat different tool. A good architect/analyst/developer knows which tool to apply to which problem, the associated "fluff" being largely nonsensical with regard to the inherent benefit of applying the right tool to the right problem.
This, my friend, is common knowledge among experienced practitioners and often fly a mile above the head of beginners which tend to have the "have a hammer, everything looks like a nail" mindset.
Generics killed Java.
Just an example, Java v.s. Javascript public static <AnyType extends Comparable<AnyType>> AnyType maximum(AnyType x, AnyType y, AnyType z) v.s. function maximum(x, y, z).
P.S. I must mention that Javascript is the worst language I have seen in a long time.
From the article:
Good coding style is to decompose your problem thoroughly, so your methods will be very small. Indeed, using this methodology, the more you refactor the greater proportion of so called 'chaff' you'll get.
I'm not arguing with the general propositions that
But this study doesn't show it, because it arbitrarily tossed away the better-written code and then analysed the remainder.
I'm old enough to remember when discussions on Slashdot were well informed.
, that you want it to launch now is obvious and make is superfluous
Not necessarily. If the regular usage is to schedule it for launching at optimal time calculated automatically - launching it now is a special case and function name must make it clear.
Bingo Dictionary - Pragmatist, n. A myopic idealist.
Yes people check in bad code all the fucking time.
Assuming prior knowledge of what code does is exactly what you are not supposed to do. You would know that if you weren't so full of yourself.
You were the one who posted shit code for all to see, claiming it was self commenting.
After hearing your defense grade is changed to F. You don't understand why you should comment.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Java is here to stay though. First off Oracle, HP and Symantec build almost all there tools in java so that pretty much guarantees survival. I know google is trying to get away from it for all HTML 5 all chromebooks don't even support java.
http://www.thetechnologygeek.org
Well, I don't want to miss at least a couple design patterns: Iterator, Observer, Adapter, Decorator, Dependency Injection, Object Pools (Threads and Connections in my case).
I also sometimes use Singleton, Builder, Proxy, Command, Circuit Braker.
Iterator: I want a simple Interface I know, to iterate over data structures if iterating is reasonable. Not 10 different interfaces for 10 data structures.
Observer: Often I want information to propagate. Yet the business logic should not depend on the UI, so I can easiely provide different UIs. And especally I don't want circular dependencies.
Adapter/Wrapper: Every so often I have to work with slightly off interfaces. Putting an Wrapper around it solves this.
Decorator: You don't have to maintain for example scroll functionality for every dsplay and input type (e.g. Text, Picture) but just do it once.
Dependency Injection: My code is so much easier testable since I use DI. Only downside, I have to admit is, that you potentially have your dependencies defined elsewhere. But then again this allows easier changes of those (for example in case of software product lines).
Object Pools: Thread and connection pooling can drastically improves performance. Noone nowaday creates 1000 new connections per second from one server to one DB.
Yet of course I don't sit there all day and try to use as many design patterns as possible. For a 100 line program I would never use DI, I use a wrapper only, if I really need to. And so on...
Forgot to note, that patterns also improve communication, as you can just say, you used the observer pattern and everybody (hopefully) knows what it means. Moreover by just harmonizing the code, so that the same problem is always solved the same way, you don't have to dig deep into the implementation to see what is going on. You see ItemsPurchasedObserver and you have a rough idea what's going on. On top you are not forced to reinvent the wheel for the millionth time.
What you're basically saying is that you like design patterns. You like them because they fit the way you think......which is probably because you learned software engineering through design patterns. Which isn't the same as 'code being easier to maintain.'
Also, you more-or-less ignored everything in the thing I linked to.
"First they came for the slanderers and i said nothing."
Java, .net and windows are mainly means to make your shinny 4GB core i7 seem slow.
If Oracle cared at all about safety, Java wouldn't have so much security updates every month. It's riddled with bugs, cause they never cared about making it secure. If they did, they wouldn't stuff it with so much bloat its pretty much impossible to inspect all of it for bugs in the first place.
I'm more of a Python guy myself, but the big issue with both, is there's no standard for browser side python or perl. We need that to hope those get rid of java. Python and/or perl on the server side, easy.
indeed. Will it 'run' without 90% of the 'stuff' in it? Probably if you wrote it that way. "Maintaining" that pile of crap would then quickly become unworkable and then it doesn't 'run' anymore.
I'm not big on scaffolding to extremes but designs call for standard stuff that isn't truly 'functional' but is nonetheless 'required' for reasonably working applications.
People in cars cause accidents....accidents in cars cause people
The thing you linked more or less said design pattern are bad because... he doesn't like it. He says there is no scientific proof they are good and thus they must be bad (without giving any proof) and because people used them wrong and introduced unnecessary complexity. The argumentation of the GoF is as sound or unsound as his. I can always construct an argument against something if I point out how it is used incorrectly. By this argumentation Assembler, C, C++ and actually every programming language is bad, because I can write really messy code.
I wrote that design patterns are usefull to _me_, yes. I can't generalize my experience, of course. Yet I learned design patterns only quite some years after I started programming. And I really prefer it this way over everyone reinventing the wheel. The code bases I worked on were many years old and more than a million lines of code or so. It just would not have worked without design patterns and code formating rules.
The problem is, that there are almost no useful experiments in software engineering. Each one I saw until now was set up way too small, too short or used students instead of professionals or any combination of this. Really good experiments are just wayyy too expensive, as you would have to watch programming efforts with and without, on the same project, over years, with professionals. Nobody will finance that, unfortunately. And case studies are just case studies and can't be generalized. Thus it is hard to scientifical sound say what is bad or good. It would be great if we could, of course.
I wrote that design patterns are usefull to _me_, yes. I can't generalize my experience, of course. Yet I learned design patterns only quite some years after I started programming. And I really prefer it this way over everyone reinventing the wheel. The code bases I worked on were many years old and more than a million lines of code or so. It just would not have worked without design patterns and code formating rules.
This is the kind of thing that worries me. You won't let yourself think outside of the design pattern box. You use listeners all over the place.
Try not using design patterns. You may find yourself enlightened.
"First they came for the slanderers and i said nothing."
Unless not all "fluff" is equal.
I think you missed the point of my post and maybe even the article.
You are absolutely correct that a language is a tool. But the point of a tool is to solve a problem. I was not advocating the use of C for general purpose business coding or even calling C an equivalent to Java. I was making a point by analogy being C has many obvious constructs that serve the language more than the problem being soled. There are also many constructs in Java that although not as heavy as having to manage your own memory off-sets are nonetheless equally superfluous to solving the actual business problem.
Java is a great language for many things, but as the article accurately notes, the language has its own layer of cruft that only services the purpose of the language. This becomes a bigger problem as Java developers push Java into more niches where it is likely not the most appropriate language.
While you, my friend, may be an experienced practitioner who selects another language when the problem is not appropriate for Java, there are many who simply force Java into the project. So regardless of your superior professional judgment regarding the selection of languages, the premise of my statements and the article stand.
-rd
The difference between Java a good languages like Python or Ruby is huge. These guys have finally hit the nail on the head. Java is a huge fat fluffy monster that wastes a massive amount of code.
Am I missing something? isn't all code 99% fluff before compiling?
I agree. Pearl probably has a higher "functionality" factor but it's compleatly unmaintainable.
I write an application, be it in Java, C++, whatever. Insert Object-Oriented language of choice here.
.JAR files, frequently megabytes in size and that's WITH compression) which I can use. I may use a few classes from that library. And those classes may use a few more. But I can't extract the classes I need and create a new library. I have to bundle their ENTIRE library, with all of the associated classes, regardless of how many/few I actively use. And this library has dependencies in those libraries. And that library has dependencies in those libraries.
.bar() method.
I need to do I/O. So I use an existing class. Which is, more often than not, bundled into a large library of classes. And the classes in this library depend on classes in that other library. So I need my new class, and the classes it uses and the classes they use, all the way down.
If I'm writing an app which uses data persisted to/queried from a SQL database, there are libraries (for Java, usually
By the time you write something for a website which will use Hibernate to handle the ORM duties, Spring framework to handle the web duties, etc. you end up with an application that's > 10 MB in size. The code which I, actively, wrote is
Furthermore, if I choose to extend that other class, I may create a method which overrides a method in that class, and my method may not use theirs at all. As such, I still have to include their class, and its associated library, IN ITS ENTIRETY, but parts of it not only aren't being used but aren't even REACHABLE. Because calling foo.bar(), where foo is an object of my class, can't even reach the extended class's
Object Oriented programs are built in layers, by accretion. As the number of layers increases, the amount of unused and unreachable code grows. Until, voila! we're down to 5% of the code in the app actually getting used, even if you exercise ALL of the functionality in the app.
Java is merely one of the worse offenders in this. When you have a stacktrace with 30 layers in it, that's a LOT of code. A significant fraction of which is NOT getting used by your app.
... by the Dew of Mountains the thoughts acquire speed, the hands acquire shakes, the shakes become a warning
Most of the Java boilerplate is for the compiler, not the programmer.