Java is the shovel and "cutting-edge" languages such as Python, Ruby and even PHP are the digging machines.
[...]
Java programmers who are to stupid to experience more advanced technologies doesn't understand it either.
Hold on there for a sec. PHP is an advanced technology? Really??? Crap like this http://lwn.net/Articles/379909/ is what 'advanced' technology is about?
PHP is basically JSP with plain Java SE on it, and a rather poor version of that approach that this. It's perfectly suited for Joe's Burgers 4 page website, but when something more 'advanced' is needed, PHP just doesn't cut it.
For instance, the platform offers zero provisions for any form of concurrency. There is not even something like the basic threading support that Java has had since 1.0, let alone that there is any support for simple asynchronous execution (e.g. the @Asynchronous annotation) or highly efficient parallel computing paradigms like a fork/join framework. And what about an O/R mapper that comes with Java by default? How about dependency injection? What about declarative transactions? What about transactions that automatically span multiple method calls and multiple transactional resources?
Does PHP offers anything like that? Does the standard PHP library already comes with an MVC framework? Is unicode already natively supported? Is there any name spacing support? Is there...
I can produce results ten times as fast by using Django/Python/nginx than with J2EE/Java/Glassfish
And I can produce results a hundred times as fast by using Java EE/Java/JBoss AS than with Django/Python and surely PHP. I'm not sure what this proves though... Also, don't forget that hacking together stuff and throwing it out is nice when you're 16 and in high school, but in the real world maintainability, correctness, stability, scalability, etc etc all count and in the end are way more important than just the ability to get some prototype like code out of the door, that 'usually-works-but-breaks-down-every-tuesday-when-it-rains'.
also var from javascript, for declarations where you don't want to type out ArrayList<SomeObject> myList = new ArrayList<SomeObject>();. Just a few tweaks here and there would make Java so much more legible
It does indeed gets a tweak a little like this. You still have to write out the declaration in full, but there's a shorthand for the generic part of the right hand side: ArrayList<SomeObject> myList = new ArrayList<>();. It doesn't look like a major improvement in this example, but when there is a lot of generic type information, it can surely help.
They prevent you from inserting a type in a collection that shouldn't be there? Good luck trying to put a battleship in a collection of pencils. It won't fit with generics and it shouldn't fit.
They take away the need for guessing and down casting anytime you retrieve something from a collection.
There are some other reasons though, not all generic usage is for collections, but these are two major use cases.
They are a band aid for the fact that not everything in Java is an object.
No they're not. Maybe unfortunately, but they don't even work with primitives in Java.
Plus any JVM I have seen is a piece of shit. Sorry, but if the official JVM takes several seconds just to start, that disqualifies it from a lot of perfectly good uses.
Java is currently primary used server side. I'm not sure what kind of code you write, but if we need to restart the JVM once a month it's already a lot. A few seconds doesn't really matter when the uptime of your server is at least weeks, but more likely months or for 'finished & stable' software maybe even years.
The amount of times I've almost been knocked over on Damrak by a 6 foot woman cycling at 30 mph with 3 kids in trailers behind her while crossing the bike roads, only to then almost get knocked down by a tram...
The Dutch must be so bored with drunk foreigners getting in the way of their high-speed cycling.
Well, to be honest they actually seem to disrespect themselves even less;) If you can read Dutch, take a look at this: http://www.amsterdamcentraal.nl/archief/2010/3/23/je-moeder#c especially the comment made by henk @ "25 maart 2010 20:03 uur".
In addition to that, cyclists in Amsterdam are notorious for this kind of anti-social behavior. Try to visit some cities in the direct surroundings of Amsterdam (Haarlem, Utrecht) and you'll find a whole different atmosphere.
Polite Europeans huh? Guess you never had to jump away for a bicycle approaching you at high speed while you were at the middle of a pedestrian crossing in Amsterdam!:P
Yes that's true. My main point is simply that the situation isn't as black/white as saying that either using frameworks or using hand crafted code is always better for performance.
It depends really and the ability of an engineer to chose and use a given framework wisely is an important asset. Some bigger frameworks like Oracle's Coherence can absolutely improve performance too, while some home cooked solution may never obtain the level of performance that Coherence has, simply because you are maybe just not a good a programmer as Cameron is.
But blindly applying Coherence for each and every problem simply because you heard "it's good for performance" of course doesn't work. But I doubt a programmer who thinks like that is capable of writing anything efficient himself anyway;)
Buying a new DB server is cheaper than fixing the code. Renting one (say, from Amazon Web Services or Microsoft Azure) and keeping it offline until you're actually using it is cheaper still.
It depends. Sure, there are some models out there which are quite hard to beat. I surely won't underestimate the lurking power of certain cloud or grid offerings. Yet, IMHO it's naive to think 'the cloud' will just fix all your problems for you.
For starters, you might have a single query and a DB that can't run a single query in parallel (like Postgres or MySQL) which is already executing on the fastest hardware available. We for instance have a DB running on a dual socket Nehalem based server (8 cores), running at some 3.4Ghz with 4 Areca 1680's (each with 4GB of cache for a total of 16GB) and 32 Intel X25-E SSDs (8 per Areca). I dare you to come up with a faster machine without stepping into a completely different price category. You know, the kind of category where prices aren't listed anymore but sales representatives visit your office to negotiate a deal.
At the moment, there simply isn't a faster DB available and even the hugely expensive IBM POWER offerings are only a factor or so faster for our kind of workload. So simply buying a faster machine is absolute not an option, even if we had money to burn (which incidentally, we haven't). So, if I can optimize some piece of code in 30 minutes, maybe even less, why shouldn't it do that? It's basically as simple as turning:
for (Customer customer : customer) { BigInteger companyRevenue = rewardService.getMontlyRevenue(); //... customer specific calculations here }
into
BigInteger companyRevenue = rewardService.getMontlyRevenue(); for (Customer customer : customer) { //... customer specific calculations here }
Why shouldn't I grab that low hanging fruit, yet pursue on a mission to find a faster machine since Blakey Rat on/. told me that software optimizations, no matter how trivial, never pays of? That would be rather silly, wouldn't it? On top of that, you seem to forget that a new & faster machine even if would exist doesn't magically fly into service. Real people have to order it. Real people have to install it with something, real people have to test its stability, real people have to place it into some rack, etc. All of that costs money, time and potentially worries too.
But to blithely come in here and say "you're wrong, because what you said about batch processes doesn't apply to websites!" that's just insulting to me and all your readers.
It is surely not my intention to insult anyone, but if you read back my post then you might see that I mentioned what I said was true particularly for web sites, not exclusively for web sites. Although there is a big difference between interactive and non-interactive payloads, at the end of the day a batch job should not recklessly consume resources either. Just because batch jobs can be queued and can be run right after each other, doesn't mean there isn't someone waiting for the result. Worse yet, if batch jobs are submitted to your queue at a certain rate and your system can only process them at a lower rate because of some completely unnecessary resource hogging, then eventually your queue will spill over. Your DB may still not be offended, but a slow turnaround and a denial of accepting new jobs surely will offend customers again.
Now you'll probably say that I simply need to throw more hardware at it and that batch jobs by their very nature can always run in parallel on multiple machines, but this is certainly not true in general. In order to make that possible you need to carefully craft your software architecture, which takes... guess what... engineering effort. If you'r
I'm a bit of both an algorithmics and programming language design geek with leanings for high level programming, and I never quite understood
and a programming technique compatible with the underlaying hardware and/or OS structure,
this part. A good language that allows for flexible design doesn't necessarily have to do much with the hardware, but it still allows you a very good access to the algorithmic part of the problem.
It depends really. The right algorithm and especially the complexity of that algorithm obviously matters a lot, but for true high performance computing, knowing your hardware can make a huge difference too. Regarding space/time optimization, it simply matters to know if you have more memory or more cpu cycles to burn in your actual machine. Very few to no algorithms can optimize for both space and time, so you have to know your target hardware in order to make the necessary tradeoff.
Next to such general choice, it can matter immensely if your hardware has special provisions that you can exploit. For instance some CPUs have hardware support for BCD arithmetic, which can speed up calculations a great deal if your software can take advantage of them. Knowledge about alignment of your memory can matter a lot too. Mathematically identical algorithms may cause your target machine to resort to trashing if you misalign every memory access and/or access 'adjacent' memory cells in the wrong order. In you have to process some matrix, it normally doesn't matter for the algorithm whether your process it row wise or column wise. If the matrix is stored into memory in such a way that cells in a single column are adjacent in memory, while each row in this matrix is the column length of cells away, then processing row by row may cause a page fault for every new cell access which completely blows your performance away.
Then there are such things as certain CPU instructions like an atomic test and set, or certain addressing modes that your language may or may not be capable of taking advantage of. Using such things can matter a lot too. Then there is a whole category of special hardware, like DSPs, vector units and lately GPUs. Taking advantage of such hardware might make more of a difference then choosing between a bad and a good 'normal' algorithm that only runs on a general purpose CPU. A simplified example is hardware H264 decoding. If your CPU is not powerful enough, then no algorithm, no matter how cleverly constructed will be able to decode say an HD H264 stream. But if your machine happens to have an HD H264 hardware decoder, then obviously simply using it in your software will yield much better results. On a more fine-grained scale, it's the same with all those little things your hardware offers. Is your language able to take advantage of that, or does it just let it sit idle?
I basically agree with your point that making the right choices regarding performance matters immensely. It's just that I don't think this necessarily means that you should write all code yourself instead of intelligently using a library or framework.
You can either write slow and inefficient code yourself, or you can use an existing library in such a way that it will be slow and inefficient and vice versa. I've written high performance software capable of processing 6000 fairly complex transactions per second on moderate hardware, while still using highly convenient library classes like Java's ConcurrentHashMap. Some would perhaps even say that just using Java is already a 'framework' and I should have used C++ or maybe assembly... where does it stop?
I'm also implementing some high performance (beta) code using preleases of Java's upcoming fork/join framework and the level of performance I can squeeze out of this is fairly impressive to say the least. I'm not sure if I would be able to write my own "work-stealing work queue" that offers better performance. Actually, I'm pretty sure I wouldn't.
The home grown solution is often faster to develop than the time it takes to learn the intricacies of a monolithic framework. It is also easier to teach other employees the logical to-the-point framework that does only what is required by the problem you are trying to solve.
I'm not entirely sure about that last thing you're saying. The problem is that teaching your framework over and over again to new developers takes a fair amount of time too. You'll also have to make damn sure you write good documentation and keep it up to date. Depending on the size of your development team and the amount of time dedicated to such a framework this may be doable, but more often than not this is precisely the part that's lacking.
A standard well known framework has the advantage of having heaps of documentation already written. When stuck, programmers can post questions in forums, google away for answers, buy some books or even follow some course. This won't be possible with your proprietary in-house developed framework. When stuck, programmers will come directly to you and if maintaining this framework is not your core job, this can get annoying pretty fast.
Don't get me wrong, it can definitely pay off to build something of your own. After all, most best-of-breed open source frameworks on the market today where once started because some programmer in some company wasn't satisfied with the readily available stuff. I'm just saying that you should really think twice before starting development of some framework and carefully consider your options.
For instance, if you feel the need to develop yet another web framework, logging framework or ORM for Java, my first advice would be to simply not do it. The amount of web frameworks already available is simply getting ridiculous and don't think lightly of the pressure you put on your fellow programmers who have to learn a new g*dd*mn framework for each and every new Java job they take. It would be far better to start with one of the excellent web frameworks already available (e.g. JSF, Wicket), and extend them where necessary. Most good frameworks have a lot of options for customization and extension. Going that route will not only safe you and your fellow programmers time, it will also allow you to leverage third party offerings for that framework. In the case of JSF, there are a couple of high quality component sets emerging (like RichFaces and PrimeFaces). With your own web framework, you simply won't be able to use any of that.
The database doesn't get offended if you ask it the same thing over and over.
The database maybe not, but those other customers on whose behalf your code is executing other queries that now run slower because you needlessly consume a big chunk of the DB's resourced surely will get offended. Offend them enough and they will run away. This will in turn offend your manager or boss and when it becomes clear that YOU were the one who's careless code caused the server to grind to a halt, you might find yourself without a job.
Particularly for (public) websites you have to be careful to never consume an unreasonable and unnecessary amount of resources. The database might not groan when you run your single user test executing 10,000 queries in a loop for one user. However, just a little peak in your traffic and you suddenly find some odd 10,000,000 queries being fired at your DB, where 1000 could have sufficed. Believe me, there is a HUGE difference between being able to process 1000 queries per second and 10,000,000.
A mentality such as yours is frequently the reason why websites completely fail to scale up as traffic increases. We shouldn't overzealously optimize in advance, but executing 10,000 queries where 1 would perfectly suffice only 'because the DB doesn't get offended' is just silly.
I've said this before and I'll say it again, because I've created plenty of unoptimized "batch jobs" that take longer than they should.
COMPUTERS ARE CHEAPER THAN HUMANS
A cubicle filled with racks of computers running inefficient batch jobs costs a tiny fraction of a competent person sitting in the same cubicle and optimizing every little thing by hand.
The best results are often obtained by going for a hybrid approach. Either extreme is not good. I've seen places where people tried to safe a mere 50 bucks for some simple memory upgrade, and instead letting a team of programmers optimize code for months. On the other hand, downright sloppy programming like asking the database a thousand time the same question in a loop and just throwing all the hardware at it that you can buy "since programmers shouldn't be bothered to think cause that costs money" is equally stupid.
Just make sure you got reasonably fast hardware. You don't need 300 servers with 128GB of memory to run the website of Joe's Sandwich shop, but you surely don't have to artificially limit yourself to a 256MB server for running that huge enterprise app your 100+ staff makes daily use of. Likewise, let programmers spent some time on optimizing code. Lifting some invariant from a loop is an easy win which every programmer worth its salt should be able to do. Knowing how to operate a profiler and having some basic idea of what part of the app gets hit most frequently, so some focussed optimization can be directed to hot spots never hurts too.
Pre-mature optimization surely is the root of all evil, but some common sense optimizations in proven hotspots really should be among the standard tasks of every developer. Don't mindlessly change all your instances of Double to double in Java, since that is supposedly faster, but in that tight loop of 10,000,000 iterations that executes every few minutes it might not be a bad idea to simple change 1 letter in order to prevent 10,000,000+ unnecessary boxing/unboxing operations.
They are only "global variables" when they are JVM (or whatever) scoped. The problem with simple global variables that they are always global scoped
Not true. "simple global variables" (public static fields of some class) can be isolated in Java by means of class loaders. Although class loaders can be a PITA themselves, they actually work for this.
Given two bugs, one can be as simple as adding a forgotten quote somewhere, while the other can amount to weeks of digging through the lowest levels of some code base.
There's also a third category, my favorite: weeks of digging through the lowest levels of some (old, undocumented, messy) codebase, which is ultimately followed by a fix that adds a forgotten quote. How do you even quantify that kind of thing?
That's a good question. It's actually not an uncommon situation at all. Some years back we discovered that posting a page to a Tomcat based server would sometimes just disappear. We spent day and night researching this, had discussions via mail with some Tomcat developers, set up probes and loggers just about everywhere, even spent a lot of time reading huge amounts of Tomcat's source code looking for a possible clue, when eventually we found the culprit: an overambitious system administrator had changed some timeout value in the AJP connector between Apache and Tomcat to an extremely low value. It was a sheer miracle that there even were requests at all that made it through. The ultimate fix that had costed several weeks worth of time, a couple of emergency meetings, starts of drafting migration plans to just about anything else that possibly wouldn't have this problem, consisted of adding I believe no more than three zeroes to this particular timeout value...
This is indeed somewhat of a problem in our profession. It's in general hard to find good metrics that quantify the performance of a programmer. Lines of code, number of closed tickets, or years of experience are all sometimes used but even though these might be indicative of performance, they all don't necessarily have to mean much.
Lines of code has been discussed quite often over the years, but it's typically not seen as a good indicator. People may use a lot of white space, or write a bunch a spaghetti code based on blindly copy-pasting stuff around. This blind copy pasting will result in extremely bad code that's often impossible to maintain. A better performing developer may actually refactor all this duplicate code and abstract it into some common class or method, in which case the LOC produced by said developer may actually be negative! Worse yet, people may check in stuff like.dia files to their source code repository, which might boost your supposed LOC productivity with thousands of lines, while all you did was draw a box with an arrow pointing to it.
On the other hand, LOC also doesn't mean nothing. I've seen developers reading slashdot all day instead of coding and as a result their daily, monthly and even yearly LOC count was extremely low. We use among others statsvn http://www.statsvn.org/ and though not perfect it does give a very crude indication of who's very active and who's basically doing nothing all day long.
Number of closed tickets is an indicator too, but just as with lines of code hard to really use for measuring some one's performance. Tickets (issues/bugs) can vary wildly in complexity and the "estimated amount of hours" and "impact" is hardly ever accurate. Given two bugs, one can be as simple as adding a forgotten quote somewhere, while the other can amount to weeks of digging through the lowest levels of some code base. Yet, on average, if tickets are assigned to developers without really taking into account their abilities, then over a longer period of time all developers should on average get an equal amount of quick&easy and hard tickets. In that case, the number of closed tickets might be indicative again. Someone who barely ever closes a ticket might not be that top performer, despite the inaccuracy of the ticket measurement.
Years of experience, which is I think used the most, is maybe also the most debatable of them all. It's a very natural measurement tool which takes no personal stuff into account. It's a very basic and easy to measure number. But here too, it can be deceiving. I've seen programmers who had some 8 years of Java experience, but appeared to be totally unable to pass a basic Java test and produced nothing but WTFs in their code like concatenating strings to each other with commas in between instead of storing them into a list, simply because they didn't grasp how a simple list actually worked! (I kid you not, I actually encountered this). In contrast with this, there's the guy (or gal) taking up some part-time job while still studying, who understands even complex stuff in the blink of an eye and produce nothing but exemplary code. But here too, given a group of all reasonable knowledgeable programmers, the ones with the most experience typically win out. When I look at my own code that I produced 10 years ago and compare it with what I produce now, I most definitely see a vast improvement.
Even though management might often have difficulties with measuring the performance of a programmer, there is one group of people who are true experts here: the team mates of said programmer; his or her fellow programmers! If you have worked in a team for some time, everybody knows who's the ace, who's the simply capable one and who is obviously trailing behind. As a programmer you actually work with the code of that other programmer. You are either able to extend that code with the greatest ease because of the elegant design and clear names being used, or you curse every minu
music is also a distraction; you should be thinking about the problems and coding rather than focusing on the deep beats of the music:).
Well, it might be a distraction sometimes, but it might also help. Specific kinds of music might get you in some flow, where you just become more alert. Often at the end of the day, things can be a bit slow and developers can become a little sleepy, which is certainly not helping with thinking about problems. Music, as long as it's not of the distracting kind, can actually help here.
Next to that, not all coding requires deep thoughts about complex problems. There always is some mundane amount of work to be done. Be it moving that button from left to right, changing those Strings to Integers, replacing some hardcoded text with i18n keys, etc. I often find that putting on some music gets me through this mundane stuff. Eventually I stumble on something interesting which does require some good thinking and then I might indeed turn the music off.
Our office is one large open space without any cubicles whatsoever. The developers used to be mixed with accounting, sales, etc but this was abandoned after complaints starting to pile up. Now the developers all sit together in one corner of the office, which is an improvement although you can still here some slight prattle from the other guys.
Our music policy is somewhat relaxed though. A while back me and a co-worker of my won an iphone dev competition, where the first prize was a speaker set that we installed at the office. I can play music on this, providing it's not too loud. There basically is only one woman in accounting who can't stand music, but we either ignore her complains or we wait till she is in a meeting or something before turning on the music.
Headphones are okay too at our place. The guy who's sitting next to me turns up the volume so much though, that we can all enjoy what he's listening to. No need to use headphone really, he might as well use the speaker set:P
These can be comments in code, but SVN (or any SCM) commit comments are also extremely useful for this.
I fully agree with that. That's why I usually go out of my way to persuade the developers in my team to write good commit comments. Of course, source code typically can't be attached to an SCM forever, but in most production environments it typically is (or well, should be...).
Next to commit comments, don't forget a link to an issue, ticket or bug (e.g. Jira, Trac, Bugzilla) that was the reason for the code to be written or modified. I see well written tickets and commit comments as an integral part of the code's documentation. Now some people only see a ticket as an order to do some work and thereafter discard the ticket. With such a view, you don't write your tickets in such a way that you still understand a year later what the issue was. E.g. tickets like: "Fix the prod. problem that bothered bob this afternoon". For me such tickets are worthless.
The required use of design anti-patterns, such as the Facade pattern, or the Endless-Wrapper pattern, where every OS call must be slowed down by an additional layer of indirection
The Facade pattern is not necessarily an anti-pattern. It can greatly simplify access to a complex sub-system, but as with all things this pattern should be used moderately and with care.
Wrappers can be overused indeed, but "every OS call must be slowed down" is exaggerating. For starters, it's just a call, you don't need to qualify it with "OS". Secondly, modern compilers and runtime optimizers (e.g. the JVM), can inline calls. Inlining calls effectively nullifies any overhead. Furthermore, this overhead is very minimal on modern systems and finally, it only really matters when doing many calls in a tight loop. The average business request which runs in say 100ms is completely unaffected by the 'overhead' of 10 extra calls.
I am the asshole with the 40-character variable names. Yes, I need to split statements over multiple lines sometimes, yes I have more lines of code and it takes me longer to type
Luckily, most of today's IDEs offer you variable name completion. Just enter a few letters, hit something like ctrl+space and the IDE completes the name for you. Using long variable names doesn't necessarily means you have to type more.
We deal with this by using something that can be described as code editors.
These are people that edit raw code committed by other programmers and refactor it in order to comply with a set of standards. Editing may entail just changing a few variable names, but it can also be almost a complete rewrite of the code. Of course, if code rewrites become the rule rather than the exception, something is wrong and some serious talks with the original programmer will be necessary.
As long as the code editors are knowledgeable people and agree among themselves about the specific style, best practices and architecture to be used, this can greatly improve the quality and consistency of the code.
The point of writing code is not to have great code. It's to produce something that benefits the end user. The users don't care about the code or how cool it is. It's a tool they have to use in order to accomplish whatever their real goals are.
The point of building a house is not to have a great foundation. It's to produce something that benefits the end user. The users don't care about the foundation or how cool it is. It's a thing they have to live in, in order to accomplish whatever their real goals are.
Sounds insane? That's because it is. We wouldn't want to live in a house with a rotten foundation, why then would we want to accept that for software? Without a reasonably sound foundation of the code, your product will wither, fall over and simply cease to exist. I've seen too many examples of code bases that where rotten to the core, precisely because of people like you who think it's ONLY about external appearances. I suggest reading thedailywtf a little and you'll be surprised about the horrors that result from sloppy programming.
Java is the shovel and "cutting-edge" languages such as Python, Ruby and even PHP are the digging machines.
[...]
Java programmers who are to stupid to experience more advanced technologies doesn't understand it either.
Hold on there for a sec. PHP is an advanced technology? Really??? Crap like this http://lwn.net/Articles/379909/ is what 'advanced' technology is about?
PHP is basically JSP with plain Java SE on it, and a rather poor version of that approach that this. It's perfectly suited for Joe's Burgers 4 page website, but when something more 'advanced' is needed, PHP just doesn't cut it.
For instance, the platform offers zero provisions for any form of concurrency. There is not even something like the basic threading support that Java has had since 1.0, let alone that there is any support for simple asynchronous execution (e.g. the @Asynchronous annotation) or highly efficient parallel computing paradigms like a fork/join framework. And what about an O/R mapper that comes with Java by default? How about dependency injection? What about declarative transactions? What about transactions that automatically span multiple method calls and multiple transactional resources?
Does PHP offers anything like that? Does the standard PHP library already comes with an MVC framework? Is unicode already natively supported? Is there any name spacing support? Is there ...
I can produce results ten times as fast by using Django/Python/nginx than with J2EE/Java/Glassfish
And I can produce results a hundred times as fast by using Java EE/Java/JBoss AS than with Django/Python and surely PHP. I'm not sure what this proves though... Also, don't forget that hacking together stuff and throwing it out is nice when you're 16 and in high school, but in the real world maintainability, correctness, stability, scalability, etc etc all count and in the end are way more important than just the ability to get some prototype like code out of the door, that 'usually-works-but-breaks-down-every-tuesday-when-it-rains'.
On top of that, Jodatime (or rather a slightly modified version of it) is slated to be included in the standard library of Java 7.
also var from javascript, for declarations where you don't want to type out ArrayList<SomeObject> myList = new ArrayList<SomeObject>();. Just a few tweaks here and there would make Java so much more legible
It does indeed gets a tweak a little like this. You still have to write out the declaration in full, but there's a shorthand for the generic part of the right hand side: ArrayList<SomeObject> myList = new ArrayList<>();. It doesn't look like a major improvement in this example, but when there is a lot of generic type information, it can surely help.
> OTOH, Java has support for generics
How are generics a good thing?
Because:
There are some other reasons though, not all generic usage is for collections, but these are two major use cases.
They are a band aid for the fact that not everything in Java is an object.
No they're not. Maybe unfortunately, but they don't even work with primitives in Java.
Plus any JVM I have seen is a piece of shit. Sorry, but if the official JVM takes several seconds just to start, that disqualifies it from a lot of perfectly good uses.
Java is currently primary used server side. I'm not sure what kind of code you write, but if we need to restart the JVM once a month it's already a lot. A few seconds doesn't really matter when the uptime of your server is at least weeks, but more likely months or for 'finished & stable' software maybe even years.
The amount of times I've almost been knocked over on Damrak by a 6 foot woman cycling at 30 mph with 3 kids in trailers behind her while crossing the bike roads, only to then almost get knocked down by a tram... The Dutch must be so bored with drunk foreigners getting in the way of their high-speed cycling.
Well, to be honest they actually seem to disrespect themselves even less ;) If you can read Dutch, take a look at this: http://www.amsterdamcentraal.nl/archief/2010/3/23/je-moeder#c especially the comment made by henk @ "25 maart 2010 20:03 uur".
In addition to that, cyclists in Amsterdam are notorious for this kind of anti-social behavior. Try to visit some cities in the direct surroundings of Amsterdam (Haarlem, Utrecht) and you'll find a whole different atmosphere.
The Dutch are great though - ik hou van jullie. :)
Hey, thanks man! We houden ook van jullie hoor. ;)
Polite Europeans huh? Guess you never had to jump away for a bicycle approaching you at high speed while you were at the middle of a pedestrian crossing in Amsterdam! :P
Yes that's true. My main point is simply that the situation isn't as black/white as saying that either using frameworks or using hand crafted code is always better for performance. It depends really and the ability of an engineer to chose and use a given framework wisely is an important asset. Some bigger frameworks like Oracle's Coherence can absolutely improve performance too, while some home cooked solution may never obtain the level of performance that Coherence has, simply because you are maybe just not a good a programmer as Cameron is. But blindly applying Coherence for each and every problem simply because you heard "it's good for performance" of course doesn't work. But I doubt a programmer who thinks like that is capable of writing anything efficient himself anyway ;)
Buying a new DB server is cheaper than fixing the code. Renting one (say, from Amazon Web Services or Microsoft Azure) and keeping it offline until you're actually using it is cheaper still.
It depends. Sure, there are some models out there which are quite hard to beat. I surely won't underestimate the lurking power of certain cloud or grid offerings. Yet, IMHO it's naive to think 'the cloud' will just fix all your problems for you.
For starters, you might have a single query and a DB that can't run a single query in parallel (like Postgres or MySQL) which is already executing on the fastest hardware available. We for instance have a DB running on a dual socket Nehalem based server (8 cores), running at some 3.4Ghz with 4 Areca 1680's (each with 4GB of cache for a total of 16GB) and 32 Intel X25-E SSDs (8 per Areca). I dare you to come up with a faster machine without stepping into a completely different price category. You know, the kind of category where prices aren't listed anymore but sales representatives visit your office to negotiate a deal.
At the moment, there simply isn't a faster DB available and even the hugely expensive IBM POWER offerings are only a factor or so faster for our kind of workload. So simply buying a faster machine is absolute not an option, even if we had money to burn (which incidentally, we haven't). So, if I can optimize some piece of code in 30 minutes, maybe even less, why shouldn't it do that? It's basically as simple as turning:
into
Why shouldn't I grab that low hanging fruit, yet pursue on a mission to find a faster machine since Blakey Rat on /. told me that software optimizations, no matter how trivial, never pays of? That would be rather silly, wouldn't it? On top of that, you seem to forget that a new & faster machine even if would exist doesn't magically fly into service. Real people have to order it. Real people have to install it with something, real people have to test its stability, real people have to place it into some rack, etc. All of that costs money, time and potentially worries too.
But to blithely come in here and say "you're wrong, because what you said about batch processes doesn't apply to websites!" that's just insulting to me and all your readers.
It is surely not my intention to insult anyone, but if you read back my post then you might see that I mentioned what I said was true particularly for web sites, not exclusively for web sites. Although there is a big difference between interactive and non-interactive payloads, at the end of the day a batch job should not recklessly consume resources either. Just because batch jobs can be queued and can be run right after each other, doesn't mean there isn't someone waiting for the result. Worse yet, if batch jobs are submitted to your queue at a certain rate and your system can only process them at a lower rate because of some completely unnecessary resource hogging, then eventually your queue will spill over. Your DB may still not be offended, but a slow turnaround and a denial of accepting new jobs surely will offend customers again.
Now you'll probably say that I simply need to throw more hardware at it and that batch jobs by their very nature can always run in parallel on multiple machines, but this is certainly not true in general. In order to make that possible you need to carefully craft your software architecture, which takes... guess what... engineering effort. If you'r
I'm a bit of both an algorithmics and programming language design geek with leanings for high level programming, and I never quite understood
and a programming technique compatible with the underlaying hardware and/or OS structure,
this part. A good language that allows for flexible design doesn't necessarily have to do much with the hardware, but it still allows you a very good access to the algorithmic part of the problem.
It depends really. The right algorithm and especially the complexity of that algorithm obviously matters a lot, but for true high performance computing, knowing your hardware can make a huge difference too. Regarding space/time optimization, it simply matters to know if you have more memory or more cpu cycles to burn in your actual machine. Very few to no algorithms can optimize for both space and time, so you have to know your target hardware in order to make the necessary tradeoff.
Next to such general choice, it can matter immensely if your hardware has special provisions that you can exploit. For instance some CPUs have hardware support for BCD arithmetic, which can speed up calculations a great deal if your software can take advantage of them. Knowledge about alignment of your memory can matter a lot too. Mathematically identical algorithms may cause your target machine to resort to trashing if you misalign every memory access and/or access 'adjacent' memory cells in the wrong order. In you have to process some matrix, it normally doesn't matter for the algorithm whether your process it row wise or column wise. If the matrix is stored into memory in such a way that cells in a single column are adjacent in memory, while each row in this matrix is the column length of cells away, then processing row by row may cause a page fault for every new cell access which completely blows your performance away.
Then there are such things as certain CPU instructions like an atomic test and set, or certain addressing modes that your language may or may not be capable of taking advantage of. Using such things can matter a lot too. Then there is a whole category of special hardware, like DSPs, vector units and lately GPUs. Taking advantage of such hardware might make more of a difference then choosing between a bad and a good 'normal' algorithm that only runs on a general purpose CPU. A simplified example is hardware H264 decoding. If your CPU is not powerful enough, then no algorithm, no matter how cleverly constructed will be able to decode say an HD H264 stream. But if your machine happens to have an HD H264 hardware decoder, then obviously simply using it in your software will yield much better results. On a more fine-grained scale, it's the same with all those little things your hardware offers. Is your language able to take advantage of that, or does it just let it sit idle?
I basically agree with your point that making the right choices regarding performance matters immensely. It's just that I don't think this necessarily means that you should write all code yourself instead of intelligently using a library or framework.
You can either write slow and inefficient code yourself, or you can use an existing library in such a way that it will be slow and inefficient and vice versa. I've written high performance software capable of processing 6000 fairly complex transactions per second on moderate hardware, while still using highly convenient library classes like Java's ConcurrentHashMap. Some would perhaps even say that just using Java is already a 'framework' and I should have used C++ or maybe assembly... where does it stop?
I'm also implementing some high performance (beta) code using preleases of Java's upcoming fork/join framework and the level of performance I can squeeze out of this is fairly impressive to say the least. I'm not sure if I would be able to write my own "work-stealing work queue" that offers better performance. Actually, I'm pretty sure I wouldn't.
The home grown solution is often faster to develop than the time it takes to learn the intricacies of a monolithic framework. It is also easier to teach other employees the logical to-the-point framework that does only what is required by the problem you are trying to solve.
I'm not entirely sure about that last thing you're saying. The problem is that teaching your framework over and over again to new developers takes a fair amount of time too. You'll also have to make damn sure you write good documentation and keep it up to date. Depending on the size of your development team and the amount of time dedicated to such a framework this may be doable, but more often than not this is precisely the part that's lacking.
A standard well known framework has the advantage of having heaps of documentation already written. When stuck, programmers can post questions in forums, google away for answers, buy some books or even follow some course. This won't be possible with your proprietary in-house developed framework. When stuck, programmers will come directly to you and if maintaining this framework is not your core job, this can get annoying pretty fast.
Don't get me wrong, it can definitely pay off to build something of your own. After all, most best-of-breed open source frameworks on the market today where once started because some programmer in some company wasn't satisfied with the readily available stuff. I'm just saying that you should really think twice before starting development of some framework and carefully consider your options.
For instance, if you feel the need to develop yet another web framework, logging framework or ORM for Java, my first advice would be to simply not do it. The amount of web frameworks already available is simply getting ridiculous and don't think lightly of the pressure you put on your fellow programmers who have to learn a new g*dd*mn framework for each and every new Java job they take. It would be far better to start with one of the excellent web frameworks already available (e.g. JSF, Wicket), and extend them where necessary. Most good frameworks have a lot of options for customization and extension. Going that route will not only safe you and your fellow programmers time, it will also allow you to leverage third party offerings for that framework. In the case of JSF, there are a couple of high quality component sets emerging (like RichFaces and PrimeFaces). With your own web framework, you simply won't be able to use any of that.
The database doesn't get offended if you ask it the same thing over and over.
The database maybe not, but those other customers on whose behalf your code is executing other queries that now run slower because you needlessly consume a big chunk of the DB's resourced surely will get offended. Offend them enough and they will run away. This will in turn offend your manager or boss and when it becomes clear that YOU were the one who's careless code caused the server to grind to a halt, you might find yourself without a job.
Particularly for (public) websites you have to be careful to never consume an unreasonable and unnecessary amount of resources. The database might not groan when you run your single user test executing 10,000 queries in a loop for one user. However, just a little peak in your traffic and you suddenly find some odd 10,000,000 queries being fired at your DB, where 1000 could have sufficed. Believe me, there is a HUGE difference between being able to process 1000 queries per second and 10,000,000.
A mentality such as yours is frequently the reason why websites completely fail to scale up as traffic increases. We shouldn't overzealously optimize in advance, but executing 10,000 queries where 1 would perfectly suffice only 'because the DB doesn't get offended' is just silly.
I've said this before and I'll say it again, because I've created plenty of unoptimized "batch jobs" that take longer than they should.
COMPUTERS ARE CHEAPER THAN HUMANS
A cubicle filled with racks of computers running inefficient batch jobs costs a tiny fraction of a competent person sitting in the same cubicle and optimizing every little thing by hand.
The best results are often obtained by going for a hybrid approach. Either extreme is not good. I've seen places where people tried to safe a mere 50 bucks for some simple memory upgrade, and instead letting a team of programmers optimize code for months. On the other hand, downright sloppy programming like asking the database a thousand time the same question in a loop and just throwing all the hardware at it that you can buy "since programmers shouldn't be bothered to think cause that costs money" is equally stupid.
Just make sure you got reasonably fast hardware. You don't need 300 servers with 128GB of memory to run the website of Joe's Sandwich shop, but you surely don't have to artificially limit yourself to a 256MB server for running that huge enterprise app your 100+ staff makes daily use of. Likewise, let programmers spent some time on optimizing code. Lifting some invariant from a loop is an easy win which every programmer worth its salt should be able to do. Knowing how to operate a profiler and having some basic idea of what part of the app gets hit most frequently, so some focussed optimization can be directed to hot spots never hurts too.
Pre-mature optimization surely is the root of all evil, but some common sense optimizations in proven hotspots really should be among the standard tasks of every developer. Don't mindlessly change all your instances of Double to double in Java, since that is supposedly faster, but in that tight loop of 10,000,000 iterations that executes every few minutes it might not be a bad idea to simple change 1 letter in order to prevent 10,000,000+ unnecessary boxing/unboxing operations.
They are only "global variables" when they are JVM (or whatever) scoped. The problem with simple global variables that they are always global scoped
Not true. "simple global variables" (public static fields of some class) can be isolated in Java by means of class loaders. Although class loaders can be a PITA themselves, they actually work for this.
Nothing like shooting the shit with your pals after a long day of watching Star Trek VHS tapes.
Drag yourself into 2010 man, Star Trek is on Blu-ray now...
(and yes, 1985 called, they do want their VHS tapes back)
Given two bugs, one can be as simple as adding a forgotten quote somewhere, while the other can amount to weeks of digging through the lowest levels of some code base.
There's also a third category, my favorite: weeks of digging through the lowest levels of some (old, undocumented, messy) codebase, which is ultimately followed by a fix that adds a forgotten quote. How do you even quantify that kind of thing?
That's a good question. It's actually not an uncommon situation at all. Some years back we discovered that posting a page to a Tomcat based server would sometimes just disappear. We spent day and night researching this, had discussions via mail with some Tomcat developers, set up probes and loggers just about everywhere, even spent a lot of time reading huge amounts of Tomcat's source code looking for a possible clue, when eventually we found the culprit: an overambitious system administrator had changed some timeout value in the AJP connector between Apache and Tomcat to an extremely low value. It was a sheer miracle that there even were requests at all that made it through. The ultimate fix that had costed several weeks worth of time, a couple of emergency meetings, starts of drafting migration plans to just about anything else that possibly wouldn't have this problem, consisted of adding I believe no more than three zeroes to this particular timeout value...
This is indeed somewhat of a problem in our profession. It's in general hard to find good metrics that quantify the performance of a programmer. Lines of code, number of closed tickets, or years of experience are all sometimes used but even though these might be indicative of performance, they all don't necessarily have to mean much.
Lines of code has been discussed quite often over the years, but it's typically not seen as a good indicator. People may use a lot of white space, or write a bunch a spaghetti code based on blindly copy-pasting stuff around. This blind copy pasting will result in extremely bad code that's often impossible to maintain. A better performing developer may actually refactor all this duplicate code and abstract it into some common class or method, in which case the LOC produced by said developer may actually be negative! Worse yet, people may check in stuff like .dia files to their source code repository, which might boost your supposed LOC productivity with thousands of lines, while all you did was draw a box with an arrow pointing to it.
On the other hand, LOC also doesn't mean nothing. I've seen developers reading slashdot all day instead of coding and as a result their daily, monthly and even yearly LOC count was extremely low. We use among others statsvn http://www.statsvn.org/ and though not perfect it does give a very crude indication of who's very active and who's basically doing nothing all day long.
Number of closed tickets is an indicator too, but just as with lines of code hard to really use for measuring some one's performance. Tickets (issues/bugs) can vary wildly in complexity and the "estimated amount of hours" and "impact" is hardly ever accurate. Given two bugs, one can be as simple as adding a forgotten quote somewhere, while the other can amount to weeks of digging through the lowest levels of some code base. Yet, on average, if tickets are assigned to developers without really taking into account their abilities, then over a longer period of time all developers should on average get an equal amount of quick&easy and hard tickets. In that case, the number of closed tickets might be indicative again. Someone who barely ever closes a ticket might not be that top performer, despite the inaccuracy of the ticket measurement.
Years of experience, which is I think used the most, is maybe also the most debatable of them all. It's a very natural measurement tool which takes no personal stuff into account. It's a very basic and easy to measure number. But here too, it can be deceiving. I've seen programmers who had some 8 years of Java experience, but appeared to be totally unable to pass a basic Java test and produced nothing but WTFs in their code like concatenating strings to each other with commas in between instead of storing them into a list, simply because they didn't grasp how a simple list actually worked! (I kid you not, I actually encountered this). In contrast with this, there's the guy (or gal) taking up some part-time job while still studying, who understands even complex stuff in the blink of an eye and produce nothing but exemplary code. But here too, given a group of all reasonable knowledgeable programmers, the ones with the most experience typically win out. When I look at my own code that I produced 10 years ago and compare it with what I produce now, I most definitely see a vast improvement.
Even though management might often have difficulties with measuring the performance of a programmer, there is one group of people who are true experts here: the team mates of said programmer; his or her fellow programmers! If you have worked in a team for some time, everybody knows who's the ace, who's the simply capable one and who is obviously trailing behind. As a programmer you actually work with the code of that other programmer. You are either able to extend that code with the greatest ease because of the elegant design and clear names being used, or you curse every minu
(and yes, some people prefer to talk rather than email - mainly because it actually gets a result)
Maybe THEY do, but for US it's a lot easier to discuss code via email than over the phone.
Ever tried to orate a code fragment of non trivial size? It ain't pretty I tell ya :P
music is also a distraction; you should be thinking about the problems and coding rather than focusing on the deep beats of the music :).
Well, it might be a distraction sometimes, but it might also help. Specific kinds of music might get you in some flow, where you just become more alert. Often at the end of the day, things can be a bit slow and developers can become a little sleepy, which is certainly not helping with thinking about problems. Music, as long as it's not of the distracting kind, can actually help here.
Next to that, not all coding requires deep thoughts about complex problems. There always is some mundane amount of work to be done. Be it moving that button from left to right, changing those Strings to Integers, replacing some hardcoded text with i18n keys, etc. I often find that putting on some music gets me through this mundane stuff. Eventually I stumble on something interesting which does require some good thinking and then I might indeed turn the music off.
Our office is one large open space without any cubicles whatsoever. The developers used to be mixed with accounting, sales, etc but this was abandoned after complaints starting to pile up. Now the developers all sit together in one corner of the office, which is an improvement although you can still here some slight prattle from the other guys.
Our music policy is somewhat relaxed though. A while back me and a co-worker of my won an iphone dev competition, where the first prize was a speaker set that we installed at the office. I can play music on this, providing it's not too loud. There basically is only one woman in accounting who can't stand music, but we either ignore her complains or we wait till she is in a meeting or something before turning on the music.
Headphones are okay too at our place. The guy who's sitting next to me turns up the volume so much though, that we can all enjoy what he's listening to. No need to use headphone really, he might as well use the speaker set :P
These can be comments in code, but SVN (or any SCM) commit comments are also extremely useful for this.
I fully agree with that. That's why I usually go out of my way to persuade the developers in my team to write good commit comments. Of course, source code typically can't be attached to an SCM forever, but in most production environments it typically is (or well, should be...).
Next to commit comments, don't forget a link to an issue, ticket or bug (e.g. Jira, Trac, Bugzilla) that was the reason for the code to be written or modified. I see well written tickets and commit comments as an integral part of the code's documentation. Now some people only see a ticket as an order to do some work and thereafter discard the ticket. With such a view, you don't write your tickets in such a way that you still understand a year later what the issue was. E.g. tickets like: "Fix the prod. problem that bothered bob this afternoon". For me such tickets are worthless.
The required use of design anti-patterns, such as the Facade pattern, or the Endless-Wrapper pattern, where every OS call must be slowed down by an additional layer of indirection
I am the asshole with the 40-character variable names. Yes, I need to split statements over multiple lines sometimes, yes I have more lines of code and it takes me longer to type
Luckily, most of today's IDEs offer you variable name completion. Just enter a few letters, hit something like ctrl+space and the IDE completes the name for you. Using long variable names doesn't necessarily means you have to type more.
We deal with this by using something that can be described as code editors.
These are people that edit raw code committed by other programmers and refactor it in order to comply with a set of standards. Editing may entail just changing a few variable names, but it can also be almost a complete rewrite of the code. Of course, if code rewrites become the rule rather than the exception, something is wrong and some serious talks with the original programmer will be necessary.
As long as the code editors are knowledgeable people and agree among themselves about the specific style, best practices and architecture to be used, this can greatly improve the quality and consistency of the code.
The point of writing code is not to have great code. It's to produce something that benefits the end user. The users don't care about the code or how cool it is. It's a tool they have to use in order to accomplish whatever their real goals are.
The point of building a house is not to have a great foundation. It's to produce something that benefits the end user. The users don't care about the foundation or how cool it is. It's a thing they have to live in, in order to accomplish whatever their real goals are.
Sounds insane? That's because it is. We wouldn't want to live in a house with a rotten foundation, why then would we want to accept that for software? Without a reasonably sound foundation of the code, your product will wither, fall over and simply cease to exist. I've seen too many examples of code bases that where rotten to the core, precisely because of people like you who think it's ONLY about external appearances. I suggest reading thedailywtf a little and you'll be surprised about the horrors that result from sloppy programming.