Don't Shoot Me, I'm Only the Software
ctwxman writes "How often have you heard about some massive crash and then the blame was placed on the software? "Disasters are often blamed on bad software, but the cause is rarely bad programming." If you've been looking to blame your boss, this article from MSNBC says your ship has come in! Poor planning, poor execution and poor leadership are more likely to blame than bad code when it comes to systems that fail. "
How ironic that MSN(BC) is pushing a story about 'don't blame the programming'.
Although legitimate in the concept, I would say that poor programming is most definitely a cause for system failures.
If you've been looking to blame your boss, this article from MSNBC says your ship has come in!
I think this little gem says it all. Strangely enough, it's today's Dilbert. Thing is, the buck-passers are who protect their own image or the image of those who write their cheques. The result? Too many projects are blamed on interns or programmers, rather than the truth coming to light.
Why? I think it's simple, really. Management often has no clue what they are doing in terms of managing a technical project so they make decisions about things like the exact features, and they often fight to get things a certain way -- unwittingly forcing programmers to code all the way around the block to get to the house next door, leaving problems in the wake.
The best case is when a programmer is given design autonomy. That's why Open Source is such a threat to large companies like Microsoft... because the guys who know what *can* be done, are the same guys doing it -- the result is 1111x better, and cheaper too.
I am so lucky to be working now for a company that allows me to have full autonomy with my projects. They tell me what the customer wants and I do it the way I think is best. Every single project done in this manner has resulted with happy customers and excellent systems.
The dangers of knowledge trigger emotional distress in human beings.
I have worked at CIMM level -3 and at CMMi level 5 groups. Starting at level 5, you're about as likely to win the lottery and while on the vacation at the moon than getting fired for bad software; at level 1 your highly likely to get fired for a bad programming mistake; level -3 you try to point the finger for anything.
.01%). Thus, if you want management to point fingers go down in levels but if you want the group to be aware of problems then look for a high CMM level group to work for. Disclaimer this is now way scientific but used as illustrative purposes; objects may be closer than they appear; no left turn on red; do not pass Go.
Now there's a mathematical formula (let me see if I can derive one) for each level you go down, the half-life of bad software divided by the software engineer goes up a log base 10 (4 - 95%, 3 - 90%, 2 - 75%, 1 - 50%, 0 - 25%, -1 10%, -2 - 2%, -3 -
Big projects require organization or shit happens.
Uh, that's it. Thrilled?
Great.
Bugs and errors do not neccessarily mean BAD programming. Bad means that it sucks and it's of no quality level.
There may be minor flaws in things that an application relies on i.e. external code libraries or methodologies which have not been proven and tested.
Speaking of tested, how many coders here can testify that they are great programmers, but so many times have not been given appropriate amounts of time or resources to write something that works perfect.
That to me is not bad programming, because many times under these situations programmers do an amazing job of writing amazing code which actually works for the most part.
There's too many managers out there who like things to work 90% and say that the other 10% (which usually ends up being crucial to end users) can be dealt with after the initial release.
The more things change, the more they stay the same. I fought in the Brain Wars with management 30 years ago, and it was the same thing. The Powers wanted X, but system capabilities were Y. They did not want the issue confused with facts, they just wanted wehat they wanted, and wanted it yesterday. My peers and I coded it as close as we could, implemented it, and crossed our fingers. We kept the app running for about a week (with frequent bailing wire and BandAide patches), but the system eventually melted down due to data overload (fancy-speak for filled up the disk).
Management skated, two programmers fired.
That's why I raise cattle and write hunting articles these days.
Ignorance is curable, stupid is forever.
The story was that windows had to be rebooted regularly or simply would stop working and reboot on its own.
Now of course you are right that some admin forgot the fortnightly reboot and that led to the problems, but I simply can't totally dispute the notion that a server OS that has to be regularly rebooted should at least take a share of the blame.
The article cites as an example,
Last month, a system that controls communications between commercial jets and air traffic controllers in southern California shut off because some maintenance had not been performed.
As I recall, the system in question has to be rebooted every thirty days, which is a software problem! The very fact that there were ridiculous procedures to fail to carry out is due to the poor software in the first place.
I beg to differ slightly.
Software projects seem to be primarily constrained by time/money which is usually controlled by management (read: boss)
If one wants to test software properly then you will need lots of the constraints (i.e. time and/or money). Just before a coder is testing his block, he/she will generally say something like:
"I'm finished the block, just need to test it a bit more"
Generally that is not what management will hear, they hear:
"I'm finished"
So they think "its ready". I've seen several first generation projects get hit by this problem (in commercial environments). In the IC design world (where its not generally possible to just flash the firmware to fix a bug) its accepted that at this point - i.e. primary design is finished you are only 50% of the way through. We spend at least half the time verifying the blocks. Management in IC design have accepted that this just as important as the implementation and so don't go off making wild assumptions.
So rather than just pawn off the blame onto your boss, it really is (partially anyway) your fault as well for not highlighting the fact that your block is not as tested as you would like it to be.
The philosophy of open source seems to limit the "its ready" effect to a good degree and hence the better code quality perception. When main stream commercial coding picks up the slack, it should get better as well. But generally a lot of these messes can be attributed to communication (person to person) failure rather than coder/boss failures.
[ Monday is a terrible way to spend one seventh of your life. ]
how many large companies think that they can still be successful by programming their way out of problems.
If you work at a company that places some value on requirements and design development before you start cranking out code, consider yourself fortunate. And for those of you that have a consistent process for development and deployment, you're not that common either. There are still a considerable number of large companies with a presence on the web that rely on individual heriocs to keep their business running.
In most cases, it's management's reliance on a few people within development that keeps them from making any improvements. That and the lack of undestanding that spending some money could make (or save) them significant amounts of money.
The fact that YOUR SOFTWARE shuts down after 49.7 days "to prevent data overload" is YOUR FAULT and BAD SOFTWARE DESIGN, no matter how much you use your pet news outlet to spin otherwise.
You're right about one thing, though. The FAA guys were idiots for deploying your software to replace an (eminently more reliable by all accounts) UNIX-based system. Call it a compound failure.
-Isaac
I am not a lawyer, and this is not legal advice. For Entertainment Purposes Only.
Sorry Microsoft, it's the software. When I go to the local airport and see a kiosk displaying a Windoze 2000 screen saver instead of information, something is wrong with the software running the kiosk. I'm sure that the kiosk owner followed all of the directions given and the stupid thing did not work anyway. A box that has to be restarted once a month and crashes when it's not has a software problem. Having two of them will simply multiply the problem by a factor of two.
How am I so sure that software not people are to blame? It's easy, I started using non Microsoft software and most of my problems went away. I've got the same old hardware, it just works better under Linux. It does more for me too.
Why is that? It might be that there's no nasty registry that's designed to keep me from "stealing" software. It might be that sane networking models really do eliminate most problems with worms and viruses. It might be that free software really works to make better code. Who cares?
The bottom line is obvious. No amount of blame shifting will change it.
Friends don't help friends install M$ junk.
However, I do take issue with the following quote:
"Another common theme in failures lies in the ranks of employees who actually must use the systems. Often they're not given proper training. There's also a chance that they don't want the project to succeed, especially if they see it as a threat to employment."
Never give the credit so quickly to evil intent if you can chalk it up to simple laziness instead. I doubt many employees conciously try to cause software crashes, in comparison with the number who just dont have a clue what they're doing.
And, naturally, programmer error will always cause a certain amount of crashes...we are human too. Testings just a way of minimizing that.
Support more choices in goverment-Vote 3rd party.
So in other words the life of airplane passengers is depending on the fact if a computer is rebooted manually or not. Thank god nothing really bad happened during this radio outage, otherwise some smartass would have blamed it on the tech that forgot to reboot.
The main problem is obviously we're relying on systems and procedures that never have been tested under emergency conditions.
So far I was never scared to board a plane, but now I am. Especially after learning that air traffic communciation relies on something that I abandoned at home because of security reasons.
If someone is to blame, then the authority that gave permisson to run such systems without proper testing. The question still arising is if this will have consequences. AFAIK there were 5 incidents where the safety distance between planes was violated... shouldn't the FAA invstigate this and enforce procedures to avoid those sort of incidents in the future?
is that IT tasks have been highly compartmentalized - to the point where coders are actually versed in a limited set (or 1) coding language.
And coders cannot be designers, DBAs, or possess much business knowledge. Interaction with the end user is done with a 'business designer'.
As with the childhood game of post office, some of the information gets lost for every node in the SLCD (sftwr life-cycle design) chain.
One of the best fixes is to allow direct interaction of coder/end-user, and merge the designer/developer roles for a better industry understanding.
Translation: they didn't hire enough analysts
Translation: They didn't hire enough consultants from SAP.
"Developers are least qualified to validate a business requirement. They're either nerds and don't get it, or they're people in another culture altogether," said Michelsen,...
Translation: It's not our fault the developers couldn't understand our brick of a business case.
Another common theme in failures lies in the ranks of employees who actually must use the systems.
Translation: It's not our fault the interface sucks - it's the stupid users too dumb to adapt to our software.
From the article: "Developers are least qualified to validate a business requirement. They're either nerds and don't get it, or they're people in another culture altogether,"
I used to think this. Then I realized that at least the developers knew one end of it -- they knew what the software can do. The other end, what the customer wants out of the system, is usually not known by anyone. Not management, certainly not sales, and not the customer either.
A customer with an existing system will often try to write requirements which amount to "do exactly what the existing system does in exactly the way it does it", which is not what they want or they wouldn't be replacing the system. Or, whoever is providing the business requirements will be so out of touch with their own business that the requirements will be incomplete or wrong. Or on the flip side they'll be so familiar with the system that they'll leave out things which are obvious to them -- but so obscure outside their field that no one on the software side will even notice the omission.
Of course, these problems will be discovered very late in the development cycle, resulting in a scramble to redesign and redevelop, a bunch of fingerpointing, mandatory overtime, and a host of other ills all of which lead to bad and buggy software.
...isn't actually the fault of MS programmers? In that case, given that leadership is one of the factors, then I can legitimately blame Bill Gates personally. So that's alright, then.
I've bitched about this before, but why can't news sites provide links to their sources? This is the internet, after all; we have the technology. It would take seconds, and obviously the journalist already has the information. Yes, I know I can search it myself, but I shouldn't have to - the supporting documentation should be provided by the person writing the article. (And, yes, I'm aware of the irony of saying that without providing a link. :) But I'm stating an opinion, not a fact.)
http://www.nist.gov/public_affairs/releases/n02-1--RJ
Not surprsing that a CEO would make this remark. I can't count the times I've asked the business community I'm working with for clarification of a business rule or requirement, and then get a 'sigh' or other look that says - "I'm too busy to worry about this".
And on the contract I'm working on now, they consider a 30 min phone meeting enough information to build a full blown app - trying to get documentation is like pulling teeth. And of course we know where the finger will be pointed if there's any issues.
To say we're nerds who don't "get it" is just an asinine, condescending remark; a) I'm perfectable able to learn about the business involved, b) If you explain the rules properly most developers I know have no problem at all coding the solution. I find most of the developers I work with brighter than the business community they're working with. The CEOs remark has a dilbert-like quality to it imo, and this guy's one of the 'experts' on the problem in the article... ha!
'The unexamined life is not worth living' - Socrates
- software projects are usually done in concert with business process changes,
- business process changes are often poorly managed, and
- the resulting problems are usually not due to software implementation issues
In other words, if you want your software projects to succeed, recognize that the management and executives are where a company's resources should be concentrated. The programmers are usually unimportant to the success of a project, and businesses can safely spend fewer resources on them without negatively affecting most projects.The Pittsburgh Post-Gazette has a closely related story: Software disasters are often people problems. Well, duh: "Garbage in; garbage out."
What I find really interesting is that this story, or various versions of it, while hardly "new," starts popping up on news sites all at once? It sounds like some organization is running a PR campaign, but it isn't quite astroturfing.
Contrary to popular belief here on /., MS does not hire idiots to write their code
Amen to that. I don't know where this idea that MS doesn't hire skilled people to design and develop software came from, but it's wrong.
It has always appeared to me that MS hires top students from the very best schools.
bhj
Irony would be if MSN(BC) blamed windows. For instance, here they were saying that the problem with the FAA UNIX to windows migration was not software (windows) but the lack of testing and maintenance. They say that the system required regular maintenance (in windows I think this means rebooting) but I am sure there is probably more to it than that. However, I don't see that MSNBC is being Ironic - there is nothing anti-Microsoft or anti-windows that could be taken from this. In fact, it is right on the correct spin factor you would expect. Here they are saying the recent bad press associated with a public outage from a UNIX to windows migration was not a problem with buggy windows but a problem with management of the system.
Nevertheless, it's those poor planners, poor executors, and poor leaders who are in charge. You really think they are going to take the blame? No, of course not! It's so much easier, more fun, and better for your career to tell upper management that it was just the programmers who couldn't follow their instructions correctly.
Programmers will then get blamed, the poor managers will get a bonus for "correctly" identifying the problem, and corporate America will sail on as it always has: giving the big bucks to the managers and sales folks, while ignoring the programmers.
Who me, bitter?
...that's the reason why bad code is written and why systems crash.
.. patch time!
..
:-)
I have, time and again, been asked to cut corners in the design during the implementation phase of a project. The result is, that too much is cut in order to meet the deadline, another project sucks out key resources after the deadline and the product is rushed into production.
Everybody is happy until things start falling apart
44% of the employees (a couple of hundred) in my department are consultants , employed on a timelimeted contract. Some slam things together knowing they are not present when "patch time" starts
Bad testing, bad deadlines and rushed projects is the cause of a lot of evil!
Luckily, I can still express myself in the cvs comments and the random comments in the code
The article is a bunch of malarky. Well, I suspect it is, but i stopped after the first couple paragraphs, after I read this:
Last month, a system that controls communications between commercial jets and air traffic controllers in southern California shut off because some maintenance had not been performed.
Yeah. That maintenance they failed to perform? It was their mandated once-a-month reboot of their windows system, because it locks up after 43 days.
This was the result of bad programming.
Anyway, as a QA guy, I can assure you that bad programming abounds. It's my job to make sure you never see it. Part of that job is trying to drill into programmer's heads the concept that performing to spec when used as directed is not sufficient.
This is just like television, only you can see much further.
IT budgets are still shrinking....
We need to hire MORE managers.
Considering that you deal with users who don't really know what they are doing in the first place I would have to place the majority of the blame on them. However you could also retrospecitvely place the blame on IT for not having the systems locked down in the first place but then you would have to blame the CEO and the board for not putting more technology in the budget. Yea we won't go there.
They bought a Yugo (windows) to do the job of a truck (UNIX). The Yugo needed more maintainence than the truck, and they had an accident. They fired the 'state of the industry' execs who decided to replace trucks with a Yugos. This is actually good news, in a way. Now all they need to do is get the trucks back.
Hmm... I wonder if the execs running nuclear power plants have finished installing windows to run them....
Better yet, we can put windows in charge of the ICBM fire control systems. We'll be *so* state of the industry.
Last week, my WinXP box locked up in the middle of a game. It was so bad, smoke poured out of the case. The software, probably WinXP but maybe the game, had overused one of the RAM modules so hard that two of the leads were charred black.
%^$#@& SOFTWARE!!!!
I've always wanted to design a system that gets nastier every time a user repeats an error.
In the generic sense, it would start with "Could not do xyz. Please check what you intended to do and try again."
Then, it would progress through "I can't do that. Try again." "You're starting to wear on my nerves. Can't you do anything right?"
Then, it would start to get more down to the source of the problem, beginning with "DOES NOT COMPUTE" and ending, finally, with "You fucking moron, my program works, read the manual before i cut you."
Stupid users always bothering me with crap.
ACs are modded -6. I don't read you, I don't mod you, I don't see you. Don't like it? Don't be a coward.
from tfa: "It becomes a major role of (management) to kind of herd the cats in and make them all line up in a reasonable way," said Barry Wilderman
yea its becomes much harder when you have to work with people who not only have bad communication skills but may not know the subject matter.
(sarcasm) enjoy the saved cash you paradigms shifting fucks
let alone the fact that many times you dont know if your getting someone who has made 10 hello world programs and count themselves as a pro in each.
ah and the images in tfa of people holding each other watching their software fail was priceless.
One major thing that comes inbetween coding near-perfect software (Perfect software is never going to be possible) is also the demand the customers place on the team.Of course, they know very less about the technology and so cannot blame them totally.
/. stories
In India, software companies treat the customer as God accepting his/her unreasonable targets.. I wouldnt blame the customers alone for it... the managers too are responsible. They agree to whatever the customer says even though the actual development team asks them not to. And then, the normal work timings stretch to 10 AM to 3-4 AM next day... Now, anybody think anyone can write quality code when they are working this timing??
Well, the only advantage that comes here is that we get to read all the
"Microsoft owns MSNBC, so this is clearly an evil plan to blahblahblah.."
Actually, now that I think about it, that's probably closer to the truth than anything else...
Not a Twitter sockpuppet... but I wish I was.
This is why Free Software tends to be more secure. The project managers tend to be programmers, not non-techy businessmen. They understand the concepts of "still needs work" and "not ready yet" even if a product is late. Commercial software vendors would rather release a program on time and hide any last-minute security flaws that pop up (to be fixed in some patch, which is perhaps another profit generator). Open Source projects, lead by the programmers themselves, will usually prefer to hold back a new version if they feel it's not reliable enough for release. Besides, that's what developer versions are for.
If it weren't for fog, the world would run at a really crappy framerate.
We have a user here that sent out an e-mail to 30 people that were definitely not supposed to get it. This came about because she opened up a distribution group and was pulling out the three names from that list and adding them to the e-mail message. But in the process of all of this, she also added the group as a whole (double-clicked to open it, even though that adds it to the message, but a button opens it to retrieve names).
There was then supposedly a program crash and magically the message went out.
I was of course blamed because as the network admin I somehow failed by being unable to bring back all of those e-mails, even though there are a million things wrong with that train of thought.
Clearly they couldn't imagine that:
1) software crashes don't cause mail to send
2) why was she removing names from a group instead of selectively adding them
3) she didn't use the software correctly on multiple counts
4) if she is clearly not competent enough to handle this and it was such an important e-mail, why was she given the task and not someone higher up?
In the end, yet one more reason I hate my job.
There are some odd things afoot now, in the Villa Straylight.
Assuming all my hardware is behaving nicely if a crash occurs that means a piece of software somewhere has failed, be it OS, network or what have you.
I boycott signatures
Sorry boss, you're getting paid to know. Spend some time (gasp! outside of work if you need to) and read up. While you're not expected to know every last implementation detail, you should understand the capabilities of your chosen platforms completely.
A mission-critical system should be interrupted exactly when you want, not on a schedule dictated by a calendar. The original "BS!" poster was right: if there are memory leaks, garbage collection problems, etc., then that's evidence of sloppy design work.
Saying you need regular reboots is the same as saying you need a firewall to protect against viruses: both show flaws in the design of OS.
And as far as "fscking their disks every day" goes, that's more sloppy design. You shouldn't have to do that. Fsck fixes file system errors resulting from poor application behavior, environmental problems, and (sometimes) hardware troubles. You shouldn't have those every day in mission-critical systems, but even if you do then putting in place a system of daily fsck is not the way to fix it.
I've had a production application server running for the last 288 days. It's due to come down for OS updates, but it will do so on my schedule, not because its operating system is poorly designed.
sigs, as if you care.
"The Mythical Man Month" should be required reading for every six figure mouth breather out there. Of course, it's thicker than "Who moved my cheese" and can't be purchased in an airport gift shop, so I suppose there's no hope...
*** Sigs are a stupid waste of bandwidth.
The assumption that MS hires "idiots" is unfair to be sure. However, those in the know who have seen some of the colossal kludges in MS software, and recently almost all Windows users who have been impacted by the repeated, massive virus/worm attacks base their knowledge on the only thing they know about Microsoft--their products.
It has always appeared to me that MS hires top students from the very best schools.
That is true--unfortunately they have been known to hire them AWAY from the best schools too (ie. before they graduate). It doesn't matter if they are top five percentile students--if they have zero practical experience and are thrown into a situation beyond their capabilities the result can be less than ideal. Nonetheless, I think that by now MS has figured out how to select and place recent grads and students hired before graduation. I think the problem is now deeper than that.
Microsoft triumphed over other tech companies that were prominent in its early days because BillG learned it had to become a marketing company (the same reason Apple still exists today--Jobs knew that from the start and Gates is a very quick study). Other tech companies remained software companies--they toiled away to make their next killer app the best it could be and marketing was an afterthought.
At Microsoft, from 1980 on at least, has been a marketing comapny first, with software development second. The most important technology it markets was invented elsewhere and merely extended by Microsoft. Only in the company's latter life have they been truly serious about research. The long time "thinkers" are brilliant but historically little has come out of Microsoft's research that has been commercially successful given the potential funding power MS has had.
Therein lies the problem. The article is right--software isn't the root cause of the vast majority of failures (even when the failure is the direct result of a software bug). At Microsoft, software design is driven by marketing--time deadlines, customer requests for features, backwards compatibility/legacy support etc. The result is the house of cards we build our systems upon today.
That result is unavoidable without EXTREMELY skilled planning and throttling the pace of change. Unfortunately, The MS Ship sails where the winds take it, and the pace of change has been rapid and relentless until now. I once thought the problems with MS products were because too many drop-outs were running the show. After seeing this blog I can see what the development teams have had to cope with. They have to do the impossible and try to get it done before the deadline slips yet again and MS market cap slips a few million and BillG comes down to yell at them. In some cases you have to be brilliant just to survive at MS.
So anyways, I think software bus are the immediate cause of a lot of disasters, but the ROOT cause definitely is poor planning and project management that leads to unstable system development.
If you're programming with other programmers, you are operating in an environment that has constraints built in. You are constrained by the quality of your teammates, by the amount of time available, by the list of desired features, and so on.
Now imagine that managers are faced with constraints. They have to deal with the insane deadlines imposed on them by the O-level people in the company. Middle managers in particular are often in a very unenviable position, in that they have to try to make impossible demands possible. But just as there are varying levels of programming skill, there are varying levels of management skill. Some managers can push back on their bosses enough to give the project a chance of succeeding, but many are ill-equipped to do so.
Those that are ill-equipped to do so are in this position primarily because unlike the field of programming, where specialized education is seen as a necessary prerequisite to employment (i.e. - "He's got a bachelor's in Computer Science from MIT, we'll hire him") most managers either have no specialized management training, or they have an MBA (a degree that sometimes offers real management training but often provides no practical hands-on management training at all), or even worse, they've been in the same company or types of companies for years, learning the same bad management habits over and over.
What businesses need to do is pay more attention to actual real-world leadership experience and training. "Manager" is a term that reeks of 19th Century automated factories. When you're dealing with abstract concepts, creativity, and continually-shifting requirements, you need to have leadership skills.
You also need to have people skills, and while it's easy to berate salespeople and managers because they often seem defined by their "touchy-feely" capabilities, the flip side is that without those abilities, it's very very difficult to lead people.
Read the EFF's Fair Use FAQ
If you want to know what a true fiasco is like, just Google "CoreFLS" and read the results.
At the U.S. Department of Veterans Affairs, some of the payroll systems date back to 1964 (that's right - no joke, they were bought when Lyndon Johnson was President), so they decided to replace them with a new system based on Oracle Financials. The new system is called CoreFLS. It has been a fiasco. So far VA has already spent over $270 Million out of an expected $472 Million total budget for the project. The project has been a failure laregly because of mis-management and plain-old stupidity.
First, they decided to do test trials at one of the busiest hospitals (that's right, they first went live at one of the *BUSIEST* hospitals) instead of a smaller test location. The user training for a critical system consisted of a self-paced web-based distance training as detailed here. No hands-on training was provided until a month after deployment and only after problems were apparent because the whole operation ground to a halt. So finally the senior managment decide to commission a $500,000 study from Carnegie Mellon to find out why it failed. The study concluded that CoreFLS was "an exemplary case study in how not to do technology transition." Yeah, they needed to spend a half-million to find the obvious.
Finally Congress got involved and all the senior managers including the Secretary himself were put on the "hot-seat" to testify. Lots of heads rolled (even senior managers like Assistant Secretaries) and lots of people were forced to resign or were fired. Now the place is crawling with federal investigators looking to put people in jail
So now the project gets cancelled. The sad thing is that VA really needed this program to succeed. I suspect that the technology has been made a scapegoat for mismanagement (not that the technology was perfect). Well.. back to 1964.
More like, the plane was known to continuously leak oil and/or other safety fluids to the point where it became dangerous or unreliable. They could have either replaced the plane or fixed the problem for greater cost, but chose to ignore the problem until one day missing that critical oilchange caused a near crash.
This isn't about a standard maintenance procedure, since a server should not have to be rebooted constantly in order to maintain. stability/functionality. That's like saying it's ok to swap the oil every second flight because it's cheaper than fixing the actual problem (that there's a leak in the first place).
And actually, considering that many earlier windows problems were caused by memory leaks... not such a bad analogy now...
Back in the late 60's and early 70's Texas Instruments started hiring lots of new college graduates to help them stay abreast of the latest technology. The object was to put them on a project with lots of unpaid overtime, work them at new-hire salary for four years and then, if they didn't leave on their own, gently "boot" them out the door and hire fresh, new replacements. After four years and lots of unpaid overtime, a lot of well trained engineers were ready for better jobs at other companies, taking TI's technology with them. TI trained a lot of engineers. By the mid 70's they realized what was happening and the policy was reversed.
No they don't. It's been studied for decades, but in all that time we've still not settled on anything that's actually demonstrated over long periods of time to be good. Hardware, materials and other available resources are continuously evolving and changing, meaning that software design research has nothing to reliably settle on before things change again.
We don't even have consistent and proven programming languages. Today it's Java, C#, VB and a variety of imperative scripting languages. Yesterday it was C and C++, before that it was Fortran, and before that there have been variants of assembler. And as we use these languages, we're constantly discovering more and more about language design and developing new languages.
HCI is still in very early stages of development, and that's a major part of software engineering. (If people can't use software then what's the point?) The vast majority of software development shops -- particularly smaller ones -- don't even employ HCI experts, and substantial proportions of developers still don't respect them or understand what the point is.
Something like bridge building, for instance, has been studied for centuries (if not millenia). It relies on consistent physics, consistent tools and well understood environments. Organisations that build bridges have well established experience, procedures and regulations that are put in place throughout their organisation. Software development's been studied for a few decades with the existing materials, resources and expectations constantly changed from underneath it.
Organisations that build software still don't have any reasonable idea of how to arrange themselves, or what procedures they should be using. There have certainly been some pretty good ideas from relatively recent ongoing studies, but the fact that managers and developers and marketers and whoever else frequently don't gel together very well with usually bad results is just an ongoing consequence of the fact that it's a very new field.
Just because software engineering has been studied for a few decades doesn't mean we know what we're talking about, or even that we know what we're studying.