IT Infrastructure As a House of Cards

All comes down to budget by Admodieus · 2010-05-24 10:23 · Score: 5, Informative

In most organizations, the IT department is treated as pure cost instead of something that provides strategic value. These IT departments have no chance of getting a budget approved that will allow them to "start over" on any part of their implementation; hence the constant onslaught of temporary fixes and patches.

--
"It's a reverse vampire...they....they crave the sun!"

Re:All comes down to budget by Opportunist · 2010-05-24 10:27 · Score: 4, Insightful

Budget and the lack of ability to see ahead, on the side of the decision makers.
Far too often decision makers are not the people who also have to suffer, I mean work with the tools they bought. They are often easily swayed by a nifty presentation from a guy who doesn't know too much either but promises everything, and of course the ability to cut cost in half, if not more, so they buy. Only to find out that the solution they bought is not suitable to the problem at hand. And then the bandaids start to pop up.

--
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
Re:All comes down to budget by Megaweapon · 2010-05-24 10:40 · Score: 4, Insightful

They are often easily swayed by a nifty presentation from a guy who doesn't know too much either but promises everything, and of course the ability to cut cost in half, if not more, so they buy.
If you've worked in a huge shop, you know that the big software vendors send reps out to IT managers for golf outings and the like. Screw it if the software works or not, just fluff up the guy with the budget rubber stamp.

--
I'm sure "SlashdotMedia" will improve on all the wonders that Dice Holdings blessed us all with
Re:All comes down to budget by eln · 2010-05-24 10:42 · Score: 5, Informative

The problem is not with kludges themselves, but with the fact that IT management does not stress documentation and proper change control procedures enough. If a kludge works, is documented, was implemented with proper change controls, and can be repeated, is it really a kludge anymore? IT has to screw around with stuff to make it work, that's what they (we) get paid for. If all we ever had to do was click on an install button and have everything work perfectly from there, what would be the purpose of an IT department at all? Off-the-shelf software and hardware can never be made to work perfectly for everyone's requirements. IT folks are paid to get non-unique components to work for unique requirements.

The problem is not with these fixes, it's that nobody ever documents what they did, and documentation is not readily available when needed. So, these kludges become tribal knowledge, and people only know about them because they were around when they were implemented or they've heard stories. When this happens, these wacky fixes can come back and bite you in the ass later when something mysteriously crashes and no one can get it to work like it did because nobody remembers what was done to make it work before. As people come and go, and institutional knowledge of older systems slowly erodes, we end up in a situation where everyone thinks the current system is crap, nobody knows why it was built that way, and everyone figures the only way out is to nuke the site from orbit and start over. The trick is keeping it from getting to that point.

Of course, nobody likes jumping through all these hoops like filing change control requests or writing (and especially maintaining!) documentation, so it gets dropped. IT management is more worried about getting things done quickly than documenting things properly, so there's no incentive for anyone to do any of it. Before long, you get a mass of crap that some people know parts of, but nobody knows all of, and nobody knows how or where to get information about any of it except by knowing that John Geek is the "network guru" and Jane Nerd is the "linux guru".

We will never get hardware and software that works together exactly the way we want them to. We will always have to tweak things to get them to work right for us. Citing lack of budgets or bug-ridden software may be perfectly valid, but those problems are never really going to be solved. Having our own house in order does not mean fixing all the bugs or being able to refresh our technology every 6 months. Having our own house in order means we know exactly what we did to make each system work right, we can repeat what we did, and everyone knows how to find information on what we did and why.
Re:All comes down to budget by mlts · 2010-05-24 11:01 · Score: 3, Interesting

Isn't this taught to death in ITIL 101 that every MBA must go through in order to get their certificate in an accredited college? It sort of is sad that the concepts taught in this never hit the real world in a lot of organizations. Not all. I've seen some companies actually be proactive, but it is easy for firms to fall into the "we'll cross that bridge when we come to it" trap.
Re:All comes down to budget by BigSlowTarget · 2010-05-24 11:04 · Score: 2, Insightful

To be blunt, most IT departments act like cost centers and don't provide any strategic value. Business units help by shorting the budgets and whining about band-aid technology instead of seeing how IT can build the business. It takes an exceptional move by IT or amazing insight from a business unit to raise IT above the slog and allow it to provide a competitive advantage to the business units. Projects that do this get firehosed with funding.
Consultants take advantage of this catch 22 situation when they sell new projects. It lets them get the new implementations and cutting edge development. This situation also causes application oriented mini-IT organizations to pop up in the business units from time to time. That, in turn, causes more headaches for central IT.
Re:All comes down to budget by Grishnakh · 2010-05-24 11:21 · Score: 3, Interesting

I don't think budget is a problem at all here. The problem as described by the article is with vendor-provided software being crufty and having all kinds of problems. The author even mentions that normal free-market mechanisms don't seem to work, because there's little or no competition: these are applications used by specific industries in very narrow applications, and frequently have no competition. In a case like this, it doesn't matter what your budget is; the business requirement is that you must use application X, and that's it. So IT has a mandate to support this app, but doing so is a problem because the app was apparently written for DOS or Windows 95 and has had very little updating since then.
The author's proposed solution is for Microsoft to jettison all the backwards-compatibility crap. We Linux fans have been saying this for years, but everyone says we're unrealistic and that backwards compatibility is necessary for apps like this. Well, it looks like it's starting to bite people (IT departments in particular) in the ass.
Re:All comes down to budget by almitchell · 2010-05-24 11:30 · Score: 3, Insightful

That is very true. Unless you are working for a highly visible technology company or high-profile corporation, most companies simply want you to keep the mess you've got going, no matter if it meas bandaids and soldering irons. Over the course of my 20 years career at four different companies, and from talking with colleagues, it is much the same story - the steering committee says, we initially invested hundreds of thousands of dollars, but we sure as hell aren't overhauling and starting over. The best boss I ever had did what I now refer to as guerrilla network administration. We had an aging infrastructure supporting hundreds of banks that we wanted to migrate off Wintel based systems, because they were end of lifing and we were sick of holding it together with bandaids and baling wire. It kept breaking, we sysadmins were sick of sitting through post-mortem meetings, and we were sick of upper management's refusal to acknowledge that it wasn't that the sysadmins sucked, it was that we didn't have a magic wand to keep the old nag on its feet to pull the plow. We repeatedly added a new plan into the budget to change out the system, and just as repeatedly got turned down because of cost. One day we had what was a near catastrophic hardware failure on one of those systems and we had to wait three days to get the parts on it because no one supported it any more. My boss told us to let it sit, and we did, chewing Tums the whole time because it's not like he was the one who would get fired over it. When upper management asked why we were still down after 24 hours, he took a PO in to them and said that when they signed off on new and functional systems, the problem would be fixed. When they balked, he asked them was the cost of a new system really so exorbitant compared to the manhours and effort it took to nurse a sick system well beyond its years. They signed, we had it changed out in two months, and we went another four years without so much as a hiccup. If he hadn't moved out of state I would have followed that man anywhere.

--
Baseless self confidence kills more people each year than bathtubs.
Re:All comes down to budget by Vellmont · 2010-05-24 12:08 · Score: 4, Insightful

If a kludge works, is documented, was implemented with proper change controls, and can be repeated, is it really a kludge anymore?

Yes.
You've either don't know what a kludge is, or don't have enough ability to see how fixing things or implementing something the wrong way can really be a horrible mistake that feeds on itself and creates other mistakes. Kludges aren't something you can simply document around. The rest of your post isn't really worth responding to, since it makes the false assumption that kludges are simply poorly documented behavior. If that's the worst you've seen, you're lucky.

--
AccountKiller
Re:All comes down to budget by HangingChad · 2010-05-24 12:11 · Score: 4, Insightful

the IT department is treated as pure cost instead of something that provides strategic value.
I can't count the times I've gone in somewhere and saw major deficiencies in their IT infrastructure. I mean really bad, O-M-G size problems. And when you point them out they act like you're trying to pad your billing. Just fix whatever isn't working that day. One of them was a doctors office.
Imagine if their patients acted that way. I don't care if I have cancer, just remove that lump in my underarm.
That's what you get when the problem is dictating the solution.

--
That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
Re:All comes down to budget by Grishnakh · 2010-05-24 12:18 · Score: 2, Insightful

To be honest, though, Linux is generally very good at backwards compatibility if you statically-link everything when you compile (as is frequently the case with commercial software). The Linux system calls never change, except to add new ones once in a while, so it should be very rare that something doesn't run.
Of course, if something is compiled with dynamic links, this isn't the case, as many of the dependencies will change over the years, but that's why static-linking is available, to avoid this problem. Dynamic linking is better for software that's distributed by the distro, as they can make sure all dependencies in place. Boxed commercial software doesn't have this luxury, so it needs to stick with static linking.
The main place where people complain a lot about Linux's backwards compatibility is with drivers, but that's a design decision. In Linux, drivers are supposed to be included with the kernel. If you don't want to do that, then you'll suffer the consequences. Application software doesn't have this problem as it doesn't link directly to the kernel.
Re:All comes down to budget by Cryacin · 2010-05-24 13:00 · Score: 4, Funny

Yeah, that's why the sane firms have rules on accepting gifts.
Yes, and both of them have never looked back!

--
Science advances one funeral at a time- Max Planck
Re:All comes down to budget by TheRealMindChild · 2010-05-24 16:33 · Score: 2, Insightful

I've seen some companies actually be proactive, but it is easy for firms to fall into the "we'll cross that bridge when we come to it" trap.

To be honest, in all of my years as a programmer eventually becoming a full software engineer (meaning I design, implement, and maintain software solutions), doing it "The Right Way" has always lead to bankruptcy. Always. Of course correlation is not causation, but for the times I've seen companies fail when "following the process" vs. "Release early and often", the latter half were the ones to stay in business.

--

"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
Re:All comes down to budget by Some+Bitch · 2010-05-24 20:27 · Score: 2, Insightful

To be honest, in all of my years as a programmer eventually becoming a full software engineer (meaning I design, implement, and maintain software solutions), doing it "The Right Way" has always lead to bankruptcy. Always. Of course correlation is not causation, but for the times I've seen companies fail when "following the process" vs. "Release early and often", the latter half were the ones to stay in business.
You can do "release early, release often" within an ITIL framework, just because most places implement it poorly doesn't mean it can't be done well.
Re:All comes down to budget by Opportunist · 2010-05-24 20:41 · Score: 2, Insightful

Odd, ain't it? Those sales, I mean, training meetings are always in a holiday resort. When your boss is at a "business meeting" at some place near the sea or high up in the mountains (Summer and Winter, respectively), you better make some room in the next few weeks in your schedule, you're gonna get some new hard- or software.

--
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
Re:All comes down to budget by wintermute000 · 2010-05-24 21:01 · Score: 3, Insightful

This is because IT is managed by managers, not engineers.
If all managers had coalface IT backgrounds at least (even to the point of just helpdesk) the problem would not be there.
As usual strategic and policy decisions are being made by people who don't understand the nuts and bolts.
Would you design a car by having a committee of non engineers approving every major decision. No. But that is how IT infrastructure seems to be built...
Re:All comes down to budget by TheRaven64 · 2010-05-24 23:10 · Score: 2, Interesting

The correct strategy is to get someone whose job it is to take bribes. You pay them a small salary, which they can add to with any extra gifts that they receive. You tell all of the sales reps that they are the ones with the final purchasing authority. Then you let someone competent actually make the decision.
Alternatively, you can use a downwards delegation strategy, where the people at the top have to justify purchasing decisions to the tier below them (recursively). You're free to take as many kick-backs as you want, as long as the people implementing your decision agree with it.

--
I am TheRaven on Soylent News
Re:All comes down to budget by ckaminski · 2010-05-25 01:25 · Score: 2, Funny

Yeah. That only took 15 years.

I don't believe in a lot of things by Culture20 · 2010-05-24 10:30 · Score: 5, Funny

...but I believe in Duct Tape.
As long as your backup and tertiary machines have different kludges keeping them running, there's no problem...

As a non-developer, this is what I see by Em+Emalb · 2010-05-24 10:30 · Score: 4, Insightful

Maintaining code is boring.

Everyone wants to work on the latest and greatest stuff, no one wants to maintain or even release patches.

It sucks, especially since it isn't limited to just software development.

I've seen companies where their "core switch" was a Cisco 2548. This wasn't 10 years ago, this was last year! Unreal.

--
Sent from your iPad.

Re:As a non-developer, this is what I see by drachenstern · 2010-05-24 10:37 · Score: 4, Interesting

As a dev, what's the problem with a 24 port gigabit switch as the "core" on a medium sized office? Aside from the fact that 10Gb is becoming popular (has become popular?) in the datacenter? Most desktops are only at the 1Gb level (and most users at below 100Mb), and most inbound internet pipes are much smaller. I don't understand the downfall here.
Can you elaborate?

--
2^3 * 31 * 647
Re:As a non-developer, this is what I see by Em+Emalb · 2010-05-24 10:56 · Score: 2, Interesting

No redundancy, is the biggest one. No real layer 3 switching is another.

--
Sent from your iPad.
Re:As a non-developer, this is what I see by oatworm · 2010-05-24 10:58 · Score: 2, Interesting

Ditto this - plus, in a medium-sized office, you're probably not getting 10x24Gb/sec out of your server infrastructure anyway. Your network is only as fast as the slowest component you rely upon; at 10Gb/sec, you're starting to bump into the limits of your hard drives, especially if you have more than a handful of people hitting the same RAID enclosure simultaneously.
Re:As a non-developer, this is what I see by JerkBoB · 2010-05-24 10:59 · Score: 4, Interesting

As a dev, what's the problem with a 24 port gigabit switch as the "core" on a medium sized office?
If all you've got is 24 hosts (well, 23 and an uplink), then it's fine. I suspect that the reality he's alluding to is something more along the lines of multiple switches chained together off of the "core" switch. The problem is that lower-end switches don't have the fabric (interconnects between ports) to handle all those frames without introducing latency at best and dropped packets at worst. For giggles, try hooking up a $50 8-port "gigabit" switch to 8 gigabit NICs and try to run them all full tilt. Antics will ensue... The cheap switches have a shared fabric which doesn't have the bandwidth to handle traffic between all the ports simultaneously. True core switches are expensive because they have dedicated connections between all the ports (logically, if not physically... I'm no switch designer), so there's no fabric contention.

--
A host is a host from coast to coast...
Unless it's down, or slow, or fails to POST!
Re:As a non-developer, this is what I see by Anonymous+Struct · 2010-05-24 12:25 · Score: 2, Insightful

Absolutely nothing. A 24 port gigabit switch makes a great foundation for a small to medium-sized network with typical business use. It's a stretch to call it a 'core', but anybody who tells you that you need some kind of crossbar fabric chassis switch at the center of your average branch office is just trying to sell you hardware and service contracts.
Re:As a non-developer, this is what I see by Em+Emalb · 2010-05-24 12:33 · Score: 2, Informative

The network it was running was not a small network. Not at all. It was a travesty that this poor switch was running the network. Well over 200 devices plugged into other 2548s all bridged back to the poor "core" switch.

--
Sent from your iPad.

Summary by dangitman · 2010-05-24 10:35 · Score: 3, Funny

There are a lot of people writing shitty software. Film at 11.

--
... and then they built the supercollider.

Re:Summary by Rob+Riggs · 2010-05-24 11:24 · Score: 3, Insightful

You obviously do not belong here.
Nerds only have one time zone: UTC

--
the growth in cynicism and rebellion has not been without cause

Take responsibility and stop the magical thinking by DragonWriter · 2010-05-24 10:36 · Score: 3, Informative

The constant need to apply temporary fixes that end up becoming permanent are fast pushing many IT infrastructures beyond repair. Much of the blame falls on the products IT has to deal with.

Well, sure, IT departments place the blame there. The problem, though, is not so much with the products that IT "has to deal with" as with the fact that IT departments either actively choose the penny-wise-but-pount-foolish course of action of applying band-aids rather than dealing with problems properly in the first place, or because -- when the decision is not theirs -- they simply fail to properly advise the units that are making decisions of the cost and consequence of such a short-sighted approach.

When IT units don't take responsibility for assuring the quality of the IT infrastructure, surprisingly enough, the IT infrastructure, over time, becomes an unstable house of cards, with the IT unit pointing fingers everywhere else.

And yet breaking this 'vicious cycle of bad ideas and worse implementations' by wiping the slate clean is no easy task. Especially when the need for kludges isn't apparent until the software is in the process of being implemented. 'Generally it's too late to change course at that point.'

If your process -- whether its for development or procurement -- doesn't discover holes before it is too late to do anything but apply "temporary" workarounds, then your process is broken, and you need to fix it so you catch problems when you can more effectively address them.

If your process leaves those interim workarounds fixes in place once they are established without initiating and following through on a permanent resolution, then, again, your process is broken and needs fixed.

You don't fix the problems with your infrastructure that have resulted from your broken processes by "wiping the slate clean" on your infrastructure and starting over. You fix the problems by, first, improving your processes so your attempts to address the holes you've built into your infrastructure don't create two more holes for every one you fix, then by attacking the holes themselves.

If you try to through the whole thing out because its junk -- blaming the situation on the environment and the infrastructure without addressing your process -- then:

(a) you'll waste time redoing work that has already been done, and
(b) you'll probably make just as many mistakes rebuilding the infrastructure from scratch as you made building it the first time, whether they are the same or different mistakes.

Magical thinking like "wipe the slate clean" doesn't fix problems. Problems are fixed by identifying them and attacking them directly.

Don’t patch bad code - rewrite it by D4C5CE · 2010-05-24 10:40 · Score: 4, Interesting

Don’t patch bad code – rewrite it.

Kernighan & Plauger
The Elements of Programming Style
2nd edition, 1974 (exemplified in FORTRAN and PL/1!)

Re:Don’t patch bad code - rewrite it by eggoeater · 2010-05-24 12:32 · Score: 4, Insightful

I couldn't agree more, but that's very expensive and very very dangerous. Why? Two factors:
1. Rewriting means rethinking; most legacy code is functional and is usually rebuilt in OOP. Whenever you rethink how something works it tends to change the entire behavior to say nothing of all the new bugs you'll have to hunt down. You're customers will definitely notice this.

2. Scope creep!! Rebuilding it? Why not throw in all that cool functionality we've been talking about for the past 10 years but couldn't implement because the architecture couldn't handle it. You get the idea.

Want an example? Netscape 5

--
$7.95/mo, 200 GB disk, 2TBxfer, MySQL, PHP, RoR.

implemented by convolvatron · 2010-05-24 10:41 · Score: 2, Insightful

i guess its ok that the sysadminds coopted the work 'implemented' where one would normally
say 'installed'

but that kind of leaves the actual implementors without a word now

and in this particular usage, its kind of odd, because usually the best time to
find and fix these problems is exactly when its being implemented, rather than
when its being installed

Written for a P-II 300Mhz? by damn_registrars · 2010-05-24 10:43 · Score: 5, Funny

Wait, you mean there have been newer and faster processors released since then? So Mordac really has been hiding something from me...

--
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.

Re:Take responsibility and stop the magical thinki by FooAtWFU · 2010-05-24 10:43 · Score: 4, Interesting

"they simply fail to properly advise the units that are making decisions of the cost and consequence of such a short-sighted approach."

In the defense of IT, those people they're trying to advise aren't always the best at taking advice. (But then again, neither are IT admins always the best at giving it.)

--
The World Wide Web is dying. Soon, we shall have only the Internet.

pay off your credit cards? by Matthew+Weigel · 2010-05-24 10:45 · Score: 5, Informative

This the essence of technical debt. Whether you're programming or deploying IT infrastructure, it's inescapable that sometimes you're going to have to include kludges to work around edge conditions, a vocal 1% of your users, or whatever. These kludges are eyesores, and fragile, but they're also as far as you could go with the time and budget you had.

Sometimes, accruing debt like this enhances your liquidity and ability to respond to change, so avoiding all kludges introduces other more obvious costs that slow you down and make you seem unresponsive to users or customers. But you can't just go on letting your debt grow all the time and not eventually come up technically bankrupt. Let it grow when you have to, but just as importantly make time to pay it down. A lot of this stuff can be paid down a little at a time, as you come across it a few months later. The pay-off if you're vigilant is that the next ridiculously urgent fix to that system can often be handled much more easily, without dipping down further... with patience and attention to maintaining this balance, you can reduce your technical debt and make the whole system hum.

The downside is that there isn't a quick fix when you find yourself deep in technical debt. You can't just spend all your time reducing it; your highest aspiration at that point should be maintaining the level of technical debt, rather than letting it grow, but it's generally been my experience that altering the curve of debt growth even a little can set you on the right path.

--
--Matthew

like bubblegum under a desk... by Thud457 · 2010-05-24 10:48 · Score: 4, Insightful

There's nothing more permanent than a temporary fix.

--

the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff

Re:like bubblegum under a desk... by Tridus · 2010-05-24 13:05 · Score: 3, Informative

Yeah, I saw that line and immediately thought about some of the "temporary solutions" people have proposed over the years. The statement is an oxymoron. It's either not a solution to the problem, or it's not temporary.
We've got less of those being made now, because I've taken to listing the previous "temporary solutions" every time someone proposes a new one.

--
-- "So they told me that using the download page to download something was not something they anticipated." - Bill Gates
Re:like bubblegum under a desk... by Hognoxious · 2010-05-24 23:57 · Score: 2, Insightful

There's nothing more permanent than a temporary fix.
[PHB] So you're saying that quick and dirty fixes in the past worked ... and some of them are still working? Must be good, then! [/PHB]

--
Confucius say, "Find worm in apple - bad. Find half a worm - worse."

Solution is obvious - Linux by seyfarth · 2010-05-24 11:04 · Score: 3, Interesting

From the original message we read that the "code was also written to interact with a completely different set of OS dependencies, problems, and libraries." This seems to imply that the IT organizations are allowing outside interests to dictate the rules of the game. If there were a stable set of operating system calls and libraries to rely on, then the software vendors would have an easier time maintaining software. I recognize that Linux changes, but the operating system calls work well and API is quite stable. I have used UNIX for a long time and I have compiled programs from 25 years ago under Linux. There have been some additions since then, but the basics of Linux work like the basics of UNIX from 25 years ago.

At present there are some applications available only on Windows and some only on Windows/Mac OSX. This might be difficult to change, but going along with someone's plan for computing which is based on continued obsolescence seems inappropriate. At least those who are more or less forced by software availability to use Windows should investigate Linux and negotiate with their vendors to supply Linux solutions.

Computers are hard to manage and hard to program. It is not helpful to undergo regular major overhauls in operating systems.

--
Ray Seyfarth, ray.seyfarth@gmail.com, http://rayseyfarth.blogspot.com

Re:Solution is obvious - Linux by Anonymous Coward · 2010-05-24 11:31 · Score: 2, Insightful

I've been saying exactly the same thing since about 1994---since I got into linux thing. Every program I wrote since just "works" without changes (granted, I don't write many gui apps; mostly data management stuff). My Windows counterparts (same corp, doing semi-related apps) have to release a "new version" every time .net is patched---or something along those lines. Your environment shouldn't make your things break or not work right.

Software = untouchable mentality by Stiletto · 2010-05-24 11:11 · Score: 3, Insightful

This happens in commercial software development, too. There's this belief (often held all the way up the management chain to the top) that software, even bad software, represents some kind of massive, utterly permanent investment that must never be thrown away and re-written.

I've worked with managers who would think nothing of throwing away a million dollar manufacturing machine to replace it because it's old, yet cling with all their might to ancient software code that represents a similar level of investment.

Re:Software = untouchable mentality by Ichijo · 2010-05-24 12:28 · Score: 3, Informative

There's this belief (often held all the way up the management chain to the top) that software, even bad software, represents some kind of massive, utterly permanent investment that must never be thrown away and re-written.
Ah yes, the sunk cost fallacy.

--
Any sufficiently unpopular but cohesive argument is indistinguishable from trolling.

Re:Kludges are short-time fixes and long-time prob by Grishnakh · 2010-05-24 11:24 · Score: 4, Insightful

It doesn't look like "doing it right the first time" is an option here. RTFA. They're talking about vendor applications being crappy and crufty, and IT departments being required to support them. The IT department didn't pick the app, and isn't allowed to not support it. They can't switch to another app (usually apps like this have little or no competition, and they're probably locked-in anyway).

So there's really nothing they can do but complain as long as they're required to support some shitty application on the latest version of Windows, as these are the requirements set down by upper management.

Re:Take responsibility and stop the magical thinki by Bunny+Guy · 2010-05-24 11:49 · Score: 4, Insightful

I'm going to tackle some of the conceptual problems that are hinted at above, which is usually where the difficulties lie, usually in trying to use the wrong software and expecting to somehow "make everything better" if you just make it work "my way" - the true "Magical Thinking".

I tend to agree with your conclusions, "wipe the slate clean" is a drastic action. I disagree with some of the approach you use to arrive at them:

a.) Problems are solved by people being invested in solving them, not process. This requires the antithesis of "Units" - Ownership; Ownership in the company, Ownership of the mission, and a direct heart felt connection to the success of the company. Until you have staff, from the CEO down, that own problems, from the mess in the coffee room to server down time, you will have a "business house of cards" no matter how good the process. In fact, most of the time, fixing things involves re-writing and/or reconsidering process - usually starting with asking the question - "Do we really need that?"

b.) Sometimes you really do have a train wreck on your hands. If you have mastered a.) b follows almost effortlessly, because now, you can *talk* about this behemoth that is eating your company and everybody sees the discussion for what it is, not empire building or managerial fingerprinting.

when you run into a train wreck - assess your tech problem - is the fix easily found? Are your processes using the software at cross purposes? if so, which is cheaper to fix? No amount of bug fixing will repair using the wrong software. It won't even fix using the right software in the wrong way.

In the end, re-asses often, be frugal, not cheap, if it truly is a requirement to run your business, buy the most appropriate. If you've made the mistake of buying a Kenworth long hauler when you needed 3 old UPS trucks - admit it, sell it back, take your loss and get what you really need.

Thats not "magical thinking" it's just common sense.

The meaning of Quality by bartwol · 2010-05-24 11:53 · Score: 4, Insightful

More than any other type, businesses are run by salesmen. These are people whose strongest attributes are the ability to build relationships, to communicate value, and a strong inclination to increase their personal wealth.

Increasingly, the stuff salesmen sell is based on complex technologies that, really, are beyond the reach of their comprehension. They kind of understand the products they sell, but really, they don't. If the world only had salesmen, there wouldn't be any sophisticated products.

Say hello to the engineer...a person who builds products. His strongest attributes are a desire to solve problems, a willingness to absorb the tedious but essential details needed to build a complex system, and a personality that derives gratification from doing so.

We now begin the business cycle. The salesman says, "Build me something I can sell."

The engineers says, "I will build you something that works well."

And therein begins a lifetime of the two, symbiotically, talking past each other. The engineer serves the salesman, and the salesman serves himself. But make no mistake about it: the salesman is in control.

For a salesman, QUALITY means it works well enough for him to sell more, and most importantly, to make more money for himself. For an engineer, QUALITY means it works reliably and efficiently. To be sure, QUALITY is an abstract and moving target that varies according to the eyes of the beholder. But to understand why we have the predicament described in this article, we need only understand the SIGNIFICANCE OF QUALITY TO A SALESMAN.

I would continue to expound, but then, most readers here need only reflect on their already frustrated pasts to understand the mechanics of this convenient but often vacuous relationship.

Re:Take responsibility and stop the magical thinki by DragonWriter · 2010-05-24 12:10 · Score: 2, Insightful

Problems are solved by people being invested in solving them, not process.

Both are, IMO, essential, which is while while I pointed at particular areas of process, my big picture message was about IT shops taking "responsibility for assuring the quality of the IT infrastructure."

Neglect of process is a symptom of people not being invested in solving problems that leads to bad results on its own, but even a good (nominal) process isn't going to work well if people aren't invested.

This requires the antithesis of "Units" - Ownership; Ownership in the company, Ownership of the mission, and a direct heart felt connection to the success of the company.

I prefer "responsibility"; "ownership" is, IMO, misapplied here. (Though, arguably, one of the reasons people do not take responsibility is because they don't, in fact, have ownership -- but ownership is a material relationship, and responsibility is the relevant attitude.)

But I think in substance we generally agree.

Until you have staff, from the CEO down, that own problems, from the mess in the coffee room to server down time, you will have a "business house of cards" no matter how good the process. In fact, most of the time, fixing things involves re-writing and/or reconsidering process - usually starting with asking the question - "Do we really need that?"

You kind of contradict yourself there: if fixing things usually requires changing the process, then "how good the process" is obviously has fairly direct bearing on success. The key thing is that processes aren't good (or bad) in a vacuum, they are good or bad based on the effects they have in your organization, in acheiving your mission; the same nominal process that is good for a group of people when considered against one mission is going to be bad for the same group of people when considered against different goals, and the same process that is good for one group of people with a given mission will suck for another group of people with the same mission, because people matter.

I was torn between modding this up and commenting. by tlambert · 2010-05-24 12:19 · Score: 3, Insightful

I was torn between modding this up and commenting.

I picked commenting.

This statement:

Everyone wants to work on the latest and greatest stuff, no one wants to maintain or even release patches.

is very, very true. We (Apple) have a hard time getting applicants who want to do anything other than work on the next iPhone/iPad/whatever. Mainline kernel people are difficult to hire, even though the same kernel is being used on the iDevices as is being used on the regular Macs. Everyone wants to work on the new sexy. For some positions, that works, but for most of them, you have to prove yourself elsewhere before you get your shot.

I think that, for the most part, we see the same thing in marketing for higher education (with the exception of one track, one of the universities I went to has become a diploma mill for Flash game programmers; sadly, I would not hire recent graduates from there unless they have an experience track record). There are video game classes at most universities, but while it might be sexy, you are most likely not going to be getting a job doing video games, 3D modeling for video games, or anything video game related, really, unless you get together with some friends and start your own company, and even then it's a 1 out of 100 chance of staying in business.

I don't really know how to address this, except by the people who think they are going to be the next great video game designer remaining unemployed.

-- Terry

No! by Greyfox · 2010-05-24 12:44 · Score: 3, Insightful

The current patchwork of duct tape and glue that works today is much better than the pie-in-the sky "lets build it from scratch" architecture that IT is pitching that will be late, over budget and eventually have its feature set scaled down until it's less functional than what you have today when it finally is delivered.

There is constant pressure to re-implement existing architecture. Most of the time, the people who want to do this do not have a clear understanding of the business process involved, don't realize that the existing frameworks represent years of bug fixes and are at least stable for that reason. They only think "Wow this sucks, a new one would HAVE to be better."

I'm not saying that you should never rebuild something from the ground up, but the scope of the project should be limited and the entire endeavor should be well documented and well understood from the beginning. And if the guy who's pushing for a rewrite can't demonstrate a deep and fundamental understanding of the business flows being automated, he should be taken out and shot (Or at least pummeled soundly.)

--

I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

Good luck by PPH · 2010-05-24 13:07 · Score: 2, Interesting

Been there, done that.

If you've got even a small or medium sized enterprise application (whatever that buzz word means) at a larger company (Boeing, for example), it might have its hooks into a dozen or more peer systems/hosts/databases/whatever. They are all 'owned' by different depatments, installed and upgraded over many years. Each on their own schedule and budget. When one group gets the funds together to address their legacy ball of duct tape and rubber bands, they roll the shiney new hardware in and install the spiffy new app. But everyone else is a few years away from affording new systems. And so the inter-system duct tape is simply re-wrapped.

The IT department tried selling everyone on architecture standardization. But due to the gradual pace of system upgrading, the plan was out of date before everyone got caught up to the old one. And today's 'standard' architecture wouldn't play nicely with what was state of the art a few years ago (thanks Microsoft). The whole architecture standard ploy is a salesman's pitch to get management locked into their system. Unless you've got a small enough shop that you can change out everyone's desktop and the entire contents of the server room over a holiday weekend (another salesman's wet dream), it ain't gonna work.

The solution is to bite the bullet and admit that your systems are always going to be duct-taped together. And then make a plan for maintaining the bits of duct tape. There's nothing wrong with some inter-system glue as long as you treat it with the same sort of respect and attention to detail that one would use for the individual applications.

--
Have gnu, will travel.

Just how much documentation can you read? by hsthompson69 · 2010-05-24 13:12 · Score: 5, Insightful

The problem with the whole idea of "if we only had enough documentation and change control" is that it becomes a non-trivial event to actually read through the documentation. Let's take an imaginary system that's been in production for 5 years...assume every last drib and drab of change has been documented...now you've got a 2000 page document and several hundred change records that tell you *everything*. Except, when it comes right down to it, mastering that 2000 pages of documentation and all the changes made afterwards is a months if not years long project - hardly effective for dealing with production problems that need to be solved in minutes or hours.

The illusion being perpetrated here is that people are interchangeable, and if you just have enough documentation, you can replace Mr. Jones with 20 years of hands on experience with the system with Mr. Vishnu living in Bangalore (or even Mr. Smith in the next cube, for that matter), with a net cost savings.

Now, I'm not saying documentation is a bad thing -> lord knows, it helps to have a knowledge base you can search...but knowing what to search for is knowledge you only get by real world experience with maintaining a production system. This is not digging ditches, boys and girls, this is skilled, if not essentially artistic labor.

Simply put, people matter more than process.

Re:I was torn between modding this up and commenti by by+(1706743) · 2010-05-24 13:13 · Score: 2, Funny

We (Apple)...

...

...has become a diploma mill for Flash game programmers; sadly, I would not hire recent graduates from there...

Sounds about right to me!

Odds of +1 funny over flamebait/troll/offtopic: slim to none. I just hope your 2548 dies before you can mod me down!

Re:Take responsibility and stop the magical thinki by AK+Marc · 2010-05-24 13:20 · Score: 3, Insightful

The problem, though, is not so much with the products that IT "has to deal with" as with the fact that IT departments either actively choose the penny-wise-but-pount-foolish course of action of applying band-aids rather than dealing with problems properly in the first place, or because -- when the decision is not theirs -- they simply fail to properly advise the units that are making decisions of the cost and consequence of such a short-sighted approach.

I've found the problem to almost always be the last thing listed. It's the contractor syndrome. "If you give me $1,000,000 now, I'll save your $500,000 a year for the rest of the time you'd have used that." Well, they think you are lying. They think that you wouldn't actually save the $500,000 a year, but would take the $1,000,000 this year and add it to your budget as a permanent line item, costing them $1,500,000 a year, rather than saving $500,000.

You can blame the IT director/manager/CIO/whatever for not being convincing enough, but there seems to be a pattern where people bid low then have massive overruns where the highest bidder would have been cheaper. As such, the people the IT person is talking to are often so jaded they don't trust anyone with price estimates.

When IT units don't take responsibility for assuring the quality of the IT infrastructure, surprisingly enough, the IT infrastructure, over time, becomes an unstable house of cards, with the IT unit pointing fingers everywhere else.

And when the IT units have the responsibility, but not the authority to fix things, what then? Most all places tie the hands of IT then complain when the solution isn't perfect.

--
Learn to love Alaska

Re:I was torn between modding this up and commenti by lonecrow · 2010-05-24 15:51 · Score: 2, Funny

I am shocked! Shocked I tell you. Apple applicants are only attracted to shiny new things? And all along I thought it was just the customers.

Re:I was torn between modding this up and commenti by gillbates · 2010-05-24 16:04 · Score: 4, Insightful

Everyone wants to work on the latest and greatest stuff, no one wants to maintain or even release patches.

I don't really know how to address this, except by the people who think they are going to be the next great video game designer remaining unemployed.

Here's how you address it: you hire one of those 9 out of 10 CS graduates who "Just got in it for the money". Had you offices in the Midwest, you'd have no problem finding programmers whose only ambition is to crunch out brain-dead code until they can move into management. Trust me, I work with these people and they're even worse than the primadonnas interested only in the "cool" things. Naturally, not everyone can be the next game programmer, or work on cool things, but you probably don't want to hire those whose only ambition is to do the grunt work.

Typically, the primadonna has to have his ego coaxed into doing the grunt work. But you can usually count on him to do it fast, and not to make a total mess of things. Granted, some people have a higher estimation of their abilities than their peers. But at least someone passionate about coding can be inspired to improve their code; they'll actually accept coding standards once reasonably explained. But here's a short list of problems with the typical "career type":

Because they don't have the intelligence or the initiative to do things right, they'll happily plod along, even when the given design can't possibly work, or can't be delivered on time. And when it does fail, rather than trying to understand *why* it failed, or *what* they could do differently next time, they blame their coworkers/subordinates, etc.
They are more sensitive to the political implications than the technical ramifications of their decisions. Consequently, they'll often run with an inefficient, or sometimes even incorrect design so as to placate their superiors. And once again, the blame always lies with *someone else*.
And speaking of blame, they'll frequently blame others when things go wrong, and even sometimes when they don't. There are *certain people* at the office around whom I can't have a technical discussion with coworkers because they understand neither about what we are talking, nor that such conversations are a normal part of the job. I've actually been reprimanded for discussing architectural decisions, because "we've already decided on the architecture..." Which is great, but the fact that you've decided doesn't help me understand it better. Supposedly, we're all mind readers here, and no discussion is necessary.
The career types usually promise unrealistic deadlines, and write horribly unmaintainable code. After all, writing code is just a stepping stone into management, and maintaining that code will soon become *someone else's* problem, not theirs...
And perhaps the worst part is that they have a corrosive effect on teamwork and morale. With a politician in the office, *no one* wants to do the grunt work out of fear that it will adversely affect their career.

It's easier to convince a rock-star programmer that documentation is necessary than it is to convince the career-track political programmer that a race condition is a problem, that architecture matters, that maintainability and scalability are important. Just the other day, I had a department manager question the value of writing reusable code - in fact, he was so hostile as to suggest that it wasn't worth our time to make code reusable... (And not only that, but reported to my boss that my suggestion otherwise was "distracting to what we're trying to accomplish here"...)

I know the starry-eyed programmers can be a handful at times, but those indifferent to technical issues will lay a minefield in your company. Suddenly, years after they've moved on, you'll find your new hires telling you the projects they built aren't worth salvaging, that you'll have to start over, etc... I've seen these types move into management and turn an otherwise fun profession into a death march. You don't want the stupid, or the political, types of people writing code. They'll set your company up for failure every time.

--
The society for a thought-free internet welcomes you.

Re:I was torn between modding this up and commenti by Animats · 2010-05-25 03:51 · Score: 2, Informative

Some of the concurrency stuff needs a complete rewrite - acquiring synchronization primitives is painful, the new 'amazingly fast' locking that they use for GCD is marginally better than a FreeBSD mutex, and between one and three orders of magnitude (depending on load) faster than a Darwin mutex. Part of this is a userspace problem (not optimising for the uncontended case, which is the most common in good code), but a lot of it comes from the route down through the myriad kernel layers when sleeping a thread.

That problem in Mach is part of what gave microkernels a bad name. QNX, which is a real microkernel (about 65K of code) does thread dispatching, locking, and message passing very fast, in constant time, and without long interrupt lockouts. Those are the functions which must go fast in a microkernel, because they're used so much. In QNX, locking a mutex in the uncontested case is about three instructions in-line, with no system call. Those three functions are most of what the QNX kernel really does. In Mach, they were an afterthought, written on top of BSD.

This really belongs in the "when is it time to rewrite" thread.

Slashdot Mirror

IT Infrastructure As a House of Cards

55 of 216 comments (clear)