Are There Limits to Software Estimation?
Charles Connell submitted this analysis on software estimation, a topic which keeps coming up because it affects so many many programmers. Read this post about J.P. Lewis's earlier piece as well, if you'd like more background information.
There are always things you won't consider until something's being developed. If you've done something a thousand times, and have the libraries developed then you can probably estimate the time required very accurately. If the request is something completely new to your team, you'll never be able to accurately estimate the time required until analisys (which takes it's own time as well).
Luck favors the prepared, darling.
There is only one way to make a good estimate on a software project:
Experience
It looks to me like someone just had too much time on their hands, and decided to say that in a very, very complex manner.
Sheesh.
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
Rapid Development : Taming Wild Software Schedules
by Steve C McConnell
In a software engineering class in college, I remember a professor joking around that the catch-all equation for software estimation is 2x+7, where x (can be in any units like hours, days, weeks, minutes) is your estimate for how long you think the component will take. So for example, If one of your developers estimates that developing some component will take 4 hours (so x = 4), in *reality* it will take them 2x+7 = 15 hours to complete.
:-), I'm realizing that this professor wasn't that crazy, and his crude estimation mechanism (which is a joke) isn't any more or any less accurate than a lot of modern techniques I have seen people use in the field.
After gaining a few years of "real world" experience in software engineering (and I know that the very term real world experience is debatable
"My mother never saw the irony in calling me a son-of-a-bitch." - Jack Nicholson
I have been in this industry for what often times seems too long, My father was in from the beggining 1962, When I was younger and he asked me how long I thought it would take to write I blurted out my answer and he said no X , I said noooo thats way too long how did you arrive at that ?
Here was his answer I have ALWAYS found it accuraye to +/- 10% so far on hundreds on small to massive projects.
1. Once you know all , or most of the forseeable estimates take that number. say 10 hours.
This number is an instinctual reaction to a perfect enviroment , a little experience, some ego on your part of what might be accomplishable in a vacum.
2 Take that Number ad double it.
This takes into account all the real world distractions. Events, etc.
3.Take that number and double it again. This takes into account unssen variables and events beond mortal control.
40 Hours.........
I use this on EVERY single estimate I provide, WHY ?? It works, its not too high not too low, just right.
I tell people this and they laugh, then I tell them that there are MANY legacy applications SSI, IRS, FBI, you name it that were qutoed by my father in this EXACT manner.
There is NO practical limit to estimation, As long as you have the information neccesary to determine what the job youre actually doing is.
Sig went tro...aahemmm.....fishing........
I'm glad there's finally a resource to help the folks who insist on accurate estimates understand why my response to the inevitable inane question is always a cynical "two weeks", regardless of the complexity of the problem.
In real life it's rare to be asked for an estimate of the time required.
What usually happens is you get told roughly what to build and the final date by which it needs to be ready. There then takes place a series of negotiations and compremises on the scope of work until everyone is "happy".
I suppose that doesn't really invalidate the point of the article at all, it's just an observation for those who think that estimation is the nice science that it is sometimes presented as being.
Sig is taking a break!
In the real world, any effort estimations are irrelevant anyway. I am sure everyone working in the business knows this situation:
Project manager says: "We have to add line item X to the project. What's the effort estimate for that?"
Me: "Twelve weeks."
PM: "But we need it in three weeks."
Me: "No way."
PM: "We have to. Shoot for" (names target date in three weeks).
Me: "Sure."
The due date is fixed, and the software development effort is determined by the available time afterwards.
Yes, you are right there. -- Another glass of champagne?
Every article I've read on this overlooks one thing that every programmer requires a small amount of.
Creativity.
It's something that's hard to be measured. Sadly, programming is not like assembling a car, where it can be broken down into infinitesimally smaller chunks, then added back together to get a whole.
For example: it takes six seconds to put this screw in place, so we'll stop the assembly line for 8 seconds, then the car moves on regardless, under the assumption that the screw was inserted.
Programming is not like that. I know I've stared for an hour at the screen trying to figure out why one line of code wasn't working.
Or sat there for a while trying to figure out how to approach a problem before writing another line of code.
Likening programming to a production line is not good. There's no way to know in advance how many lines of code there are going to be, nor how long each line is going to be. If you knew this, you could add up how long it would take the average person to key in the strokes, and there's your estimate. That doesn't work in software.
For time usage, software needs to be compared to any other creative process as opposed to a mechanical one. How long did it take daVinci to paint the Mona Lisa? An hour? Two? 3 days? Could he have guessed from the outset that it's going to take x amount of time? Probably not. He might have been able to give a ball park based on how fast he's painted similar stuff in the past, but he couldn't nail it down exactly.
Now, granted, as you develop time and experience, your estimations get better. In addition, yor time to completion gets better. (How long do you think it would have taken daVinci to paint a _second_ Mona Lisa? A lot less time than the first one, because he's done one, and he remember how he solved various problems, like how much of each color to mix to make a certain tone.) This is where talent and experience come in.
But until software becomes similar to assembling Lego bricks (which it will, one day, and has in some places), then it's going to be hard to quantitatively determine how long a given project will take. And even if it becomes like Lego stacking, there's still going to be some fudge factor because how to solve the problem has to be solved before solving the problem.
And sometimes you have to tear apart and start over because a brick is out of place, or it's just poorly designed.
Reeses
In my experience, the biggest snags in all time estimates have to do with the under-determination of what a project is and what it involves. Given any project F which has only F(x) parts to it, you usually have some rough intuitive estimate that there will be G( F(x) ) bugs to work out. Given that you are familiar with the type of project involved the estimations are generally fairly decent.
The big problem is that in real-world applications, x is always changing. I have found that the culprits of this is mostly one of several things:
1) You're not as familiar with the project as you thought you were - or there are some aspects that are familiar, but the unfamiliar ones have ramifications you don't foresee because you're not familiar with them. This adds to both your estimations of F(x) and G(F(x)).
2) Users are dumber than you thought. The difference in mindset between the user and the engineer is real and very significant. There are things that as an engineer ( especially one who is working closely to a piece of code for months on end ) you would never try to do with a particular application, and yet a user who has never seen it before will do out of ignorance or confusion or both. Just when you think you've made something idiot proof - they invent a bigger idiot. This throws off your estimates of G( F(x) ) because you have whole classes of bugs you never thought of as bugs before. Sometimes this requires reworking core components making estimates of F(x) go wrong.
3) The client either doesn't know what (s)he wants, or doesn't know how to explain it, or even that it is necessary to be explained. This is the most frustrating of problems, and can be fatal to entire projects. Often clients don't think of software engineering like real engineering. One cannot ask an architect to redesign a building after its already 3/4 built. But this has happened to me with software projects, and even on occasion prompted me to quit a job in frustration. When this happens, all bets on estimates are off.
Either that or I'm just really lousy at doing time estimates =)
There are a thousand forms of subversion, but few can equal the convenience and immediacy of a cream pie -Noel Godin
As someone who has to provide estimates to different clients for different types of jobs on a frequent basis, I have to say that I don't think it is as difficult as some people make out.
The secret is to base your estimate on a detailed specification. Specify in detail, break down the big task into smaller ones, estimate for each smaller task, add up, add 10% for contingency.
I think the problem is that too many estimates are made on the basis of poor specifications, then you get a shock when you discover a problem you haven't anticipated. So, my top tips:
1) detailed spec agreed with client.
2) breakdown into smaller tasks.
3) estimate for smaller tasks.
4) add up and add 10%.
All this stuff about doubling etc. - what are you people like? If you have to do things like that then perhaps project estimation isn't something you should be doing...
I figure out how much time it will take me to just sit down and do it without any interruptions.
Then I multiply that by the number of DBA's I have to go through to have a table get created for me divided by two.
Then I add to that the 10 times the number of project branches I need to request the PVCS administrator to create.
Then I count up the number of consultants sitting within 50 feet of my desk and multiply by that number times 20.
Then I multiply that number by the number of status reports I have to submit per week.
Finally, I add to that the number of games of foosball I play per day on average * 10.
That number is the final number of days it will take to complete the project.
I Heart Sorting Networks
Now, what good is f? On most software projects, f wouldn't be worth much. Why? Because nobody knows what X is. X is a specification of the work to be done (i.e., software requirements), and most such specifications are woefully incomplete, imprecise, and erroneous.
That's why development processes that are repeatable and emphasize increased formalism allow for better estimates. They provide higher-quality X values, not to mention better approximations of f based on past performance. Therefore, if long-term estimates are important to your business, climb the formalism ladder.
On the other hand, good long-term estimates are often unnecessary. Many business need only to know where the project is now and to be able to change directions with reasonable efficiency when business needs change or realities are better understood. Witness the effectiveness of so-called agile development processes in turbulent business environments.
So, in the end, the only real lesson is to pick your software development (and estimating) process to support your business. Doing it the other way around usually doesn't work.
Easy, automatic testing for Perl.
The glaring flaw of the paper is that the main argument can be applied equally to any human endeavor, not just to programming. The argument is essetially a rigorous version of the statement, "You can't (in general) know how hard (complex) it's going to be, until you do it". The author supports this by pointing out that the purpose of any program is equivalent to generating a string that is a complete, precise description of the problem. Complexity theory tells us you can't predict the length of that program (without a formal system bigger than the program).
But it's not hard to cast any problem into this form. Take baking a cake. The problem can be thought of as generating a precise description of how to turn some inputs into an output within the range of what we consider a cake. In a reductionist sense, that process is incredibly complex (much more than any computer program), involving gazillions of elementary parcticles and their interactions. But nonetheless it's pretty easy to estimate how long it will take to bake a cake.
Complexity theory shows us that complexity is indeed pervasive in general; but everyday experience shows us that it is usually encapsulated within simple abstractions. Most things we plan and do have relatively simple descriptions in terms of objects with those properties we are familiar, and things we have done countless times before. So while estimating complexity may not be possible in general, it is usually not very hard for the things we care about.
In order for the paper to be persuasive, Lewis must show that computer programming is, in practice, more complex than most other activities--that new problems can't be easy stated in terms of already solved problems. (He does begin to address this, but only as a side-note.) I think most practitioners would essentially agree (and I'm not going to argue this, unless someone challenges it). What does this mean for the relevance of complexity theory? It's a deep and difficult question, but I suspect that some insights can be drawn. In particular, I do believe that there are problems that can't be estimated without effectively solving them.
Regardless, there are more obvious, intuitive reasons that complex activities are difficult to estimate. One is that that humans vary wildly in their efficiency at complex tasks. We all know the experience of cracking nut after nut one day, and being stumped the next. Sometimes, to be sure, this is due to misestimation of difficulty, but just as surely it is often purely psychological. Another is that teams working on complex problems are prone to miscommunication and other group disfunctions. A third is simply that the flesh is weak--we often lack the discipline and concentation to plan our projects in sufficient detail.
And this list only considers the difficulties that derive from complexity. Software development faces a host of additional "accidental" challenges, such as bugs in third-party software, clients (and marketers) that change their minds, changing fashions in tools and methodologies, etc. In short, you don't need a fancy theory to conclude that predicting development time is quite hard!
The evaluation of an action as 'practical' . . . depends on what it is that one wishes to practice.
Most code is utter drudgery. It's predictable in an informal manner to a very high degree.
Quick tip: If most of your coding is utter drudgery, you're doing the wrong coding.
The potential drudgery I see tends to come from two sources: 1) users tend to ask for very similar things (e.g., a zillion slightly different reports), and 2) tools that are a poor match to the problem domain.
For the first case, users with similar requests, you give the user control and tools to support that control. So instead of wasting your life writing report after report, write report-generating tools.
For the second case, you gotta buy or build better tools. Or, possibly, learn to use the ones you have better. For example, if you're using an OO language, stop using your copy and paste keys. (Why? When you copy and paste chunks of code, you're saying that two things are very similar. Instead of copy-and-paste, abstract the problem using, say, inheritance or containment or delegation. Copy-and-paste yeilds maintenance nightmares.)
Computers do drudgery without complaining, and they do it much faster than you. Make them do your donkey work!
And generally the way we accomplish something in impossible times is to cut corners. Sure, it works in three weeks, but the code is snarled, there is no documentation, and you took advantage of a security hole to make it go.
Now of course you tell the manager, "If I spend three weeks on a temporary hack, I'm still going to have to spend another twelve weeks later doing it right."
And they say, "Sure! As soon as this crisis is past."
Of course, as soon as crisis A is done, crisis B is looming. And after B, then C, D, and E. So a lot of 'temporary' code gets written. Eventually, the project is just a big heap of steaming turds with some pretty contact paper covering most of the surface. And then the good programmers catch on and leave; the bad ones spend the rest of the lives sticking on more contact paper.
And the manager, of course, has long since moved on; he met his deadlines, after all, so he must be a good manager. And the person who's now in charge of that group? Well he must be a bad manager, because his team has lots of bugs and never makes deadlines anymore.
It's enough to make me cry.