Are There Limits to Software Estimation?
Charles Connell submitted this analysis on software estimation, a topic which keeps coming up because it affects so many many programmers. Read this post about J.P. Lewis's earlier piece as well, if you'd like more background information.
We all know that software schedules, etc. can be estimated, but not with a large degree of accuracy. It has always and will always just be a case of risk management, and whether you want to release early to market, or release late and have a better product.
In the real world, we don't go by some estimation or rigid schedule, and we wouldn't have to if not for the accountants and marketing people that have to prove their usefulness. THEY are the people who want estimates, and incredibly, they are also the people who have the least idea as to what is requred.
Moon Macrosystems. Sun's biggest competitor.
There are always things you won't consider until something's being developed. If you've done something a thousand times, and have the libraries developed then you can probably estimate the time required very accurately. If the request is something completely new to your team, you'll never be able to accurately estimate the time required until analisys (which takes it's own time as well).
Luck favors the prepared, darling.
There is only one way to make a good estimate on a software project:
Experience
It looks to me like someone just had too much time on their hands, and decided to say that in a very, very complex manner.
Sheesh.
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
Can be best based on these.
Metrics and processes worry me to some extent on this particular topic, because often times it seems that managers think that anyone can apply a few algorithms to a set of data and come up with an estimate.
What's truly important is that intuitive feel that people develop over time for what the bottlenecks will be, how their particular organization operates, etc, etc...
You can teach number-crunching, but you can't develop that intuition without experience.
Rapid Development : Taming Wild Software Schedules
by Steve C McConnell
This has been posted before here.
"We all know that Crap is King" - Don Henley
the number of anime festivals during the development time, what new PS2 games are soon to be released, whether there is a Magic: the gathering convention - all of these things affect the time needed for those smelly geeks in software dev to get off their butts and get the job done.
In a software engineering class in college, I remember a professor joking around that the catch-all equation for software estimation is 2x+7, where x (can be in any units like hours, days, weeks, minutes) is your estimate for how long you think the component will take. So for example, If one of your developers estimates that developing some component will take 4 hours (so x = 4), in *reality* it will take them 2x+7 = 15 hours to complete.
:-), I'm realizing that this professor wasn't that crazy, and his crude estimation mechanism (which is a joke) isn't any more or any less accurate than a lot of modern techniques I have seen people use in the field.
After gaining a few years of "real world" experience in software engineering (and I know that the very term real world experience is debatable
"My mother never saw the irony in calling me a son-of-a-bitch." - Jack Nicholson
Getting a price tag for software development is like knowing how much you're going to pay to build for a new house. Software is incredibly expensive to build. Any professional needs to be able to say: it will take so and so long and that means such and such a price tag.
The risk and uncertainty stem IMHO from two factors: the importance and rarity of talent and skill (a really good programmer can work ten times faster and produce a finer result than a 'normal' programmer); secondly, the inventive nature of much software development. When you make something new it's impossible to know what surprises you will get.
The more one works with standard pieces and the less one depends on extremes of talent and skill, the more predictable software development is.
My blog
my last company i worked for... we developed a proprietary program for internal use only to catalogue what we had in a "nice, friendly" environment (it was a procurement company).. of course the code would never fully be completed, but it was a very long and tedious procedure due to QA... in my opinion, i usually think QA has probably one of the biggest roles with development.. without them we'd be releasing buggy programs left and right
"The ones who dont do anything are always the ones who try to pull you down" -- Henry Rollins
I have been in this industry for what often times seems too long, My father was in from the beggining 1962, When I was younger and he asked me how long I thought it would take to write I blurted out my answer and he said no X , I said noooo thats way too long how did you arrive at that ?
Here was his answer I have ALWAYS found it accuraye to +/- 10% so far on hundreds on small to massive projects.
1. Once you know all , or most of the forseeable estimates take that number. say 10 hours.
This number is an instinctual reaction to a perfect enviroment , a little experience, some ego on your part of what might be accomplishable in a vacum.
2 Take that Number ad double it.
This takes into account all the real world distractions. Events, etc.
3.Take that number and double it again. This takes into account unssen variables and events beond mortal control.
40 Hours.........
I use this on EVERY single estimate I provide, WHY ?? It works, its not too high not too low, just right.
I tell people this and they laugh, then I tell them that there are MANY legacy applications SSI, IRS, FBI, you name it that were qutoed by my father in this EXACT manner.
There is NO practical limit to estimation, As long as you have the information neccesary to determine what the job youre actually doing is.
Sig went tro...aahemmm.....fishing........
Look at the initial proposal for at least a few days if it's major.
Take your gut feeling about the development time required.
Multiply that by 20.
Divide the result by the number of years you've been developing similar projects.
I work on a very large software project. 6million+ lines of active source code, with 400,000 new development hours per year and growing. -and- we are on our extimates well over 80% of the time. (if we don't hit it, we are under).
How is this possible? It has evolved over time, some of the same people who started this project 9 years ago are still here and they know the system very well. That knowledge, combined with good project management leads us to several categories. During a requirements phase, designers assign a complexity to the changes for a module, and based on the type an hours extimate is generated.
Now, Lewis is right, no algorithm can be developed to figure out the compleixty, but a human can, and the computer can figure out how many hours should be devoted to documentation, coding, and testing.
My overall point, as a software product matures...esitmates are easy to estimate and project dates are easier to meet. But you already knew that...
What I would like to know is, how is this going to effect expectations from non-technical people in charge of projects that demand "accurate" estimates. I've had good and bad managers, maybe these kinds of articles will help make developers life a little less stressful and more flexible.
I'm glad there's finally a resource to help the folks who insist on accurate estimates understand why my response to the inevitable inane question is always a cynical "two weeks", regardless of the complexity of the problem.
A problem that seems to come up in scheduling and time estimation is that the people producing the estimates aren't the people doing the actual work. Add onto that the customer giving additional requirements, changing requirements mid project, putting together a team that doesnt have the skills necessary to produce on time deliverables, etc...that's a LOT of variables.
I don't want to sound like that programmer who makes excuses for why their project isn't delivered on time ("That other guy was a moron", "Management is horrible", "We didn't have solid requirements") but IMHO, if you want a program delivered on time, pick a good team and then try to estimate the amount of time it will take...then reduce that by 20%. It seems like every project is late by about 25% or more, so if you reduce it initially, perhaps it will be delivered closer to when you really expected it.
--trb
It is certainly hard to produce an estimate since the bigger the project gets and the longer the same is beeing developed, the wider the possibilities there are that a new ideas or improvements to project, It is not uncommon to be reaching a deadline and saying, damn it would be nice if I wrote this new module.
Accurate estimates are somewhat irreal, its just a matter of human timing, it doesnt really has to be perfect. Yeah, you can create the shortest project, but who wants small (possibly untested) software?
This reminds me of a paper I came across on the limits of formal methods (http://www.kneuper.de/a-limits.htm).
You can prove philosophically the limits of mathematical methods,. but that doesn't make them useless. A formally-proved system, when put in contact with an informal world, may show itself to have limits, but it'll probably perform better than a system that's not been formally proven, and if it does fail, the reason for the failure will be glaringly obvious.
We build systems of ever-increasing complexity with tools that are constantly playing catchup. Does that mean we ignore the tools? I don't think so. Instead, we reflect and improve.
668: Neighbour of the Beast
Comment removed based on user account deletion
When you have a customer sueing you for failure to meet a projected deadline, then you can tell him he should be ecstatic that your other 4 projected deadlines were correct.
Information theorists, and what they are trying to solve is interesting, I will agree. What they might be happy with is also interesting. But in the real world, people want real answers, and don't want to find out, in October of 1995 that Windows95 won't be shipping for maybe another 3 months, if lucky.
Real world consumers, and service providers do want 100% clinical accuracy in estimates. It seems that most people have forgotten what quotes and estimates are for. To most people in the real world, 80% accurate on an estimate is just as good as 0% accurate. The lawsuits, and penalties for going over budget, over time, or failing all together on 20% will outway the profits on the 80%.
So be happy that you can know that 80% of your school homework will be done when predicted. In the real world, no one cares about what you did in school.
"It will be done, wheneven it is done"
People who are against human cloning must be bitter they are not good enough to be cloned.
In real life it's rare to be asked for an estimate of the time required.
What usually happens is you get told roughly what to build and the final date by which it needs to be ready. There then takes place a series of negotiations and compremises on the scope of work until everyone is "happy".
I suppose that doesn't really invalidate the point of the article at all, it's just an observation for those who think that estimation is the nice science that it is sometimes presented as being.
Sig is taking a break!
I'm going to have a great reply to this important story. It's going to have all the latest stuff - it will be broken down into paragraphs and have a high degree of relevancy. My reply will be ready in two weeks, give or take a month or so, if the powers that be decide it also must contain links and be spelled correctly.
The kind of development being done is going to have
a large part to play in how well the time and budget can be estimated. Projects that build applications automating known systems using strong toolkits should be more estimateable than leading edge mathematics and science driven projects.
When this comes up I always think or the evil officer pointing his gun at the scientist and saying "You will launch the new rocket by midnight or I will kill you." As if somehow stress will make the careful scientific work go faster.
I also wonder if this is a chaos problem. If someone could make really good estimates would knowlage of the estimate effect how the project is carried out causing the estimate to be wrong?
Or would being able to make good estimates cause management to under estimate even more often.
I would like to see results of projects estimated by a independent party that does tell the primarily parties of the results till after the project is finished.Would these estimates be correct more often.
With all the hype surrounding XP/Agile/Name of the Week type development, I've been looking for hard numbers on how much better it is against older development styles. So far, I haven't been able to find anything accurate. It really comes down to multiple projects vs. performance. The is no hard data yet on the speed of XP against all project/component types. My biggest concern with this is that some manager will read up on XP, read a line that says it cuts your development time/cost by some percent and then draft a memo and adjust all targets.
Do these sort of numbers exist out there yet and is it even worth doing them given the theme of the article? Thanks.
And that's the point of a healthy pessimism in estimates; when the estimates are good, it's a matter of experience, not methodology. As you read through the comments on this article, you'll notice that everyone who has a method that sounds really sensible is relying on experience and the input of programmers, not on a pure methodology.
Expanding a vast wasteland since 1996.
THe best way to estimate is to give a shorter time frame, this makes the developers work harder (faster?), if you give a large time frame, they work slow. no im not management, im a developer, I jsut call it like it is. You always see people scramble and whip out code right before a deadline.
I SURVIVED THE GREAT SLASHDOT BLACKOUT OF 2002!
The trouble is that people always leave things out of the schedule. For instance maybe 30% of those reading this post are supposed to be writing software right now, but nobody in the schedule does it say "time spent pissing about on /.: 2 weeks".
Stupid topic, it depends what you're doing, duh: nth ecommerce site - predictable,
anything interesting (which by my definition means something that hasn't been done before) - unpredictable
The first rule of software schedules is things always take least twice as long as you think, even when you allow for the first rule.
Or to put it another way, the first 90% takes 90% of the time, and the remaining 10% takes the other 90%.
So, it's actually stupid to try to produce a valid schedule. If you estimate 2 weeks it will take about 4. You might think it smarter to change the estimate to 4 weeks, but then it will take 8, so you may as well estimate 2 weeks and be done with it.
http://rareformnewmedia.com/
I was working for a company a few years ago as an intern. My managers were not programmers by any means. I had such a plast at that job, because they would give me a programming task, and then give me a week for every day that it would actually take me to complete it. So for a project that would have taken me 3 days, they gave me 3 weeks. Of course, I always finished my projects early, but I took my time about it. I didn't want to be the one to make things worse for the next guy. I knew it was a short term gig, as it was technically and internship. I was always dead on with how long I thought it would take me, though.
I think when you're programming with something that you're familiar with, and have a pretty good idea of how to go about it, it's pretty easy for you to estimate how long it will take. I think for anyone else who is not as familiar with everything involved it would be harder, though.
It's easy to stand out when the general level of competence is so low.
Companies get certified in SEI-CMM (the Software Engineering Institute Capability Maturity Model) to get that government contract -- and then they quickly abandon or pay lip service to the CMM principles. The whole point of SEI-CMM is that you have to have a non-dilbert organizational structure in order to achieve "maturity" resulting in the organization's being "capable" of developing more stable code and of achieving more control over project costs and schedules. The irony is, the companies who need SEI-CMM certification the most, the big government and defense contractors, tend to be the same companies who foster immature corporate policies such as frequent mass layoffs, no training, illusory stock option programs, a culture of blame, and lousy HMO plans that don't cover anything.
First hint is to collect resonable metrics - even if you can't estimate a project, at least make sure you have some data to go on for the next one. Like defect rates, how much code is being generated or fixed per day, and so on.
Secondly, get programmers on the team to provide some tight resolution of how much they expect to get done...not in a month, but in a day. After a few days they'll start to understand how quickly they actually work and their estimates will get better.
Most of all, attach dollar amounts to things you do. Don't spend $1000 in engineering time to save $10 in computer time. Learn what resources are cheap and what resources are expensive.
There are many other tidbits which a common sense to most working programmers, but it doesn't seem that anyone employs them.
Boss: "I need in 2 at the latest!"
(6 weeks later) Me: "It was rough, but I'm finished!"
Boss: "You're a genius!"
Actually, you tell him the real time, and he assumes you are padding and wants it twice as fast!
"Da ist ein Technölüst in mein Unterpanten!"
If the specifications change in the middle of the development fase, all predictions will be useless... and when the client isn't sure of what he wants, then specifications will change... aLot!
The net-net is that human factors are far more important - and it's really hard to plug these into an estimate. One of Cockburn's contentions is that people aren't linear or predictable. But he also identifies items that can help a project run more efficiently. An excellent read at any rate.
we practise extreme programming (XP). To give you some context - we break software development down into 1 or 2 week iterations and break the work down into stories. These stories are written by people like myself, customers, and are estimated and by developers/engineers. Estimates range from half a day to about 8 days for the work. The benefit to the customer is in XP is visibility. Doing these small chunks of work we are always able to have a snapshot of the project and can massage it into place if things are looking bad.
so now to the point...
we are still finding that estimates, even though in bite/byte-size chunks, can be inaccurate. what we are looking at making the programmers do is give us a high and a low estimate for the work. We then roll a dice. If you roll a 1, you take the lower limit, a 6 the upper and so on.
this may sound totally crazy, but - i dunno - i've seen people spend months analysing a workschedule and still be MILES off the mark in terms of an estimate.
The author of this article is not disinterested in trying to convince an audience that he can teach others how to come up with reasonable (within 20%) estimates of development processes.
The point of the original paper was: There will never be a mechanical way to generate estimates of software development projects. I've never been part of a software project that was correctly estimated. How many have you been part of?
In the real world, any effort estimations are irrelevant anyway. I am sure everyone working in the business knows this situation:
Project manager says: "We have to add line item X to the project. What's the effort estimate for that?"
Me: "Twelve weeks."
PM: "But we need it in three weeks."
Me: "No way."
PM: "We have to. Shoot for" (names target date in three weeks).
Me: "Sure."
The due date is fixed, and the software development effort is determined by the available time afterwards.
Yes, you are right there. -- Another glass of champagne?
*Half a year* after this article was published, this guy finally comes around to say "Yes, you *can* meet deadlines"?
I suppose it'd have been more ironic if he immediately produced an article that agreed with the original.
With a bit of experience under your belt, you can approximate up front, but anything claiming to be more accurate than an order of magnitude is somebody blowing smoke.
That said, an honest and honorable programmer will always do one of two things: (1) swallow his or her pride and give the high end of the above estimate, or (2) knock as much time off the high estimate as he or she is willing to compensate for by putting in the extra hours UP FRONT to deliver in the timeframe promised.
In my last shop we spent a lot of our time working on acheiving CMM level 2, which was harder then it sounds for a bunch of hackers. As we were approaching our level 2 goal, Carnegie Mellon offered us their Personal Software Process class. It ties in closly with the rules and guidelines set by the CMM, but it focuses on your own abilities. By keeping metrics on time spent coding or time spent documenting you can learn the rates at which you are able to work, and can apply that to estimates with reasonable accuracy.
So, in general, the time necessary to program a piece of software of minimal size is well-known. Thanks to Kolmogorov complexity.
As far as I know there are no specific K-complexity proofs about the (time/space) complexity of programs larger than minimal size.
This invalidates the conclusions in the paper. So keep on creating and testing software metrics!
Any formal estimation method, if possible, even if only partial, would still require a formal description of the task, which is, in my experience, the first and foremost problem/art/craft in software engineering. Once the task is adequately defined, the remaining work is, by and large, downhill.
Ok, the definition of "adequate" may kick off a few debates, especially with management... which, in my experience, is the central "problem" in software engineering. Management, that is.
Hmmm...
For the sufficiently clueless, even trivial applications of common sense are indistinguishable from wisdom
Every article I've read on this overlooks one thing that every programmer requires a small amount of.
Creativity.
It's something that's hard to be measured. Sadly, programming is not like assembling a car, where it can be broken down into infinitesimally smaller chunks, then added back together to get a whole.
For example: it takes six seconds to put this screw in place, so we'll stop the assembly line for 8 seconds, then the car moves on regardless, under the assumption that the screw was inserted.
Programming is not like that. I know I've stared for an hour at the screen trying to figure out why one line of code wasn't working.
Or sat there for a while trying to figure out how to approach a problem before writing another line of code.
Likening programming to a production line is not good. There's no way to know in advance how many lines of code there are going to be, nor how long each line is going to be. If you knew this, you could add up how long it would take the average person to key in the strokes, and there's your estimate. That doesn't work in software.
For time usage, software needs to be compared to any other creative process as opposed to a mechanical one. How long did it take daVinci to paint the Mona Lisa? An hour? Two? 3 days? Could he have guessed from the outset that it's going to take x amount of time? Probably not. He might have been able to give a ball park based on how fast he's painted similar stuff in the past, but he couldn't nail it down exactly.
Now, granted, as you develop time and experience, your estimations get better. In addition, yor time to completion gets better. (How long do you think it would have taken daVinci to paint a _second_ Mona Lisa? A lot less time than the first one, because he's done one, and he remember how he solved various problems, like how much of each color to mix to make a certain tone.) This is where talent and experience come in.
But until software becomes similar to assembling Lego bricks (which it will, one day, and has in some places), then it's going to be hard to quantitatively determine how long a given project will take. And even if it becomes like Lego stacking, there's still going to be some fudge factor because how to solve the problem has to be solved before solving the problem.
And sometimes you have to tear apart and start over because a brick is out of place, or it's just poorly designed.
Reeses
All development and estimation methodologies are going to rely on human estimation at some point. A realistic Work Breakdown Structure diagram needs to be drilled down until the tasks are less than 5% of the total project time. Even still, the estimates of those tasks are still going to be based on human perceptions of how long that particular task will take to accomplish. The estimates of each individual task may be inaccurate especially if they involve doing something that's never been done before. This is where the errors crop up. It's those tasks that I agree can't be predicted with a high level of accuracy.
However, this doesn't imply that you shouldn't try to plan your project instead of diving right in with coding. There are plenty of tasks that can be predicted accurately. Even if they do take longer than you expected, at least you broke the project into small chunks. You can manage the slip on a week-long task much easier rather than try to do damage control after a month of development and no results. Furthermore, if your milestones aren't met on the dates you predicted and they aren't on the critical path, you can let them slide without impacting the project. If you never plotted out the project beforehand, you'd never know.
Developers should issue release dates at various points in the project. An initial release estimation which is basically worthless but gives you an idea of what the company has in mind for release. A point at which the project has reached beta stage and how long they will expect it to be in beta. This can be taken with a grain or two of salt, but should give a more accurate estimation on the true release date.
I'm sure there can be other points along the line, but you get the idea. Granted this is kind of how things are done now, but not standardized at all.
And the response later is, "Oh! You thought I meant two calendar weeks? I meant two CPU weeks!"
In my experience, the biggest snags in all time estimates have to do with the under-determination of what a project is and what it involves. Given any project F which has only F(x) parts to it, you usually have some rough intuitive estimate that there will be G( F(x) ) bugs to work out. Given that you are familiar with the type of project involved the estimations are generally fairly decent.
The big problem is that in real-world applications, x is always changing. I have found that the culprits of this is mostly one of several things:
1) You're not as familiar with the project as you thought you were - or there are some aspects that are familiar, but the unfamiliar ones have ramifications you don't foresee because you're not familiar with them. This adds to both your estimations of F(x) and G(F(x)).
2) Users are dumber than you thought. The difference in mindset between the user and the engineer is real and very significant. There are things that as an engineer ( especially one who is working closely to a piece of code for months on end ) you would never try to do with a particular application, and yet a user who has never seen it before will do out of ignorance or confusion or both. Just when you think you've made something idiot proof - they invent a bigger idiot. This throws off your estimates of G( F(x) ) because you have whole classes of bugs you never thought of as bugs before. Sometimes this requires reworking core components making estimates of F(x) go wrong.
3) The client either doesn't know what (s)he wants, or doesn't know how to explain it, or even that it is necessary to be explained. This is the most frustrating of problems, and can be fatal to entire projects. Often clients don't think of software engineering like real engineering. One cannot ask an architect to redesign a building after its already 3/4 built. But this has happened to me with software projects, and even on occasion prompted me to quit a job in frustration. When this happens, all bets on estimates are off.
Either that or I'm just really lousy at doing time estimates =)
There are a thousand forms of subversion, but few can equal the convenience and immediacy of a cream pie -Noel Godin
As someone who has to provide estimates to different clients for different types of jobs on a frequent basis, I have to say that I don't think it is as difficult as some people make out.
The secret is to base your estimate on a detailed specification. Specify in detail, break down the big task into smaller ones, estimate for each smaller task, add up, add 10% for contingency.
I think the problem is that too many estimates are made on the basis of poor specifications, then you get a shock when you discover a problem you haven't anticipated. So, my top tips:
1) detailed spec agreed with client.
2) breakdown into smaller tasks.
3) estimate for smaller tasks.
4) add up and add 10%.
All this stuff about doubling etc. - what are you people like? If you have to do things like that then perhaps project estimation isn't something you should be doing...
Wow.
It's this kind of thing that makes me really worry for the future of the world. The first guy(Lewis) says there's no rigorous mathematical method for estimating software development times and the second guy calls him on it and than goes on to prove Lewis correct. A statistical method that produces 50% correct answers 50% of the time is not a method that I would claim fits the model that Lewis "proves" can't be developed.
As far as I can tell Lewis doesn't claim that the search for a method for estimating development times is useless but only that you'll never find one that gives 100% accuracy 100% of the time, i.e. a method that is deterministic.
Than the second guy goes "oh yeah, I'll show you, we'ld be happy with a statistical method that's 50% correct 50% of the time". Huh? Let's play that again?
Statistical methods for estimating software development times are basically the aggregated knowledge of history, i.e. a semi-formal method of canning a good experienced developer. Again, that's Lewis' point.
The time spent in testing is inversely proportional to the quality of the software which tends to be proportional to the time allowed for development.
ie. The worse (more rushed, etc) the SW project then the longer the QA cycles.
You're not alone, there are plenty of pointy headed bosses out there that don't get it.
Note that QA should be distinguished from develop/test/use iterations.
Jayfang
cycling sig- So what is in your waterbottle comrade?
is PHBs who want the software done yesterday.
I develop web applications in a small town. My boss comes to me and gives me specs on some new project. I look over them and give him a quote, say 40 hours, he then proceeds to laugh and say that the client will never pay that much for the app. So we spend an afternoon looking at what we can cut, trying to reuse code, maybe take out a feature or two here and there and come up with half the quote (20 hours) which I tell my boss we can make unless problems arise.
As with all development, problems arise, the client complains about X feature stuff gets redone, the code ends up being a huge mess and usually takes 1 and a half times the original quote.(60 hours). Yet my boss still doesn't figure it out. Why? Probably because his boss keeps breathing down his neck to cut development times as well.
What's worse is when a sales person or my boss talks to a client and gets them to agree on a list of features and the time it takes to develop before even consulting me. Last month a client wanted a content managment system for a website, discussion forums, polls, etc. Because of certain features it couldn't just be downloaded and I ended up just writing it. The client was charged 25 hours, it took closer to 80.
Anyway its the PHBs that cause the problem
The Anti-Blog
I would agree to this statement, but I would define what we are solving as a much simpler problem altogether. And I am also suspicious of claims that experience can allow neural networks (or brains) to do something no algorithm can do.
The reason software estimation can be made to work in the real world is that the estimates can be made in to a self-fullfilling prophecy. We prototype the major risks early, and we negotiate to drop features that won't make the schedule. It makes the whole thing more of a craft than a science, but it works.
[-- Trust the Monkey --]
I figure out how much time it will take me to just sit down and do it without any interruptions.
Then I multiply that by the number of DBA's I have to go through to have a table get created for me divided by two.
Then I add to that the 10 times the number of project branches I need to request the PVCS administrator to create.
Then I count up the number of consultants sitting within 50 feet of my desk and multiply by that number times 20.
Then I multiply that number by the number of status reports I have to submit per week.
Finally, I add to that the number of games of foosball I play per day on average * 10.
That number is the final number of days it will take to complete the project.
I Heart Sorting Networks
By speeding up development the estimation of time it takes will be easier to get a grip on.
I don't claim to be a programming language creator, instead 3000+
languages in less than 50 or so years should be enough to figure out that
the limitations of programming languages are not going to be solved by
creating another one. But rather in making use of the various languages
where they best fit, thru an action set that enable the creation of
automation of language use.
Comments from the LL1 article
USPTO Article specific reference is here.
Three Primary User Interfaces
The need for speed and language barrier to break:"
What's beyond the language barrier:
What I have found odd about the Virtual Interaction Configuration as I've
attempted to explain it to others over the years, is that there is an
extreamly strong tendancy to preceive in it terms of their individual and
specific mindset focus. i.e. if one is focused into prolog, they preceive
it as a prolog function set, which causes problems in correctly
understanding the actual general action set.
It's possible that communication of the VIC to Carl Sassenrath triggered
off the creation of what is now called REBOL. And it's also very possible
that SHEEP has as well gotten inspiration from the VIC.
Noodle baking...
SHEEP article
Another SHEEP article
If the algorithmic complexity is the length of the shortest program which can produce a given string, would its value not depend on the universal Turing machine the program is expected to run on? Eg, say A and B are two strings, T1 and T2 are turing machines. Let N1()and N2() be the turing numbers of the shortest programs you could use to specify a string using T1 and T2 respectively.
Surely you could artificially define T1 and T2 so that
K1(A) = 1
K1(B) = 2
K2(A) = 2
K2(B) = 1
Here the algorithmic complexity of A is smaller that B is T1 is used and vice-versa if T2 is used.
If I am not talking utter crap here, wouldn't that mean that the algorithmic complexity of an program does not only depend on the algorithm itself, but on an arbitrary choice of Turing machine against which to measure it, making it pretty useless as a measure of how long it would take to develop the software.
Plan for everything to go wrong, then revise it against stuff that goes right.
You get a project and say, this will be done in 5 years.
In 3 months you get 50%done, you say "Good News" it should only take 2 years total"
then when your done in 6 months "great news, we came in 4.5 years a head of schedule, and underbudget! where's my bonus?"
You can not solve without all the vsriables, and as long as here are people writing software, and people requesting software, there will always be unknowns.
Of course if there was one global class/function global repository where every one in the world can get a function/class in there language of choice, and it was open, time management would be come very easy, and development time would drop.
of course, this won't happen, or will it...
The Kruger Dunning explains most post on
Your method is certainly better than just doubling. However one thing you haven't taken into account is that on large projects the detailed specification is a significant proportion of the work. Also if the prjoect lasts for many months the specs invariably change during that time - sometimes a little, sometimes a lot.
Don't get me wrong I'm not criticizing your method if it works for you. But there's no getting around the fact that for large (tens of person-years of effort or more) software projects - estimation is a tricky task.
I'm sure we've all learned humorous little estimation tricks, and here's mine:
:)
1) Estimate how long it will take.
2) Double it.
3) Use the next highest measure of time.
Here's an example: You estimate that it will take about 5 hours to make a program change.
1) Estimate is 5 hours.
2) Double it to 10 hours.
3) The next highest unit of measurement is days.
The end result is that it will take 10 days to finish the job. I've always liked this one.
P.S.
Yes, I know it's nowhere near accurate. But it's still fun.
One of the things I've always noticed about estimation of software projects is very often there's a lack of formal feedback loop. I've never personally experienced a project 'post-mortem' where the accuracy of estimates was assessed. I've spoken to others who say "well we had something a bit like that but no-one takes it seriously, after all by then the project is over"
Surely if estimation is based on experience (and we know it is) then that experience needs to be recorded in some formal manner?
I wrote an article about the method we use at Fog Creek for making software schedules which I've seen work very precisely, consistently, on projects from a week to two years. The basic approach is to make sure that the granularity of the tasks that you are estimating is fine. If you itemize the tasks at the procedure level (write subroutine x), where each task is less than 2 days, your schedule will work. The reason most people's estimates don't work is because they pull them out of the air, instead of actually thinking about what tasks they will need to complete. Getting down to the procedure level forces you to figure out what you're actually going to do, which is how you get a real estimate.
Joel Spolsky
spolsky@panix.com
Joel On Software
(Allmost?) everybody here seems to know what they are going to build before they start. If that were really true building the software would just be hammering out what you allready know. This is a weird assumption. There is a radical different approach to this, which in practice works remarkably well. Build what you can in a reasonable timeframe. Time up? Project finished.
If the result is promising, start a new project to improve. Stop starting new projects when the improvements don't bring enough benefits anymore.
Just my 2 Eurocents
In other words, they will hear this as "close to all of our software projects will be within estimates if we follow method X."
However, because of their own perceived business needs (which may even be correct to an extent; remember, just as we're the presumed software experts and should be given the benefit of the doubt as far as understanding software engineering principles, they are the presumed business experts and should be presumed to understand *their* business and markets), the likelihood of actuall *rigorously* following method X gets considerably lower. This goes primarily to time-to-market considerations and changing requirements. Changing requirements are *inevitable*, particularly in initiatives where a non-IT company is trying to use technology to enhance their traditional business. Additionally, if we accept that a good understanding of the problem domain is one of the complexity factors that affect the likelihood of success of software projects, staff turnover and the loss of people within the IT infrastructure of the company who have a good undestanding of the problem domain will also tend to have a negative effect on the predictive success of a methodology in such an environment.
So when the inevitable failure occurs, the method (and by extension the profession) will still be percieved to be unreliable. This will especially be the case if this is an early effort in the organization. The reaction of the business people is likely to be (intuitively, even if they realize the illogic of their interpretation of statistics) "hey, your method predicted 80% success rate, but this is our second project, and it FAILED. That means we only got a 50% success rate. Your method sucks."
Finally, even the criteria for evaluating the "successfulness" of a software project will differ between sponsors of a project and the architects of said project in this environment. In the evaluation of the sotware engineering industry, a project that was delivered on time, within budget and with a high quality but too late for a market which changed underneath it, is a "success" according to the terms of the methodology, but to the business people who sponsored the project, it will likely be viewed as an unmitigated failure.
The process I am familiar with involves a general "Requirements" document, followed by a "Design" document. In the requirements, all the inputs, transforms, and outputs are listed, which in effect "solves" most of the problem they are trying to get at BEFORE it is estimated. By the time the Design is done, all the I/T/Os are detailed.
The hardest part is always getting the user or engineering committee to agree on what the inputs or outputs are. Do we need sensor X? Is Y going to be fully articulated, or have a limited range of motion? How do we handle error cases (BSOD?, dialogs?, Ignore, log, and reset since the user won't understand anyway?).
I've often been accurate on my estimates (over the 80/80 given above) - but I insist on defining things comprehensively first. And I know my "velocity" (in the Extreme Programming sense of that word).
Extreme Programming bypasses the argument because it breaks the problem down to very small pieces which can be estimated. Just do small sets of I/T/Os and get feedback.
That is often much more effective than trying to get a person or persons to agree on what the inputs and outputs and other specifications are, especially when the user probably doesn't know what they want.
Economics is the science of modeling non-linear phenomena with linear systems. The entire economic context of this question is debious to begin with, but we're all intimately familiar with those distortions (cf. Dilbert). The purpose of estimating is to fit the activity of developing software into an economic context. If we weren't doing that we just go ahead an implement the software without the tea leaves.
What the estimation business needs to justify their existence is a formula that tells people before the fact that the linear assumptions have broken down. Otherwise it's just the science of dividing by zero.
It's the age old question: would you rather have someone who always insists that hse's right, but are only right most of the time, or someone who rarely insists that hse's right but always is right on those occasions?
There's subtlety here in the invocation of Komolgorov complexity. KC is measured with respect to an assumed universal computer. That can be as simple as a Turing machine with a handful of states. Or you can define the machine as including all the software already written that you have available for use. In other words the shortest string is the string that maximizes software reuse.
Estimation is never going to produce satifying results until we have a better foundation of code built for maximal reuse.
Given that the majority of programmers view the coding structures that promote reuse as cluttter (templates anyone?), it'll be a long time yet before KC has any bearing on this subject matter.
The world is neither black nor white nor good nor evil, only many shades of CowboyNeal.
I find this problem very similar to problems facing compression software: you can prove that it's impossible to have a compression algorithm that compresses everything (read the compression FAQ), but on the other hand a compression program (ZIP, ARC...) will work great on real world examples: text files, images... It just won't work on random data.
I strongly suspect it's the same here: if you have vague random specs from your boss, it's impossible to give an estimate. If you have precise estimate and know your stuff well, then here you go...
Non-Linux Penguins ?
Now, what good is f? On most software projects, f wouldn't be worth much. Why? Because nobody knows what X is. X is a specification of the work to be done (i.e., software requirements), and most such specifications are woefully incomplete, imprecise, and erroneous.
That's why development processes that are repeatable and emphasize increased formalism allow for better estimates. They provide higher-quality X values, not to mention better approximations of f based on past performance. Therefore, if long-term estimates are important to your business, climb the formalism ladder.
On the other hand, good long-term estimates are often unnecessary. Many business need only to know where the project is now and to be able to change directions with reasonable efficiency when business needs change or realities are better understood. Witness the effectiveness of so-called agile development processes in turbulent business environments.
So, in the end, the only real lesson is to pick your software development (and estimating) process to support your business. Doing it the other way around usually doesn't work.
Easy, automatic testing for Perl.
The glaring flaw of the paper is that the main argument can be applied equally to any human endeavor, not just to programming. The argument is essetially a rigorous version of the statement, "You can't (in general) know how hard (complex) it's going to be, until you do it". The author supports this by pointing out that the purpose of any program is equivalent to generating a string that is a complete, precise description of the problem. Complexity theory tells us you can't predict the length of that program (without a formal system bigger than the program).
But it's not hard to cast any problem into this form. Take baking a cake. The problem can be thought of as generating a precise description of how to turn some inputs into an output within the range of what we consider a cake. In a reductionist sense, that process is incredibly complex (much more than any computer program), involving gazillions of elementary parcticles and their interactions. But nonetheless it's pretty easy to estimate how long it will take to bake a cake.
Complexity theory shows us that complexity is indeed pervasive in general; but everyday experience shows us that it is usually encapsulated within simple abstractions. Most things we plan and do have relatively simple descriptions in terms of objects with those properties we are familiar, and things we have done countless times before. So while estimating complexity may not be possible in general, it is usually not very hard for the things we care about.
In order for the paper to be persuasive, Lewis must show that computer programming is, in practice, more complex than most other activities--that new problems can't be easy stated in terms of already solved problems. (He does begin to address this, but only as a side-note.) I think most practitioners would essentially agree (and I'm not going to argue this, unless someone challenges it). What does this mean for the relevance of complexity theory? It's a deep and difficult question, but I suspect that some insights can be drawn. In particular, I do believe that there are problems that can't be estimated without effectively solving them.
Regardless, there are more obvious, intuitive reasons that complex activities are difficult to estimate. One is that that humans vary wildly in their efficiency at complex tasks. We all know the experience of cracking nut after nut one day, and being stumped the next. Sometimes, to be sure, this is due to misestimation of difficulty, but just as surely it is often purely psychological. Another is that teams working on complex problems are prone to miscommunication and other group disfunctions. A third is simply that the flesh is weak--we often lack the discipline and concentation to plan our projects in sufficient detail.
And this list only considers the difficulties that derive from complexity. Software development faces a host of additional "accidental" challenges, such as bugs in third-party software, clients (and marketers) that change their minds, changing fashions in tools and methodologies, etc. In short, you don't need a fancy theory to conclude that predicting development time is quite hard!
The evaluation of an action as 'practical' . . . depends on what it is that one wishes to practice.
Grumble grumble. I missed-typed the ending italics tag.
If not editing a post, maybe we can get a warning when a tag is not closed, and give us the chance to close it??
- - - - - - - - - - -
I am a programmer. I am paid to produce syntax not grammar. Deal with it.
Posted this earlier as an AC, but I'd quite like to know the answer so here it is again with that extra mod point.
If the algorithmic complexity is the length of the shortest program which can produce a given string, would its value not depend on the universal Turing machine the program is expected to run on? Eg, say A and B are two strings, T1 and T2 are turing machines. Let N1()and N2() be the turing numbers of the shortest programs you could use to specify a string using T1 and T2 respectively.
Surely you could artificially define T1 and T2 so that
K1(A) = 1
K1(B) = 2
K2(A) = 2
K2(B) = 1
Here the algorithmic complexity of A is smaller than that of B if T1 is used and vice-versa if T2 is used (assuming the program is just the turing number written in binary, ie 1 or 10).
If I am not talking utter crap here, wouldn't that mean that the algorithmic complexity of an program does not only depend on the algorithm itself, but on an arbitrary choice of Turing machine against which to measure the complexity, making it pretty useless as a measure of how long it would take to develop the software.
Incidentally, when I refer to the Turing machine here, I am not talking about the Turing machine the software we are developing will run on, but the one on which the program which generates its string runs on - ie. the one used to define its algorithmic complexity. They may be the same thing, but I don't see how making them the same would make the algorithmic complexity a more valid indication of software development time.
Somebody is actually concerned about not pissing off the customer? What next, tea and sympathy for the poor end-user?
"Lewis also is incorrect in his criticism of the Software Capability Maturity Model (SW-CMM) from the Software Engineering Institute at Carnegie-Mellon. This method, while far from perfect, has helped many organizations improve their software processes"
So their processes have improved. The real question is whether their software products have improved. I believe in the UL testing model. Each product is evaluated on its own without regard to how it was developed.
All these new age quality models involve grading an organization rather than a product. It's pretty clear that most companies produce products with varying quality even though they were produced by the same organization.
Connell: "no serious researcher in software engineering is trying to find a guaranteed method for producing estimates of time and effort that are certain to be correct. No one is even trying to find methods that produce estimates guaranteed to be correct within a known error range."
By saying this he can ignore those who claim more than subjective estimation because they are not "serious researchers". Fair enough, but you shouldn't leave with the impression that no such claims have ever been made.
and [references in the original paper]R.e. the CMM, read the quote in the original paper carefully and judge for yourself. It's from the beginning of what I call their "manifesto", a white paper that motivates and explains the CMM. Indeed, it stops just short of saying that they claim to estimate objectively, but it sure leaves that impression if you're reading quickly.
More generally, many people in the field don't hesitate to call what they're doing software engineering, some call their methods "scientific". A reasonable definition of engineering is:
Is software estimation engineering of this sort? No.I think of my paper as putting a stake in the ground at one edge of the field, saying "not beyond here", but the rest of the field is open.
An awareness of that 'stake' will help debunk unsustainable claims (if such continue to arise) and also help assign value to methods that really do work, even if the rely on human experience.
And once more, the original paper does NOT say that software estimation is not possible or not valuable. Claiming that it does is a straw man, or more likely, a reflection of the fact that you didn't read it.
No, don't throw up your hands; use extreme programming. In effect, break the problem down into smaller and smaller chunks until its estimation is achievable, and then correct and refine estimates as you proceed through a series of iterations. Release cycles are short, perhaps two months, and you can release a fully tested project at any time.
---- David Phipps david@infiniteresource.net
This is mostly wrong- AC is defined asymptotically. A translator from any language/machine to any other is a fixed size, not even particularly large (50k perhaps). In the limit of larger objects this 50k becomes insignificant. [this is described in AC textbooks and mentioned in the referenced paper]
The best way to get accuracy would be the same way my wife does with my estimates of home repair projects. First, double the numerical part of the estimate. Second, increment the units. Thus, if I estimate it will take 2 hours, she knows it will probably take 4 days. This takes account of all the problems that Murphy guy throws in.
I'm sure than asking the programmers how long they think it will take and following the rule above will work for software too!
This comment was posted twice, see response below (AC deals with the choice-of-language/machine issue)
"you can't test your way to quality" - Larry L. Constantine
Cause Larry doesn't make any money on testing techniques, he's in the silver bullet business.
Few people are familiar with the term "Kolmogorov complexity". It is basicly the length of the shortest possible solution (sequence of symbols). Sometimes refered to as "algorithmic complexity". It proves that, except for a constant term, the complexity of a problem is independant of what method or language is used to process the symbols. Except for a constant term, Lisp, C++, Basic, and Perl all yeild the same complexity for any problem.
Lewis's proof if based on a mathmatical proof that the Kolmogorov complexity is impossible to predict (without actually solving the problem).
One objection was that for some "Kolmogorov simple" problems it may take a human a long time to find the short solution, and that for some "Kolmogorov complex" problems the long solution may be obvious to a human.
It got me thinking. If we fudge the definitions a bit, Kolmogorov complexity still applies. "Thinking" is just another method or language for processing symbols (thoughts). So the Kolmogorov complexity is the length of the shortest sequence of thoughts required to solve a problem. In the general case it is impossible to predict the length of the shortest sequence without actually solving the problem.
-
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
...that the limit was when the marketing people told the customer that it would be done! At least that's how it works here...We were asked once how long we needed from start to finish to do custom software packages for clients(including protocol analysis, specs development, software development, software V and V, and FDA and site-level documentation) and we told them six weeks. We've gotten that ONCE in the past year and a half, even though it states in our contracts with our clients that they need to give us six weeks. Marketing just charges them an expediting fee (which rarely seems to trickle down) and we work a bunch of weekends (Our marketing department's motto on Friday is "Two working days until Monday!"). And God help us if the software is a day late in shipping, cause according to the marketing people if the software is late we'll lose the client! (Yes I'm being whiny--I just finished up a project that, start to finish, we had a week and a half to do and I'm TIRED.)
Denver Isuzu Suzuki
And generally the way we accomplish something in impossible times is to cut corners. Sure, it works in three weeks, but the code is snarled, there is no documentation, and you took advantage of a security hole to make it go.
Now of course you tell the manager, "If I spend three weeks on a temporary hack, I'm still going to have to spend another twelve weeks later doing it right."
And they say, "Sure! As soon as this crisis is past."
Of course, as soon as crisis A is done, crisis B is looming. And after B, then C, D, and E. So a lot of 'temporary' code gets written. Eventually, the project is just a big heap of steaming turds with some pretty contact paper covering most of the surface. And then the good programmers catch on and leave; the bad ones spend the rest of the lives sticking on more contact paper.
And the manager, of course, has long since moved on; he met his deadlines, after all, so he must be a good manager. And the person who's now in charge of that group? Well he must be a bad manager, because his team has lots of bugs and never makes deadlines anymore.
It's enough to make me cry.
I've been directly and indirectly involved in two CMM efforts in Silicon Valley
Both failed. One achieved CMM Level 2 in 14 months and then reverted backwards. I've
met with people from other companies that started down this path and then backed away.
In both cases, software estimation techniques proposed produced inaccuracies and required
to much rigidty in perspective, resulting in frustrated engineering, management and program
management team members... i.e it did not stick...
Secondly, the efforts required by individual contributors did not enhance day-to-day
work experiences. i.e no return-on-investment for the individual contributor
The people I've encountered that are CMM experts base much of their training on theory
instead of day-to-day experiences that do occur in this industry and in this valley.
CMM like Demings work in Japan, requires a homegenous culture to succeed. There are
few places where this exist in Silicon Valley (some goverment contractors).
Where does CMM certification payoff... when you are doing business with a company like a
Motorola... But there are other ways to demonstrate control of your development processes
IMO... In this valley, trying to turn software development teams into an manufacturing assembly
line is a waste of time. Put time and effort into empowering each and every individual by providing
the environment they need to be creative, accurate, and efficient.
Regards,
Kramer
We use the Disney system (based on how Disney overestimates the time spent waiting in line). We come up with a best case/worst case estimate. We tell the worst case to the management and the customer, then we tell the best case to the programming staff. Barring major goofs, the real time falls in between, which means we get enough pressure on the programming staff to induce urgency, but we always look really good to management and the customer.
Until computer programs can without significant human tailoring write other computer programs, it will be impossible to mathematically provide reliable estimates. Even then it will only provide a reasonable estimate 80% of the time. Why? Because of human involvement. The lackadaisical system admin who never upgraded the hardware that caused the system to go down for a week. The disgruntled employee who yanks his libraries off the builder box in protest. The rival manager who prevents your team from access to the builder box so he can increase his power. The vindictive maintenance man who cuts the power to the outlet your power strip is in. The secretary who loses the new combination to the systems room. The power-tripping network tech who changes your network password because you heedlessly cut him off in traffic. The inexperienced intern who throws off the default builder box settings. Your programmer colleague who confuses the libraries because there was no documentation. Your own well-meaning PHB who needs a chunk of your time to look good in front of his boss before the next round of layoffs. Your girl's troubles that sap your mind so that you can't concentrate. The cold you caught that keeps you from getting the sleep you need for top form. The car that won't start, the bus that runs late, the manditory training seminar, the emergency meeting, the system outage, the grim realization that you haven't any purpose in all you do and the collapse of motivation that comes with it. Don't think this stuff is far-fetched, it goes about daily. These are the things that throw off estimates today because these details are always ignored. You cannot admit them publicly and maintain a cohesive company, people will rebel. These will continue to throw off estimates as long as humans are involved in the process.
Programmers that I've worked with have almost always intuitively known this to be true, and non-programmers (in particular, product managers responsible for scheduling) have almost never understood this.
Those in the "Programming is an Art" camp tend to agree that there is no real way to estimate how long doing something new is going to take.
Those who think of programming as simply bulk engineering, repetetive, boring, or just "coding" tend to be frustrated by this seeming fact. It is almost irreconcilable with normal business practices to know how long a job will take until it is actually done. This makes it extremely difficult to make close-ended contracts, and to predict budgets.
Asking how long a particular software job will take is often equivalent to asking how long a research job will take.
Im sure the scientists would be amused if a suit walked down into R&D and asked them when they would be "done"
There is also a tendency for the QA manager to be asked by the pointy-haired boss at the beginning of the project "give us an optimistic estimate of time to test if everything goes right."
The horror, the horror.
From the analysis:
Lewis claims there simply cannot be any objective method for arriving at a good estimate of the complexity of a software development task prior to completing the task. He uses "objective" to mean a formal, mechanical method that does not rely on human intuition.
Okay, so Lewis doesn't conclude that good estimation isn't possible. He simply says that it's always going to require human intuition. So software engineers can't easily be replaced by some good AI in an app or by a robot. Big deal. Many critical tasks in many professions fit this definition. Doctors, lawyers, chefs, investment managers, etc. The best ones often distinguish themselves with intuition.
First you say that no-one is trying to find methods that produce relatively accurate estimates, and you use that to dismiss a large fraction of his theory: Then you say that you *are* trying to produce relatively accurate estimates: So which is it?
To tell you the truth, I would tell customers/superiors that I can give them very accurate software estimates as long as they don't change project parameters on me after I start.
This whole estimation thing assumes that the project parameters do not change during development, which I have never come close to seeing happening on any of the projects I've been exposed to. Ahh.. to be able to work on a project on a fixed set of parameters..
There are the changes that people can never seem to stop making during product development, and they originate from: marketing, sales, superiors, customers, warehouses and factories, just to name a few.
Of course, there are also the factors that you can't predict ahead of time (and consequently, cannot quantify besides adding a qualitative factor) such as changing: product costs, product availability, product specifications, competition, benchmarking, and tool quality/availability.
The best thing I've found is to keep software simple, sweet and very amiable to changing design and specifications. Software estimation is very much an intuitive feel based on past experience; there are also certain characteristics that you know will throw uncertainty into the schedule. For example, not only do I give my superiors at work a "time estimation", but I also give them an "uncertainty" or "risk factor" that tells them how close I feel my time estimation to be. They can learn a lot when you tell them "4 weeks give or take a couple of days" or "4 weeks if it's feasible to do at all".
Spot on. That has to be one of the most moronic questions you could ask a QA manager. As if the manager can tell the PHB "Oh well we expect to be able to do a full run test of all components in three weeks flat"
I'm sure you already know, but what invariably happens is that the QA Manager takes a stab in the dark at a figure, the Development Manager takes that figure and removes a week, the development teams run over by a week, and testing ends up being a bag on the side "Getting in the way" of the release. Then the build is released anyway, because "The customer has been told by [Sales|Management] it would be ready!"
Maybe I'm just cynical. Maybe I'll post as an AC so I don't reveal who I work for (Not a big company, none of you would know them anyway, but still).
Become a tester. See the world!
"Oh. you want this in 3 weeks. No problem." >)
Too bad you forgot my raise last spring. Guess I'll work on it after 7 more hours of Team Arena.
-J
You may look at it as four hours, but I look at it as 1/10 of a work week. Does that mean that in reality it will take 7.2 weeks?
Particularly for larger projects (like this one), not only do you have to add new features, but rewriting old code is necessary. The larger the program, the more has to be recoded in order to add new features. In short, these guys may write 24 lines of fresh code per hour, but the probably have to rewrite significant blocks in order to shoehorn in this new code.
It's a catch 22. You can plan all you want and have an estimate that is within, say, +-5%. The problem is, people want to know how long your estimate is going to take. Well, based on what I just told you, how long will it take to do the estimate? Exactly. But this doesn't stop people from thinking like that. Now, you're back to ball park guessing. When was the last time that someone wanted to pay you to estimate your estimate? Plus, people get upset when all they have is paper to show for their money. For some reason, if a software engineer shows you his design on paper, it has no value. On the other hand, if he shows you prototype code or a proof of concept, they feel like things are moving ahead. The problem is, if you design it right the first time, it will more than likely be a strong enough design to be able to evolve into anything that's needed. If it's kludged together, it sure does show, in money and time, now and later.
Because the guy at the repair shop can give you an estimate pretty much on the spot, many people expect the same to happen with everything else. Buildings would no doubt be the same way, EXCEPT, management won't tolorate this because of the liability issues. If management would grow up supported their people, and of course, stop pulling numbers out their butt, estimates in the software world might actually mean something. Until management fixes the problems that THEY created, they need to just shut their hole and write the check because it's THEIR screwup more often than not that they're paying for.
I've been playing around with the bitkeeper source control system for the last week. After reading this article I suddenly recalled that bitkeeper treats 2-way merge and 3-way merge as entirely separate features. N-way is not even discussed.
In some ways N-ways is merely a simple generalization of 2-way. The algorithmic complexity is not much different. The problem here is human scale. Humans cope well with two-way merge as a daily activity, cope with 3-way merge at the level of focus required by air-traffic control, and don't cope with 4-way merge under any sane circumstance.
Bitkeeper solves the problem by designing the architecture so that merges can be performed hierarchically. This is a feature that CVS sorely lacks.
Everyone knows that the success of projects is to a large measure determined by whether the architecture obviates the need to delve into N-way hell.
I also recall a project where a database supported two processes which concurrently updated the same dataset. During the design process we found a way to define the system so that each process was permitted to update a distinct set of columns, with maybe a column or two where one process was allowed to set a value and the process allowed to clear the value. Months of potential development effort slashed at the stoke of a pen. The first design dealt with the concurrency problem in a different way. Getting everyone to respect everywhere the subtle rules required by that design just doesn't happen on most projects.
The best book on the subject is the psychology of everyday things
What people tend to forget is that nuances of a software design create affordances with respect to the coding effort. When the pressure is on, people tend to grab onto the nearest handle. The handles hidden in the design have a momentous impact.
Some of the most important affordances are second order effects.
The C++ language is often criticized for having a model of class protection which is easily violated. Yes, that's true as a first order criticism. However, the C++ makes it fairly easy to figure out a way to manipulate the source code to find all the violations if you decide to look. These manipulations might be a temporary modification for the sole purpose of determining whether a certain kind of integrity exists. The C++ community doesn't lose any sleep over the first order weakness of the class protection model. We all know that violaters are playing a dangerous game.
On the other hand, there are certain kinds of abuse in the C language where it's practically impossible to turn up the smoking gun short of a complete source code audit.
The difference is not that C++ prevents programmers from abusing abstractions, but that it provides the necessary affordances to catch the people who do. The importance of these second order effects is vastly underestimated by those who plan.
You can see the extent of the problem wherever mouthy mights thrive. You know the people who always shout "it might happen" when the downside of anything they oppose is mentioned, as if might was an adverb of quantity. The implicit logic is that only a first order guarantee is sufficient, yet the recent study shows what everyone already knows, second order affordances generally suffice.
My experience is that projects are a morass of non-quantifiable psychology, experience, and intuition. The second order effects are never discussed on paper. It's left up to the cohesion of the team to impose the second order effects that make the first order effects possible.
It would be far more useful for the estimation people to spend their time figuring out the conditions under which a project becomes non-viable. Offer the programmers some kind of handle for coming back to management with their concerns about faulty second order effects, in language a whole lot less vague than what I'm using.
Wouldn't it be a fine start just to be able to limit ourselves to projects where the outcome is somewhat proportional to the effort expended. If we had proportionality already, the kind of estimation we have now would be a second order concern in its own right, rather than a masturbatory mission impossible.
Both the author and Lewis agree on what is arguably the one, main point: there is no substitute for an experienced, knowledgeable programmer to estimate the time required. Beyond that, they appear to come from different camps. The author appears to come from the camp that believes that objective estimation is possible, but isn't there yet. Lewis it appears, is from the camp that believes that coding is art, science and experience and not subject to quantification.
This is the age old programming clash of elegance, skill, hard work and experience (admittedly my persuasion) vs. managed, predictable, code generation by technique, hard work and experience. One camp says it's a learnable skill that can be harnessed, quantified and predicted like any other process of the machine age. The other says that there are intangible qualities that elude the cold, hard science of process efficiency.
In practice, we've used lone programmers that worked alone on our projects. They had various personalities, some flamboyant and extravagant, some quiet and staid. All hard working, all successful and all on time and under budget. In a couple of cases, their efforts were deployed enterprise wide. These programs were translated by people working in teams with processes, milestones, deliverables and other project management buzzwords. All took more time than they said, took over twice as much time to deliver as our single programmers did, cost more, delivered fewer of the original features and were not as responsive or usable as the originals. Most of the users that used both forms of the programs were very unsatisfied with the newer, enterprise-wide implementation and we get constant requests for the old versions to be re-implemented. For political reasons, this is not going to happen. This covers my experience for the last 4 years. Your mileage may vary. We're in the middle of another one of these conversions and it's going just as badly as all the previous ones. With no hope in sight. These projects cover various OS's and languages, so I can't take sides in any of the religious wars about platform or language, only approach. None of ours are even in the ballpark of the FAA system that Lewis refers to in his writings.
Given all this, the author of this article appears to have years of experience and knowledge in this field. Lewis is admittedly new to this field. Refer to my first point.
You must be the change you wish to see in the world - Ghandi
LaForge: Of course
Scotty: You never tell them how long it will really take. Captains are like babies, they want everything right now.
LaForge: But isn't that wrong?
Scotty: How else do we get the reputaion of being miracle workers?
(With apologies to Star Trek)
make Linux, not Microsoft. sin(beast) = -0.809016994374947424102293417182819
This paper seems to indirectly support Agile Development, and the methodologies that fall under it (XP for instance.)
http://www.agilealliance.org
"Suppose someone discovers a formal, mechanical method to estimate software effort that is 80% correct 80% of the time. For the other 20% of its estimates,"
And what about the other 16% of occasions?
Most everyone of my projects both large and small have been done without formal specs,requirements,or even scribbles on a knapkin.
It is virtually imposible to accurately estimate. I have started developing my own tools to aid in reuse. This has helped out a lot, but nevertheless it is virtuyally impossible to estimate how long it will take you track down a bug or design an aesthetically pleasing interface.
So I usually give two quotes. The first is the estimated time for **BASIC** functionality, meaning it does what it was supposed to do, (aka I was asked to do w/ out specs) abosultely nothing more. The second part of the estimate is how long it will take to get it to a more useful state. This is usually double or triple the original estimate, since because the poor upfront planning leads to poor architecture, and "quick-fixes" that I try to clean up in the "revision" phase.
For Large projects I expect to go through several of these cycles. . .
Though I have found that as I matured with respect to the tools (PHP and PERL) and as I have bene able to build libraries of common tasks (generating forms based on DB tables, error checking, login code, etc) I have tremendously increased my programming productivity.
The other variable that effects my estimates is I were a lot of hats at the office. I am the programming/architect, tech support, and recently have been the sales and billing dept as well since we laid some people off. Not to mention I have multiple bosses who each have their own projects.
In short in most real world situations i think the problem of software estimation is impossible since there are so many outside factors that contribute.
my $.02
-MS2K
Despite Lewis's claims to the contrary, no serious researcher in software engineering is trying to find a guaranteed method for producing estimates of time and effort that are certain to be correct. No one is even trying to find methods that produce estimates guaranteed to be correct within a known error range. The real-world problem of software estimation is much less strict than Lewis states. We are just trying to get somewhere close a reasonable percentage of the time!
Maybe you're correct: maybe no serious researcher claims these things, or is trying to get to a point to claim these things. But it's sure happening in the workplace. There are many "process" consultancies that do all but promise accurate, repeatable software development. Most of these consultancies advocate a process that gets you to ISO 9000 certification, or SEI's CMM Level N or some combination.
I've been victimized by just such a consultancy (used to have a name starting with "Bell" and ending with "Core") getting corporate upper management to buy into becoming CMM Level 3 certified. The method that this consultancy pushed claimed exactly what J.P. Lewis said: using their method got you to repeatable, predictable estimation.
Naturally, all this system actually required you to produce was "Word" documents, modified individually by highly paid programmers, architects and analysts.
Many people have suggested that in order to properly estimate a project one must have completed a design phase of some kind. How do you complete the design phase of a project that requires a client commitment before you can begin? How can a client commit without an accurate estimate as to time and budget? Are real world clients willing to pay for the design phase of a project they don't know they can afford?
... for most software. Most software projects are late because:
... etc. Needless to say, this leads to late projects.
1) The customers or product manager(s) can't decide what features they want and the project keeps changing.
2) Internal organizational politics.
3) Poor project management practices.
4) Internal organizational politics.
5) Inadequate staff.
6) Internal organizational politics.
Most programming projects are IT-related rather than "Computer Science"-related and have modest algorithmic complexity. It isn't figuring out the algorithm that's the problem; it's figuring out the problem to be solved itself, in many many cases. And, even in shrinkwrap (not my area of expertise), I think that dreaming up killer features is a lot harder than implementing them.
Politics kills projects all the time. Boss A has a vendetta against Boss B and sticks him with a project that requires 10 stud programmers for a year, but only gives him 5 weak programmers. Or, Bob is jealous that Dave is going out with Cindy so he won't cooperate and does a lot of passive aggressive crap. Or... well, you get the idea. People matter more than everything else and so politics matters a lot.
Many failed projects exhibit a "random walk" across the solution landscape, as programmers fart around with ideas that are fun to program but only modestly related to the problem at hand. Good project managers do not let the project drift.
And I've seen a lot of cases where I was committed 50% on project A, 50% on project B, 50% on project C, 40% on project D,
But it isn't that most of these projects are that hard. Database programming (or web development, or building an editor, or...) is not rocket science.
In one of my doctorate classes, I did an informal study of the relationship between software size and the accuracy of its COCOMO estimate on a data set of government projects. In short, I found that the model was _completely_ wrong estimating the duration of projects under 10,000 lines of code, +/- 20% for projects between 10,000 and about 200,000 lines of code, and it got really accurate after that. Just went looking for the paper but couldn't find it, so the above assertion is only a recollection. Point is, it's probably easier to be "close enough" on really big projects, and don't waste your time on small ones. I might try to get one of my grad students to duplicate or redo the study with a better project set.
How often has your car taken one more day at the shop, or the pizza place got a mad rush right after your call?
Estimation, by definition, is inaccurate because it has to account for the randomness of reality.
Project managers are rewarded for being on time and under budget. It is the abilities to organize, communicate, and marshall reality into compliance with the schedule that makes a project manager successful.
My favorite estimate is a year, regardless of the complexity, platform, budget, or need. Coincidentally, my current project was spec'd in May 2001. At that time I said it would take a year (I meant it then) to write a fully customized JSP app on top of a vendor's API and pre-existing schema. My boss insisted we only had until September because what she needed was a showpiece (in November). Fortunately, I convinced her a mockup would be sufficient for her needs.
Since then we have written functional requirements, design docs, architecture specs, and a static html mockup. In November, when the mockup was delivered, the vendor said, "Hey, you're not gonna believe this but...look at the mockup of our next major upgrade." It turns out we had 80%+ overlap, and we canned the customization project, because the vendor upgrade is due in, you guessed it, May 2002.
The wiseacre would point out that our original estimate was infinitely inaccurate. The pragmatist would point out that the delivery of 80% of our functional requirements will occur in lock-step with the estimate--one year from approval.
My favorite answer estimate is the same as the answer to "How long is the server going to be down?"
which is "How long does it take to catch a fish ?"
No one really knows untill they are well into the coding process... No matter how well defined all the components are...
morturii
so shove it
Heh heh. The "rule" that I learned from my first year engineering professor was that you take the estimate, x, and the actual time (and/or cost) will be kx, where k is some number between e and pi (e ~= 2.7183, and pi ~= 3.1416).
It's not a bad rule. Engineers (and programmers) tend to think that things cost a lot less and take a lot less time than they actually do.
Cryptnotic
My other first post is car post.
People forget delays, but they will always remember failures. It's human nature. Do you remember how long it took for Apple to get OS X out? Chances are, you don't. Do you remember Apple's pre-1997 "next generation OS", Copland? Utter failure.
There are tons of other examples.
Cryptnotic
My other first post is car post.
Take any accredited field in engineering. It is like saying "We cannot architect/design/ a building because we cannot exactly figure out
how much wood/brick/rebar/concrete we need."
Yet, we build many buildings by both factored and quantitative measures. Those of course depend on metrics gathered through experience. Factored methods can be out 30-40 percent where quantitative are 10-20.
But because you can not exactly figure out the minutia, to discount planning (forethougt?) and those that treat software like construction?
If you use proof by induction, I am sure that
you could prove no software project could ever
be built.
/\/\icro/\/\uncher