Outstanding Objects (Developed Dirt Cheap)
Mark Leighton Fisher writes "Some readers might be interested in
Outstanding Objects (Developed Dirt Cheap); or "Why Don't Developers Search the Literature?" It seems like I still see a lot of wheel reinvention going on, even with the wealth of code and information now available on the Net."
It's because all us developers think our way is the better way :P
Rick
Making something out of nothing : MD5 ("") = d41d8cd98f00b204e9800998ecf8427e
1) There's no pride in using someone else's code.
2) GPL'd code can't be used in commercial apps (blah blah, technicalities, yeah yeah, just try to get it past your boss)
3) If you're paid by the hour, what's the rush?
I wasn't aware that AC/DC was involved in software development till now.
If it's generic enough to be scratch your particular itch, you'll need to do a lot of work to implement the specifics of your case. If it's very highly specialised, you'll need to do a lot of work to adapt it to the specifics of your case.
;)
Given the choice, would you rather work on adapting someone else's code for your situation - or would you rather write your own from scratch?
(it's a rhetorical question
"None are more hopelessly enslaved than those who falsely believe they are free." -- Goethe
library uses wrong language
library has the wrong license
library pulls in too many external dependencies
library not threadsafe
But it's worth the search - occasionally you find a real gem.
It's far easier for me to spend time recreating code that exists already than to hunt down what's out there, read the documentation, figure out what drugs the developer was on, and customizing it. I make use of perl modules and bits of code on Usenet, just to save time, but that's the extent of it.
You have enemies? Good. That means you've stood up for something, sometime in your life. --Winston Churchill
I always hit the net before emarking on coding. There's no way I'm gonna spend 6 hours throwing together code that someone else has spent 20 lovingly moulding for me :-) I just too damn lazy... What can I say, Wally is my hero :-)
-- Gaxx
Done Dirt Cheap.
It doesn't matter if the code is available from somewhere "out there", from inside your company, or even from inside your group. The reality is that developers in general don't play well with others. Why? For a number of reasons.
First, it is no fun to use someone elses code. This is why at one time Apple computer (many years ago) had 13 different (yes, I counted them) memory managers being written. It was fun to write a memory manager, to solve the problems involved, etc.
Second, people don't trust one another. How do I know that you have implemented this code correctly? How do I know that you will deliver the modifications that I need? That you will deliver them on time? I can't, so it is better to do it myself.
Bottom line, we don't play well with one another, because we want all the fun for ourselves and because we don't trust the other folks (called flipping the bozo bit in some corners).
Most developers probably don't even know how to search CPAN or install a module from it (or PEAR for PHP). So they roll their own inferior solution. Those who have spent the three minutes reading the docs are getting an incredible benefit.
We reinvent the wheel because new wheels look sexy, not because they roll any better.
I have absolutely no idea whether there's a point to be proved with that, but it's kinda interesting.
If your problem is trivial, it's faster to write your own code. If your problem is not that trivial, it takes a lot of time to try to understand someone else's non-trivial solution. More than it would take to write your own code.
I write a program, and part of it needs to simply read a
Do I _REALLY_ want to pull in libpng and libSDL just to do this? What kind of risks does pulling these libraries in add to my project? How much will this bloat my code? Will users be confused from the different versions of these libraries? What if I one day want to port to a platform that these libraries work on?
Turns out it's usually simpler, easier, and less risky to just roll your own.
I develop in Delphi and I use a lot of stuff from the net (if you want to learn how to create reusable components, and use already made components, this is THE development environment, and there is even a free linnux version! and it is PAscal, not this joke of a language called C or C++).
anyway, as I work that way (for my company), I then get nailed down by the legal team because most stuff on the net doe not have a licence attached to it, or has a wrong licence, or the company wants to kee 100% copyright on stuff, but we can not contact the authors or something like that.
ie: if you develop for a company, you do not have the choice, you have to re-invent the weel (or hide it from your superior and legal teams). what a shame....
Most of us are lazy by nature.
I completely disagree. As a matter of fact, I have an article that totally disproves this, but I don't feel like finding it right now.
Maybe after my nap.
Best Windows Freeware
I'm not sure I that the 3 reasons Mark Fisher gives for lack of code-reuse are the main issue. Usually I think programmers are just too lazy to search. I had one predecessor that I always suspected had this disease. I've noticed in myself at times too. Usually you think that if you (re)write it, it'll be easier than trying to understand someone else's code (often but not always, you are wrong. :).
That said, let me pass on a little practical story. Having built solutions myself that were quick and dirty, for version 2.0 of a recent project I worked on, I decided to dump most of my code and try building on an existing, well-known open source project in my area. I've spent 4-6 weeks trying to take a well-known piece of open source code that performs a similar function better than my quick&dirty approach. I'm not finished, but with the deadline past and with significant obstacles remaining, I'm really questioning my well-intentioned attempt at re-use.
So let me toss out some more reasons why developers may not "search the literature":
1) the (time)-cost of doing the search,
2) the cost of figuring out the implementation details of what you do find so that you can effectively use it (which can be anything from understanding perl module documentation to understanding the concepts behind lex/yac or some protocol),
3) the time/development-cost of integrating the open source codebase into your codebase; this includes porting or handling dependency chains
4) the risk that, because of some detail that you won't understand until you fully invest in #2 (above), it may end up that this tool you are reusing actually doesn't solve your problems for some unexpected underlying implementation reason (something you can avoid if you fully develop your own solution with methods you *know* will work)
5) the risk of choosing the wrong alternative (e.g. picking one templating system out of the dozen alternatives that then gets orphaned)
I'd like to reuse code more, but rationally there are a bunch of reasons why I don't do more. These need to be addressed more satisfactorily for more code sharing to flourish.
--LP
Back in 1990 I worked for a small company that built
graphics boards and my first task was to debug
the "polygon fill" routine in their firmware.
It turns out they use their own "home brewed"
algorithm that was slow, memory hungry, and didn't
handle degenerate cases correctly. If anyone in
the company would have taken the time to pull
any one of the graphics textbooks off their shelf
(e.g., Foley, van Dam) they would find a much better
solution.
I ended up rewriting the module myself using
the classic solution -- it was faster, used little memory,
and handled degenerate cases reasonably.
It was my experience that everything was a badly
reinvented wheel when I worked there.
I have been involved in software reuse since the mid-1980s and possibly even earlier. There has been lots of energy expended on the problem of making existing implementations extensible, one of the strengths of OO technology, though not requiring OO. The big piece that has always been missing has been a major concerted effort focused on facilitation matching a developer's needs with existing software.
... I have long felt that CASE tools, yup those tools that are totally out of vogue right now, would be of greatest value if they had a dual function. Their primary function would continue to be as a means of describing architecture, design, or code, but a secondary function would be to, in the background, perform a continuous search of existing work looking for matches. I have never seen a tool that does this, yet this seems a tremendously valuable function.
There are many mechanisms that can assist such as:
1 - technical reviews. When these happen, you get a number of co-workers together to review your work. Not only does this assist in ensuring that direct work (architecture, design, code) is correct, but it also provides an opportunity for all those involved in the review to search their knowledge of pre-existing "parts", be they architectural, design, or actual code, and to suggest you investigate them. Of course, if you're like me, then actual review meetings where a number of people sit down and examine your work just do not happen any more. Thus this form of identifying existing work that can be reused no longer functions.
2 - CASE tools
3 - personal memory - only works for those items you already are familiar with, which frequently gets voided when changing jobs.
4 - institutional memory - this is similar to the technical review mechanism, yet is less well defined. The real question here is HOW does an individual tap into an institutional memory? Documentation search? This is far less than perfect even if all work was well documented. Code search? Even worse at turning up matches to needs.
So... the bottom line is that it truly is VERY difficult to match up needs of a software development effort with the existing software that is available.
Once case in point... I worked on a very large project for an FAA (Federal Aviation Administration) contract. One mechanism I needed was a circular buffer/queue. These seem very straight forward to implement, and an obvious place to use an existing piece of design/code. Well, even after extensive search and review I could not find such a part and built my own. Later, I discovered there were at least six independent implementations of a circular buffer/queue in this single project team. All of them were general enough to meet the other implementation's needs, yet somehow none of us knew of the others' overlapping work. If we couldn't coordinate the reuse of these six independent efforts (and that means we all built the same basic algorithm, found the same set of bugs... and yes, using our code management tool I was able to see the same bugs being fixed in each implementation... and thus a total and unnecessary duplication of effort), how in the world will we ever solve the problem of reusing work outside the single project team, or outside a company?
There are some examples of wild success with reuse... though they seem to me to be more success though definition. All of those shell scripts that are built from individual command line tools are examples of reuse, where each command line tool represents a unit of software available for reuse. But, I think we all think of reuse more at the code module level... a function, or class, or small library. And it is at this level that I think we fail miserably, and it is my contention that we fail because we can't easily find the candidates for reuse.
Yes, I'm paid by the hour, but I also care about the quality of that hour. If the problem is interesting, I tend to research other solutions (to scope out the pitfalls and features I might not have thought of), then I'll often implement my own solution, because learning someone else's code tend to be pretty high on the tedium scale.
If the problem is NOT interesting, I have a lot more motivation to find someone else's code that I can use; if I find a quality solution, I can plug it in, spend hopefully minimal time debugging and testing it, and move on.
And there CAN be pride in using someone else's code, actually; I really get a kick out of using libraries and sending back elegant enhancements or bugfixes back to the authors ("Your library was excellent. I improved it.").
Also, if the code you found is really good stuff, it might help you to finish up a complex feature in record time, which also feels nice ("Oh, I almost forgot to mention it, but that new report we scoped out yesterday is out on the test server").
There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
I often don't reuse for reasons described very well by other posters, but I wanted to mention some cases where I either did reuse or wanted to.
Two years ago I was developing online courseware for a company that trained/certified future medical transcriptionists. We needed to develop a typing test. Now, a typing test is all about doing two things -- (1) noting when someone types something the shouldn't be there and (2) noting when someone doesn't type something that should. So you're comparing for absensces or additions between a given text and a key. Sound like anything else? My first thought was 'diff'. My second thought was Perl (after all, this is text slinging). My third thought was CPAN. And sure enough, Mark Jason-Dominus' excellent Algorithm::Diff saved me at least days of time and quite possibly weeks.
Now, this was possible in part because I was working as a contractor, and so was probably trusted a bit more, and also, in part because my supervisor/contact with the company was pretty savvy. I can contrast this with some other experiences. Like the company I worked for that wanted a webserver log file analysis package. Again, lots of text slinging, but perl or any other scripting language was out because we wanted the source as closed as possible. Nope, it had to be in C, and I was discouraged from trying to find a regex library to use. I essentially ended writing my own regex engine. It was buggy. It needed optimization. The syntax was less powerful . The stats package itself was good, especially for 1997 (it could do things I've only seen other log analyzers do in the last two years), but because it all ran on top of this flaky regex engine, it couldn't fly. I think it got canned after I left... nobody wanted to touch it. I seriously think I lost months of my life on this, and the company lost a good product. All from trying to reinvent the wheel...
Tweet, tweet.
when programmers are taught to a computer language, we're taught to write. When everybody else is taught any other kinda language, they're taught to read. If I learn Russian, I'll read great literature.
Trouble is twofold.
1) we con't have a corpus of great computer code we can show folks howto read.
2) reading code is most often associated with Maintenance and maintenance programmers aren't highly regarded.
call it NIH.