Myths About Open Source Development
jpkunst writes "A thought-provoking article by chromatic on oreillynet, listing eight "myths" that Open Source developers tell themselves. For example: Myth: Publicly releasing open source code will attract flurries of patches and new contributors. Reality: You'll be lucky to hear from people merely using your code, much less those interested in modifying it."
Nearly all of the article's "myths" are relevant for all software development, not just FOSS. As for the first myth, and the one cited in the posting, that's just a troll. I don't think anyone believes that just releasing code makes it useful or desirable. In other words, this article should have titled: 7 Myths about Software Development. As such, it's not bad, although I didn't find any deep insights in it.
----------------
Mythical Man Month Methodology
http://fourm.info/
writing open source software will get me laid!
This may be true for a minority of widely used projects, but for most applications, I've never bought this argument. Bug swatting, and especially code inspection, is and always will be a tedious process, not well-suited for a volunteer-only development community. The only advantage I see for open source in this area is that bugs can be fixed as they are encountered -- but this only works where the end user has the required skills to do the fixing in the first place.
Roving Web-Teleoperated Robot
My limited experience with open source is summed up with this article sentence:
~~~
Not all open-source projects are alike, however. A small number of open-source projects have become well known, but the vast majority never get off the ground, according to Scacchi.
~~~
Open source is obviously faster/better/cheaper when 1000's of people donate their time to a single project. The only open source project I've been involved in was a collaboration among several corporations, all of which wanted to leverage each other's resources, but none of which could really contribute their own.
There's nothing like money to motivate people to work on a project for which people aren't willing to donate their time.
Personally, I'm not convinced speed is related to developer quantity. There's too big a variation in productivity between experienced and amateur developers.
I'm also not convinced open-source is right for all types of software. How many open-source developers you know that conduct large-scale usability tests? How many open-source developers go around interviewing end users? When the developer and product consumer is the same, open-source makes much more sense to me.
The linux hacker
Myth: The GPL is the only open source license
Truth: Although it's the most popular, it's not the only license.
Sadly, I think this is what most people think of when they think of open source.
Fortress of Insanity
Myth: Publicly releasing open source code will attract flurries of patches and new contributors.
Reality: You'll be lucky to hear from people merely using your code, much less those interested in modifying it.
In my experience, this is not the case. I wrote a little rip-encode-and-tag script called choad and listed it on Freshmeat for the hell of it. This was two years ago, and I've received over 20 patches -- for a crappy little perl script!
I wrote it to solve my problem, and I continue to be pleasantly surprised when I get emails with feature enhancements, bug fixes, or just plain thanks and encouragement from people who had the same problem as me.
Cretin - a powerful and flexible CD reencoder
I mean, does anyone really think that how they package their product won't effect how many people start using it? Are there really a lot of people out there who assume that they'll have an instant dedicated following of skilled developers spring from nowhere the moment they publish their source?
I really doubt it, somehow. Charitably, I'd file the advice in this article under the "Obvious but sometimes in need of restating" catagory in that sometimes people will lose the forest for the trees. Still, no real revelations here.
Every year during my review, I just pray the words "slashdot.org" aren't mentioned.
They bring up an important point about warnings-- that if you don't fix warnings, even if the thing you're being warned about is fine, you'll miss important warnings later.
Their solution is, always fix warnings.
My solution is, GCC needs some way to suppress warnings!
Yes, GCC can already suppress *classes* of warnings. But I want to be able to suppress warnings on a per-line basis. What if in function x, there is a variable that I have defined but do not use for some specific reason-- but I still want to be warned if I do the same by accident in function y?
In Codewarrior, we had something called #pragma unused which worked like this. But that was just for that one case. Something generalized would be cool, something like "#pragma gcc.sw typecast" that would suppress typecast warnings for the next block, for example...
Oh my God, this sounds exactly like my last job. 10,000 lines of Tcl, with not a shred of documentation in sight. Running a financial system that processed millions of dollars a day. And I know to this day, my old boss is still trying to figure out why she keeps losing employees left and right, and why it takes so long for new people to come up to speed.
I Am My Own Worst Enemy
It's not worth writing good design documents because everyone will read the code.
Read Epic the first RPG novel.
"I am sure that everyone will want to install Apache/mod_perl/mod_ssl and mysql and perl 5.8.3 and 17 non standard perl modules (8 of which are not available on CPAN), ImageMagick, python, zlib, libpng and glib2.1 and zend and php) to be able to use my practically useless and very buggy digital picture management system."
If you write something that is usefull and/or fun. People are going to use it. For example I use the Spreadsheet::WriteExcel module at work. Yes perl writing excel documents. I used because there was a need. I fixed a bug in one of the optional modules because that was a feature we use and need to work correctly. Would I ever picked up and use that module on my own. Maybe if I came across it and wanted to create an spreadsheet for some silly reason but I highly doubt it. But I had a need to create an excel spreadsheet on a unix server so I filled that need.
Get Movie Posters
It's a myth. And here's a proof: /usr/src/linux/Documentation/networking/arcnet.txt :
A few words from a desperate open source coder...
Since no one seems to listen to me otherwise, perhaps a poem will get your
attention:
This driver's getting fat and beefy,
But my cat is still named Fifi.
Hmm, I think I'm allowed to call that a poem, even though it's only two
lines. Hey, I'm in Computer Science, not English. Give me a break.
The point is: I REALLY REALLY REALLY REALLY REALLY want to hear from you if
you test this and get it working. Or if you don't. Or anything.
ARCnet 0.32 ALPHA first made it into the Linux kernel 1.1.80 - this was
nice, but after that even FEWER people started writing to me because they
didn't even have to install the patch.
Come on, be a sport! Send me a success report!
(hey, that was even better than my original poem... this is getting bad!)
WARNING:
--------
If you don't e-mail me about your success/failure soon, I may be forced to
start SINGING. And we don't want that, do we?
(You know, it might be argued that I'm pushing this point a little too much.
If you think so, why not flame me in a quick little e-mail? Please also
include the type of card(s) you're using, software, size of network, and
whether it's working or not.)
My e-mail address is: apenwarr@worldvisions.ca
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
What most developers don't think is "Hey, I didn't contribute anything. Nobody I know has contributed anything. Why will my project be any different?"
Myth 3: Reading code
I've tried to read large bodies of code before. It's damn hard, even if it is documented. And when it isn't documented, your beginning developers don't have a chance.
Myth 4: Packaging
Um...duh? Of course it needs to be properly packaged. And dependency lists? If someone can't get it to compile, they definitely won't use it.
Myth 5: Start from scratch
Don't start from scratch if the code isn't clean. Make new code clean, and go back to clean up existing code. Make sure you have those regression tests ready.
Myth 7: Perfection
Developers are humans. Humans are fallible. I'll make a perfect program - when Bullwinkle pulls a rabbit out of his hat.
Myth 8: Ignore warnings
If the warnings were ignorable, they wouldn't be there. My profs would take marks off if you got warnings in compilation, unless your documentation explained exactly why you let the warning stand (and it had better be a good reason).
Myth 9: Tracking CVS
Users don't track CVS. Developers track CVS. Users want quick-and-easy, working code.
Either I miscounted, or there's more than 8 entries on the site (they aren't numbered)
I can't say that I don't give a fuck. I've just run out of fuck to give.
Congrats to chromatic for offering several points about ease of use, especially regarding installation, which are often missed. In particular:
- "Packaging Doesn't Matter"
- "Programs Suck; Frameworks Rule!"
- "Warnings Are OK"
- "End Users Love Tracking CVS"
I appreciate the difficulties involved for open-source developers in making their programs easy to download and play. At the end of the day, it's their choice whether they make it accessible to the masses. Many of them just want to give something to the world that they would have otherwise kept for themselves.
But it is clear from the number of ambitious projects that many developers to aspire to hit prime time. In those cases, I hope they will take the advice in Chromatic's article, and think very carefully about the experience of an end-user who just wants to have a look.
For one thing, provide some screenshots so they don't even have to download the thing to see it. Next, read your installation instructions and consider whether they might not be better represented as an actual installation script. And finally, have an automated test facility to make sure the installation procedure works correctly.
An example of a problematic open-source package is subversion, the "sequel" to CVS. Because of the decision to bootstrap version control, you have to go through some painful procedure (last time I looked), just to see if it's worth bothering about yet. I have better things to do than jump hoops to try out a bit of fresh meat. I'm sure it will be great when it hits 1.0, but I'll save my energy until then.
Remember: the risk of a crap product is high when it comes to picking one of the thousands of packages on SF. Therefore, the pain threshold for most people is very low: if it doesn't work after a few minutes, most people will give up and try one of the dozen alternatives.
Myth: Publicly releasing open source code will attract flurries of patches and new contributors.
;D
Myth: Stopping new development for weeks or months to fix bugs is the best way to produce stable, polished software.
Myth: New developers interested in the project will best learn the project by fixing bugs and reading the source code.
Myth: Installation and configuration aren't as important as making the source available.
Myth: Bad or unappealing code or projects should be thrown away completely.
Myth: It's better to provide a framework for lots of people to solve lots of problems than to solve only one problem well.
Myth: Even though your previous code was buggy, undocumented, hard to maintain, or slow, your next attempt will be perfect.
Myth: Warnings are just warnings. They're not errors and no one really cares about them.
Myth: Users don't mind upgrading to the latest version from CVS for a bugfix or a long-awaited feature.
For explanations of each RTFA
Granted, I don't think all of those are myths. But one really irks me as being false for any software developers:
Myth: New developers interested in the project will best learn the project by fixing bugs and reading the source code. Reality: Reading code is difficult. Fixing bugs is difficult and probably something you don't want to do anyway. While giving someone unglamorous work is a good way to test his dedication, it relies on unstructured learning by osmosis.
I work for a very niche market/profitable software company and thats exactly how the developers get their feet wet, by fixing minor bugs.
Seems like the only way to "learn a project" is to fix bugs and therefore read the code.
I find that open source is not so valueable in that people inspect my code and provide feedback. Instead I find the following realizable benifits:
A) I can build apon other people's code.. It's effectively stealing their ideas, BUT since I'm GPLing my code as well, there is no net loss, and they are free to resteal my ideas back (if they are so inclined). I do often refer original authors to my new code.
B) I recognize that people MIGHT secretly build apon my code, so I get a warm fuzzy.
C) I can fix problems with open source drivers (postgres jdbc driver, GNU file-utils, etc. are some of my examples). Moreover, my debugger can jump straight to the line of maliscious code.
D) When I am about to release code publicly, I feel self conscious, and thus I put a TREMENDOUS amount of effort into cleaning up the code.. Making sure various platforms work, making sure there is no embarrasing spagetti-code, etc. Thus the mere possibility of people reading my code causes me to exert effort that I wouldn't otherwise. The end positive is a lower propensity for bugs, AND more modular/reusable code (especially with anything in perl).
The end-end result is therefore that Open source facilitates greater code reuse; less re-inventing of the wheel.. And more importantly code extensibility.
Now this begs a question of the distinction between modules and out-right applications. Open source is great for producing millions of reusable modules, but we often get chastized about the availibility of abundant QUALITY applications. Well, in my view, the merging of these two is two fold:
A) Open source applications tend to be more "plugagable"
B) Commercial sites will often pay developers to use open source modules and customize them to the particular needs of the corporation.. In doing so, serious feedback is provided to the various open source projects (because it is in their mutual interest to refine the modules). I as part of such a corp, have contributed (in various small ways) to several open source projects on the corp's dime, and with full authorization. This is of course, a completely unreliable source of income for a project, of course, but it is definitely a facilitator.
-Michael
regardless of whether the project is an open source (or not).
We (popular IT community) are re-learning the lessons of IBM in the 60s which Fred Brooks distilled in his famous The Mythical Man-Month.
I think the bigger misunderstanding is that new developers/IT types/CS academics thinks that everything is new. Most computer security issues were first discussed based in the 1960s or 1970s. Much of Distributed Computing is now being "re-discovered" as Grid Computing.
I've found a few other misconceptions in open source development that have irked me over the years.
1. Using autoconf/automake will make my code portable.
TRUTH: You need to know what system calls are portable, which ones arent, and the nuances in using each on different platforms. The auto* tools will only make detecting and utilizing the correct versions easy. It's up to you to identify and code for them in the first place. (Ditto for compiler flags, shared libraries, linker options, etc)
2. Network programming is easy.
TRUTH: I've seen a lot of projects that implement their own network communication using TCP sockets and sprintf text messages. A number of others transmit little endian integers around. And others still use a blocking style request->response form of communication.
Good network programming is really hard, and unless you take the effort to design and implement something robust from the start, this kind of ad-hoc, inflexible networking will become embedded into the application and require significantly more rework later down the road.
And PLEASE reuse something that might fit before even attempting to write your own layer. The gnutella protocol is a great example of this problem.
3. Threading is as simple as using pthreads and mutexes.
TRUTH: Good threading code is difficult to develop and difficult to debug. It is always preferable to use an event based model where possible, and rely on threads only when you need scalability on SMP, work arounds for blocking system calls (gethostbyname_r), or background tasks that you dont want delaying interaction with a user or network app (there are many other reasons, but these give you the general idea of where threading is appropriate).
Synchronizing access to shared resources between threads is also very tricky. The level of granularity of locking, and the structure of your data structures themselves, will have a significant impact on performance. Too much granularity and you end up with extremely complex locking hierarchies that are difficult to debug, more prone to dead lock. Too little granularity and you get lots of contention for these shared resources.
Finding the sweet spot is tricky, and often requires lots of experience or tuning to get right. The lack of tools to provide visibility to lock contention and latency also make this difficult.
I'm sure there are others, but these are the big ones that come to mind.
then please make it easier to contribute.
Show us your roadmap for development,
where you want us to contribute time,
and how we can get started helping you.
Make it easy to understand your software,
maybe by creating help files, diagrams,
real examples of how to use your software,
even comparisons to related software.
Source code comments are good;
technical overviews are even better.
Above all, get FEEDBACK from developers
on your source code and your documentation.
Is it clear? easy? How could it be easier?
The more your improve your documentation,
and your process for contributing code,
the more we can help you. Thanks!
Cheers, Joel
If the oil light comes on, and you don't stop immediately, you will stop in a very expensive way seconds later.
False.
The oil light in my 1990 trooper used to come on regularly because of low oil pressure. After I while, I quit topping it off, always thinking "I'll take care of it tomorrow." The situation went on for weeks before the engine finally siezed.
Of course, the above is strong evidence that I am an idiot.
While I'm agreed that the best way to write good software is to not introduce bugs in the first place, I don't believe that this is an entirely avoidable problem.
There are certain types of necessary changes that inherently destabilize a codebase no matter how careful you've been. It's inevitable. Oftentimes, things like this are checked in to amortize the cost of producing, fixing, and improving said code. There are the unforseen interactions that your new subsystem has, that none of the regression or unit tests have picked up. I know - "write more/better tests" is a better solution. But omnipotence is an impossible goal.
To continue the author's "home" anology, relasing software is like preparing a meal. The pots and spoons simply must get dirty when you're cooking. Many try to "clean as you go," but at the end, you're still left with your dirty casserole dish. You can either choose to clean things up before your guests get there (feature freeze), or you can leave the dirty dishes lying on the counter for all to see.
I might be inclined to say that the shorter the feature freeze, the better. But I don't have any evidence to back this up - nor does Chromatic cite any evidence (except antecdoctal) to support or detract this claim. Maybe people by nature are better at fixing a slew of bugs at once. Maybe not.
Freezes, milestones (alpha, beta) and the like are inevitable parts of producing quality software fit for public consumption, short of "papal infallibility." We're only human.
Dom
As Linux and Other Open Source software get used more and more by less tech savvy users, eg non-programmer types, as a percentage of the community contributions will seem to decline.
,or not. appreciate the work that has gone into the free software that they use from day to day. When was the last time you took stock of just how incredible that linux box with its flashy gui you are using is when you consider that it has been bought to you by the hard work of the OS community.
I think most people, tech savvy
I think people need to find their niche, as to what they can and can't do in order to contribute. Many people think because they are not a hard-core coder they cant do anything to help. I've only contributed to a couple of things since I've been using Open Source stuff.(the past 4yrs) But when I do fix a bug or create something a project might find useful I usually send any files or useful info over to the project maintainers. It is the least I can do when I owe my redmond-free world to so many dedicated geeks!
I wonder just how many regular Open Source users feel that if they could, they would help, but maybe dont know how.
I would say project maintainers should encourage people to help out in other ways, There are loads of things people can do. Artwork, Documentation, Website maintennance heck , even give free support to people if they are nice enough.
I've been helping a few newbies through their first forays into linux, as indeed friends helped me when I got started. If you plant the right seeds in those newbie minds, they most certainly will grow a giving and generous attitude.
There is one more way people can support Open Source.. Lets introduce a "Send your favorite project A Beer Day" send em some beer money!
Nick !
Electronic Music Made Using Linux http://soundcloud.com/polyp
According to that link, Alan has a BSc in CS. Linus Torvalds has a Bachealors degree in CS, and an honorary Ph.d from the same school in Finland. I'm too lazy to dig up links for that. It's in several of the books about his life.
Kirby
Here's another common myth... "You can't sell open-source software". Not true! In fact, the FSF encourages people who distribute free software to charge as much as they want.
Throwing away code blindly is a mistake, especially if it is working code. Then again, keeping crufty, bad code around is an equally large mistake. The larger point that Chromatic misses is that making uninformed code decisions is playing Russian Roulette. Throwing away code (or keeping code around) is only a mistake when one has no concrete rationale for doing so.
The important part is to have a good understanding of the problem scope, previous attempts (if any) at solving the problem, and what their advantages and drawbacks are.
You have to remember that code doesn't exist for code's sake alone. We write code to solve problems. Code is a window into how someone solved a problem. And not all solutions are created equal.
What is important is to understand the "whys" and "hows" of these previous attempts, and then chart the best course you see toward success. It may well be that the best solution is to scrap another's design. It may be the best solution to build off of another's success. However, it's probably a bad decision to build off of another's failures.
Dom
Programs Suck; Frameworks Rule!
Myth: It's better to provide a framework for lots of people to solve lots of problems than to solve only one problem well.
Reality: It's really hard to write a good framework unless you're already using it to solve at least one real problem.
Really-Real-World Reality: Frameworks that are developed in conjunction with one specific project are likely to produce lousy results when used in a different project.
I've seen a number of "generalized" frameworks that came out of one large project, only to wreak havoc when they were forced upon the developers of another project. When people are writing support code for a project, a lot of project-specific design decisions get mistaken for generic architecture because the developers are only looking at it from an insider's perspective.
Solve your real problem first. Generalize after you have working code. Repeat. This kind of reuse is opportunistic...
This is sheer idiocy. If anyone disputes this, I've got some code I'd like to show you...
(Trying not to flame) This guy doesn't know what he's talking about. The proverbial "reinvention of the wheel" is not really reinvention. The problem is that programmers do just what he suggests - rather than think through the problem, and how they can create reusable code, they proceed to cobble together some garbage which solves only the specific problem at hand. Which leads to other programmers having to "reinvent the wheel" because the first programmer didn't make his code reusable!
You can't have it both ways. Either you reinvent the wheel every time, or you write reusable code. It's a discipline, folks - sometimes you have to put forth the extra effort up front to make gains in the long run.
The first three years as a programmer, I must have written at least half a dozen linked list implementations. It wasn't until I had worked on some large projects that I learned that writing reusable code is well worth the extra effort. I was the guy who "just coded the solution". It took me a long time to learn that the more time I spent thinking about the problem, the less time I spent on coding and debugging.
The society for a thought-free internet welcomes you.
The surest way to gaurantee involvement in a project is to create a community around it. Forums, user/contributor publishing, blogs. Anything that will let your contributors express themselves regarding the project.
Let people get involved, encourage them, provide a forum.... hopefully provide the tools (sourceforge) but also provide a unique community experience. Create a brand (read a book on marketing) and you will reap the rewards for years... think about Aibo for instance...
A fool throws a stone into a well and a thousand sages can not remove it.