Myths About Open Source Development
jpkunst writes "A thought-provoking article by chromatic on oreillynet, listing eight "myths" that Open Source developers tell themselves. For example: Myth: Publicly releasing open source code will attract flurries of patches and new contributors. Reality: You'll be lucky to hear from people merely using your code, much less those interested in modifying it."
Nearly all of the article's "myths" are relevant for all software development, not just FOSS. As for the first myth, and the one cited in the posting, that's just a troll. I don't think anyone believes that just releasing code makes it useful or desirable. In other words, this article should have titled: 7 Myths about Software Development. As such, it's not bad, although I didn't find any deep insights in it.
----------------
Mythical Man Month Methodology
http://fourm.info/
writing open source software will get me laid!
My limited experience with open source is summed up with this article sentence:
~~~
Not all open-source projects are alike, however. A small number of open-source projects have become well known, but the vast majority never get off the ground, according to Scacchi.
~~~
Open source is obviously faster/better/cheaper when 1000's of people donate their time to a single project. The only open source project I've been involved in was a collaboration among several corporations, all of which wanted to leverage each other's resources, but none of which could really contribute their own.
There's nothing like money to motivate people to work on a project for which people aren't willing to donate their time.
Personally, I'm not convinced speed is related to developer quantity. There's too big a variation in productivity between experienced and amateur developers.
I'm also not convinced open-source is right for all types of software. How many open-source developers you know that conduct large-scale usability tests? How many open-source developers go around interviewing end users? When the developer and product consumer is the same, open-source makes much more sense to me.
The linux hacker
Myth: The GPL is the only open source license
Truth: Although it's the most popular, it's not the only license.
Sadly, I think this is what most people think of when they think of open source.
Fortress of Insanity
I mean, does anyone really think that how they package their product won't effect how many people start using it? Are there really a lot of people out there who assume that they'll have an instant dedicated following of skilled developers spring from nowhere the moment they publish their source?
I really doubt it, somehow. Charitably, I'd file the advice in this article under the "Obvious but sometimes in need of restating" catagory in that sometimes people will lose the forest for the trees. Still, no real revelations here.
Every year during my review, I just pray the words "slashdot.org" aren't mentioned.
It's not worth writing good design documents because everyone will read the code.
Read Epic the first RPG novel.
"I am sure that everyone will want to install Apache/mod_perl/mod_ssl and mysql and perl 5.8.3 and 17 non standard perl modules (8 of which are not available on CPAN), ImageMagick, python, zlib, libpng and glib2.1 and zend and php) to be able to use my practically useless and very buggy digital picture management system."
If you write something that is usefull and/or fun. People are going to use it. For example I use the Spreadsheet::WriteExcel module at work. Yes perl writing excel documents. I used because there was a need. I fixed a bug in one of the optional modules because that was a feature we use and need to work correctly. Would I ever picked up and use that module on my own. Maybe if I came across it and wanted to create an spreadsheet for some silly reason but I highly doubt it. But I had a need to create an excel spreadsheet on a unix server so I filled that need.
Get Movie Posters
What most developers don't think is "Hey, I didn't contribute anything. Nobody I know has contributed anything. Why will my project be any different?"
Myth 3: Reading code
I've tried to read large bodies of code before. It's damn hard, even if it is documented. And when it isn't documented, your beginning developers don't have a chance.
Myth 4: Packaging
Um...duh? Of course it needs to be properly packaged. And dependency lists? If someone can't get it to compile, they definitely won't use it.
Myth 5: Start from scratch
Don't start from scratch if the code isn't clean. Make new code clean, and go back to clean up existing code. Make sure you have those regression tests ready.
Myth 7: Perfection
Developers are humans. Humans are fallible. I'll make a perfect program - when Bullwinkle pulls a rabbit out of his hat.
Myth 8: Ignore warnings
If the warnings were ignorable, they wouldn't be there. My profs would take marks off if you got warnings in compilation, unless your documentation explained exactly why you let the warning stand (and it had better be a good reason).
Myth 9: Tracking CVS
Users don't track CVS. Developers track CVS. Users want quick-and-easy, working code.
Either I miscounted, or there's more than 8 entries on the site (they aren't numbered)
I can't say that I don't give a fuck. I've just run out of fuck to give.
I think one needs to differentiate between small and big projects. It's certainly easier to write a patch for a relatively short script, simply because it's easier to understand what it does. Try to write a useful patch for a big project like Mozilla and you'll spend quite some time trying to even understand which file you need to patch. It's obvious that smaller projects attract more patches while bigger projects attract more bug-reports.
You can use GCC's attribute system:
int foo __attribute__ ((unused));
GCC supports all kinds of cool attributes, both for functions and variables. For example, the ((deprecated)) attribute marks a variable as deprecated, and will produce a warning if any code uses that variable.
However, these methods are not portable. On nearly any compiler I can imagine, the cleanest and simplest way to supress an unused variable warning is to assign the variable to itself:
int x; /* shut up compiler warning */
x = x;
Run 'info gcc' to get the full documentation. Go to the "C Extensions" section. GCC is littered with HUNDREDS of very cool extensions. Just make sure it's worth giving up portability...
Congrats to chromatic for offering several points about ease of use, especially regarding installation, which are often missed. In particular:
- "Packaging Doesn't Matter"
- "Programs Suck; Frameworks Rule!"
- "Warnings Are OK"
- "End Users Love Tracking CVS"
I appreciate the difficulties involved for open-source developers in making their programs easy to download and play. At the end of the day, it's their choice whether they make it accessible to the masses. Many of them just want to give something to the world that they would have otherwise kept for themselves.
But it is clear from the number of ambitious projects that many developers to aspire to hit prime time. In those cases, I hope they will take the advice in Chromatic's article, and think very carefully about the experience of an end-user who just wants to have a look.
For one thing, provide some screenshots so they don't even have to download the thing to see it. Next, read your installation instructions and consider whether they might not be better represented as an actual installation script. And finally, have an automated test facility to make sure the installation procedure works correctly.
An example of a problematic open-source package is subversion, the "sequel" to CVS. Because of the decision to bootstrap version control, you have to go through some painful procedure (last time I looked), just to see if it's worth bothering about yet. I have better things to do than jump hoops to try out a bit of fresh meat. I'm sure it will be great when it hits 1.0, but I'll save my energy until then.
Remember: the risk of a crap product is high when it comes to picking one of the thousands of packages on SF. Therefore, the pain threshold for most people is very low: if it doesn't work after a few minutes, most people will give up and try one of the dozen alternatives.
Not really sure that this is a myth. Anybody can write crappy, buggy code. People do it everyday. Same thing with stability. Whether unix is a better platform than windows might be debateable, I don't think anybody denies that crappy code is written on both platforms.
The only thing that open source brings to the table is that people might look at it, and might point out problems. But if you are relying on both of those to happen you are making two big assumptions.
I find that open source is not so valueable in that people inspect my code and provide feedback. Instead I find the following realizable benifits:
A) I can build apon other people's code.. It's effectively stealing their ideas, BUT since I'm GPLing my code as well, there is no net loss, and they are free to resteal my ideas back (if they are so inclined). I do often refer original authors to my new code.
B) I recognize that people MIGHT secretly build apon my code, so I get a warm fuzzy.
C) I can fix problems with open source drivers (postgres jdbc driver, GNU file-utils, etc. are some of my examples). Moreover, my debugger can jump straight to the line of maliscious code.
D) When I am about to release code publicly, I feel self conscious, and thus I put a TREMENDOUS amount of effort into cleaning up the code.. Making sure various platforms work, making sure there is no embarrasing spagetti-code, etc. Thus the mere possibility of people reading my code causes me to exert effort that I wouldn't otherwise. The end positive is a lower propensity for bugs, AND more modular/reusable code (especially with anything in perl).
The end-end result is therefore that Open source facilitates greater code reuse; less re-inventing of the wheel.. And more importantly code extensibility.
Now this begs a question of the distinction between modules and out-right applications. Open source is great for producing millions of reusable modules, but we often get chastized about the availibility of abundant QUALITY applications. Well, in my view, the merging of these two is two fold:
A) Open source applications tend to be more "plugagable"
B) Commercial sites will often pay developers to use open source modules and customize them to the particular needs of the corporation.. In doing so, serious feedback is provided to the various open source projects (because it is in their mutual interest to refine the modules). I as part of such a corp, have contributed (in various small ways) to several open source projects on the corp's dime, and with full authorization. This is of course, a completely unreliable source of income for a project, of course, but it is definitely a facilitator.
-Michael
I've found a few other misconceptions in open source development that have irked me over the years.
1. Using autoconf/automake will make my code portable.
TRUTH: You need to know what system calls are portable, which ones arent, and the nuances in using each on different platforms. The auto* tools will only make detecting and utilizing the correct versions easy. It's up to you to identify and code for them in the first place. (Ditto for compiler flags, shared libraries, linker options, etc)
2. Network programming is easy.
TRUTH: I've seen a lot of projects that implement their own network communication using TCP sockets and sprintf text messages. A number of others transmit little endian integers around. And others still use a blocking style request->response form of communication.
Good network programming is really hard, and unless you take the effort to design and implement something robust from the start, this kind of ad-hoc, inflexible networking will become embedded into the application and require significantly more rework later down the road.
And PLEASE reuse something that might fit before even attempting to write your own layer. The gnutella protocol is a great example of this problem.
3. Threading is as simple as using pthreads and mutexes.
TRUTH: Good threading code is difficult to develop and difficult to debug. It is always preferable to use an event based model where possible, and rely on threads only when you need scalability on SMP, work arounds for blocking system calls (gethostbyname_r), or background tasks that you dont want delaying interaction with a user or network app (there are many other reasons, but these give you the general idea of where threading is appropriate).
Synchronizing access to shared resources between threads is also very tricky. The level of granularity of locking, and the structure of your data structures themselves, will have a significant impact on performance. Too much granularity and you end up with extremely complex locking hierarchies that are difficult to debug, more prone to dead lock. Too little granularity and you get lots of contention for these shared resources.
Finding the sweet spot is tricky, and often requires lots of experience or tuning to get right. The lack of tools to provide visibility to lock contention and latency also make this difficult.
I'm sure there are others, but these are the big ones that come to mind.
You might be surprised, but I agree. It usually takes me finding three instances of similar code before I can generalize it correctly.
This article was talking about the open source world, though. There seems to be a penchant for writing frameworks without any projects that actually use them. That's the myth I was trying to address. Extracting a framework from only one project isn't spectacular, but it's much, much better than extracting a framework from zero working projects.
how to invest, a novice's guide