Open Sourcing with (Imperfect) Revision History?
AArnott asks: "My company is open-sourcing a private project that has been in development for 4 years. It's history is all in our internal Subversion server. The history of the project includes dependencies on source code that we are not open-sourcing. Should we just publish the latest version (now that we've removed the dependencies) and leave out the old history? Or should we publish the history, even though no previous revision will build, due to the dependencies that we are not including?"
Why bother deaing with old or historical releases? Unless there is functionality or features lost in the current release that someone could resurrect by going through the historical code, there really wouldn't be any advantage.
The only thing releasing the full Subversion history is going to get you is complaints from idiots that you're violating the GPL by not open-sourcing the dependencies. I applaud your concern for thorughness but just go with the current version.
What I'm listening to now on Pandora...
Just start a new public repository with the latest good version. Keep the history, but don't worry too much about making sure that everyone has access. Long term storage is fine.
CM systems improve communication between developers by allowing them to synchronize their work as well as preventing simple developer mistakes from turning into massive code rewrites (but you don't need more than two weeks of history to accomplish these goals). The reasons you usually carry around all of the extra baggage of the old versions is for (1) establishment of legal ownership (copyright information) (2) simultaneous maintenance of multiple versions in the field and (3) to show some history of how you got to where you are.
Legal ownership is important, but you get that by keeping a few backups in your long-term storage. You don't have versions in the field (not of the open-sourced version anyway) so that's a moot point. The "how we got here" argument is also of minimal value as long as someone who knows the code is still around. The knowledge of how things were developed in a decent developer's head will be much easier to use than attempting software archeology on a stale file repository.
Regards,
Ross
I say publish the history. The worst case scenario is that it's not useful to anyone. If that's true, there's no loss on your part. The advantage of providing the history is that if there was, say, a bug in your dependency removal, someone can go look at the history and fix the bug. When you're fixing bugs, extra information never hurts...
My other car is first.
When learning about certain code bases, I find it extremely valuable to start with whatever beginning code there is because it illustrates the core concepts while not being a thicket of code. It also helps to see what design decisions were made and then rescinded.
it took me a long time to come around to this view -- but the value of information is always positive. storing and managing it might be costly or take time, but all information, by itself, always has positive value. I argued with my decision analysis prof in grad school a whole semester on this point and after losing miserably I finally came around.
so... realease as much as you can.
I went through this with my current company a few years ago, and we decided to publish only the current revision. I wish we hadnt.
Think about another thing. Geniuses are extremely rare these days, and I have doubts that there is absolutely nothing in the code left "for historical reasons", or as a drop-in replacement for code you do not wish to go open source. The history would at least explain the things if one runs into a bug because of this replacement; moreover, there are chances that such kluges will be improved, if one knows what they are for.
docs.linux.org.ua