The Power Behind the SCO Nuisance
akahige writes "Forbes has a fairly detailed story about the sordid history of The Canopy Group and all the various companies they've sued -- Microsoft (who they beat) and CA (this case is still pending), among them. Before joining Caldera, Darl McBride sued IKON Office Solutions, for whom he worked -- and won. And it also seems that a bunch of Canopy power players also sit on SCO's board of directors. The short summary is, 'these guys are professional litigious bastards -- be exceptionally wary.'" A local user's group is planning a protest for tomorrow. Reader myst564 writes: "After reading all of this SCO press I remembered that SCO once offered up all of their 'Ancient UNIX' (their words, not mine) source to the world while retaining all copyrights (i.e, no OSS license). Interestingly enough it WAS located here but isn't any longer: SCO's Ancient Unix. What's more you can read about the original release here at: Linux Today. I downloaded the source myself way back then but never did anything but delete it! Anyway, check out this comment. It's interesting that this was predicted in 2000!"
You did not count on the Way Back machine Herr Doktor SCO?
Here's a working link..
Enjoy!
Perhaps there was meant to be a NOT in there somewhere?
The thing about things we don't know is we often don't know we don't know them.
It depends really. A MD5 hash will only tell if entire files were misappropriated verbatim. So throwing on a GNU header, adding in a changelog entry for a bug fix etc would all invalidate the MD5 hash. I do not believe that there is any truth to the SCO claims, but MD5 hashes wouldn't be proof in favour of linux either.
A first step would be to use a regexp to spit out all the comments into a file sorted by some key. Do this for both the SCO and linux code bases. Toss out all the comments which aren't in both lists and you now have a file with common comments. This would be where to start looking, if you see non-trivial verbatim comments then further investigation would be needed.
Chris Kuivenhoven is a thief, beware
No, no, this was gone over before; you MD5 hash each consecutive five-line set (including overlapping ones) for each set of source, sort the list of hashes, do the same for Linux, and then run through the list of MD5s looking for matches.
That'll give you hits for any five-line segment of code that matches anywhere between the two.
IAABAAP (i am a biologist and a programmer), and the 2 processes are not really similar. most higher organism genomes are chock full of very highly repetitive genetic filler/rubbish/crap, which makes the gene assembly *way* more difficult.