IBM Says SCO Willfully Failed To Detail Evidence
Robert wrote to mention a piece on CBR Online where the latest volley in the SCO case is covered. IBM is now accusing SCO of having acted in bad faith when they opened the trial against IBM, by being purposefully vague in their evidence. From the article: "All in all, according to IBM, SCO's evidence filing makes it impossible for the company to defend itself. 'By failing to provide adequate reference points, SCO has left IBM no way to evaluate its claims without surveying the entire universe of potentially relevant code and guessing ... Since only SCO knows what its claims are, requiring such an exercise of IBM would be as senseless and unfair as it would be Herculean.'"
In college, my professor had a class of a couple hundred freshmen and the problem of making sure no one was copying anyone else's code for trivial homework assignments. It's a similar problem, how do we solve it?
His solution was a simple edit distance program that checked every pair-wise set of homework assignment's source code. You could thus find the highest areas of similar work between two pieces of code or even documents. A simple algorithm--it's the engineer way.
When I took a course in computational biology (or bioinformatics), I was enlightened to the BLAST and FASTA algorithms that could be useful in this case. Basically, you could search by global alignment or some form of local alignment (reducing and increasing complexity of the algorithm, respectively). These algorithms work already with protein chains and DNA so they are more than capable of large sets of data computed quickly and effectively.
The article lists SCO submitting 45,000 pages of evidence and materials--of which I assume is SCO's own work. What IBM could choose to do is have them scanned and provide the court with the alleged infringing documents to check against. The localized areas that score the highest could then be inspected by IBM and give their lawyers ample time to start a defense against points in the documents that will probably be areas of attack for SCO. In fact, it's entirely possible that SCO used this method to quickly identify what it thought to be points of infringement in code.
But of course, like most Slashdot posters, I'd rather just see the judge turn to SCO and say, "Bullshit, case dismissed..." and proceed to tell them off like Judge Judy giving a deadbeat father a taste of the back o' her hand.
My work here is dung.