Slashdot Mirror


ESR to Shred SCO Claims?

webmaven writes "According to this article in eWEEK, ESR has released a utility called comparator for analyzing the similarity of source code trees. The technical details are interesting, in that ESR says he is using an implementation of a refined version of the 'shred' algorithm, with higher performance (on machines with enough RAM) than other versions. ESR won't say whether he intends the comparator to be used to compare older Unix code to Linux so as to be able to refute SCO's claims, but it's obviously well suited for such a purpose. Interestingly, as the shred algorithm can run reports on source trees using only the MD5 signature shreds (once generated), it is possible to use it to compare trees without direct access to the source code itself, leading to a possible use in comparing various proprietary source trees with each other and with Freely available code bases such as Linux and *BSD without requiring actual disclosure of the proprietary source code (a neutral third party could generate the shreds on a company's premises, and leave without taking a copy of the source with them). I'll be interested to see if (or which of) the proprietary vendors allow their source trees to be 'shredded' for such comparisons, and whether this becomes a standard forensic technique in source-code copyright and trade-secret disputes."

2 of 554 comments (clear)

  1. Re:Is there really that much data there? by RexHowland · · Score: 0, Redundant

    But what would the hash be of? Would each line of code be a separate hash, or would lines be combined?

    Wouldn't altering one letter in the code completely change the hash? If so, all you would need to do to avoid detection would be to make a few changes to minor things, and you would appear to have different hashes, even if the source code were essentially the same.

  2. Read the article by Bananenrepublik · · Score: 1, Redundant

    From ESR's README:
    Besides the production C code, the distribution also includes working Python versions. These were used to prototype the concept.
    So the answer to your question is yes and no.