Slashdot Mirror


Tracking Code to Its Origins?

openbear writes "While doing a code review for a closed source project at work I came across a few files that were stolen from an open source project. The individual that did this was dumb enough to leave the original license in one of the files, however he was smart enough to remove all trace of where the code came from. He since quit the organization, so we (the developers) can't get to him to find out where he got this code from. Now management wants us to ship the product as is (with the stolen code intact) because we can't point to the original source of his questionable code. A few of us scoured sourceforge and several apache projects but couldn't find anything matching. My question is: What is the best way to track down where this code originated from. Is there an organization that would help? A tool? A website?"

9 of 59 comments (clear)

  1. Tried Google? by rtaylor · · Score: 5, Informative

    Find a line or 2 of code that look non-standard.

    Run through google groups, etc. If it's from a popular project, Web based cvs is gonna be on it and Google will have sucked up the source.

    Other than that, I really don't know.

    --
    Rod Taylor
  2. This is a first... by infonography · · Score: 2, Informative

    Some on at Micro$oft actually admiting to stealing code? (kidding), but seriously if you could tell us in very rough detail what the code does we might be able to help. You already told us it's a web app (apache sites?) You'll still get the kudos for trying to be a sport about it, without violating your NDA.

    --
    Sorry about the writing. Robot fingers, you know? Cliff Steele in DOOM PATROL #23
    1. Re:This is a first... by openbear · · Score: 2, Informative

      The code that he forgot to remove the original comments from was doing base64 encoding/decoding. It was Java code (a class named Base64) with only the following two methods:

      public static String encode(String data)
      public static String decode(String data)

      Most implementations of base64 that I have seen use byte arrays instead of Strings. I have tried searching Google using the filename "Base64.java" and the various method signatures, but no luck. The original stolen code is dated (in the comments he forgot to remove) from about two years ago. This is probably why I can't find it on Google or SourceForge.

      I realize that this isn't much to go on, but like you stated, I don't want to violate the NDA and lose my job.

  3. Re:Try www.google.com by kilrogg · · Score: 2, Informative
    Except it's not as easy as just feeding in the file and saying "find it", partly because google only allows you to feed in a few search terms and partly because it sounds like the files have been modified from their origional form.

    Assuming the code hasn't been too modified, he can try searching for function or variable names.

    Another problem is that it's very likely that the source files will only be stored within tarballs,

    True but many opensource projects have html front ends to their cvs trees, google sometimes index these. Same for mailing list archives, they'll sometimes contain patches or discussions of the code which include parts of the code.

  4. Re:what about rewriting the code? by openbear · · Score: 3, Informative

    Yes the code could be rewritten, but the project is at the stage where it takes a show-stopping** bug or management approval to modify any code. The next version of this product will NOT have the questionable code in it, but there will still be customers running this version (with the stolen code) for about a year or so.

    ** And by show-stopping bug, I mean broken core functionality or something deemed important by management.

  5. Re:what about rewriting the code? by little_fluffy_clouds · · Score: 3, Informative

    ** And by show-stopping bug, I mean broken core functionality or something deemed important by management.

    I call getting the pants sued off you something "deemed important by management".

    Several of you fucked up - this code got into the project without being checked where and who wrote it. Now rewrite and reintegrate and retest, and remember this lesson.

    --
    What were the skies like when you were young?
  6. link to the code. by gonar · · Score: 3, Informative

    http://java.sun.com/j2se/1.4/docs/api/java/net/URL Encoder.html

    --
    The difference between Theory and Practice is greater in Practice than in Theory.
  7. Here are two methods ... by openbear · · Score: 3, Informative

    Ok, I thought about it a bit and I think I can post some of the source without violating my NDA. Here are two methods from code that I know is stolen. It is only doing Base 64 encoding and decoding so it is not giving away any company secrets. I removed all comments and package names so it is just the bare code. If anyone can locate the origins please reply to this post. Remember this particular code is dated about two years old. Thanks to all of those who put effort into giving ideas and opinions. I still haven't been able to locate the origins of this code, so if nothing more comes out of this last post then I suppose I will just accept the fact that sometimes sleazy people get away with thievery and walk away without a care. Thanks again.

    public class Base64 {
    public static String encode(String data) {
    int c;
    StringBuffer ret = new StringBuffer();
    try {
    byte[] arr = data.getBytes("iso-8859-1");
    int len = arr.length;
    for (int i = 0; i < len; ++i) {
    c = (arr[i] >> 2) & 0x3f;
    ret.append(cvt.charAt(c));
    c = (arr[i] << 4) & 0x3f;
    if (++i < len)
    c |= (arr[i] >> 4) & 0x3f;
    ret.append(cvt.charAt(c));
    if (i < len) {
    c = (arr[i] << 2) & 0x3f;
    if (++i < len)
    c |= (arr[i] >> 6) & 0x3f;
    ret.append(cvt.charAt(c));
    } else {
    ++i;
    ret.append((char) fillchar);
    }
    if (i < len) {
    c = arr[i] & 0x3f;
    ret.append(cvt.charAt(c));
    } else {
    ret.append((char) fillchar);
    }
    }
    } catch (Exception e) {}
    return(ret.toString());
    }
    public static String decode(String data) {
    int c;
    int c1;
    StringBuffer ret = new StringBuffer();
    byte[] arr = data.getBytes();
    int len = arr.length;
    for (int i = 0; i < len; ++i) {
    c = cvt.indexOf(arr[i]);
    ++i;
    c1 = cvt.indexOf(arr[i]);
    c = ((c << 2) | ((c1 >> 4) & 0x3));
    ret.append((char) c);
    if (++i < len) {
    c = arr[i];
    if (fillchar == c)
    break;
    c = cvt.indexOf((char) c);
    c1 = ((c1 << 4) & 0xf0) | ((c >> 2) & 0xf);
    ret.append((char) c1);
    }
    if (++i < len) {
    c1 = arr[i];
    if (fillchar == c1)
    break;
    c1 = cvt.indexOf((char) c1);
    c = ((c << 6) & 0xc0) | c1;
    ret.append((char) c);
    }
    }
    return(ret.toString());
    }
    private static final int fillchar = '=';
    private static final String cvt = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    + "abcdefghijklmnopqrstuvwxyz"
    + "0123456789+/";
    }