Slashdot Mirror


Can Watermarking Help Find GPL Violations?

bitkid writes "I recently run across techniques that can be used to watermark program code. While I yet have to see some source code for this to play with, the authors claim that the watermarks can be introduced into the source code and can be found in the compiled executable. My question for the slashdot-crowd is: Do you think free software (GPL or other viral licenses) should be watermarked? This could help to find GPL violations (think Everybuddy or Linksys) or can be used in court someday against the next SCO to prove authorship. What might be the ramifications of this?"

8 of 265 comments (clear)

  1. Just an extra step by Moeses · · Score: 3, Interesting

    I think this would only help the most blatent copying. If the watermark code is embedded in the datastructures of the source code either it would be fairly easy to remove or the software would be in such a state that it would be hard to maintain and evolve. The attempt to avoid piracy would have a negative long term effect on the project.

    I can still see this being useful if blatent copying of the software is the biggest problem the project faces, however I'm having trouble envisioning a scenerio where that's the case.

  2. its for java and its binary watermark, not source by Anonymous Coward · · Score: 3, Interesting

    Caveat - I haven't read the paper but from the description is looks like you apply your watermark to the class files after compilation.

    So,
    1) only protects binaries not source ... therefore not applicable in its current form to source code which would be required for any usefullness to GPL.

    2) its for Java which is easier due to the cannonical form (bytecodes) that can be manipulated by the watermarking tool. You could probably do this to protect GPL binaries but with less portability

    IMHO opinion, not usefull for source but sure if you're worried that some of your precompiled binaries are being ripped, then maybe.

    For source, you need to detect common code patterns and use source tools that have been discussed elsewhere on /.

  3. Re:Useful, but easy to get around. by LostCluster · · Score: 2, Interesting

    You may be on to something... create an organization that accepts code and stores it with a datestamp forevermore. No need for random-access hard disk, just archive the material to tapes, CDs, or DVDs and properly maintain them. If your ownerhsip of the code is ever called into question, you can obtain 3rd party proof that the code was on that time period's record, proving you had the code as of that time.

  4. Pointless. by Clinoti · · Score: 2, Interesting
    Does this not go against what open source is all about? It's open code given without the extremes of ownership like water/copy/trade-marks. Where does this apply with CVS and open project developments?

    I like the idea behind it but I don't think it's the answer. It would be easier and more applicable to have a 3rd party database that held published coding rather than having to graph and mark my work everytime I released etc... this way I have it (1) in the public domain and (2) have a published reference for it. (For smaller works).

    And borrowing code despite our hatred for it is one of the tools of software development, not so much in the word for word copying and ctrl-V (thats a whole separate discussion) but capturing the methods and innovating them, then re-releasing it into the wild for the next innovator or janitorial white hat. Thats what open source coding is for me anyway not the profit or the credit but the goal.

    --

    Let's keep in mind that patents are in place to keep lawyers employed and keep them litigating. -CatGrep

  5. Not easy -- story submitter is confused by 0x0d0a · · Score: 5, Interesting

    Look at the techniques. This stuff is designed for use on binary-only software (with the sole exception of the comment embedding, which is easy to strip, and the embedded strings, which are easy to remove/modify).

    The approaches they're talking about are done at the compilation phase or post-compilation on Java bytecode.

    It's *extremely* difficult to produce good, reliable watermarks, because different compilers will build software differently, as will different optimization options.

    I'd essentially say that source-based watermarks are a lost cause (at least with C, and with the current constraints of readability and simplicity on code).

    A much better approach would be a project that does fuzzy comparisons on binaries, and is somewhat aware of ELF. Basically, you'd have a program that would have a set of known GPL code (a compiled Linux system would work well) and compare it to a set of compiled code.

    This is still not perfect if the person is malicious and just tries using a different compiler. This has happened before with xvid and use of icc. However, there aren't *too* many compilers out there.

    Hmm...this is an interesting problem.

    A more interesting approach that just occurs to me now -- in general, the proportions of compiled code should be roughly the same, independent of compiler -- adding padding, etc. Generate a call graph of the function tree in a set of GPL code. Then your checker would do fuzzy matching on chunks of that call graph against the suspicious code. It'd take a bit of massaging. It'd also still need some manual looking at the target once identified. However, this should be able to run in a pretty automated manner (even if it takes a long time to run) and could potentially turn up some interesting goodies. It'd certainly discourage commercial folks from ripping off GPL-using authors and companies.

    Try taking a Windows system with a lot of installed (non-GPL) software and a Linux system with a lot of (GPL) installed software. Start a comparison running. See what turns up.

  6. Is this a big problem? by NanoGator · · Score: 2, Interesting

    Pardon my naievity. I just wanted to ask, are GPL violations a big problem?

    If it's happening all the time and this is a method slow progress of it, then I don't see a huge issue with it. But if it is a once in a while type of thing, then how could this have anything but a negative impact on GPL? The potential is there (reality could tell a different story) for people to shy away from it, worrying that they haven't quite got all their ducks in a row. If it's easy to automatically scan their code and say they're in violation, well then what? I guess what I'm trying to say is that it could be mishandled, thus treating the users of GPL code like they're potentially thieves. It strikes me that one of the compelling factors of GPL is their reliance on the honor system. Whatever you do, don't play games that can damage that bright point of GPL.

    Maybe I'm looking at this the wrong way. I suppose it could be used to defend against an accusation not unlike what SCO has claimed. "You copied our code!" "No, we used GPL'd code, see?" In that case, my previous comment about disrupting GPL's trust might not be as likely. "Well, we're just doing it so that this sort of thing doesn't happen again." I can see people nodding their head in agreement in that case.

    In short, it's one thing to do it if your aim is to defend yourself from SCO'esque accusations, it's another to use it to look for victims to sue. Whatever is implemented, be very careful about damaging GPL's image to the community that values it.

    --
    "Derp de derp."
  7. Not possible with open source by HoleNdaBitBucket · · Score: 3, Interesting

    Read the presentation. Although complete sentences aren't exactly present, there seems to be the indication that access to the source can provide an attack on the watermarking scheme: well, duh, if it's open source just modify the source to eliminate the watermark.

    But what's the likelihood a lazy company/individual will actually do this before violating the GPL? Probably slim, but more of the world seems to be going GPL anyway; and if the whole world did GPL, why would you need watermarks?

    Point is: if the monopolies of the world insist on using GPL code without releaing the source, they'll expend the effort to remove the watermark.

  8. ubiquitous GPL code == BAD? by natron8080 · · Score: 4, Interesting

    Ok, assume a corporation CAN sucessfully steal GPL code, with or without watermark. Let's say M$ paints an IE browser look on top of the mozilla firebird codebase:

    1. Is it a bad thing that their software just got better, faster, and more standards compliant?
    2. Doesn't this even out the playing field, as far as proprietary technology goes? Everyone starts at 0.
    3. The mozilla developers would have real speed/memory/feature competition from M$, as opposed to the "we'll never touch IE code again" stance of M$.
    4. More company coders would be familiar with and able to develop on open source projects in their spare time (or convert even!).
    5. GPL projects aren't really in competition with corporate firms. GPL software doesn't lose profit margins if there's better software out there.

    So aside from ethical issues, why should the GPL community really care?