Slashdot Mirror


Can Watermarking Help Find GPL Violations?

bitkid writes "I recently run across techniques that can be used to watermark program code. While I yet have to see some source code for this to play with, the authors claim that the watermarks can be introduced into the source code and can be found in the compiled executable. My question for the slashdot-crowd is: Do you think free software (GPL or other viral licenses) should be watermarked? This could help to find GPL violations (think Everybuddy or Linksys) or can be used in court someday against the next SCO to prove authorship. What might be the ramifications of this?"

15 of 265 comments (clear)

  1. details about watermarking techniques by gripdamage · · Score: 5, Informative

    The paper cited in the first link is from a professor I once had.

    On his website I found his full article, if you want some details about watermarking techniques. It's has a lot more meat than presentation slides.

    1. Re:details about watermarking techniques by Theoria · · Score: 4, Informative

      The original poster made a comment about never getting to play with watermarking code. Along with some informative papers about software watermarking, obfuscation, tamperproofing, etc and uses for such techniques, there is an implementation on the SandMark website.

  2. Re:Watermark? by Naerbnic · · Score: 5, Informative

    Perhaps this is true for static data (as in a bunch of source code), you can insert a watermark into code, which will create a dynamic watermark (i.e. something that depends on the runtime operation of the program). To make a long story short, you cannot easily remove it by rearranging binary code, and it's difficult (i.e. NP-complete for those in the know) to analyze the software to remove. Tack on the fact you can tamperproof the code (i.e. make the behavior of the program depend on the existence of the watermark), and you have a pretty difficult path to walk if you want to remove it.

    More info can be found in this paper, if you're into reading that sort of thing.

    --


    So there I was, juggling apples and small animals, when I accidentally bit into the wrong one...
  3. Ummm... by praedor · · Score: 0, Informative

    Couldn't the watermark be very easily defeated simply by copy-pasting the code text into a new file and recompiling? You could also simply manually copy word for word the code anew and, poof!, no watermark.

    --
    In Bushworld, they struggle to keep church and state separate in Iraq as they increasingly merge the two in America.
  4. Re:Watermark? by NicenessHimself · · Score: 2, Informative

    Golly, I had no idea asymetric cryptography was involved!

    I envisage a 'watermarker' as being some program you run your app through and it records a signature, which you can treat as a 'fingerprint'. You can then run that watermarked program through a checker, and it will tell you how close (100%) the match is?

    There are commercial programs which translate binary applications from one instruction set at a time, sometimes as a simulator, sometimes outputting a compiled program.

    A program is just a flowchart. It can be translated into any other equiv form quite easily.

    I imagine that translating an x86 binary to ppc and then back again using a different off the shelf tool would be pretty effective.

    Things like 'make the app rely on the logic!' type hints are used by shareware authors all the time.. and they are routinely cracked.

  5. RTFA by NivekEnterprises · · Score: 2, Informative
    OK, i'll make it easy on all of you. Hear is the article:


    A Practical Method for Watermarking Java Programs

    Akito Monden, Hajimu Iida, Ken-ichi Matsumoto, Koji Torii, Nara Institute of Science and Technology
    Katsuro Inoue , Osaka University

    Java programs distributed through Internet are now suffering from program theft. It is because Java programs can be easily decomposed into reusable class files and even decompiled into source code by program users. In this paper, we propose a practical method that discourages program theft by embedding Java programs with a digital watermark. Embedding a program developer's copyright notation as a watermark in Java class files will ensure the legal ownership of class files. Our embedding method is discernible by program users, yet enable us to identify an illegal program that contains stolen class files. The result of the experiment to evaluate our method showed most of the watermarks (20 out of 23) embedded in class files survived two kinds of attacks that attempt to erase watermarks: an obfuscactor attack, and a decompile-recompile attack.
  6. Re:Useful, but easy to get around. by kasperd · · Score: 5, Informative

    randomising white space, replacing variable names

    Those are stuff that cannot be seen in the resulting executable, the watermark is claimed to be found even in the resulting executable. (Yes I know in some cases variable names can be visible in the executable, but you can easilly prevent it from being there.) I somehow doubt this watermarking is at all possible. With optimizing compilers it is hard to find resemblance between source and executable. Finally knowing how the watermarks are made on the code, it is probably easy to write another but slightly similar algorithm that will remove the watermark.

    --

    Do you care about the security of your wireless mouse?
  7. Re:as usual by dspeyer · · Score: 4, Informative
    From the GPL (section 3):
    You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following:
    * a) Accompany it with the complete corresponding machine-readable source code,
    ...
    The source code for a work means the preferred form of the work for making modifications to it. (emphass added)

    So, unless you plan to do maintainance on obfuscated code, this is no good for GPL software. In fact, it's no good for Open Source software of any kind.

    Admitadly, you could use unobfuscated code and refuse to reveal the watermark, but it's kind of tricky to keep things secret in the OSS world.

  8. Re:Not easy -- story submitter is confused by sICE · · Score: 4, Informative

    It's not that fuzzy - i mean you seem to look like you know what all this stuff is about, and no offense is intended here - but, sadly, you underestimate the power of modern cracking and reverse engineering tools you have at your disposal.

    Even with compiler optimizations and processor specific instructions AND EVEN different compilers, you can actually find and detect "similar HLL code" (there's a tool called DATING that can do that - contact me for a copy, it's hard to find - and which the name is a pun to the IDA FLIRT abilities). I dont know for different cpu, but i guess it would be ressources hungry, and i dont know of a tool that can catch those for now. Try anyway to have a look at VMWARE binaries - win32/linux - with it, you'd probably be surprised.

    blah, dunno what i wanted to say next it's late here... ~<:(

  9. Such an organization already exists by Anonymous Coward · · Score: 1, Informative

    It's called the US Copyright Office.

    You deposit your code with the Copyright Office. It costs a nominal amount of money ($20 IIRC). At a later, the copyright holder can obtain a certified copy from the copyright office, with a certificate that says what day it was filed. This can be used as legal evidence.

    1. Re:Such an organization already exists by yerricde · · Score: 2, Informative

      The UK doesn't have a Copyright Office. Neither does Australia. I guess they'd have to use their traditional notary channels.

      --
      Will I retire or break 10K?
  10. Re:Useful, but easy to get around. by sholden · · Score: 2, Informative

    Or you could just register the copyright and use the existing institution (or the equivalent in your jurisdiction) that has been doing that task since before computers were invented.

    Of course registering every cvs checkin is going to get expensive :)

  11. UK Method by Gonoff · · Score: 3, Informative

    Put a copy in an envelope - printed or CD, whatever you like. Post it to your solicitor and have them put it in their safe unopened.

    Later when Parasitesoft trys to claim you stole it from them, the solicitor can produce this as legally acceptable evidence of its date of existence.

    --
    I'll see your Constitution and raise you a Queen.
    1. Re:UK Method by malthusan · · Score: 3, Informative

      Address it on the backside of the envelope (the side with the flap) and place the postage over the flap once it's sealed. When the post office postmarks it, the stamp will cross the flap onto the envelope. The intact postage and postmark serves to show the envelope hasn't been opened since it was posted.

      I do this with my own writing (that is, I post it to myself) so I have the means to prove creation date should it ever become an issue.

  12. How does this help GPL? by scdeimos · · Score: 5, Informative

    Having read the .PDF paper and then skimmed the /. comments it would seem few people have taken the time to actually read (or understand) the paper before commenting on it. Hats-off to those who have.

    What is the essence of this watermarking technique?:
    - For embedding copyright information into individual .class files, as opposed to signing .cab's for whole Java apps/applets.
    - It modifies compiled Java bytecode, shuffling eight bytecode operators in targeted "dummy" class methods. The shuffling is able to encode only three bits per operation, so watermarks need to be short or dummy methods need to be large.
    - It relies on the watermarked dummy method(s) appearing in stolen (decompiled/recompiled) .class, which is achieved by pretending to call the dummy method(s) from other methods using always-false logic constructs.

    What are its downfalls?:
    - The technique is specific to Java. Forget about using it for other languages which output platform-specific machine code binaries, although it might be possible to modify it for use in .NET and other bytecode environments.
    - If an intelligent thief (or smart optimizing compiler) is able to detect the always-false condition used to shield the dummy method(s) the watermark(s) will be removed.
    - The larger your watermark, the larger you need to make your dummy method(s), or you need to embed more of them. The larger you make your dummy methods, the more obvious it will be that there's something strange about them.
    - Optimizing compilers could still destroy the modified operators used to form the watermarks.

    The paper also claims it protected more .class files from decompile/recompile attacks than *I* feel it should have: five of the ten .class files crashed their test decompiler (Mocha), thereby "protecting" their watermarks. If someone is keen to re-source your .class file, particularly if there's money to be made, I'm fairly certain they'd try another decompiler instead of giving-up on just one crash. I suspect that these five .class files could be decompiled by another utility, so the question of their watermark protection remains unanswered. Potentially this could cause up to 18 (instead of 3) of their 23 watermarks actually being defeated. This is entirely feasible, since only 3 of the 8 watermarks fully tested survived (the other 15 being in the five .class files which crashed Mocha).

    How does this technique benefit GPL? I'm not sure that it would. Even if the above problems were fixed:
    - To submit "source code" for your protected .class, you'd have to compile it, watermark it, decompile it and then post the decompiled version. Not very pretty and what about comments? I suppose you could have a Perl script reinsert comments from the original source, or copy-and-paste the watermarked dummy methods back in.
    - It's really designed to embed personal/corporate copyrights into code, protecting the IP of the submitter not the GPL community. I suppose the GPL community could design a community-wide watermark policy, but then that would become public knowledge and so thieves would be aware of its existence and be inclined to search harder to remove it.