Slashdot Mirror


Debian Working on Reproducible Builds To Make Binaries Trustable

An anonymous reader writes: Debian's Jérémy Bobbio, also known as Lunar, spoke at the Chaos Communication Camp about the distribution's efforts to reassert trustworthiness for open source binaries after it was brought into question by various intelligence agencies. Debian is "working to bring reproducible builds to all of its more than 22,000 software packages," and is pushing the rest of the community to do the same. Lunar said, "The idea is to get reasonable confidence that a given binary was indeed produced by the source. We want anyone to be able to produce identical binaries from a given source (PDF)."

Here is Lunar's overview of how this works: "First you need to get the build to output the same bytes for a given version. But others also must to be able to set up a close enough build environment with similar enough software to perform the build. And for them to set it up, this environment needs to be specified somehow. Finally, you need to think about how rebuilds are performed and how the results are checked."

16 of 130 comments (clear)

  1. Seems like a little random build size by Revek · · Score: 2

    Would make it harder for them to exploit.

    1. Re:Seems like a little random build size by Desler · · Score: 2

      But that isn't the point of this. It's how to verify that your binary doesn't have tampered with source code.

    2. Re:Seems like a little random build size by cold+fjord · · Score: 2

      That's a tricky problem.

      Countering "Trusting Trust"

      --
      much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
    3. Re:Seems like a little random build size by Desler · · Score: 3, Insightful

      Yes it is difficult. That's why they are trying to solve the problem.

    4. Re:Seems like a little random build size by MyAlternateID · · Score: 3, Interesting

      But that isn't the point of this. It's how to verify that your binary doesn't have tampered with source code.

      I care about this, too. That's one reason I run a source-based distribution. It's not the only reason. It's not even the main reason. But it's one reason.

      Anyone who really needs this kind of assurance was probably also building from source. You can do it once on-site, then make your own binary packages and push those to all of your other machines so it's really not bad. I think a much more insidious threat comes from malicious yet innocent-looking source, like what you find in the Underhanded C Contests.

      It doesn't do much good to have a reproducible build of a program when it contains an innocent-looking yet malicious piece of code. Just consider Heartbleed. Whether Heartbleed was intentional or not, it proves that people can run vulnerable code for a very long time before it's found out, and that was a program intended to be secure.

    5. Re:Seems like a little random build size by Bing+Tsher+E · · Score: 2

      I wonder, though, if an always-connected build machine could have compromised object files pushed onto it mid-build. It's a theoretical risk, but not one that couldn't be accomplished by a determined foe. The build process is well characterized for many projects, knowing when to push a rogue object file into the build directory wouldn't be that difficult.

      It means the penetrating entity would need to already have access to your system, but 'object pushing' would be a useful technique for escalating security breaches.

  2. Awesome by trawg · · Score: 4, Informative

    I was thinking about this being a problem a while back - how to deal with building something from source and knowing I was getting the same output that the developers wanted me to have. Coincidentally about the same time, a href="http://developers.slashdot.org/story/13/06/20/1548228/are-you-sure-this-is-the-source-code">this article popped on Slashdot and introduced me to Ken Thompson's article Reflections on Trusting Trust - a great read and something that really opened my eyes (in that wide-open-because-of-terror kind of way).

    Also from that thread came this email from one of the Tor developers talking about their deterministic build process to do the same thing.

    I think this is a problem that would be really great to solve as soon as possible. I very much hope that once we start seeing more reproducible builds we don't suddenly find out that certain compilers have been compromised long ago.

  3. Re:This seems like a job for Virtual Box by wolrahnaes · · Score: 4, Insightful

    On the otherhand I don't quite understand why, if one can compile the source, one needs to worry about untrusted binaries. Perhaps the intent here is for some master agency to watch for tinkered binaries or to post it's own Checksums apart from Debian. Then everyone has two sources for validated checksums.

    Almost right, except without the master agency. This isn't for the incredibly paranoid types who would already be compiling from source. This is for the rest of us, the lazy people who would rather "apt-get install foo" and just assume the distro's doing things right. If the builds are reproducible then eventually someone's going to verify them. If no variations are discovered, the rest of us lazy masses can be a lot more confident that we're not running anything unexpected.

    --
    I used to get high on life, but I developed a tolerance. Now I need something stronger.
  4. Awesome - on trusting trust by trawg · · Score: 4, Interesting

    I was thinking about this being a problem a while back - how to deal with building something from source and knowing I was getting the same output that the developers wanted me to have. Coincidentally about the same time, this article popped on Slashdot and introduced me to Ken Thompson's article Reflections on Trusting Trust - a great read and something that really opened my eyes (in that wide-open-because-of-terror kind of way).

    Also from that thread came this email from one of the Tor developers talking about their deterministic build process to do the same thing.

    I think this is a problem that would be really great to solve as soon as possible. I very much hope that once we start seeing more reproducible builds we don't suddenly find out that certain compilers have been compromised long ago.

  5. Re:Build timestamps mess this up by wolrahnaes · · Score: 4, Informative

    Pages 6 and 7 of the PDF linked cover time-related issues and basically agree, anything that builds time/date in to the binary is a problem that needs to be fixed.

    Git revision on the other hand is a recommended solution, since it points at a specific state of the code and will always be the same if the code is unchanged.

    --
    I used to get high on life, but I developed a tolerance. Now I need something stronger.
  6. Re:This seems like a job for Virtual Box by Anonymous Coward · · Score: 2, Informative

    From the article the issue was that the cia had found a way to own the *compiler* binaries and each program it compiled would have a vulnerability added at build time.

  7. Compromised hardware by fabrica64 · · Score: 3, Interesting

    What about compromised CPUs? If you are the NSA I think it's easier to build a backdoor into the CPU than try to keep up with ever changing software builds. Isn't it? CPUs are totally controlled by three or four U.S. companies, are closed source nobody has ever seen into it...

    1. Re:Compromised hardware by caseih · · Score: 4, Interesting

      A partial answer to this is to build your own CPU and system in software. Like Bochs. But you could build this virtual system on any number of other completely incompatible platforms for verification. Would be slow. But at least it would be consistent and verifiable. You couldn't use hardware virtualization for this. Would have to be completely implemented in software. And if different people implemented the same reference platform independently (using their own preferred language and programming techniques) that would add an additional layer of verification. Even the deepest NSA compromise would have a hard time completely influencing this.

  8. Diverse double compiling (thanks dwheeler) by tepples · · Score: 4, Interesting

    So long as two or more independently developed, self-hosting compilers for a language exist, with at least one as publicly available source code, a Ken Thompson attack on the public-source one is infeasible. David A. Wheeler proved it; here's the gist:

    1. Use Visual C++, Intel C++, and Clang++ to compile g++. The binaries you get in this stage will differ, but if VC++, Intel C++, and Clang++ are uncompromised, they will have exactly the same behavior.
    2. Use each of the three copies of g++ you compiled earlier to compile g++, disabling timestamps in the output. Because they all have the same behavior (the behavior of g++), they should all produce the same the output. Thus the binaries you get in this second stage will be identical unless one of the first compilers is compromised.
    1. Re:Diverse double compiling (thanks dwheeler) by Paul+Jakma · · Score: 2

      No he didn't prove it is infeasible. For one, that would require a method to prove that the compilers are indeed wholly independent, which hasn't been provided. Also, note that people in some sub-field of technology tend to move around. An engineer who has worked on one compiler is *more* likely to also work on another compiler at some stage than any random engineer. The DDC technique *assumes* that diverse compilers are independent - it takes it on trust. Wheeler's work if anything re-inforces the essence of Thompson's philosophical point, that we must either completely build and control every aspect of our system OR we must trust to at least some degree in someone else. Note also that someone can frustrate this technique by deliberately making their software not build reproducibly, for apparently innocent reasons (e.g. D Wheeler had such issues with using tcc for DDC). A fuller version of my critique of "Diverse Double-Compiling".

      That sounds like I'm being very dismissive of DDC, but I'm not. It could be really useful, *if* it is feasible to actually regularly reproduce builds. Debian is working on this, and hopefully they'll get there - but it's not a trivial task either. However, DDC does not fully counter Thompson's attack - not in the normal absolute sense of the word "fully" at least.

      --
      I use Friend/Foe + mod-point modifiers as a karma/reputation system.
  9. Easy enough to handle trusting trust by raymorris · · Score: 2

    Since you mentioned Reflections on Trusting Trust, that issue is easy enough to avoid. There are some simpler and more clever methods, but consider this:

    Use Borland 1.0 to compile llvm.
    Use this new llvm binary to compile gcc.
    Chain a few more in you want to.

    You don't need to trust the first compiler. It could be trojaned so as to trojan new copies of itself. You'd only be concerned if you thought that Borland 1.0 was trojaned in such a way as to add a trojan to the code of a compiler that didn't yet exist, llvm, AND that trojan wasn't for Borland or llvm, but for the current version of gcc - another compiler quite different from anything that existed when Borland 1.0 was created.

    The perpetrators would have to not only be astronomically clever, they'd also have to see into the future, twice, in order to build such a trojan.