Slashdot Mirror


Are You Sure This Is the Source Code?

oever writes "Software freedom is an interesting concept, but being able to study the source code is useless unless you are certain that the binary you are running corresponds to the alleged source code. It should be possible to recreate the exact binary from the source code. A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."

17 of 311 comments (clear)

  1. Bogus argument by Beat+The+Odds · · Score: 5, Insightful

    "Exact binaries" is not the point of having the source code.

    1. Re:Bogus argument by Anonymous Coward · · Score: 5, Informative

      The guy who submitted that article is the person who wrote it. Awesome "work", editors.

    2. Re:Bogus argument by arth1 · · Score: 5, Informative

      To borrow from The Watchmen:

      Who compiles the compiler?

      Your attribution isn't just a little off, it's way off.
      Try Iuvenalis, around 200 AD.

    3. Re:Bogus argument by oGMo · · Score: 5, Insightful

      Simply having the source code doesn't mean you have the ability to actually use the source code to make bug fixes should the need arise.

      And yet, it still means that you can fix it, or even rewrite it in something else, if you want. Not having the source code means this is between much-more-difficult and impossible. The lesson here should be that everything we use should be open source, including compilers and libraries, not "well in theory I might have problems, so screw that whole open source thing .. proprietary all the way!"

      --

      Don't think of it as a flame---it's more like an argument that does 3d6 fire damage

    4. Re:Bogus argument by Lumpy · · Score: 5, Informative

      There are very talented people that can hide things in only a few lines of code. See http://ioccc.org/ for some examples that will make your skin crawl.

      --
      Do not look at laser with remaining good eye.
    5. Re:Bogus argument by Andy+Dodd · · Score: 5, Informative

      Yeah. Unfortunately, the issues he presents here DO make it more difficult to prove that someone is providing a binary that could NOT have possibly originated from the provided source code.

      As an example, the kernel source initially released for the Samsung GT-N8013 (USA Wifi Note 10.1) was not what was used to build the binaries in question.

      The "difficult to prove but obvious" - Any kernel built from the provided source had a massively broken wifi driver that would completely stop functioning, usually within 5-10 minutes, requiring the module to be removed and reinserted. Pulling the wifi module source from a different Samsung tarball (such as a GT-I9300 release) would result in a working driver. But how do you prove the source provided is correct?
      In the case of the N8013, we were lucky - Samsung changed a bunch of debug printk()s slightly in their released binary. Small stuff, not functionally relevant, such as typo fixes and capitalization differences in their touchscreen driver's debug printk()s - but at least provable to be different.

      So we could prove that the kernels didn't match, but couldn't necessarily prove that the biggest functional problem was due to a source difference.

      We asked Samsung to provide source that corresponded to the UEALGB build for that device, and their response was, "That build is a leak and hence we are not obligated to provide source for it." Effectively admitting that the provided source was not meeting the requirements imposed by the GPL for that build, and then claiming that the software build preinstalled on every device sold in the USA for the first 1-2 months after launch was a "leak" and thus they didn't have to provide source for it.

      Needless to say, between that and other situations, that was my last Samsung device.

      --
      retrorocket.o not found, launch anyway?
    6. Re:Bogus argument by Hatta · · Score: 5, Informative

      But unless and until he reads AND UNDERSTANDS every line of the source he is
      always going to have to be trusting somebody somewhere.

      Even if he reads and undertands every line of the source, he's still trusting someone. He has to read and understand every line of the source code of the complier he is using, and the compiler that compiled that compiler, and so on.

      Reflections on trusting trust is almost 30 years old now. It should be well known.

      --
      Give me Classic Slashdot or give me death!
    7. Re:Bogus argument by frost_knight · · Score: 5, Informative

      For true malice there's also The Underhanded C Contest.

      From their home page: "The goal of the contest is to write code that is as readable, clear, innocent and straightforward as possible, and yet it must fail to perform at its apparent function. To be more specific, it should do something subtly evil."

      --
      It always takes longer than you expect, even when you take into account Hofstadter's Law. --Hofstadter's Law
  2. touch o' hyperbole by ahree · · Score: 5, Insightful

    I'd suggest that "severely limiting the whole point of running free software" might be a touch of an exaggeration. A huge touch.

  3. Incorrect suppositions. by Microlith · · Score: 5, Insightful

    A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."

    No it doesn't. The whole point of running free software is knowing that I can rebuild the binary (even if the end result isn't exactly the same) and, more importantly, freely modify it to suit my needs rather than being beholden to some vendor.

    1. Re:Incorrect suppositions. by Shoten · · Score: 5, Insightful

      A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."

      No it doesn't. The whole point of running free software is knowing that I can rebuild the binary (even if the end result isn't exactly the same) and, more importantly, freely modify it to suit my needs rather than being beholden to some vendor.

      There's another point too...which incidentally is the whole point of running a distro like Gentoo...that you can compile the binary exactly to your specifications, even sometimes optimizing it for your specific hardware. I don't get at all this idea he has about "reproducible builds;" if he builds the same way on the same hardware, he'll get the same binary. But what he's doing is comparing builds in distros with ones he did himself...and the odds that it's the same method used to create the binary are very low indeed.

      If he's concerned about precompiled binaries having been tampered with, he's looking at the wrong protective measure. Hashes and/or signing are what is used to protect against that...not distributing the source code alongside the compiled binary files. If you look at the source code and just assume that a precompiled binary must somehow be the same code "just because," you're an idiot.

      --

      For your security, this post has been encrypted with ROT-13, twice.
  4. Problems with verifying the binaries from source by tooslickvan · · Score: 5, Funny

    I have recompiled all my software from the source code and verified that the binaries match but for some reason there's a Ken Thompson user that is always logged in. How did Ken Thompson get into my system and how do I get rid of him?

  5. Trust by bunratty · · Score: 5, Insightful

    I took a graduate-level security class from Alex Halderman (of Internet voting fame) and what I came away with is that security comes down to trust. To take an example, when I walk down the street, I want to stay safe and avoid being run over by a car. If I think that the world is full of crazy drivers, the only way to be safe is to lock myself inside. If I want to function in society, I have to trust that when I walk down the sidewalk that a driver will not veer off the road and hit me.

    When you order a computer, you simply trust that it doesn't have a keylogger or "secret knock" CPU code installed at the factory. It's exactly the same with software binaries, of course. In the extreme case, even examining all the source code will not help. You must trust!

    --
    What a fool believes, he sees, no wise man has the power to reason away.
  6. Re:What a problem by h4rr4r · · Score: 5, Funny

    Hey now, you have to be pretty IT savvy to type ./configure, make and make install all in the same day. Some of us make good money doing that, don't just go suggesting everyone should be doing it.

  7. Diverse Double-Compiling by David A. Wheeler by tepples · · Score: 5, Interesting

    If you've compiled the compiler with competitors' compilers (try saying that ten times fast), you should be fairly safe from Trusting Trust.

  8. Re:What a problem by TheRaven64 · · Score: 5, Insightful

    Most of the time, even that isn't enough. C compilers tend to embed build-time information as well. For verilog, they often use a random number seed for the genetic algorithm for place-and-route. Most compilers have a flag to set a specified value for these kinds of parameter, but you have to know what they were set to for the original run.

    Of course, in this case you're solving a non-problem. If you don't trust the source or the binary, then don't run the code. If you trust the source but not the binary, build your own and run that.

    --
    I am TheRaven on Soylent News
  9. Required in some industries by mrr · · Score: 5, Interesting

    I work in the gaming (Gambling) industry.

    Many states require us to submit both the source code and build tools required to make an exact (and I mean 'same md5sum') copy of the binary that is running on a slot machine on the floor.. to an extent that would blow you away.

    They need to be able to go to the floor of a casino, rip out the drive or card containing the software, take it back to THEIR office, and build another exact image of the same drive or SD card.

    md5sum from /dev/sda and /dev/sdb must match.

    I can tell you the amount of effort that goes into this is monumental. There can be no dynamically generated symbols at compile time. The files must be built compiled and written to disk exactly the same every time. The filesystem can't have modify or creation times because those would change.

    This is a silly idea for open source software, the only industry I've seen apply it is perhaps the least-open one in the world.