Are You Sure This Is the Source Code?
oever writes "Software freedom is an interesting concept, but being able to study the source code is useless unless you are certain that the binary you are running corresponds to the alleged source code. It should be possible to recreate the exact binary from the source code. A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."
"Exact binaries" is not the point of having the source code.
Given the scale of most modern programs' codebase, good luck actually reviewing the code meaningfully in the first place. That said, if you're really that concerned about the code matching the source, run a source-based distro like Gentoo or Funtoo. For most practical purposes, though, users find binary distributions like Debian/Ubuntu or the various Red Hat-based systems to be more effective in regards to their time.
In SOVIET RUSSIA... erm...NSA AMERICA, the Internet logs onto YOU!
If you are that paranoid study the source code then recompile
I'd suggest that "severely limiting the whole point of running free software" might be a touch of an exaggeration. A huge touch.
No it doesn't. The whole point of running free software is knowing that I can rebuild the binary (even if the end result isn't exactly the same) and, more importantly, freely modify it to suit my needs rather than being beholden to some vendor.
...or just using a binary that you compiled from binary yourself.
For a lot of projects, that's not nearly as hard as some people like to make it sound.
A Pirate and a Puritan look the same on a balance sheet.
If you need to be sure, just compile it yourself. If you suspect foul play, you need to do a full analysis (assembler-level or at least decompiled) anyways.
The claim that this is a problem is completely bogus.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I have recompiled all my software from the source code and verified that the binaries match but for some reason there's a Ken Thompson user that is always logged in. How did Ken Thompson get into my system and how do I get rid of him?
1) Submitter is the one who wrote the blog post 2) No cross-reference, no references, no differing opinions at all 3) "severely limiting the whole point of running free software" is more than a bit of an exaggeration
Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
I took a graduate-level security class from Alex Halderman (of Internet voting fame) and what I came away with is that security comes down to trust. To take an example, when I walk down the street, I want to stay safe and avoid being run over by a car. If I think that the world is full of crazy drivers, the only way to be safe is to lock myself inside. If I want to function in society, I have to trust that when I walk down the sidewalk that a driver will not veer off the road and hit me.
When you order a computer, you simply trust that it doesn't have a keylogger or "secret knock" CPU code installed at the factory. It's exactly the same with software binaries, of course. In the extreme case, even examining all the source code will not help. You must trust!
What a fool believes, he sees, no wise man has the power to reason away.
..are a bitch. The amount of hoops eg. the bitcoin developers jump through to proof they didn't mess with the build are large. Running specific OS build in emulators with fake system time and whatnot. No easy task.
Hey now, you have to be pretty IT savvy to type ./configure, make and make install all in the same day. Some of us make good money doing that, don't just go suggesting everyone should be doing it.
If you've compiled the compiler with competitors' compilers (try saying that ten times fast), you should be fairly safe from Trusting Trust.
Has anybody thought about recompiling the source and seeing if you get the same binary?
Has anybody thought of reading the article before posting questions like this?
That said, this particular "article" isn't worth the waste of bytes it takes up. It's like seeing a 6 year old trying to explain a combustion engine.
Binaries will almost always differ - if nothing else because you need the entire environment exactly like the binary builder. Not just the time stamps, compile paths, hostnames and account names, which are the obvious.
If your compiler or linker is a minor version off what he used, the results can be very different, even if using the same compile options.
But that's not enough: If your hardware is different, randomization of functions in a library will be different.
To flesh out his article a bit more, the author could have done a test with two different Gentoo systems. Different but mostly compatible hardware, and a slight difference in the toolchain. That might have opened his eyes.
Then again, probably not.
This a problem that doesn't exist. You establish a chain of evidence and authority for the binaries via signing and checksums, starting with the upstream. Upstream publishes source and there's signing of the announcement which contains checksums. Package maintainer compiles the source. The generated package includes checksums. Your repo's packages are signed by the repo's key.
You can, at any point in time with most packaging systems, verify that every single one of your installed binaries' checksums match the checksums of the binaries generated by the package maintainer.
If you don't trust the maintainer to not insert something evil, download the distro source package and compile it yourself.
If you suspect the distro source package, all you have to do is run a checksum of the copy of the upstream tarball vs the tarball inside the source package, and then all you need to do is review the patches the distro is applying.
If you suspect the upstream, you download it and spend the next year going through it. Good luck...
Please help metamoderate.
Most of the time, even that isn't enough. C compilers tend to embed build-time information as well. For verilog, they often use a random number seed for the genetic algorithm for place-and-route. Most compilers have a flag to set a specified value for these kinds of parameter, but you have to know what they were set to for the original run.
Of course, in this case you're solving a non-problem. If you don't trust the source or the binary, then don't run the code. If you trust the source but not the binary, build your own and run that.
I am TheRaven on Soylent News
I work in the gaming (Gambling) industry.
Many states require us to submit both the source code and build tools required to make an exact (and I mean 'same md5sum') copy of the binary that is running on a slot machine on the floor.. to an extent that would blow you away.
They need to be able to go to the floor of a casino, rip out the drive or card containing the software, take it back to THEIR office, and build another exact image of the same drive or SD card.
md5sum from /dev/sda and /dev/sdb must match.
I can tell you the amount of effort that goes into this is monumental. There can be no dynamically generated symbols at compile time. The files must be built compiled and written to disk exactly the same every time. The filesystem can't have modify or creation times because those would change.
This is a silly idea for open source software, the only industry I've seen apply it is perhaps the least-open one in the world.
One example being Philips TV or BluRay built on Linux. When asked for source code, it is provided, but there are no way to ensure that the source code is for the device, because the provided binaries are encrypted and signed.
Bad choice of target - .Net does actually have multiple compilers available, including open source. But more to the point for this discussion, it has multiple DEcompilers available, including open source.
Want to know what that nasty MS compiler put in your .Net binary ? - run it through ILSpy.
Don't trust the ILSpy binary - decompile it with itself, or with a.n.other decompiler.
In fact, because .Net decompiles so well, the problem of this article (binaries don't compare) just doesn't occur. Want to check your .Net binary against the supposed source ? - easy (well, a hell of a lot easier than with C++). Build your binary from the source, decompile both binaries and compare the two sets of decompiled source. It works, it is consistent and reliable, and it is one hell of a lot more useful at showing up differences than comparing two binaries.