Are You Sure This Is the Source Code?

← Back to Stories (view on slashdot.org)

Are You Sure This Is the Source Code?

Posted by timothy on Thursday June 20, 2013 @06:20AM from the not-as-simple-as-md5-sum dept.

oever writes "Software freedom is an interesting concept, but being able to study the source code is useless unless you are certain that the binary you are running corresponds to the alleged source code. It should be possible to recreate the exact binary from the source code. A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."

311 comments

Min score:

Reason:

Sort:

Bogus argument by Beat+The+Odds · 2013-06-20 06:22 · Score: 5, Insightful

"Exact binaries" is not the point of having the source code.
1. Re:Bogus argument by Anonymous Coward · 2013-06-20 06:29 · Score: 5, Informative
  
  The guy who submitted that article is the person who wrote it. Awesome "work", editors.
2. Re:Bogus argument by Anonymous Coward · 2013-06-20 06:30 · Score: 0, Redundant
  
  Thank you for adding so much insight to the discussion.
  I mean, uhh... yeah, this!
3. Re:Bogus argument by CastrTroy · 2013-06-20 06:33 · Score: 4, Insightful
  
  Ok, maybe not exact binaries, but what if you can't even make a binary at all, or if you do make one, how do you ensure it's functioning the same? That's the problem that many people have with open source code that exists in languages that you can only compile with a proprietary compiler. Take .Net for instance. It's possible to write a program that is open source, and yet you're at the mercy of Microsoft to be able to compile the code. Even when I download Linux packages in C, it's often the case that I can't compile them, because I'm missing some obscure library that the original developer just assumed I had. What good is code if you are unable to compile it is right up there with "what use is a phone call, if you are unable to speak". Some code only works with certain compilers, or with certain flags turned on in those compilers. Simply having the source code doesn't mean you have the ability to actually use the source code to make bug fixes should the need arise.
  
  --
  
  Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
4. Re:Bogus argument by chuckinator · 2013-06-20 06:34 · Score: 1
  
  It definitely is. Discounting difference in hardware of their build machine and your build machine and difference in versions of compiler, libraries, etc, it's still a bogus argument. I've had the same compiler on the same machine produce different binaries on two consecutive builds on the same day due to changing memory addresses of values throwing the checksum completely off.
  
  Also, the author needs to install redhat-rpm-config on his system if he's trying to generate stripped binaries with separate debuginfo packages.
5. Re:Bogus argument by tloh · 2013-06-20 06:37 · Score: 2
  
  These are not the code you are looking for.....
  
  *ducks*
  
  --
  Stay sentient. Don't drink bad milk.
6. Re:Bogus argument by Anonymous Coward · 2013-06-20 06:44 · Score: 3, Insightful
  
  To borrow from The Watchmen:
  Who compiles the compiler?
7. Re:Bogus argument by lgw · 2013-06-20 06:44 · Score: 1
  
  "Exact binaries" is not the point of having the source code.
  The use case is "we're using this binary in production, which we didn't build ourselves". That's how open source is generally used in practice, after all - you download the binaries for your platform, and you (maybe) archive the source away somewhere just in case.
  Isn't that the strongest practical use case for Open Source in the business world? Sure, you don't plan on maintaining it yourself but you could if you have to. The problem is, if the source doesn't match the object, you can't just fix a bug - you have to requalify this whole new software package which just happened to come from the same place, and which may be a very different version of the software.
  I've had to spend time maintaining binaries where we didn't have matching source and we couldn't migrate to what would be build from the source. Maintaining raw binaries with the source as just a guideline blows goats, and you never want to be the guy stuck doing it.
  
  --
  Socialism: a lie told by totalitarians and believed by fools.
8. Re:Bogus argument by jythie · 2013-06-20 06:50 · Score: 2
  
  Yeah.. it really strikes me that the person is over exaggerating the importance of a narrow set of use cases. Reproducible builds are nice, and in some cases important, and in an ideal case compiling should be sufficiently deterministic one should be able recreate any given binary, but I would not say that is the 'point' of having access to source code.
9. Re:Bogus argument by Chuckstar · 2013-06-20 06:50 · Score: 2
  
  No. The strongest practical use case for Open Source in business is that the Open Source version is some combination of better/cheaper than alternate versions, with "better" including the fact that Open Source projects often get updated faster when security bugs (and sometimes other bugs) are found. The possibility of bringing development fully in-house is not a practical solution for 99.99% of businesses. (I'm exaggerating a little, but not much).
10. Re:Bogus argument by ZahrGnosis · 2013-06-20 06:51 · Score: 4, Insightful
  
  If you're worried about the lineage of a binary then you need to be able to build it yourself, or at least have it built by a trusted source... if you can't, then either there IS a problem with the source code you have, or you need to decide if the possible risk is worth the effort. If you can't get and review (or even rewrite) all the libraries and dependencies, then those components are always going to be black-boxes. Everyone has to decide if that's worth the risk or cost, and we could all benefit from an increase in transparency and a reduction in that risk -- I think that was the poster's original point.
  The real problem is that there's quite a bit of recursion... can you trust the binaries even if you compiled them, if you used a compiler that came from binary (or Microsoft)? Very few people are going to have access to the complete ground-up builds required to be fully clean... you'd have to hand-write assembly "compilers" to build up tools until you get truly useful compilers then build all your software from that, using sources you can audit. Even then, you need to ensure firmware and hardware are "trusted" in some way, and unless you're actually producing hardware, none of these are likely options.
  You COULD write a reverse compiler that's aware of the logic of the base compiler and ensure your code is written in such a way that you can compile it, then reverse it, and get something comparable in and out, but the headache there would be enormous. And there are so many other ways to earn trust or force compliance -- network and data guards, backups, cross validation, double-entry or a myriad of other things depending on your needs.
  It's a balance between paranoia and trust, or risk and reward. Given the number of people using software X with no real issue, a binary from a semi-trusted source is normally enough for me.
11. Re:Bogus argument by Anonymous Coward · 2013-06-20 06:53 · Score: 1
  
  how do you ensure it's functioning the same?
  You run a test suite.
12. Re:Bogus argument by icebike · 2013-06-20 06:54 · Score: 4, Insightful
  
  But too his credit, he did say a "simple analysis" although when reading TFA he omitted the word "minded" from the middle of that phrase.
  Virtually all of his findings are traced to differences in date and time and chosen compiler settings and compiler vintage.
  Unless he can find large blocks of inserted code (not merely data segment differences) he is complaining about nothing.
  He his certainly free to compile all of his system from source, and that way he could be assured he is running
  exactly what the source said. But unless and until he reads AND UNDERSTANDS every line of the source he is
  always going to have to be trusting somebody somewhere.
  Its pretty easy to hide obfuscated functionality in a mountain of code (in fact it seems far too many programmers pride
  themselves their obfuscation skills). I would worry more about the mountain he missed while staring at the
  mole-hill his compile environment induced.
  
  --
  Sig Battery depleted. Reverting to safe mode.
13. Re:Bogus argument by Anonymous Coward · 2013-06-20 06:55 · Score: 0
  
  When someone can submit their Slashdot Journal entries as articles and have them accepted, I fail to see the problem with someone submitting his own blog post.
14. Re:Bogus argument by arth1 · 2013-06-20 06:58 · Score: 5, Informative
  
  To borrow from The Watchmen:
  Who compiles the compiler?
  Your attribution isn't just a little off, it's way off.
  Try Iuvenalis, around 200 AD.
15. Re:Bogus argument by briancox2 · 2013-06-20 06:59 · Score: 3, Funny
  
  This looks like the shortest, most consise piece of FUD I've ever seen.
  
  I wonder if next week I could get a story published that say, "I don't know if Microsoft is spying on you through your webcam. So it could be true."
  
  --
  We should learn what we need to know about issues, before we decide what we need to feel about them.
16. Re:Bogus argument by oGMo · 2013-06-20 07:00 · Score: 5, Insightful
  
  Simply having the source code doesn't mean you have the ability to actually use the source code to make bug fixes should the need arise.
  
  And yet, it still means that you can fix it, or even rewrite it in something else, if you want. Not having the source code means this is between much-more-difficult and impossible. The lesson here should be that everything we use should be open source, including compilers and libraries, not "well in theory I might have problems, so screw that whole open source thing .. proprietary all the way!"
  
  --
  Don't think of it as a flame---it's more like an argument that does 3d6 fire damage
17. Re:Bogus argument by CastrTroy · 2013-06-20 07:01 · Score: 3, Insightful
  
  I'm not really even talking from a trust point of view, but more the other point of open source software, which is, "if there's a bug in the code, you can fix it yourself". Without even going down that whole tangent of recursively verifying the entire build chain, there's the problem of being able to even functionally compile the source code so that you can make fixes when you need to.
  
  --
  
  Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
18. Re:Bogus argument by arth1 · 2013-06-20 07:03 · Score: 1
  
  You run a test suite.
  Which is one reason why important open source programs make sure that the test suite and its sources are also available and up to date.
  Or, you examine the source, and then compile it with a compiler from a different source.
19. Re:Bogus argument by OakDragon · 2013-06-20 07:04 · Score: 1
  
  Already at 5, Insightful, so please enjoy this virtual "+1 Insightful"...
  
  --
  Dark Reflection
20. Re:Bogus argument by NoNonAlphaCharsHere · 2013-06-20 07:05 · Score: 4, Funny
  
  To borrow from the Tao Te Ching: "The Source that can be told is not the Source."
21. Re:Bogus argument by Aaron+B+Lingwood · 2013-06-20 07:07 · Score: 4, Interesting
  
  "Exact binaries" is not the point of having the source code.
  You are correct. However, it is a method to confirm that you have received the entire source code.
  The point being made is that a binary could always contain functions that are malicious, buggy or infringe on copyright while the supplied source does not.
  Case Study:
  A software company (lets call them 'Macrosift') takes over project management of a GPL'd document conversion tool. Macrosift contribute quite a bit of code and the tool really takes off. Most users are obtaining this tool be either the Macrosift-controlled repository or a Macrosift partner-controlled repository as a pre-compiled binary. It can even convert all kinds of documents flawlessly into Macrosift's Orifice 2015 new extra standard format which no other tool seems to be able to do.
  Newer versions of OpenOffice, LibreOffice, JoeOffice come out and this tool just doesn't seem to be doing the job. Sure, it converts perfectly from everything into MS .xsf but doesn't work so well the other way and won't work at all between some office suits. The project gets forked by the community to make it feature complete. The project managers start by compiling the source, and to their surprise, the tool will not work as well as the binary did. After a year passes, the community realizes they've been had. By painstakingly decompiling the binary, they discover that the function that converts to MS proprietary .xsf is different to that in the source. Another hidden function is discovered in the binary that introduces errors and file bloat after a certain date if the tool is being used solely on non-MS documents.
  How else can I ascertain whether you have supplied me with THE source code for THIS binary if I can not produce said binary with provided source code?
  
  --
  [Rent This Space]
22. Re:Bogus argument by Lumpy · 2013-06-20 07:07 · Score: 5, Informative
  
  There are very talented people that can hide things in only a few lines of code. See http://ioccc.org/ for some examples that will make your skin crawl.
  
  --
  Do not look at laser with remaining good eye.
23. Re:Bogus argument by 14erCleaner · 2013-06-20 07:08 · Score: 4, Informative
  
  Who compiles the compiler?
  I guess it's time to introduce another generation to the devious genius of Ken Thompson.
  
  You can't trust code that you did not totally create yourself. (Especially code from companies that employ people like me.)
  
  --
  Have you read my blog lately?
24. Re:Bogus argument by xanclic · 2013-06-20 07:09 · Score: 1
  
  I think, if you can't freely compile the source code, the software is not exactly free. Freedom 1 at http://www.gnu.org/philosophy/free-sw.en.html says that you need to be able to change the program (at source code level) and to incorporate these changes into the running program. This basically requires the ability to easily compile a working binary from source code.
25. Re:Bogus argument by houghi · 2013-06-20 07:11 · Score: 2
  
  how do you ensure it's functioning the same?
  
  If you have any reason to doubt that it might not do what it says it does, do not use the binary, but compile it yourself.
  If you are unable to compile it AND you do not trust the binary, don't run it. Read and rewrite the code so it does work.
  If you are unable to rewrite the code, do not trust the binary AND are unable to compile. Look for an alternative.
  Or hire somebody who is able to write the code for you so that you are able to read it, compile it and change it.
  Open source does not mean that it must be easy or even gratis.
  What good the code is depends on you. To me most code is utterly useless. I could not compile it if my life depended on it. This due to my technical skills, but the reason is irrelevant. To me that code is as unusable as it is for those who do not have the right compilers.
  And a phone call is very useful for those who can not speak. A fax is nothing but a phone call. Just because you are using things in one way does not mean that is the same for others.
  
  --
  Don't fight for your country, if your country does not fight for you.
26. Re:Bogus argument by Andy+Dodd · 2013-06-20 07:14 · Score: 5, Informative
  
  Yeah. Unfortunately, the issues he presents here DO make it more difficult to prove that someone is providing a binary that could NOT have possibly originated from the provided source code.
  As an example, the kernel source initially released for the Samsung GT-N8013 (USA Wifi Note 10.1) was not what was used to build the binaries in question.
  The "difficult to prove but obvious" - Any kernel built from the provided source had a massively broken wifi driver that would completely stop functioning, usually within 5-10 minutes, requiring the module to be removed and reinserted. Pulling the wifi module source from a different Samsung tarball (such as a GT-I9300 release) would result in a working driver. But how do you prove the source provided is correct?
  In the case of the N8013, we were lucky - Samsung changed a bunch of debug printk()s slightly in their released binary. Small stuff, not functionally relevant, such as typo fixes and capitalization differences in their touchscreen driver's debug printk()s - but at least provable to be different.
  So we could prove that the kernels didn't match, but couldn't necessarily prove that the biggest functional problem was due to a source difference.
  We asked Samsung to provide source that corresponded to the UEALGB build for that device, and their response was, "That build is a leak and hence we are not obligated to provide source for it." Effectively admitting that the provided source was not meeting the requirements imposed by the GPL for that build, and then claiming that the software build preinstalled on every device sold in the USA for the first 1-2 months after launch was a "leak" and thus they didn't have to provide source for it.
  Needless to say, between that and other situations, that was my last Samsung device.
  
  --
  retrorocket.o not found, launch anyway?
27. Re:Bogus argument by trum4n · 2013-06-20 07:14 · Score: 1
  
  But, you can't trust the test suite if you didn't write it!
28. Re:Bogus argument by Anonymous Coward · 2013-06-20 07:15 · Score: 0
  
  to borrow from luvenalis:
  whoosh!
29. Re:Bogus argument by mic0e · 2013-06-20 07:18 · Score: 1
  
  For this reason, building instructions are usally provided, e.g. in the form of a Makefile.
  Furthermore, almost every distribution provides you with all dependencies and the full packaging script which was used to create the distribution's binary in the first place.
  Source-based distributions such as Gentoo even go as far as to do the actual creation of the binary on your local machine.
  On Windows, however, this is admittedly a problem, since _everybody_ simply downloads an exe file from somewhere, without even checking the md5 hash that is usually provided (however, in most cases, in vain because the website is not even SSL secured). Most software probably can't even be compiled on Windows.
30. Re:Bogus argument by Hatta · 2013-06-20 07:20 · Score: 5, Informative
  
  But unless and until he reads AND UNDERSTANDS every line of the source he is
  always going to have to be trusting somebody somewhere.
  Even if he reads and undertands every line of the source, he's still trusting someone. He has to read and understand every line of the source code of the complier he is using, and the compiler that compiled that compiler, and so on.
  Reflections on trusting trust is almost 30 years old now. It should be well known.
  
  --
  Give me Classic Slashdot or give me death!
31. Re:Bogus argument by Anonymous Coward · 2013-06-20 07:36 · Score: 0
  
  That might be the original source, but that doesn't change the fact that he was borrowing from the Watchmen.
32. Re:Bogus argument by K.+S.+Kyosuke · 2013-06-20 07:39 · Score: 2
  
  Who compiles the compiler?
  You hand-translate it to machine code to create the first binary. Then you apply it to itself. It's a time-proven technique.
  
  --
  Ezekiel 23:20
33. Re:Bogus argument by Anonymous Coward · 2013-06-20 07:41 · Score: 2, Funny
  
  Yes, but how can you be SURE that's the original source?
34. Re:Bogus argument by Anonymous Coward · 2013-06-20 07:43 · Score: 0
  
  Yeah, really... If you want a exact binary copy you can use cp. You want to have the source so you can produce DIFFERENT binaries! :-P
35. Re:Bogus argument by frost_knight · 2013-06-20 07:47 · Score: 5, Informative
  
  For true malice there's also The Underhanded C Contest.
  From their home page: "The goal of the contest is to write code that is as readable, clear, innocent and straightforward as possible, and yet it must fail to perform at its apparent function. To be more specific, it should do something subtly evil."
  
  --
  It always takes longer than you expect, even when you take into account Hofstadter's Law. --Hofstadter's Law
36. Re:Bogus argument by aristotle-dude · 2013-06-20 07:55 · Score: 4, Informative
  
  "Exact binaries" is not the point of having the source code.
  Uh, you must not have worked in a shop that does continuous integration automated builds? Do you really think QA should be handed binaries that you compile and have them trust them?
  The problem is that GCC will always give you a different binary every time you compile from the same source. This makes it impossible that the binary you received comes from the source you claim to have used. You can get around this by never receiving binaries from anywhere but the automated build machine but it would still be useful to be able to test that a build that you received was built from the code you expect.
  There were several reasons why Apple moved away from the GCC tool chain to LLVM and Clang but one of the abilities of the LLVM stack is that you can actually get identical binaries from the same source compiled on different machines at different times.
  
  --
  Jesus was a compassionate social conservative who called individuals to sin no more.
37. Re:Bogus argument by UnknownSoldier · 2013-06-20 07:59 · Score: 3, Insightful
  
  Exactly.
  I've recompiled Vim because I wanted to fix Vim's broken design of being unable to distinguish between TAB and Ctrl-I, doesn't support CapsLock remap, and wanted a smaller executable not needing all the bells and whistles of the kitchen sink.
  I've recompiled Notepad++ due to bug (couldn't select a font smaller then 8 pts because the array was hard-coded in two different places. WTF?)
  If you want to be able to quickly tell the quality of an open source project, see how easy it is to follow the directions to even produce an executable. Most open source projects have shitty docs on how to even compile it.
38. Re:Bogus argument by Anonymous Coward · 2013-06-20 08:03 · Score: 0
  
  nah, it's easy.
  use the source as a guideline, but don't read it, hand compile it. you can fix bugs, and catch vulns and malware as you go.
  this technique will spell the end of zero days, because it will take much longer than that to compile stuff.
39. Re:Bogus argument by Anonymous Coward · 2013-06-20 08:04 · Score: 0
  
  Open source/source code is not really meant for a "non computer person". I don't expect my mom (who is an accountant) to be able to figure out how to build a source file (but then I can't do my taxes). Its meant for a person with knowledge about computers and software engineering. Any software professional worth his salt will be able to look at the source and figure out how to build it. If a person has problems figuring-out compiler switches or dependencies he/she should probably start with something easier than trying to build unknown software on a system they know nothing about.
40. Re:Bogus argument by Anonymous Coward · 2013-06-20 08:05 · Score: 0
  
  Coincidentally, "What If?" #34 -- an all-joke issue -- had a page of Watchers watching the Watchers watching the Watchers watching the Watchers, ad nauseum, before the Watchmen was written. I remember Elmer Fudd was one of the Watchers.
41. Re:Bogus argument by bn557 · 2013-06-20 08:35 · Score: 1
  
  yacc, I do believe
  
  --
  Humans are slow, innaccurate, and brilliant; computers are fast, acurrate, and dumb; together they are unbeatable
42. Re:Bogus argument by Anonymous Coward · 2013-06-20 08:37 · Score: 0
  
  ever hear of test cases? even better: include automated test suites as source included as part of the source for the thing in question. The test can be run on either the retrieved binary or the locally built binary.
43. Re:Bogus argument by jimbolauski · 2013-06-20 08:55 · Score: 1
  
  It's been a while since I did any parsing but I remember some saying about Lex having to go feed Yacc tokens. So wouldn't you need Lex first.
  
  --
  Knowledge = Power
  P= W/t
  t=Money
  Money = Work/Knowledge so the less you know the more you make
44. Re:Bogus argument by mrogers · 2013-06-20 09:08 · Score: 4, Informative
  
  The latest alpha release of the Tor Browser uses a deterministic build process for exactly that reason: users of open source software (or the small minority of users with the necessary technical skills) should be able to check that the published binaries match the published source exactly - no malware, no easter eggs, no backdoors. If someone detects a mismatch, they can alert the rest of the community.
  Mike Perry, who spent six weeks getting deterministic builds working for Tor, has some interesting thoughts on why this is an important issue for security tools, even if the users completely trust the developers.
  I'd like to see more open source projects following Tor's lead. Gitian is a deterministic build tool that might help - it enables multiple people to build a binary from the same source and check that they get identical results.
45. Re:Bogus argument by Anonymous Coward · 2013-06-20 09:11 · Score: 0
  
  To borrow from the Bible: "The ways of the Real Programmer are inscrutable."
46. Re: Bogus argument by Anonymous Coward · 2013-06-20 09:13 · Score: 0
  
  Have you tried turning of optimization and all security features like stack radomization? In reality you do NOT want two builds to produce the same binary. That is one of the reasons to build it yourself. ( and who patches binaries anymore?)
47. Re:Bogus argument by flimflammer · 2013-06-20 09:26 · Score: 1
  
  I'm so glad that contest returned.
48. Re:Bogus argument by Anonymous Coward · 2013-06-20 09:44 · Score: 1
  
  That constitutes a GPL violation for sure. Their best hope would be that none of the kernel contributors for the pieces that they're using in those Android devices decide to sue in a manner like Actiontec and Verizon got with the first FiOS routers over Busybox. All it'll take is ONE over that stunt. If they shipped it on the devices, they can claim "leak" all they want to- it's still a willful Copyright violation on their part.
  There's a reason Verizon and Actiontec CAVED. You don't want to be on the business end of that sort of lawsuit. It means a statutory damage of at least $150,000 per each work so violated (for each time it's violated...). Samsung should be told that they don't want to GO down that path. It's not pretty and it pretty much will break the back of any company that gets found guilty of it- the party, so found for, becomes a primary creditor in any bankruptcy proceedings. Basically, Samsung loses if it gets to court at that point because it's a clear violation with millions of units in the field with the willful infringement.
  I know I'm disinclined to have a Samsung phone after the dismal showing they did with the Galaxy Nexus and the fact that the SOBs allowed Verizon to convince them to lock it down so I couldn't load my own firmware on the Galaxy S4. With this...I'm even more disinclined.
49. Re:Bogus argument by donaldm · 2013-06-20 09:47 · Score: 1
  
  "Exact binaries" is not the point of having the source code.
  With a Linux distribution you have repositories which have checksums on their binaries, however the lead up article states this "severely limiting the whole point of running free software". nice statement but is this not a bit "trollish" since you can really say the same about all vendors who produce binaries. At least with Open Source you can look at the the code or get someone to do it for you however at some stage it still comes down to trust.
  
  --
  There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
50. Re:Bogus argument by Carewolf · 2013-06-20 09:47 · Score: 1
  
  Bullshit!. I know of speed optimization that are specifically disabled in GCC because it would make them less-deterministic. Not less deterministic on the same machine but give different result between different machines (it controlled the cache and search depth for some optimization).
  Second. There is a nice tool call icecream or icecc. It is used to distibute gcc jobs across many different machines. One funny detail of it is that 1% of all jobs are run on two machines and have the result compared. Just to detect rogue or broken machines. While using this tool for more than 10 years, I have still NEVER seen gcc produce two different binaries even when run on different machines.
51. Re:Bogus argument by Beat+The+Odds · 2013-06-20 09:49 · Score: 1
  
  "Exact binaries" is not the point of having the source code.
  Uh, you must not have worked in a shop that does continuous integration automated builds? Do you really think QA should be handed binaries that you compile and have them trust them?
  The problem is that GCC will always give you a different binary every time you compile from the same source. This makes it impossible that the binary you received comes from the source you claim to have used. You can get around this by never receiving binaries from anywhere but the automated build machine but it would still be useful to be able to test that a build that you received was built from the code you expect.
  There were several reasons why Apple moved away from the GCC tool chain to LLVM and Clang but one of the abilities of the LLVM stack is that you can actually get identical binaries from the same source compiled on different machines at different times.
  You confused my post and gotten it EXACTLY BACKWARDS. But thanks for playing....
  The point that I was making is that source code does not always build "exact binaries". Now if someone is giving you both a binary and the source code and claiming that the source produced the binary, there is really no way to prove that one way or another. But at least you know that you can build a binary from the source code and know what is in THAT binary.
52. Re:Bogus argument by RabidReindeer · 2013-06-20 09:57 · Score: 2
  
  Who compiles the compiler?
  You hand-translate it to machine code to create the first binary. Then you apply it to itself. It's a time-proven technique.
  Yeah, you could do that.
  On the other hand, the early Unix compilers were actually translators that converted C to assembler. You could simply check the assembly code to ensure no surprises, then scan the object code to make sure it matched the assembly source. I spent enough time reading core dumps in my misspent youth that I could disassemble object modules in my head.
  More recent compilers generate a generic assembler which they then reduce, optimize and generate code from, but the same tactics can be used.
  Having spent a certain amount of time twiddling the innards of gcc, however, I can say that the theoretical Evil Compiler isn't likely to go unnoticed for very long in the world. Proprietary compilers may be able to hide malfeasance, but when you've compiled the toolchain from scratch and are running a debug process on it, the number of ways that you can do things that the malware-hiding gremlins didn't anticipate is pretty large.
  You might be able to get away with it in a "Mission Impossible" scenario, but not as a general open-to-the-public situation.
53. Re:Bogus argument by c++0xFF · 2013-06-20 10:36 · Score: 2
  
  First off, [Citation Needed]. This is simply not true from my experience. I've done this many times with GCC and produced identical output (or so diff says). One caveat: make sure you start from a clean directory structure each time, because your Makefile might list dependencies in different orders for the linker if not everything is recompiled, and I think that can produce different result. But this is the build system presenting different input to the compiler, not the compiler itself producing different output. Please provide a citation to set me straight, if my experience does not cover some other situation.
  However ... why should we be relying on the compiler to produce identical output in the first place? Shouldn't we instead be fingerprinting the source code and shipping that alongside (or even inside) the binary? That fingerprint would be enough to verify that their binary matches a certain revision of the code, regardless of whatever magic the compiler might do. (For extra bonus points, include the compile arguments in the fingerprint!)
54. Re:Bogus argument by TapeCutter · 2013-06-20 10:51 · Score: 1
  
  Same thing happens with windows compilers. Assuming the people involved trust each other, the simplest method is to always build from a tag and embed the tag into the binary. Use something like the unix "strings" command to find the tag when comparing two binaries.
  
  --
  And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
55. Re:Bogus argument by donaldm · 2013-06-20 10:58 · Score: 3, Insightful
  
  There are very talented people that can hide things in only a few lines of code. See http://ioccc.org/ for some examples that will make your skin crawl.
  True, but any programmer that works in a Professional way should document their code so that it is maintainable. Those programmers that think that their code should be hard to read because that is a good way of keeping their job eventually come down to earth with a thud when their manager tells them that "The door is over there, please watch your fingers on the way out". Usually hard to read code is thrown out and a fresh start is made since it sometimes is so much quicker to do this especially if the System Designer (not the programmer) has documented the concept properly. On a more serious note companies that don't have well documented overview design and code are asking for trouble down the time line.
  
  --
  There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
56. Re:Bogus argument by Anonymous Coward · 2013-06-20 11:15 · Score: 0
  
  LLVM can't provide the same output and use the OS' include files.
57. Re:Bogus argument by Anonymous Coward · 2013-06-20 11:32 · Score: 0
  
  Mainly because different environment produce different binaries.
  For example, take a ffmpeg source code from here, with the ./configure options shown here. Using GCC 4.73 will produce different binaries than the ones produced by GCC 4.80.
  Another good example would be this one. Binaries produced by VS2010, VS2012 and Intel C++ varied widely by size, even if you have not changed a single line at all.
58. Re:Bogus argument by Anonymous Coward · 2013-06-20 11:40 · Score: 0
  
  Another option is to just write the first compiler supporting a minimum of features in a different language. A compiler is just a parser that maps symbols to machine code or some meta language.
59. Re:Bogus argument by rthille · 2013-06-20 12:15 · Score: 1
  
  How do you get the bytestream into the computer? Are you waving your arms and affecting cosmic rays, or are you relying on software, many many layers of software?
  
  --
  Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
60. Re:Bogus argument by Anonymous Coward · 2013-06-20 12:16 · Score: 0
  
  Even then, truly underhanded things can happen in the compiler. The source can be *completely* clean and the binary can still end up with a backdoor:
  http://en.wikipedia.org/wiki/Trusting_trust#Reflections_on_Trusting_Trust
61. Re:Bogus argument by BrokenHalo · 2013-06-20 12:19 · Score: 1
  
  I know I'm disinclined to have a Samsung phone after the dismal showing they did with the Galaxy Nexus
  As a matter of interest, what do you consider a dismal showing? I have a GNex, (rooted, running stock 4.2.2) and have on the whole been pretty happy with it, despite the fact that it was dropped by my telco only 6 weeks after I took out the contract. The only serious failing as far as I'm concerned is the lack of provision for more SD storage.
62. Re:Bogus argument by Anonymous Coward · 2013-06-20 12:26 · Score: 0
  
  Well boy howdy aren't you a genius? Sometimes I just wanna get down and lick your kneecaps.
63. Re:Bogus argument by Anonymous Coward · 2013-06-20 12:26 · Score: 1
  
  Huh?
  First, a.c:
  int main(){ return 0;}
  $ gcc -v
  Using built-in specs.
  COLLECT_GCC=gcc
  COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.7.2/lto-wrapper
  Target: x86_64-redhat-linux
  Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --disable-build-with-cxx --disable-build-poststage1-with-cxx --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
  Thread model: posix
  gcc version 4.7.2 20121109 (Red Hat 4.7.2-8) (GCC)
  $ gcc -o a.out a.c
  $ gcc -o b.out a.c
  $ diff a.out b.out
  $ echo $?
  0
  There are things that you can do that will make the output differ. One example I've seen, with another compiler, is enabling additional warning messages changed the output of the optimizer. The explanation is that the compiler limited the amount of memory it would use, for all operations. With extra warning output, less memory was available for the optimizer, which would in turn reduce the number of optimizations attempted.
  Second, I think something was dropped from the sentence: "This makes it impossible that the binary you received comes from the source you claim to have used."
  Perhaps the intention was something along the lines of: This makes it possible that the binary you received comes from the source you claim to have used, even though a subsequent build produces different output.
64. Re:Bogus argument by Anonymous Coward · 2013-06-20 13:01 · Score: 1
  
  The precise term is called formal verification. We use this all the time in the logic design industry to prove a given Verilog or VHDL source text matches the output of a synthesis tool. Conformal and Formality are the main tools in this area.
  This is possible for hardware design languages because any arbitrary input is not allowed, even though both the above languages are Turing complete. Any looping constructs must be unrolled and assigned physical resources statically which forces constant loop iterations(usually less than 64). This essentially reduces the problem to comparing directed data-control flow graphs below a given size, but comparison can still take days.
  C/C++ on the other hand can potentially include long-running loops and other constructs which cannot easily be proved equivalent in a reasonable amount of time. I'm only aware of them because there has been 20+ years of effort to bring 'C++ to gates' tools to production quality and lack of formal verification is a major problem.
65. Re:Bogus argument by Anonymous Coward · 2013-06-20 14:39 · Score: 1
  
  Wasn't there a backdoor in the GCC that someone fessed up to that basically inserted itself into any compiler compiled with it?
66. Re:Bogus argument by Anonymous Coward · 2013-06-20 14:49 · Score: 0
  
  Source-based distributions such as Gentoo even go as far as to do the actual creation of the binary on your local machine.
  Notice also that oftentimes the various Linux distros make their own patches to packages, so what you get in the binary reflects not only the upstream source and the various configuration and compiler settings that the distribution provides, but also the distro's patches.
  Gentoo (which I use on all of my machines) also has patches to go along with many of its ebuilds. Its very easy to look at the patches yourself as well as the tarballs that come from the upstream. You can go a step further and make local versions of ebuilds and apply your own patches. I do that every once in a while.
  With Gentoo, I don't have to worry about where the binary comes from.
67. Re:Bogus argument by Yvanhoe · 2013-06-20 15:27 · Score: 0
  
  I rely on debian to do that.
  
  Debian provides binaries but their binaries are automatically compiled from sources. If you are unable to create a source package that compiles correctly, it cannot be part of debian.
  
  --
  The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
68. Re:Bogus argument by int19 · 2013-06-20 15:38 · Score: 1
  
  The problem is that GCC will always give you a different binary every time you compile from the same source. This makes it impossible that the binary you received comes from the source you claim to have used.
  This is not true.
  Where I work I generate embedded firmware using GCC. When the code is ready, QA is provided a tag of the sources and CRCs I generate from my build, and then they build the source code and verify their CRCs against mine to ensure they can build and release what I intended. These CRCs are then published along with the binary when released to the customer, who has the ability to pull the code and build tools from escrow in case we go bankrupt. They know they can built the code 100% identical to what we provided them.
  Where you may get into trouble is using macros like __DATE__ and __TIME__. One solution is to run a binary diff against the two executables and manually verify that only this changed. The other is of course to just not use them.
69. Re:Bogus argument by int19 · 2013-06-20 15:43 · Score: 1
  
  You may be interested in this article. We have had mixed success with it.
70. Re:Bogus argument by greg1104 · 2013-06-20 15:48 · Score: 1
  
  Even when I download Linux packages in C, it's often the case that I can't compile them, because I'm missing some obscure library that the original developer just assumed I had.
  But when you're using a packaged Linux distribution like RedHat, Debian, or SuSE, all of this information is part of the package metadata in the source code packages. On Debian for example, if you request:
  apt-get buld-dep xxx
  You'll get all of the packages needed to build package xxx. The same package metadata also contains the other things you were worried about like compiler information too Having the source package metadata is why it's feasible to recreate the packages that came with many Linux distributions with only a bit of drift from the original. If you have a source RPM/DEB package, you just ask for it to be built and all of the requirements/compiler stuff is taken care of for you.
71. Re:Bogus argument by Anonymous Coward · 2013-06-20 16:06 · Score: 0
  
  Last time a checked gcc gave me perfectly repoducible results (when controlling for version). I think this is bullshit and certainy not the reason Apple uses LLVM.
72. Re:Bogus argument by SpaceLifeForm · 2013-06-20 16:17 · Score: 1
  
  Yeah, but. Still a massive troll article.
  
  --
  You are being MICROattacked, from various angles, in a SOFT manner.
73. Re:Bogus argument by Bing+Tsher+E · 2013-06-20 16:17 · Score: 1
  
  You solder it into a diode array, silly.
74. Re: Bogus argument by Anonymous Coward · 2013-06-20 16:22 · Score: 0
  
  they can do that?
75. Re:Bogus argument by icebike · 2013-06-20 16:24 · Score: 0
  
  As opposed to other distros who's binaries are compiled from used windows exe files and yesterday's underwear?
  
  --
  Sig Battery depleted. Reverting to safe mode.
76. Re:Bogus argument by MikeBabcock · 2013-06-20 16:45 · Score: 1
  
  Actually no, its a real issue that people should understand when they go about their lives using software. Software may not be what you think it is on any given day, and the importance of that cannot be underestimated after the number of worms, infections, trojans and other malice computers have been exposed to over the years.
  
  --
  - Michael T. Babcock (Yes, I blog)
77. Re:Bogus argument by MikeBabcock · 2013-06-20 16:47 · Score: 1
  
  At no point did compiling the software himself fail in the article. I've only rarely had packages fail to build when I use distro source packages.
  
  --
  - Michael T. Babcock (Yes, I blog)
78. Re:Bogus argument by AJWM · 2013-06-20 16:52 · Score: 2
  
  Sometimes it's exactly the point of having the source code.
  Take voting machines for example. I used to work for a company that certified same. This involved obtaining everything that the vendor didn't write (compilers, OS, libraries, etc) from the 3rd party vendors (Microsoft, etc) including Linux from Scratch for the linux-based systems, then compiling it all (thus creating a "trusted build") and comparing the binaries.
  No exact match, no certification. (This was after the vendor's source code went through a line-by-line inspection -- and where necessary, correction -- for all kinds of crap.)
  
  --
  -- Alastair
79. Re:Bogus argument by AJWM · 2013-06-20 16:56 · Score: 2
  
  Not the GCC, but Ken Thompson's original C compiler. And it was Thompson who fessed up. The other thing that compiler (allegedly) did was insert a back door any time it compiled the "login" program.
  
  --
  -- Alastair
80. Re:Bogus argument by gawbl · 2013-06-20 17:58 · Score: 3, Informative
  
  I used to work on GCC, and the randomness you describe would have made it impossible to find bugs.
  GCC is deterministic. If you feed it the same input and launch it with the same options, it generates the same output. GCC developers would never tolerate random behavior.
  Is it possible that you have address randomization turned on in your OS? I used to to use watchpoints & similar in the heap, and this would only work if randomization (ASLR/PAX) is disabled.
81. Re:Bogus argument by bill_mcgonigle · 2013-06-20 18:14 · Score: 1
  
  The problem is that GCC will always give you a different binary every time you compile from the same source.
  Baloney - look at CentOS, an entire distribution that's binary-identical to its upstream while compile on a completely separate system.
  
  --
  My God, it's Full of Source!
  OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
82. Re:Bogus argument by stretch0611 · 2013-06-20 18:30 · Score: 1
  
  Its pretty easy to hide obfuscated functionality in a mountain of code (in fact it seems far too many programmers pride
  themselves their obfuscation skills). I would worry more about the mountain he missed while staring at the
  mole-hill his compile environment induced.
  Many "programmers" do it so well now, that they even fool themselves. And it comes so naturally to them, they do not even realize it is happening.
  
  --
  Looking for a job?
  Want your resume written professionally?
  DON'T USE TUNAREZ!!!
83. Re:Bogus argument by jones_supa · 2013-06-20 20:17 · Score: 1
  
  On Windows, however, this is admittedly a problem, since _everybody_ simply downloads an exe file from somewhere, without even checking the md5 hash that is usually provided (however, in most cases, in vain because the website is not even SSL secured).
  On the other hand, on Windows the binaries often are signed which helps a bit...I guess?
84. Re:Bogus argument by lourd_baltimore · 2013-06-20 20:20 · Score: 1
  
  The problem is that GCC will always give you a different binary every time you compile from the same source.
  
  I tried compiling "Hello World" using GCC 4.4.3 and then building it again five minutes later. The executables were binary identical. Is what you said only for non-trivial cases such as "Hello World"?
85. Re:Bogus argument by Anonymous Coward · 2013-06-20 20:22 · Score: 0
  
  Just a quick note, please don't put "here" or "this" as your link texts, as this forces the readers to hover over them to see what you mean. :)
86. Re:Bogus argument by Anonymous Coward · 2013-06-20 20:32 · Score: 0
  
  unfortunately i couldnt read it due to the font used - very hard on the eye.
87. Re: Bogus argument by garaged · 2013-06-20 22:41 · Score: 1
  
  Oh, come on, I know I will be modded troll, but this is a fact of life, when was the last time you feel 100% confident on politician decisions? Almost every aspect of life is subject to trust betrayal, some of them having betrayal more often than not.
  
  --
  I'm positive, don't belive me look at my karma
88. Re:Bogus argument by K.+S.+Kyosuke · 2013-06-21 00:31 · Score: 1
  
  Who compiles the compiler?
  You hand-translate it to machine code to create the first binary. Then you apply it to itself. It's a time-proven technique.
  Yeah, you could do that.
  On the other hand, the early Unix compilers were actually translators that converted C to assembler. You could simply check the assembly code to ensure no surprises, then scan the object code to make sure it matched the assembly source. I spent enough time reading core dumps in my misspent youth that I could disassemble object modules in my head.
  More recent compilers generate a generic assembler which they then reduce, optimize and generate code from, but the same tactics can be used.
  I completely forgot an even more obvious idea: You write an interpreter. If the compiler is complex enough, it should be simpler than hand-translating the compiler (and less error-prone). Then, you use the interpreter to compile the first binary of the compiler. Then, you use the first binary of the compiler to compile the second binary of the compiler. As a quick check, you compare the binary files. If the interpreter faithfully emulated the evaluation of the compiled programs, the binaries should be identical.
  
  --
  Ezekiel 23:20
89. Re:Bogus argument by ConceptJunkie · 2013-06-21 03:03 · Score: 1
  
  I'm sure there are very, very few programmers that think that their code should be hard to read. On the other hand, there is a never-ending stream of programmers who think they know what they are doing, but produce the same result. I work with that type's legacy every day.
  
  --
  You are in a maze of twisty little passages, all alike.
90. Re: Bogus argument by ConceptJunkie · 2013-06-21 03:10 · Score: 1
  
  The difference is that we expect not to be able to trust politicians, and not just because they tend to be immoral.
  Code is precisely definable and precisely defined. If I give you the source code to my program, then you can reproduce it exactly. And I'm assuming that by "source code", I mean not just the code, but the precise details used to build it.
  The only way you get a different binary than me, assuming you build it correctly, etc., is if I'm lying or mistaken about what I give you. It's not a matter of human judgement, information I have that you don't (because source code by definition is all the information), impreciseness of language or any of other uncertainties that govern pretty much every human interaction and enterprise, especially those of politicians.
  
  --
  You are in a maze of twisty little passages, all alike.
91. Re:Bogus argument by Andy+Dodd · 2013-06-21 03:21 · Score: 1
  
  Unfortunately, as I understand it, a copyright holder has to actually go forward with a suit.
  The Busybox guys are quite willing to do this, but as I understand it, most of the kernel guys don't bother. For most companies, just the bad PR alone is enough to ensure compliance (unless the company is Chinese - Oppo is one of the only Chinese manufacturers to bother with GPL compliance, the rest just don't give a shit), but Samsung seems to think they can override that with their marketing budget. Nearly all GPL enforcement suits are over Busybox, not the kernel.
  
  --
  retrorocket.o not found, launch anyway?
92. Re:Bogus argument by Anonymous Coward · 2013-06-21 04:34 · Score: 0
  
  So what vintage of the compiler were they using in 200 AD?
  Probably version .00000000000000000000101
93. Re:Bogus argument by Anonymous Coward · 2013-06-21 04:44 · Score: 0
  
  All these issues go away with FoxPro.
  Laugh all you want, it's solid and it's free of most of the bullshit that programmers in other languages have to go through that has nothing to do with the reason for writing the software in the first place. And it's self-contained - decompilers are available at a reasonable price and it doesn't even depend on the Windows registry - move it anywhere and it'll just run.
  Yes, there are some limitations - I wouldn't try to write a commercial application like Excel in it - but if you're working with the end user to solve a business problem, and you're getting paid for producing business logic, not writing compilers, it's pretty hard to beat.
94. Re: Bogus argument by MikeBabcock · 2013-06-21 06:01 · Score: 1
  
  This has no correlation to politics. This is the software that you depend on every day to run your computer. Those of us who make an effort from a security standpoint to actually review source code realize what a real issue this is in the world. That medical device you'll depend on if you're in hospital, are you sure its not open to random worm attacks? Because some of them are. How would you review it? The answer 'make it open source' doesn't make it better if the article's points are relevant.
  
  --
  - Michael T. Babcock (Yes, I blog)
95. Re:Bogus argument by hazeii · 2013-06-21 07:33 · Score: 1
  
  >The use case is "we're using this binary in production, which we didn't build ourselves"
  ...
  >Isn't that the strongest practical use case for Open Source in the business world? Sure, you don't plan on maintaining it yourself but you could if you have to.
  Bzzzt. You can't be sure you can maintain it if you didn't compile it yourself in the first place (the binary you have may not have been built from the available source).
  
  --
  All your ghosts are just false positives.
96. Re:Bogus argument by david_thornley · 2013-06-21 08:34 · Score: 1
  
  Thompson's device only works if there's only one compiler available. Given two independent ones, it's fairly easy to tell if either is Thompsoned. You don't actually have to trust either of them, only that they don't share that code.
  Take compilers A and B. Compile B with A and also with B. You'll get different binaries, but if the compilation actually works they'll do the same thing. Now, take both those binaries, and compile A. The results should be almost identical.
  
  --
  "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
97. Re:Bogus argument by redlemming · 2013-06-21 11:38 · Score: 1
  
  I'd like to see more open source projects following Tor's lead.
  
  It's not just open source projects that should be doing this. There should be a legal requirement that all software companies eventually make their source code available in a well documented form that can be built with readily available tools to exactly match the released binaries.
  Long term public oversight over business is an important public goal in its own right (consider all the environmental disasters that businesses have been involved in when this oversight didn't happen), and also something that arises as part of the right to long term public oversight over government. Historically, governments that have not been able to do certain things directly have hired (or even coerced) third parties -- such as businesses -- into doing the very thing the government is not allowed to do, in an attempt to do an "end-run" around the rules. This sort of thing can result in serious violations of fundamental rights, and public oversight over business is neccesary to prevent it from happening (or to catch the government in the act). For software companies, oversight to be practical must take the form of examining the source code.
  It is for this reason, of course, that software companies who put clauses in their licenses prohibiting disassembly or reverse engineering are acting contrary to the public interest. In this USA, such clauses are appropriately considered to violate fundamental rights arising under the 9th Amendment (rights retained by the people) and the 10th Amendment (rights reserved to the people).
98. Re:Bogus argument by romons · 2013-06-21 11:59 · Score: 2
  
  link to story about trojan compiler
  
  --
  Go to Heaven for the climate, Hell for the company -- Mark Twain
99. Re:Bogus argument by Rakarra · 2013-06-25 10:29 · Score: 2
  
  Oh man. I sortof feel sorry for the runner-up of the 2009 contest. From the evaluation: "The bug is plausibly deniable as poor coding, and rests on your caffeine-addled inability to notice a ‘0’ instead of a ‘\0’ when testing for end-of-string. The comparison in safe_strcmp has unnecessary terms, which achieves two evil goals: first, it sets up a pattern that fools your eyes, and second, it looks just amateurish enough that the bug, if found, looks like a sophomoric mistake rather than an intentional backdoor."
  Now the problem is.. any time he makes a minor coding glitch, he'll be accused of putting in a super-secret backdoor, because he did that once for some contest. :-)
100. Re:Bogus argument by MasterPatricko · 2013-06-25 11:12 · Score: 1
  
  On Linux packages (rpm, deb) are almost always signed by a distribution key, which needs root access to accept.
  On Windows binary signing just gives you a company name associated with the exe, which I think is regularly ignored by users ...
  
  --
  I'd tell a UDP joke, but you may not get it. I'd tell a TCP joke, but I'd have to keep repeating it until you got it.
WTF... by Anonymous Coward · 2013-06-20 06:23 · Score: 0

I think I'm done with slashdot. The "articles" have just become tweets in disguise.
1. Re:WTF... by Anonymous Coward · 2013-06-20 06:33 · Score: 0
  
  How would anyone know? "Articles" are hypothetical. We read summaries, if that.
2. Re:WTF... by KGIII · 2013-06-20 12:42 · Score: 1
  
  The thread, the comments specifically, are actually pretty educational, informative, and potentially valuable. To me that is one of the main reasons that I frequent this site and this thread isn't, to me at least, anywhere near worth complaining about as compared to some of the others that have been approved.
  So, having said that, this thread is much better than it could be - which is an excellent thing. The subject, headline, article content, summary, and the likes aren't really that bad. I guess that it is, of course, subjective and all that but it seems like a perfectly acceptable topic for Slashdot and is fitting for this site. I don't think the source of content is of great importance so long as the information imparted is appropriate, contextually related to the general interests of the site at large, and informative.
  In this case it appears to have raised legitimate questions which people are interested in. The evidence for this is in the number of replies and in the number of dialogues being had because of this posting.
  I'm not trying to talk you out of leaving Slashdot, not at all - I thing you should as pessimism isn't really helpful no matter what, but I'm trying to point out that it your post is illogical in a variety of ways. I think that I should point out that the most important error in reasoning your post includes is that it assumes that people have any regard for your opinion on the subject and that you are assuming we value your presence enough to be concerned with your continued participation. It is, shall we say, unlikely that your opinion(s) are going to influence the content of the site.
  But, well, I digress. I think it would have been more simple (and effective) for me to have simply posted, "Well... Bye."
  
  --
  "So long and thanks for all the fish."
3. Re:WTF... by kermidge · 2013-06-20 21:39 · Score: 1
  
  Well said. Also, being one of the nuts that often RTFA, including when asked to drink from the firehose, I value the posts written by people who at least seem to know what they're talking about on technical issues. This thread is a case in point; I may never compile my own stuff from source, but now I have some excellent background of things to be aware of when doing so.
What a problem by Anonymous Coward · 2013-06-20 06:23 · Score: 0

Has anybody thought about recompiling the source and seeing if you get the same binary?
1. Re:What a problem by jedidiah · 2013-06-20 06:28 · Score: 3, Insightful
  
  ...or just using a binary that you compiled from binary yourself.
  For a lot of projects, that's not nearly as hard as some people like to make it sound.
  
  --
  A Pirate and a Puritan look the same on a balance sheet.
2. Re:What a problem by Qzukk · 2013-06-20 06:29 · Score: 1
  
  Has anybody thought about recompiling the source and seeing if you get the same binary?
  The article says you can try, but you don't.
  
  --
  If I have been able to see further than others, it is because I bought a pair of binoculars.
3. Re:What a problem by cold+fjord · 2013-06-20 06:29 · Score: 1
  
  Has anybody thought about recompiling the source and seeing if you get the same binary?
  That doesn't necessarily work unless you have the exact same build environment (libraries, compilers, etc.), and compiler settings.
  
  --
  much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
4. Re:What a problem by gweihir · 2013-06-20 06:31 · Score: 1
  
  That is what the OP is talking about.
  Suddenly it becomes obvious what the AC posting possibility is really about...
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
5. Re:What a problem by Synerg1y · 2013-06-20 06:32 · Score: 1
  
  I thought so... the build environment does affect the final hash. However, thinking about this logically most places you can get the source code and executable from the same place... and if the executable matches... how paranoid can you be?
  If you're getting the alleged source code to Windows 9 from some guy in Nigeria though, set your expectations accordingly.
6. Re:What a problem by X0563511 · 2013-06-20 06:33 · Score: 1
  
  Differing library, linker, compiler versions, configurations, and parameters would all change the output. You'd have to use the exact same system for the two builds, or you are not guaranteed to get a byte-for-byte duplication.
  
  --
  For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
7. Re:What a problem by h4rr4r · 2013-06-20 06:37 · Score: 5, Funny
  
  Hey now, you have to be pretty IT savvy to type ./configure, make and make install all in the same day. Some of us make good money doing that, don't just go suggesting everyone should be doing it.
8. Re:What a problem by arth1 · 2013-06-20 06:44 · Score: 4, Insightful
  
  Has anybody thought about recompiling the source and seeing if you get the same binary?
  Has anybody thought of reading the article before posting questions like this?
  That said, this particular "article" isn't worth the waste of bytes it takes up. It's like seeing a 6 year old trying to explain a combustion engine.
  Binaries will almost always differ - if nothing else because you need the entire environment exactly like the binary builder. Not just the time stamps, compile paths, hostnames and account names, which are the obvious.
  If your compiler or linker is a minor version off what he used, the results can be very different, even if using the same compile options.
  But that's not enough: If your hardware is different, randomization of functions in a library will be different.
  To flesh out his article a bit more, the author could have done a test with two different Gentoo systems. Different but mostly compatible hardware, and a slight difference in the toolchain. That might have opened his eyes.
  Then again, probably not.
9. Re:What a problem by TheRaven64 · 2013-06-20 06:52 · Score: 5, Insightful
  
  Most of the time, even that isn't enough. C compilers tend to embed build-time information as well. For verilog, they often use a random number seed for the genetic algorithm for place-and-route. Most compilers have a flag to set a specified value for these kinds of parameter, but you have to know what they were set to for the original run.
  Of course, in this case you're solving a non-problem. If you don't trust the source or the binary, then don't run the code. If you trust the source but not the binary, build your own and run that.
  
  --
  I am TheRaven on Soylent News
10. Re:What a problem by Anonymous Coward · 2013-06-20 07:01 · Score: 0
  
  No make test? The flying screens of white text on black they produce is tangible evidence of your "productivity"! Never make pacakges without it!
11. Re:What a problem by cold+fjord · 2013-06-20 07:09 · Score: 2
  
  However, thinking about this logically most places you can get the source code and executable from the same place... and if the executable matches... how paranoid can you be?
  How paranoid do you want to be? Reflections on Trusting Trust - Ken Thompson
  Today, in day to day practice, you are on "reasonably safe grounds" if you get the executable from either the authoritative source, or an associated mirror, and it matches the published cryptographic checksum/hash value. (md5, SHA, etc.) Of course if you can build from source, after checking the checksum of the source archive, and of any libraries you need to add, you should be in good shape as well. (And it isn't necessarily a bad thing to plan ahead and grab a copy, and then wait a little bit for either source or patches. Sometimes a patch turns out to break things. Rarely you will find out there was an intrusion last week at the site you grabbed your software from. Not being on the bleeding edge sometimes give you added buffer.) And I would avoid building and testing on production systems - use separate build & test systems, even if they are Virtual Machines like VMware or VirtualBox.
  It is a good practice to make use of checksums to check the validity of important files being copied or archived as well since sometimes the process can go badly for various reasons.
  Your point about Windows source from Nigeria is spot on. Dealing in stolen code, more generally, is seldom an aid to doing anything legal, and may cause enormous problems.
  
  --
  much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
12. Re:What a problem by h4rr4r · 2013-06-20 07:22 · Score: 1
  
  Only the hourly employees use that. Us salaried folks just leave early since we earned it by skipping that and increasing our productivity.
13. Re:What a problem by Anonymous Coward · 2013-06-20 08:00 · Score: 0
  
  Will they ever fix the mess that is c++ compilation? Whenever I try Java or C# I can install a dev environment and start writing working code before noon, while doing the same thing for c++ takes a day just to set up the IDE. And then I've not yet even figured out which binary versions of libraries I need, how to install them, and how to actually link them into the application.
  God help you if you want to do cross-platform development on the big three. Setting up MinGW on Windows is a bitch and every downloadable binary seems built for Visual Studio and incompatible with other compilers, unless you go down the rabbit hole of compiling every lib and it's dependencies from source. Why could be so hard about standardizing the structure of c++ libraries so that they can talk to one another as long as they run on the same architecture?
  On Linux installing all that stuff is by rather easy in comparison, if you have the right distro. However there always seems to be at least one package in the repos that is of the wrong version, and the whole OS depends on it and installing the older/newer one you need is somehow impossible without messing up your whole system.
14. Re:What a problem by Anonymous Coward · 2013-06-20 09:16 · Score: 0
  
  > set up the IDE
  You're doing it wrong.
15. Re:What a problem by Anonymous Coward · 2013-06-20 09:28 · Score: 0
  
  If you trust the binary, but not the source, too bad. That's the problem here. Companies can use GPL software, release the source code, which may or may not be the actual code to build the binary, and not get caught. The binary isn't "bad." The company just didn't fulfill its GPL obligations.
16. Re:What a problem by Anonymous Coward · 2013-06-20 11:27 · Score: 0
  
  Too bad this 6 year old article is above your head. He should have written on a 5 year old level. Maybe you would have gotten the point then.
Being able to is nice, but who has the time? by intermodal · 2013-06-20 06:24 · Score: 4, Interesting

Given the scale of most modern programs' codebase, good luck actually reviewing the code meaningfully in the first place. That said, if you're really that concerned about the code matching the source, run a source-based distro like Gentoo or Funtoo. For most practical purposes, though, users find binary distributions like Debian/Ubuntu or the various Red Hat-based systems to be more effective in regards to their time.

--
In SOVIET RUSSIA... erm...NSA AMERICA, the Internet logs onto YOU!
1. Re:Being able to is nice, but who has the time? by marcello_dl · 2013-06-20 07:12 · Score: 1
  
  You can also get the source packages from debian/ubuntu and compile it yourself, all in one command:
  apt-get -b source packagename
  Source debs have also the good habit of putting the modifications to the upstream package in a separate diff.
  
  --
  ---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
The obvious thing is by Chrisq · 2013-06-20 06:24 · Score: 4, Insightful

If you are that paranoid study the source code then recompile
1. Re:The obvious thing is by gl4ss · 2013-06-20 06:55 · Score: 1
  
  If you are that paranoid study the source code then recompile
  yeah if he is bothering to read through it he should quite easily be bothered enough to compile it as well.. that's what he was going to do anyhow to compare.
  also, you could clone the compile chain of popular linux distros as well, without fuss. it's not like they hide their build system behind closed doors.
  
  --
  world was created 5 seconds before this post as it is.
2. Re:The obvious thing is by Jon+Stone · 2013-06-20 07:00 · Score: 1
  
  That's not guaranteed to address the problem. http://cm.bell-labs.com/who/ken/trust.html To compile the source code you used the binary compiler...
3. Re:The obvious thing is by Lumpy · 2013-06-20 07:11 · Score: 1
  
  If you are truly paranoid you write it yourself.
  
  --
  Do not look at laser with remaining good eye.
4. Re:The obvious thing is by MrEricSir · 2013-06-20 07:14 · Score: 1
  
  If you're really going to be paranoid, how do you know your machine isn't compromised? I hope you're doing a bit-for-bit comparison on your hard drive twice a day to make sure there's no file changes you didn't approve, and that you've soldered the top off our CPU and put it under a high power microscope to ensure the circuits haven't been changed.
  
  --
  There's no -1 for "I don't get it."
5. Re:The obvious thing is by arth1 · 2013-06-20 07:28 · Score: 1
  
  also, you could clone the compile chain of popular linux distros as well, without fuss. it's not like they hide their build system behind closed doors.
  That's not a guarantee. Not only do you need to replicate the environment at the time the binary was compiled, not their current environment. That, after all, is doable, if they keep good records. (Never mind that most of the distros don't keep exact track, but only track release changes, and don't track exactly when minor changes are rolled out to the build environments. There are exceptions, but those are few and far between.)
  However, you may also need to duplicate things like the paths the source was compiled in, hostnames, user names, timestamps, contents of ccache, and the exact behavior of /dev/random on that system.
  You might as well disassemble the binaries and compare them to your own disassembled binaries. It's an exercise in futility, and to prove what?
6. Re:The obvious thing is by Anonymous Coward · 2013-06-20 08:32 · Score: 0
  
  If you are that paranoid study the source code then recompile
  That's obviously pointless unless you read and understand the entire codebase of the source for the program, the libraries it uses, the compiler, the tool chain et al. In the real world, you'll be dead from old age before you can do a comprehensive study.
7. Re:The obvious thing is by Anonymous Coward · 2013-06-20 12:29 · Score: 0
  
  "It's an exercise in futility, and to prove what?"
  So far as I can tell, to prove that you've got far too much time on your hands.
  (Not you, arth1. Whoever would actually do this.)
Timestamps make a difference by Anonymous Coward · 2013-06-20 06:24 · Score: 0

Lots of builds include a timestamp or use it so this isn't always guaranteed.
I like to use auto-generated hash signatures of code in my builds when I want to know an exact version or even exact build of the same source tree.
1. Re:Timestamps make a difference by Samantha+Wright · 2013-06-20 06:44 · Score: 1
  
  In TFA, that was the major source of difference. Debian, Fedora, and OpenSUSE packages were tested; Debian differed only in the timestamps, OpenSUSE had a few lingering debug features, and the Fedora binary was a little weirder (perhaps the result of a different compiler version?)
  
  --
  Bio questions? Ask me to start a Q&A journal. Computer analogies available for most topics!
Exact match by Anonymous Coward · 2013-06-20 06:24 · Score: 0

The trouble with trying to get an exact match is there are so many variables. Do you have the same operating system, the same architecture, the same versions of the same libraries, the same version of the same compiler? What about the same compiler flags? Unless all of those things are an exact match the odds against getting a matching binary are slim. Really, though, it becomes a bit of a moot point because, once you have the source code, you can create your own binary and don't have to wonder if the previous binary was a match.
Then recompile it by Anonymous Coward · 2013-06-20 06:24 · Score: 0

Now you have a binary which "corresponds" to the source code.
Gentoo by Anonymous Coward · 2013-06-20 06:26 · Score: 0

Thus narrowing the issue to binaries in stage3 archive.
1. Re:Gentoo by Bill,+Shooter+of+Bul · 2013-06-20 07:04 · Score: 1
  
  -funroll-loops, the breakfast of champions.
  
  --
  Well.. maybe. Or Maybe not. But Definitely not sort of.
2. Re:Gentoo by jones_supa · 2013-06-20 21:22 · Score: 1
  
  Fun Roll Loops would also be a great name for a rollercoaster.
touch o' hyperbole by ahree · 2013-06-20 06:27 · Score: 5, Insightful

I'd suggest that "severely limiting the whole point of running free software" might be a touch of an exaggeration. A huge touch.
1. Re:touch o' hyperbole by Anonymous Coward · 2013-06-20 06:39 · Score: 1
  
  Call me stupid, but if you bother to build your own binary, why would you download the binary at all instead of running the one you compiled?
2. Re:touch o' hyperbole by MozeeToby · 2013-06-20 06:59 · Score: 4, Interesting
  
  The issue the author is bringing up is that you have no way to easily determine that the published binary is, in fact, functionally identical to the published source code. Imagine you write an app that accesses private data and open source it, saying "check the source, the only thing we use the data for is X". And if you look at the source, that's certainly true. But there's no way to verify that the binary download was built from the published source; especially if the resulting binary is different every time you build it and different if you build it on different machines with different configurations. So, everyone who grabs the binary instead of building from source is taking it on trust, just like proprietary software, that the program does what it claims.
3. Re:touch o' hyperbole by Anonymous Coward · 2013-06-20 07:04 · Score: 1, Informative
  
  This would be true if an executable binary was some kind of quantum black box, like the inside of a proton or whatever. In actual fact, a binary is a set of disassemble machine code and you can compare the differences between the version you compiled to the published version as you would like. The article writer found that usually the "difference" was a build timestamp, because duh.
4. Re:touch o' hyperbole by idontgno · 2013-06-20 07:06 · Score: 0
  
  OK, stupid.
  Well, it was your idea.
  Anyway, sometimes the binaries and the sources are downloaded together. This happens a lot in single-tarball releases, for instance.
  But yeah, more often, if you don't want the binaries, you can certainly avoid downloading them, or unpackaging them and running them if you did download them.
  Source-only tarballs, for instance, or just source packages.
  Of course, you still have to run the binaries of your operating system. And your toolchain.
  But no one would ever corrupt those.
  
  --
  Welcome to the Panopticon. Used to be a prison, now it's your home.
5. Re:touch o' hyperbole by gmuslera · 2013-06-20 07:16 · Score: 2
  
  Is a big point anyway. Indepent auditing. That someone, somewhere, could say that the binary that my distribution gave me had a backdoor instead of the code they published (i.e. because forced by law to do and not disclose it), and that i even could check or rebuild it. With closed source you don't have that freedom, is even against the law to try to find that. And in current US pushed cyberwar state of things (they are trying this kind of things already), to have the possibility of independent auditing of the code you are running is what makes a difference against non open source software.
  Is not that i actively will recompile all the software I use, but that if something wrong is happening, i will have the opportunity to know.
Hyperbole? by Anonymous Coward · 2013-06-20 06:27 · Score: 0

"severely limiting the whole point of running free software"
Yet somehow we survive!
Incorrect suppositions. by Microlith · 2013-06-20 06:28 · Score: 5, Insightful

A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."
No it doesn't. The whole point of running free software is knowing that I can rebuild the binary (even if the end result isn't exactly the same) and, more importantly, freely modify it to suit my needs rather than being beholden to some vendor.
1. Re:Incorrect suppositions. by Shoten · 2013-06-20 06:48 · Score: 5, Insightful
  
  A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."
  No it doesn't. The whole point of running free software is knowing that I can rebuild the binary (even if the end result isn't exactly the same) and, more importantly, freely modify it to suit my needs rather than being beholden to some vendor.
  There's another point too...which incidentally is the whole point of running a distro like Gentoo...that you can compile the binary exactly to your specifications, even sometimes optimizing it for your specific hardware. I don't get at all this idea he has about "reproducible builds;" if he builds the same way on the same hardware, he'll get the same binary. But what he's doing is comparing builds in distros with ones he did himself...and the odds that it's the same method used to create the binary are very low indeed.
  If he's concerned about precompiled binaries having been tampered with, he's looking at the wrong protective measure. Hashes and/or signing are what is used to protect against that...not distributing the source code alongside the compiled binary files. If you look at the source code and just assume that a precompiled binary must somehow be the same code "just because," you're an idiot.
  
  --
  
  For your security, this post has been encrypted with ROT-13, twice.
2. Re:Incorrect suppositions. by Anonymous Coward · 2013-06-20 06:49 · Score: 0
  
  I usually want the binary that I compiled to be different from the original. There may be a new x86 chip with a few nice op-codes, or a GPU chip that will do vector arithmetic ONLY IF I re-compile.
  Sometimes, the package maintainer failed to build it right. There's a Python lxml connector library for RPM that can seg-fault using the SAX parser. Its GCC optimization-level doesn't match the binary "lxml.so", and linkage fails. I rebuilt it & now I can I have SAX rather than a big DOM memory footprint.
3. Re:Incorrect suppositions. by Anonymous Coward · 2013-06-20 06:55 · Score: 0
  
  yes, exactly.
  how is it that the 'open source' movement got so fixated on shipping binaries instead of shipping an
  environment that made it possible to work with the _source_
  note that spending an hour trying to replicate the version for 'ifconfig' that i'm running with the exactly
  path set, not guarenteed not to be built in the same way isn't really the same level of access as for example BSD
  where I can always recompile from the exact source if I choose
4. Re:Incorrect suppositions. by Anonymous Coward · 2013-06-20 07:06 · Score: 0
  
  A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software."
  No it doesn't. The whole point of running free software is knowing that I can rebuild the binary (even if the end result isn't exactly the same) and, more importantly, freely modify it to suit my needs rather than being beholden to some vendor.
  +1. Just ask any business owner who would rather not "upgrade" (Ha!) to Windows 8.
  Free software always lasts exactly as long as every tool should: Until it is no longer useful.
5. Re:Incorrect suppositions. by cecom · 2013-06-20 07:50 · Score: 2
  
  The whole point is that the distro build is supposed to be 100% reproducible, with the exception of things like timestamps and signatures. And it is with Debian, as he found out. But not the other distros he tried. And that is a real problem.
  Why? naive people might ask. Because that is the only way to verify that a binary is what is claims to be. And is the only way to reliably support and diagnose something. It is shocking how few people on Slashdot realize that.
6. Re:Incorrect suppositions. by tibman · 2013-06-21 05:42 · Score: 1
  
  The environment that lets you build from source is still there. The real downside is most package managers don't let you do that in a managed way. Gentoo started as source only and has now built out the ability to install binaries. But most Gentoo is still built entirely from source, including kernel.
  
  --
  http://soylentnews.org/~tibman
Not a concern by gweihir · 2013-06-20 06:29 · Score: 4, Insightful

If you need to be sure, just compile it yourself. If you suspect foul play, you need to do a full analysis (assembler-level or at least decompiled) anyways.
The claim that this is a problem is completely bogus.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
1. Re:Not a concern by bunratty · 2013-06-20 06:37 · Score: 2
  
  Compiling it yourself does not necessarily help.
  
  --
  What a fool believes, he sees, no wise man has the power to reason away.
2. Re:Not a concern by wisnoskij · 2013-06-20 07:18 · Score: 1
  
  With the optimization going on at compile time I do not see an assembler level analysis necessarily giving you any more information than a binary compare of the binaries.
  
  --
  Troll is not a replacement for I disagree.
3. Re:Not a concern by gweihir · 2013-06-20 08:33 · Score: 1
  
  I am well aware of that paper. It is a rather academic problem though, as keeping this going and hidden is in practice hugely complicated, with a high risk of producing bizarre errors that some people will actually investigate on assembler-level. I have personally looked into several bizarre things over the years and would likely have found such an attack. Turned out it was other things, like an incorrect overflow handling (gzip checksum mangled by a GCC version, data was fine), a 64bit float implementation, instead of the required 80bit for standard IEEE751 (qemu FPU emulation), and some others. There will be a lot of people around that can do this and will do it from time to time. Somebody would notice, unless this was a targeted attack. To avoid a targeted attack, just use a signed compiler package, e.g. from Debian.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
4. Re:Not a concern by gweihir · 2013-06-20 08:35 · Score: 2
  
  That is not the point. The point is that comparing binaries will just give you a mismatch, unless you re-create exactly the same build environment. That is often infeasible.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
5. Re:Not a concern by wisnoskij · 2013-06-20 09:23 · Score: 1
  
  So will comparing assembly level code, unless you re-create exactly the same build environment. That is often infeasible.
  
  --
  Troll is not a replacement for I disagree.
6. Re:Not a concern by gweihir · 2013-06-20 09:56 · Score: 1
  
  So? What has that to do with analyzing assembly code?
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
7. Re:Not a concern by wisnoskij · 2013-06-20 10:13 · Score: 1
  
  "If you suspect foul play, you need to do a full analysis (assembler-level or at least decompiled) anyways."
  
  --
  Troll is not a replacement for I disagree.
8. Re:Not a concern by Xtifr · 2013-06-20 10:21 · Score: 1
  
  Assuming you're dealing with a very simple compiler and a very simple version of init, then yes, Thompson's hack is worth worrying about. But in the real world, where init and the compiler have been rewritten almost completely from scratch several times, and change drastically even when they're not being rewritten from scratch, you'd need a degree of AI simply unavailable in this world in order to "recognize" the relevant code the way Thompson's hack did. Simple pattern matching isn't going to get you there after decades of hacking and rewriting.
  But for the record, I have loaded gcc source code into a C interpreter, and used that to compile gcc, and confirmed that my resulting binary compiler was identical to one built with a normally-compiled version of gcc. I did it to stress-test the interpreter, but checking for Thompson's bug was a secondary motivation.
9. Re:Not a concern by gweihir · 2013-06-20 14:05 · Score: 1
  
  Yes, but an assembler-level analysis has nothing to do with comparing it to some other code. In fact, you do not need any other code, you just read the assembler code and figure out what it does. Whether you could recreate a compiling process that creates the same assembler code is completely immaterial as you neither need nor want it for such an analysis.
  I think have no clue what you are talking about. Read up on the "Dunning-Kruger Effect", it seems you are on the far left side of their curves.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
10. Re:Not a concern by wisnoskij · 2013-06-20 14:36 · Score: 1
  
  I do not actually think it would be possible to know everything that a modern binary converted to assembly does. It is hard enough reading someone elses code, but we are talking about something 5000 times longer, and infinitely more complex than that. So I doubt it is even theoretically possible at that stage. We are talking about medium sized programs becoming 50 million lines long, we are talking about functions and classes not existing, and execution jumping all over the place wherever and whenever it wants. And code not written by a human being, but spit out by a computer.
  
  --
  Troll is not a replacement for I disagree.
11. Re:Not a concern by gweihir · 2013-06-21 02:53 · Score: 1
  
  It is not something everybody can do, but it is definitely possible. How do you think people analyze malware? Also note that you can look at suspicious points first, and work backwards from them.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
12. Re:Not a concern by Anonymous Coward · 2013-06-21 05:09 · Score: 0
  
  I think that malware is likely a lot smaller than say an OS or a word substitute.
  but more to the point I think that ALL (and I mean every single one) analysis of malware is looking for specific things, and uses search functions followed by analysis of short pieces of code. and at no time does anyone try to follow program flow and understand the program fully. ---wisnoskij
This is my darkest fear... by erroneus · 2013-06-20 06:30 · Score: 2

It's a fair argument. If you are not compiling your binaries, how do you know what you have is compiled from the source you have available?
Truth? You don't. If you suspect something, you should investigate.
1. Re:This is my darkest fear... by SirGarlon · 2013-06-20 06:43 · Score: 1
  
  If you suspect something, you should investigate.
  And on an open-source OS, you can.
  
  --
  [Sir Garlon] is the marvellest knight that is now living, for he destroyeth many good knights, for he goeth invisible.
2. Re:This is my darkest fear... by jdunn14 · 2013-06-20 06:47 · Score: 2
  
  Sorry to tell you, but Ken Thompson talked about you how you pretty much have to trust someone back in 1984: http://cm.bell-labs.com/who/ken/trust.html
  If no one else, you have to trust the compiler author isn't pulling a fast one on you....
3. Re:This is my darkest fear... by Bearhouse · 2013-06-20 06:48 · Score: 1
  
  It's a fair argument. If you are not compiling your binaries, how do you know what you have is compiled from the source you have available?
  Truth? You don't. If you suspect something, you should investigate.
  You're right, of course. But that's not quite the (non) argument he was making, I think.
  My understanding was that he wanted to check how easy it was to get the same result if compiled the public-available source and compared it to the objects.
  Turns out that, due to datestamps etc. slightly different, but no biggie.
  Anyway, in a production environment you should be compiling from source, since - security concerns aside - that's the only way to be sure you've got the correct source for your objects.
Not the whole point of free software. by Anonymous Coward · 2013-06-20 06:31 · Score: 0

The point of free software isn't that you can know that a particular binary is from particular code.
The point is that you have the code available for inspection and that you can modify and build it yourself.
If your build behaves differently it will soon become clear that that binary is not the same.
Problems with verifying the binaries from source by tooslickvan · 2013-06-20 06:32 · Score: 5, Funny

I have recompiled all my software from the source code and verified that the binaries match but for some reason there's a Ken Thompson user that is always logged in. How did Ken Thompson get into my system and how do I get rid of him?
Are You Sure This Is the Source Code? by jedidiah · 2013-06-20 06:32 · Score: 2

> Are You Sure This Is the Source Code?
Yes. Yes I am sure. I built it myself. It even includes a few of my own personal tweaks. It does a couple of things that the normal binary version doesn't do at all.

--
A Pirate and a Puritan look the same on a balance sheet.
1. Re:Are You Sure This Is the Source Code? by wisnoskij · 2013-06-20 07:20 · Score: 1
  
  But given that the optimization phase of compiling/building can be significant, and their are lots of different optimization options; Why would you not just be better to leave that up to code maintainers?
  
  --
  Troll is not a replacement for I disagree.
2. Re:Are You Sure This Is the Source Code? by Anonymous Coward · 2013-06-20 12:35 · Score: 0
  
  Well, ain't you just the damndest cutest little button out there? I sure do hope it was that kernel you were editing and I'll slap me vittles if it were.
3. Re:Are You Sure This Is the Source Code? by viperidaenz · 2013-06-20 13:44 · Score: 1
  
  Because the code maintainers use the same compilers and build scripts, hence the same optimisations.
  Compiling/building is not a manual process.
4. Re:Are You Sure This Is the Source Code? by Anonymous Coward · 2013-06-20 17:32 · Score: 0
  
  No. Because I don't have the same hardware they have.
  Right?
  Because when there are optimizations for everything in the binary, I'm no longer running an "optimized for my system" binary.
  May as well be on Windows at that point.
Poor testing, waste of time by Anonymous Coward · 2013-06-20 06:33 · Score: 0

Most distributions use mostly identical software, so chances are you end up with identical gcc and so are comparing identical behaviours. Not very useful.
FreeBSD now has a binary patch system and to that end someone worked out how to create binary diffs from freshly built packages against older ones. One of the major pitfalls is timestamps inserted by the compiler. Adjust for that and re-creating suddenly gets a lot more predictable.
Apparently this "tester" hasn't take a very close look as to what is really happening, but thought it more important to wax lyrically about his dreams then moan he couldn't make them reality.
Well, he hasn't really tried, I say. Consequently, his blogged moaning is a waste of time.
Some things wrong with TFA by vikingpower · 2013-06-20 06:33 · Score: 3, Informative

1) Submitter is the one who wrote the blog post 2) No cross-reference, no references, no differing opinions at all 3) "severely limiting the whole point of running free software" is more than a bit of an exaggeration

--
Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
1. Re:Some things wrong with TFA by Mike+Frett · 2013-06-20 07:02 · Score: 1
  
  I honestly don't understand the blog post, I'm not severely limited in any way. I somehow feel the user doesn't even know how to compile software and doesn't know anything about Open Source. It doesn't matter if the binary is the same, maybe his is compiled with different flags than mine or maybe I added a patch.
  This honestly smells of someone out to discourage usage of Open Source.
2. Re:Some things wrong with TFA by arth1 · 2013-06-20 07:13 · Score: 1
  
  This honestly smells of someone out to discourage usage of Open Source.
  Please run this statement through Hanlon's Razor.
  (Or, to put it another way, I don't think you're deliberately misleading when you use the word "honestly". The alternative is much more likely.)
Trust by bunratty · 2013-06-20 06:33 · Score: 5, Insightful

I took a graduate-level security class from Alex Halderman (of Internet voting fame) and what I came away with is that security comes down to trust. To take an example, when I walk down the street, I want to stay safe and avoid being run over by a car. If I think that the world is full of crazy drivers, the only way to be safe is to lock myself inside. If I want to function in society, I have to trust that when I walk down the sidewalk that a driver will not veer off the road and hit me.
When you order a computer, you simply trust that it doesn't have a keylogger or "secret knock" CPU code installed at the factory. It's exactly the same with software binaries, of course. In the extreme case, even examining all the source code will not help. You must trust!

--
What a fool believes, he sees, no wise man has the power to reason away.
1. Re:Trust by jdunn14 · 2013-06-20 06:49 · Score: 1
  
  So very true. In the end it all comes down to trust and as I posted above (before noticing yours) Thompson explained it extremely well.
2. Re:Trust by saveferrousoxide · 2013-06-20 07:07 · Score: 1
  
  Maybe those just aren't good examples, but both have way more than simple trust involved. There's a huge disincentive to perpetrate either of those actions. In the case of a driver, there's car repairs, court costs, plus the downstream effects; running down a pedestrian, especially one on a sidewalk is a life altering action that no sane individual would perform just on a lark. In the case of an insecure computer, the company would be ruined if it came out that they were doing this to all the systems it sold, and targeting specific individuals would be prohibitively expensive.
  No, the ones to worry about are those who have a reward that outweighs the risk. Voting is an excellent example of this.
3. Re:Trust by mbone · 2013-06-20 07:10 · Score: 2
  
  You have an odd notion of trust. And of security, for that matter.
  Blindly trust nothing except the laws of physics. Everything else is subject to investigation and verification. Just because verification is difficult or may fail is not excuse for not trying. By being vigilant, you can approach security, although you will never fully get there.
  When I walk down the sidewalk, for example, I pay attention to the surroundings. How much attention is based on prior experience and knowledge of how likely drivers (bicyclists, say) are to be using the sidewalk, my observations of the current situation, and just how full that part of the world is with crazy drivers. This is informed by an implicit or explicit threat model (for example, is there a reason to expect someone would want to harm me). The threat model is, for example, different for me and the US President, which implies that we should have different ideas of the level of diligence required to walk down the same street, and indeed we do.
  It's no different for a new computer, software, or anything else. I have found "phone home" malware on machines, by looking at network traffic. And, looking at the source code may not help, but, then again, it may, so that doesn't mean it is useless to look at it, any more than the success of previous assassination attempts means that the President's security is worthless and should go home.
4. Re:Trust by Kjella · 2013-06-20 07:30 · Score: 3, Interesting
  
  So your argument is that there will always be risk, so there's no point in managing or minimizing it? To continue your car analogy, even if I'm at a pedestrian crossing I don't really trust cars to stop and I always throw a glance to make sure they've noticed me. An uncle of mine was witness to a horrible accident, old lady got run over in broad daylight in the middle of a well-marked crossing, perpetrator was an old half-blind fool who should have lost his license already or had and didn't care. Doesn't help the old lady one bit no matter how much they punish him anyway. You always trust lots of people, you trust the factory who building the brakes on your car and the mechanic who serviced them, you trust the people who built the bridge it won't collapse from out under you but only because you lack any other practical alternative.
  With software you do have more and better choices, not perfect choices but it's a helluva lot harder for the NSA to place a spy bug in Linux than in Windows where they can just show up with a national security letter that is both instructions and gag order and violating either can land you in jail. If there are reasonable ways to prove that these are the exact versions and compiler settings used to produce this binary, then that is much stronger than trust. Trust is something that can be betrayed, while reproducible steps is something you can verify. In science, if one scientists told you here are the steps of my experiment, feel free to reproduce my results and the other said "I can't show you the data but the results are correct, trust me", who would you trust?
  
  --
  Live today, because you never know what tomorrow brings
5. Re:Trust by bunratty · 2013-06-20 08:48 · Score: 1
  
  No, that is not my point at all. Please re-read my post.
  
  --
  What a fool believes, he sees, no wise man has the power to reason away.
6. Re:Trust by Anonymous Coward · 2013-06-20 09:20 · Score: 0
  
  The point is whom do you trust. If you tell me that anyone who has proper skill can validate that that binary came from that source, I trust the binary more than if just the person who did the build can. For example, John did the build I trust John. But wouldn't it be better if Alice, Bob and Joe could also confirm that John did what he said he did?
  I have been wondering about this for a while. Is it possible that during the build an intermediate file could be produced that contained symbolic information that could link the binary and the source. Allowing anyone to check that all three corrispond.
Deterministic builds.. by 0dugo0 · 2013-06-20 06:33 · Score: 3, Interesting

..are a bitch. The amount of hoops eg. the bitcoin developers jump through to proof they didn't mess with the build are large. Running specific OS build in emulators with fake system time and whatnot. No easy task.
Why not just use a source based distro like Gentoo by Anonymous Coward · 2013-06-20 06:33 · Score: 1

If this means that much to you, why not just use a source based distro like Gentoo (You can have the added bonus of it being tuned to your system)?
Logical Equivalency Checking by RichMan · 2013-06-20 06:37 · Score: 2

I do IC design. Logical Equivalency Checking is well worn tool. You can futz about with the logic in a lot of different ways. LEC means we can do all sorts of optimization and still guarantee equivalent function. We can even move logic from cycle to cycle and have it checked that things are logically equivalent.
You run two compilers on the same source code you won't get the same code. You run two different versions of the compiler on the same code you wont' get the same code. You run the same compiler with different options you won't get the same code. They should however all be logically equivalent.
1. Re:Logical Equivalency Checking by Anonymous Coward · 2013-06-20 06:59 · Score: 0
  
  I do IC design. Logical Equivalency Checking is well worn tool. You can futz about with the logic in a lot of different ways. LEC means we can do all sorts of optimization and still guarantee equivalent function. We can even move logic from cycle to cycle and have it checked that things are logically equivalent.
  You run two compilers on the same source code you won't get the same code. You run two different versions of the compiler on the same code you wont' get the same code. You run the same compiler with different options you won't get the same code. They should however all be logically equivalent.
  Logical Equivalency Checking for software with unbounded memory is not possible (i.e., undecidable) because software with unbounded memory is Turing complete.
2. Re:Logical Equivalency Checking by Anonymous Coward · 2013-06-20 07:03 · Score: 0
  
  guarantee equivalent function
  Isn't that undecidable (in general)?
3. Re:Logical Equivalency Checking by Anonymous Coward · 2013-06-20 07:51 · Score: 0
  
  Parent is correct. A few more:
  You run two versions of the same compiler on the same code using the same options on slightly different hardware, you won't get the same code (if the compiler optimizes for things such as CPU cache size and speed, number of CPU cores etc.).
  Your code and/or your compilation options ask the compiler to initialize something in a random way, at compile time.
  You are using parallel make, and your build environment is structured such that the outcome will be different depending on the order in which files are actually compiled.
4. Re:Logical Equivalency Checking by slew · 2013-06-20 12:19 · Score: 1
  
  guarantee equivalent function
  Isn't that undecidable (in general)?
  It is decidable, but some cases are exponentially hard to decide (and the Logic Equivalence Checking or LEC tools will barf).
  Quick background. The task of converting a logical representation of a chip function to a physical representation is a very time consuming process, so often if a bug is found or a timing problem is found, you don't want to restart the process, but instead you probaly to make a small local change (equivalent to patching a binary to fix bug rather than recompiling). But when you create that patch, how do you know that it does what you want?
  That's why they make LEC tools (like Formality). If you made a timing/retiming fix, you can be assured it doesn't change the functionality, if you made a small bug fix, you can fix the source code run your tests and know that the results apply to the patched version as well.
  If you tried to run LEC tools on a whole chip, it would generally barf. You generally have to run it on small blocks of the chip and build up your equivalence check for the whole thing hierarchically.
  LEC tools run in a reasonable computing resource configuration because big chunks of the design are exactly the same and the primitive binary operations in IC design have simple to define properties that have easy to describe using efficient structures like ROBDDs. If you tried to take this approach on typical software executable, you would likely quickly find that either there is no efficient data structure to evaluate equivalence, or that the primitives that are reasonably efficient don't have simple to define properties (e.g, FP math is not associative, deltas in non-live/don't care parameters/registers/stack-values).
Yeah they're right! by Anonymous Coward · 2013-06-20 06:38 · Score: 0

I thought I knew Slashdot's source code... then Boom! I find this:
meta http-equiv="refresh" content="600"
Diverse Double-Compiling by David A. Wheeler by tepples · 2013-06-20 06:44 · Score: 5, Interesting

If you've compiled the compiler with competitors' compilers (try saying that ten times fast), you should be fairly safe from Trusting Trust.
1. Re:Diverse Double-Compiling by David A. Wheeler by bunratty · 2013-06-20 06:45 · Score: 3, Funny
  
  But nuking it from orbit is the only way to be sure.
  
  --
  What a fool believes, he sees, no wise man has the power to reason away.
2. Re:Diverse Double-Compiling by David A. Wheeler by Anonymous Coward · 2013-06-20 07:16 · Score: 0
  
  ::applause::
  Exactly the reference I was checking for before making a post.
  Mods, please mod up.
3. Re:Diverse Double-Compiling by David A. Wheeler by gweihir · 2013-06-20 08:38 · Score: 2
  
  Nice! While I think the threat is mostly academic, it is nice that somebody competent looked into defeating it.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
4. Re:Diverse Double-Compiling by David A. Wheeler by Anonymous Coward · 2013-06-20 23:34 · Score: 0
  
  Don't be so fucking rational and keep the scared chickens hiding in the cupboard ! I really urge you as the Ballmer In Chief !!
Compiler flags make this ridiculously nitpicky. by Dputiger · 2013-06-20 06:45 · Score: 1

Unless I'm missing something pretty profound, even having the exact *source* won't always result in the exact binary. My understanding (and I could be wrong about this) is that you can take a well written program and plug it into multiple compilers. GCC may be one of the most popular options, but it's not the only one.
But compilers all optimize differently. GCC 3.x optimizes somewhat differently than GCC 4.x. You can tweak this behavior by manually setting compiler flags, or you can compile binaries that explicitly target different CPU architectures. A binary compiled to target all x86 processors may run differently on Haswell than a binary that's compiled specifically for Haswell.
In other words, flags set at compile time will change performance characteristics, even if the source code is identical, and while some projects may publish the exact details of every compiler flag they set, this doesn't seem to be the norm. Most projects I've seen say "Here are some binaries, and here's the source code if you want to play with it."
Clearly, the point of source code isn't to exactly duplicate every binary in every situation but to give you the data that goes *into* the compiler before the executable is compiled.
Or am I missing something?
1. Re:Compiler flags make this ridiculously nitpicky. by Anonymous Coward · 2013-06-20 06:55 · Score: 1
  
  Yes, you are. "A cherished characteristic of computers is their deterministic behaviour: software gives the same result for the same input. This makes it possible, in theory, to build binary packages from source packages that are bit for bit identical to the published binary packages. In practice however, building a binary package results in a different file each time. This is mostly due to timestamps stored in the builds. In packages built on OpenSUSE and Fedora differences are seen that are harder to explain. They may be due to any number of differences in the build environment. If these can be eliminated, the builds will be more predictable. Binary package would need to contain a description of the environment in which they were built.
  Compiling software is resource intensive and it is valuable to have someone compile software for you. Unless it is possible to verify that compiled software corresponds to the source code it claims to correspond to, one has to trust the service that compiles the software. Based on a test with a simple package, tar, there is hope that with relatively minor changes to the build tools it is possible to make bit perfect builds."
2. Re:Compiler flags make this ridiculously nitpicky. by Dputiger · 2013-06-20 07:47 · Score: 1
  
  Sure. But that's not my point.
  The question isn't "Can you get a bit-perfect result if you perfectly replicate everything" the question is: "Real-world people working with real-world compilers are likely to see results obfuscated as a result of different compiler versions, flags, and optimization options long before you get to the question of time stamps and offsets."
  Nothing you said indicates this *isn't* true.
Regulators by Anonymous Coward · 2013-06-20 06:45 · Score: 1

I've dealt with a case where a regulatory authority must review code and perform the build to match compiled artifacts with distributed binaries in a (large, linux based) embedded system. You can do it if you have absolute control over the build environment.
Funny things come up when you start analyzing compiled or archived build output. I had to modify squashfs tools to prevent uninitialized superblock struct members from causing unreproducible file systems... there are unused members in the struct that just pick up whatever happens to be on the stack at the time and put it in the file archive. In another case I wrote a cpio archive normalizer to 'fix' things like the device major/minor number that gets recorded in the archive. Also, readdir(3) does not sort, which matters when making reproducible archives. There are GCC macros (__TIME__, for instance) that will embed a timestamp in an object file that can be trouble as well. Also, gzip has an undocumented flag (-m, i believe) to prevent it from sticking a timestamp in a compressed file.
Hexdump, diff and md5sum are your friends. It's possible to do this but you have to go deep.
err.. WHAT? by magistrat · 2013-06-20 06:46 · Score: 1

err.. WHAT?
Re:Problems with verifying the binaries from sourc by tepples · 2013-06-20 06:47 · Score: 1

I have recompiled all my software from the source code and verified that the binaries match
How many different compilers did you use? Did you try any cross-compilers, such as compilers on Linux/ARM that target Windows/x86 or vice versa?

How did Ken Thompson get into my system
See bunratty's comment.

and how do I get rid of him?
See replies to bunratty's comment.
^^THIS by Anonymous Coward · 2013-06-20 06:48 · Score: 0

And even building in Linux with GNU, I have come across problems with source that wouldn't compile and the endless chase of dependencies and libraries. And having problems with libraries no longer supported or not supported on my platform - *cough*Ubuntu*cough*.
There is no problem; complete chain exists by SuperBanana · 2013-06-20 06:51 · Score: 3

This a problem that doesn't exist. You establish a chain of evidence and authority for the binaries via signing and checksums, starting with the upstream. Upstream publishes source and there's signing of the announcement which contains checksums. Package maintainer compiles the source. The generated package includes checksums. Your repo's packages are signed by the repo's key.
You can, at any point in time with most packaging systems, verify that every single one of your installed binaries' checksums match the checksums of the binaries generated by the package maintainer.
If you don't trust the maintainer to not insert something evil, download the distro source package and compile it yourself.
If you suspect the distro source package, all you have to do is run a checksum of the copy of the upstream tarball vs the tarball inside the source package, and then all you need to do is review the patches the distro is applying.
If you suspect the upstream, you download it and spend the next year going through it. Good luck...

--
Please help metamoderate.
1. Re:There is no problem; complete chain exists by Anonymous Coward · 2013-06-20 07:32 · Score: 0
  
  This a problem that doesn't exist.
  Because TFA is written by someone who does not understand what he's examining and crtiticizing.
  There are any number of reasons why you can't compile bit-for-bit identical local versions of distro binary packages as he discovered.
  It's very difficult to even get completely identical binaries from the pristine source code if you compile on a different platform with different compiler optimization.
  As has been stated, it's all about TRUST.
  IMPORTANT POINT
  However, if you trust no one at all, FOSS allows you to build a trusted working system from scratch. A proprietary, closed-source OS can not offer that.
2. Re:There is no problem; complete chain exists by Anonymous Coward · 2013-06-20 08:15 · Score: 1
  
  Yes, that exists !
  The only problem is : it s not automated yet.
  For that we need some details :
  - compiler / distro standardization
  - standardized timestamps (to source file ? to fixed value ?)
  - signatures/hashes incorporated in the output of compilers/packagers
  - automated recompiling + checking of these hashes
  - for extra paranoids : normalized "root compiler" to compile the first compiler ...
3. Re:There is no problem; complete chain exists by Anonymous Coward · 2013-06-20 08:54 · Score: 0
  
  That's fairly much irrelevant. Trust-but-verify - and in this case, you would like to verify that the actual build is correct.
  If there's a reproducible build, anybody can verify that all the maintainers are doing their job - and in fact, you most likely won't have a maintainer doing the build in the first place, you'll have a build cluster (which again can be verified by another build cluster).
  If you don't have reproducible builds, you have to trust the sum of all the maintainers, and you don't have a simple engineering-wise solution to deal with it.
more difficult in practice by Chirs · 2013-06-20 06:53 · Score: 2

./configure, make, make install assumes you're building on the target machine. Many times you want to build on one machine and deploy on another. Even now, there are a lot of packages that don't work properly when cross-compiling. So you end up hardcoding config files, overriding options, patching the source/Makefiles, etc.
Also, in our environment we need to isolate the build system from the host environment to avoid contamination from the host libraries, and we need to version-control the build system so that we can go back and build the same product we built three years ago for the purposes of fixing a bug for a paying client.
So while open-source helps a lot, many times it takes significant effort to bring in some arbitrary package and build it from source.
1. Re:more difficult in practice by h4rr4r · 2013-06-20 06:55 · Score: 1
  
  Whoosh!
  That was the joke passing over your head.
  I of course agree with everything you said. I was merely being flippant for the sake of humor.
2. Re:more difficult in practice by thoriumbr · 2013-06-20 07:39 · Score: 1
  
  Nothing that the might checkinstall package cannot solve. Install it on your compiling machine, ./configure && make && checkinstall make install
  
  It will create a shinny native package, compatible with your distro, ready to be installed with dpkg, yum, or whatever package manager you happen to have...
  
  Or go full source and get a Gentoo distro...
3. Re:more difficult in practice by kermidge · 2013-06-20 21:42 · Score: 1
  
  Many thanks for "checkinstall" from one who might never have known.
nothing new here, please move along... by Nightshade · 2013-06-20 06:54 · Score: 1

Even if you have the source, it doesn't mean you can confirm what the binary is doing. See the classic "Trusting Trust" attack which is decades old. In my experience the most common reason for binaries that are not reproducible is due to build timestamps being embedded into the binary. For example, the ar command added the D flag in the past few years exactly for the purpose of being able to output reproducible results. (see the man page at http://linux.die.net/man/1/ar) It's true that reproducible binaries are probably a good thing from a security stand point, but in practice it can be a lot of work to make sure the build produces these. And even then, as Thompson showed, that doesn't always guarantee that what you see is what you get.
bugfixes, not paranoia by Chirs · 2013-06-20 06:57 · Score: 1

We frequently discover a bug and need to fix it without upversioning the whole package (which could result in other incompatibilities with the rest of the system).
So we track down the code for the version we're using, get it building from source with suitable config options, and then fix the bug. In the simple case the bugfix is present in a later version and we can just backport it. In the tricky case you need to get familiar enough with the code to fix it (and hopefully in a way that the upstream maintainers will accept).
1. Re:bugfixes, not paranoia by intermodal · 2013-06-20 07:14 · Score: 1
  
  See, that's fine and well. But that's not what OP was referring to, nor is it what I was addressing.
  
  --
  In SOVIET RUSSIA... erm...NSA AMERICA, the Internet logs onto YOU!
only if the code is 100% valid by Chirs · 2013-06-20 07:00 · Score: 1

Depending on compiler options, some code that isn't completely valid (no overflow/underflow/etc.) can end up logically completely different when you turn on optimization.
Tah da by ElitistWhiner · 2013-06-20 07:03 · Score: 2

Finally, someone gets it. The backdoor is never where you're looking for it.
1. Re:Tah da by GovCheese · 2013-06-20 14:28 · Score: 1
  
  My federal masters have long argued that the validation of open source code by many eyes was a losing argument. It turns out what they meant by many eyes wasn't what we thought they meant.
  
  --
  "He's using a quantum encryption scheme! That'll take hours to break!"
That's funny by Anonymous Coward · 2013-06-20 07:04 · Score: 0

I thought the point of open source software (from the user end) is so that you can get it for free to do some trival task that you only need to do a few times where buying some comerical software would be a waste of money (or so you don't have to find cracks or keys for the comerical software).
Meh. by Anonymous Coward · 2013-06-20 07:04 · Score: 0

> It should be possible to recreate the exact binary from the source code. A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software...
I want the binaries to be different, because my PC is a particular combination of parts.
Binaries being different are the whole point of having a single source code of Free Software app.
Alas, who cares about binaries? We're reaching a point where things are compiled just-in-time!
PS: All this is my personal opinion.
Required in some industries by mrr · 2013-06-20 07:05 · Score: 5, Interesting

I work in the gaming (Gambling) industry.
Many states require us to submit both the source code and build tools required to make an exact (and I mean 'same md5sum') copy of the binary that is running on a slot machine on the floor.. to an extent that would blow you away.
They need to be able to go to the floor of a casino, rip out the drive or card containing the software, take it back to THEIR office, and build another exact image of the same drive or SD card.
md5sum from /dev/sda and /dev/sdb must match.
I can tell you the amount of effort that goes into this is monumental. There can be no dynamically generated symbols at compile time. The files must be built compiled and written to disk exactly the same every time. The filesystem can't have modify or creation times because those would change.
This is a silly idea for open source software, the only industry I've seen apply it is perhaps the least-open one in the world.
1. Re:Required in some industries by Anonymous Coward · 2013-06-20 10:37 · Score: 0
  
  They're that strict about software verification, but they use md5 hashes?
2. Re:Required in some industries by GrahamCox · 2013-06-20 12:24 · Score: 1
  
  I work in the gaming (Gambling) industry.
  
  How do you sleep? I'm curious.
3. Re:Required in some industries by mrr · 2013-06-20 14:01 · Score: 1
  
  Well I was illustrating a point about the process more than naming specific technologies. I don't deal with specifics.
  That being said, we are talking about an industry that still primarily depends on serial communications.
4. Re:Required in some industries by mrr · 2013-06-20 14:02 · Score: 1
  
  On a gigantic pile of money.
5. Re:Required in some industries by Anonymous Coward · 2013-06-20 14:04 · Score: 0
  
  Interesting post.... Thanks
6. Re:Required in some industries by int19 · 2013-06-20 15:51 · Score: 1
  
  This is a silly idea for open source software, the only industry I've seen apply it is perhaps the least-open one in the world.
  This requirement also exists in the rail transit industry.
7. Re:Required in some industries by AJWM · 2013-06-20 17:17 · Score: 1
  
  I would expect they use both md5 and sha1 hashes, both of which must match. That's what we did with voting systems, and if anything the security for gambling machines is tighter. (Sure, with some effort an md5 can be spoofed, but good luck creating a different file where both the md5 and sha1 sums match those of the target.)
  (The other side of the systems testing/qa house where I worked on voting systems dealt with games. Not gambling machines, but console and PC games. Their security was tighter than ours.)
  
  --
  -- Alastair
Philips multimedia devices and GPL by taara · 2013-06-20 07:10 · Score: 4, Interesting

One example being Philips TV or BluRay built on Linux. When asked for source code, it is provided, but there are no way to ensure that the source code is for the device, because the provided binaries are encrypted and signed.
Analogous trouble in the embedded world by Stenboj · 2013-06-20 07:13 · Score: 1

I write embedded control firmware for MSP430 processors, building and debugging with IAR Embedded Workbench. In production I build each version to two targets with identical source files but with the single change of different loader output file formats, one for the TI gang programmer used in production, another for the field update loader that we must sometimes distribute to update customers' systems. A third output format (with debug information) is needed if I am going to go in through the JTAG port to do any debugging. Surprise: the resulting memory images from any two of these builds using the same source files have not been identical any time that I have checked. There is no hash nor any date field by the time the image is loaded and I make the comparison with the contents of target hardware memory. In this case, the linker does not always place modules in the same order, and that seems to account for the difference. As far as I can tell they are always linked correctly and so far the program images always seem to have identical functionality, but it means that I cannot use the memory compare function of the JTAG debugger to verify a memory image that was loaded with either the Gang programmer or our field update loader. I asked IAR about this, and they said that yes, the module order was not guaranteed to be consistent between loader output file formats. So I can be sure that each of these build output files does correspond to a known source, and the same source, and all of them work if any of them do, but the memory images they produce fail comparison. Grumble, grumble.
1. Re:Analogous trouble in the embedded world by int19 · 2013-06-20 15:55 · Score: 1
  
  This is curious. I use IAR EW for ARM and do not have any problems doing build verification checksums.
2. Re:Analogous trouble in the embedded world by Stenboj · 2013-06-22 16:48 · Score: 1
  
  Curious indeed! If you are loading the memory image from the same linker output file format as the debugger uses, for instance if you are loading with the debugger itself, then I would not expect the problem to occur. If you are really loading with one file format and verifying from another, and still have no trouble, please post that fact. It would be good to know that EW for ARM is better than EW for MSP430 in that way. The simplest way to account for my observations is to assume that there is an entirely separate linker for each output format at least for the MSP430, and that the code that chooses link order is not always the same from one to another. The way standards were set and enforced during development could then be very different for the two flavors, and indeed the algorithm might be always the same for the ARM flavor. I am about to decide what tools to use for an ARM project. I like most things about IAR, and if you can verify the memory image successfully using different linker output formats for original load and for the debugger when verifying I would almost certainly use IAR on ARM instead of something else.
3. Re:Analogous trouble in the embedded world by int19 · 2013-06-28 14:46 · Score: 1
  
  I have dual output formats. One is a binary image (.bin) that is loaded via custom PC software to a custom bootloader (glorified flash programmer) running on the target. This is the same .bin as we deliver to the customer. The other is a ELF file (default IAR EWARM .elf output I think) which I believe is what is used by the JTAG. I believe the .bin is auto-generated after link-time from the .elf; I had to set an option for this. The .bin is then further post-processed to add some special data such as CRC at well-defined locations.
  Generally I try to load the .bin with our PC tools (the "normal" update path) and then debug with JTAG from that rather than auto program/load/execute the .elf. If I'm lazy I program directly with the JTAG.
  I don't know the MSP430 at all and couldn't comment on how our ARM approach differs from it, or if there are any other funny technical gothcas involved. If it's of any interest, we are using TI's Luminary Micro ARMs.
4. Re:Analogous trouble in the embedded world by Anonymous Coward · 2013-06-28 15:52 · Score: 0
  
  This sounds quite different in detail; for instance, nothing at all here with suffix .elf or .bin. We have .obj files produced by compilation or assembly of a module that are linked to make d43 files (stored in a folder named BIN inside a folder labelled with the target name) for the various load schemes to use. Just the mirror image of you, I have never used ARM so can't compare the two from my own experience either.
Are you sure it matters? by WaffleMonster · 2013-06-20 07:17 · Score: 1

What difference does it make?
Do you think your smart enough to detect tampering by reading source code?
To detect tampering run strings on the binary and pipe it to grep. If the following string appears 1.3.6.1.4.1.981 you are fucked.
Been bitten by this by Anonymous Coward · 2013-06-20 07:23 · Score: 1

I was once part of a startup whose project was distributed by a big outfit. They naturally wanted to archive our source, for which they needed to do a proof build. Unfortunately the archiving company's builds didn't match ours. We eventually discovered that our toolchain running on our "official" build machine (an ancient AMD K6 whitebox) didn't generate exactly the same bits as what they were using (~20 bytes were different.)
We never found a functional difference, but they had already accepted our version and would have had to re-QA the new one, which is unbelievably expensive in the commercial software world. IIRC we finally gave them our build machine and bought whatever the archive company was using for the next time.
Re:Problems with verifying the binaries from sourc by Guy+Harris · 2013-06-20 07:26 · Score: 1

How did Ken Thompson get into my system
See bunratty's comment.
I hope that wasn't a whooshing sound I just heard....
DispatchServlet.properties is why we need code! by Anonymous Coward · 2013-06-20 07:35 · Score: 0

Why we need code: I was setting up Spring MVC in Tomcat using Fedora 18's package manager version of Spring. As a Struts person, I had never used Spring MVC and was following samples of how to set it up. And I kept getting this error about DispatchServlet.properties not being found. None of my tutorials mentioned that, it was all controller-servlet.xml stuff. (And Java annotations, but I like my ugly warts in XML files where they belong, not in my code.) What the crap is DispatchServlet.properties and why the crap can't it be found? After THREE HOURS I finally figure out that DispatchServlet.properties comes with Spring, but isn't in the JAR file. I downloaded a 300MB-ish distribution from Spring Source and dug around until I found DispatchServlet.properties and put it on my classpath, and Spring MVC started working.
So, no, I don't want to make identical binaries, but I do need the source.
Makes as much sense as my 'Ultimate Terror Weapon' by kawabago · 2013-06-20 07:36 · Score: 1

Ultimate Terror Weapon
Why running the binary if you care ? by Meeni · 2013-06-20 07:37 · Score: 1

Why are you running the binary, if you care about having a version that is trustful to the source code? Just compile your own, never use precompiled binary, problem solved.
1. Re:Why running the binary if you care ? by Anonymous Coward · 2013-06-20 08:19 · Score: 0
  
  This approach is good, but it does not scale.
  Compiler / packager proving would prevent backdoors for the rest of the world.
Using sub-projects is a common problem ... by perpenso · 2013-06-20 07:38 · Score: 1

Virtually all of his findings are traced to differences in date and time and chosen compiler settings and compiler vintage. Unless he can find large blocks of inserted code (not merely data segment differences) he is complaining about nothing.
Using sub-projects is a common problem. Consider a project A that builds upon independent projects B and C. A, B and C are independently developed by three different developers. The source to all three are publicly hosted. A's available source does not include B and C's source, rather it has a link to their respective repositories. A reasonable thing to do.

The problem comes in that daily snapshots of B and C that A used to build his binary are not know tags or otherwise identified. Happens all the time. Even in projects from Google itself.
Gitian by Anonymous Coward · 2013-06-20 07:40 · Score: 0

No one, especially TFA, mentions gitian? Really?
https://gitian.org/
https://github.com/bitcoin/bitcoin/tree/master/contrib/gitian-descriptors
Point of open source software by Anonymous Coward · 2013-06-20 07:41 · Score: 0

I thought the point of open source software was so that some unscrupilous guys could charge morons for the privilege of downloading it (ala Open Office) or so that junk sites could trick users into installing thier worthless installer and bundle crapware and spyware with it.
Compile your own source by Nyder · 2013-06-20 07:41 · Score: 1

Dude is saying if you download a binary dist that you won't be able to compile the source code to match it? Ya, no shit Sherlock. That is why you download the source code and compile it yourself. While there are trusted sources, you never know what is in binary dist. At least, when you compile it yourself, you can examine the source code.
So, is this guy going to tell us that binary dist can have malware next?

--
Be seeing you...
MUCH more important issue: Is the compiler clean? by mic0e · 2013-06-20 07:43 · Score: 2

I can only recommend you to read this: http://cm.bell-labs.com/who/ken/trust.html
When source is not safe by Anonymous Coward · 2013-06-20 07:44 · Score: 0

Even if you can build your executable from source, small difference in compiler or library versions can change the signature significantly. Add to that, that if your compiler has been compromised, you may still be pwned even if the source is clean.
1. Re:When source is not safe by viperidaenz · 2013-06-20 13:50 · Score: 1
  
  So compile your compiler from source.
2. Re:When source is not safe by int19 · 2013-06-20 16:00 · Score: 1
  
  With what (trusted) compiler?
3. Re:When source is not safe by viperidaenz · 2013-06-24 12:42 · Score: 1
  
  The one you compiled from source... that you compiled with a compiler you compiled yourself...
And that's not all! by WOOFYGOOFY · 2013-06-20 07:44 · Score: 2

Not only is limited in that way- which itself is an interesting fact, but it's limited in a lot of other ways also.
For one, source code is often bad, as in impenetrable, just off the top of my head-
* Realms of private, non-API / SPI code which is effectively *how the program actually works* which is also completely undocumented.
* Grotesque architectural errors made by (affordable) beginners which have nevertheless been cast in stone by exposing them publicly (God classes filled with global variables, etc. )
* Telegraphic and or misleading method and variable names, e.g. .VariablesWithMissingVowels, also known as Varwmvwls which nevertheless often serve as the ONLY documentation for that variable or method,
* Unfortunate architectural decisions made early on by experienced programmers who may be proud of those decisions. (tunneling package private methods out to "friend classes") and thus subverting the purpose of package private classes and making the source code scope modifiers an effectively an unreliable indicator of source code scope, for instance)
*500 -1000 line methods with some or all of the above characteristics.
* Just massive code bases- I am facing one with literally half a million classes right now...That's right almost 450,000 classes, in a code base that is deliberately architected to defy built-in scoping rules of the language, so virtually anything could call anything ...
And on and on.
All of these things will never be fixed for reasons we all understand, I presume, but reflect on of what this implies for open source. It implies that the much vaulted idea that more developers will iteratively make the code base better over time is a fiction with respect to the actual quality of the code base itself.
No team is going to stop adding features and create more work for itself in the form of resolving conflicts for the sake of enabling their program to do what it already can do.
This doesn't even get into the whole ego thing.
Worse still, anything exposed as public in any way may have a million clients depending on it and change effectively becomes impossible, open source or not. All things public, or even more precisely all things reachable in the code base by "outsiders" through any device found in the host language whatsoever, intended or otherwise, are effectively unchangeable.
In lieu of a successful campaign to stop development and do a rewrite, only a fork will make any of the above better. Forks are becoming more common, but they fail to sustain their branching a high percentage of the time (57%) and anyways presume the power TO fork and on large project this is harder to achieve.
The net effect is, open source code bases fail to live up to one of the major the promises of open source, iterative improvement of the code base.
It's true that some people may fix bugs that they are motivated for external reasons to correct and it's helpful to look at the code base if you're writing a plugin through a public API, but the code itself is often awful and this awfulness , often produced because of limited time and resources has the ironic effect of driving away many times those resources in the form of all the would-be developers who are just turned off. For those who do partake, the existing code has the effect wasting many multiples of the time originally *saved* as each new developer struggles to make sense of the impenetrable code base.
In my experience there is no easy fix or even pricey one. Original authors are quick to fix on the (self serving) idea that whatever documentation which exists *ought* to be enough and anyone who still has questions must be an *idiot*. Wasting time incrementally slogging around this code becomes some sort of test that the dev is *serious* and *smart* when the reality is more like smart, serious devs came, saw and left without saying a word.
Code quality is only subjective at the edges. Undocumented code should not exist. F
Re:Problems with verifying the binaries from sourc by tepples · 2013-06-20 07:45 · Score: 1, Interesting

The whooshing sound was David A. Wheeler flushing Ken Thompson down the drain.
What? by HaZardman27 · 2013-06-20 07:47 · Score: 1

I don't understand the problem. If you have the source code and are concerned about the authenticity of the binary, why not just build it yourself and use your own binary?

--
Apparently wizard is not a legitimate career path, so I chose programmer instead.
huh by fisted · 2013-06-20 07:48 · Score: 1

> unless you are certain that the binary you are running corresponds to the alleged source code.
Yes, i am.
> It should be possible to recreate the exact binary from the source code.
Obviously it is, since that is how i obtained the binary in the first place
> A simple analysis shows that this is very hard in practice, severely limiting the whole point of running free software.
wat?

--
CLI paste? paste.pr0.tips!
That's it by scarboni888 · 2013-06-20 07:56 · Score: 1

I'm going back to Windows, then.
valid argument... by Anonymous Coward · 2013-06-20 08:07 · Score: 0

but one that is virtually eliminated by simply compiling your own code.
Out of curiosity... by fuzzyfuzzyfungus · 2013-06-20 08:12 · Score: 1

Obviously, as a practical matter, you aren't going to get 100% identical binaries from a given chunk of source unless your build environment is very carefully set up to achieve that end(something that people don't typically bother with).
However, as a matter of theory, I'm left with a question: If I give you a piece of source code and a complete build environment, you can compile and produce a binary in a certain number of operations. If I were to give you a piece of source code, a build environment, and a binary, would there be any general algorithm more efficient than just compiling it and checking whether the output is identical to answer the question "Is this binary a product of that source and build environment?"
Is there any property that you can exploit, if provided with the alleged binary output, to perform a 'verification' operation that is less computationally expensive than a naive 'compilation', or would that be possible only in certain special cases, with no useful general method?
There is a simple solution . . . by Anonymous Coward · 2013-06-20 08:20 · Score: 0

Just download the source and compile it yourself!
Bad choice of target by ray-auch · 2013-06-20 08:25 · Score: 4, Informative

Bad choice of target - .Net does actually have multiple compilers available, including open source. But more to the point for this discussion, it has multiple DEcompilers available, including open source.
Want to know what that nasty MS compiler put in your .Net binary ? - run it through ILSpy.
Don't trust the ILSpy binary - decompile it with itself, or with a.n.other decompiler.
In fact, because .Net decompiles so well, the problem of this article (binaries don't compare) just doesn't occur. Want to check your .Net binary against the supposed source ? - easy (well, a hell of a lot easier than with C++). Build your binary from the source, decompile both binaries and compare the two sets of decompiled source. It works, it is consistent and reliable, and it is one hell of a lot more useful at showing up differences than comparing two binaries.
1. Re:Bad choice of target by bryonak · 2013-06-20 11:43 · Score: 1
  
  It only works because .Net is "comparatively" niché (with regards to C/C++, but then again everything is). Bananas to bananas would be more like: does decompiling across many different .Net versions yield the same results (since you don't know which framework versions and exact libraries were originally used)? Does running the whole roundtrip except for the untrusted binary compilation on Mono yield the same results (aka different vendor implementations)?
  But i'll give you that, it'll always be easier with byte code compared to machine code, as Java and .Net demonstrate. Likewise it'll always be easier with interpreted code[0] compared to byte code (what an insight... just read the code).
  So hardware distance respectively level of abstraction gives us advantages in verifying specific executables from one untrusted source. With regards to the big picture .Net is as "bad" as everything else, because in any case you have to trust the runtime (libc, virtual machine, ...) which at some point must be machine code, nowadays usually generated from C/C++. And let's not get started about trusting the hardware.
  [0] When only distrusting that specific program, otherwise see third paragraph.
2. Re:Bad choice of target by martijn+hoekstra · 2013-06-20 23:01 · Score: 1
  
  Aha, but then you are assuming the .NET runtime actually executes the IL, and doesn't wrap it in an abstraction layer of evil, webcam spying, password stealing, voting machine hacking and bitcoin mining. You sheeple are so gullible.
And is the compiler compromized? by Anonymous Coward · 2013-06-20 08:28 · Score: 0

1984 called and they want their problem back! https://en.wikipedia.org/wiki/Reflections_on_Trusting_Trust#Reflections_on_Trusting_Trust
It's a good problem neverthless.
What I want to confirm is ... by Skapare · 2013-06-20 08:32 · Score: 2

... not only is this the source code for the binary I am running, but also that the build system actually works. This is because not only might I want to make changes to the source to improve it, but I might want to do so in a hurry to fix a security hole. Since I might need to rebuild and run the built binary, I might as well test and make sure what the build system built really runs. So I just install the binary I built. Then I know for sure. Who needs the distributed binary (it might have a root kit in it).

--
now we need to go OSS in diesel cars
Recompiling is not enough by Aidtopia · 2013-06-20 08:51 · Score: 1

Recompiling is not enough because you can't trust the compiler either, unless you write your own bootstrapping compiler to compile the compiler.
Wrong tools by Anonymous Coward · 2013-06-20 08:59 · Score: 1

He says "using the tools that are recommended by the distributions". No idea about the rest, but in openSUSE we use "osc". Nobody in his right mind uses rpmbuild to build a RPM which is supposed to be distributed.
And part of the openSUSE build system happens to be "build-compare". Every time a package changes we automatically rebuild every dependent package (not very efficient, but ensures binary compatibility), so we are very interested in having reproducible builds to avoid unneeded rebuilds. If some code uses __DATE__ or __TIME__ we change it for the modification time of the changelog file.
1. Re:Wrong tools by Anonymous Coward · 2013-06-20 09:39 · Score: 0
  
  Actually I just checked. If you really use the recommended tools by openSUSE the tar binaries are bit identical. The only difference in the package is in the date from the man pages (because build-compare is clever enough to ignore it).
Meme picture is missing by Anonymous Coward · 2013-06-20 09:11 · Score: 0

And here shoud be the picture of that guy in funny hat that says [something]? Tell me about it.
Gentoo by Anonymous Coward · 2013-06-20 09:11 · Score: 0

This article is just screaming for this.. if you want to run exact source code every time, you need to run Gentoo (or ports or alike)!
Oblig by Tablizer · 2013-06-20 09:22 · Score: 1

No, but it's the droid I've been looking for.

--
Table-ized A.I.
Dumbass by Anonymous Coward · 2013-06-20 09:32 · Score: 0

Of course the binary is going to be different on different platforms, especially if it is compiled with platform-specific optimizations.
That's the WHOLE FUCKING POINT of running free software.
Unless you're compiling the source with the exact same versions of everything from the IDE to the underlying headers and libraries, you're not going to get a bit-for-bit binary match. Which is, again, the WHOLE FUCKING POINT of running free software.
It should NOT be easy to get the exact same binary on your system as was packaged with the software you're using. Again, whole point.
Compile it yourself by Chris+Mattern · 2013-06-20 09:33 · Score: 1

And run the resulting binaries. Voila, problem solved: you know the binaries you are running correspond to your source code.
That's why Open Source isn't Free Software. by Anonymous Coward · 2013-06-20 09:34 · Score: 0

If the binary is open source, how do you know that? You can't compile it if it's only open source, you need a license to that.
And what if it WERE allowed to be compiled, but was compiled using some weird flags or libraries that they didn't distribute? That's allowed by a non-GPL Open Source license.
With BSD, how do you know that the version you have is the one that was used in the closed version?
Again, you don't.
If it were GPL, the license is such that you are given what's needed to compile.
Well, as long as it's GPL3. GPL2 didn't *specifically* require signing keys, so Tivo's GPL2'd OS you cannot check to see that the version it is running is the same one you get from compiling the source they gave you.
Idiot. by Anonymous Coward · 2013-06-20 09:40 · Score: 0

A. I don't need to use the provided binary.
B. None of the principals of Free Software involve 'ability to produce identical binaries'.
C. It is possible to determine whether a binary corresponds to source code if malfeasance is sufficiently suspected.
Now there is a bogus argument. by Anonymous Coward · 2013-06-20 09:40 · Score: 0

The claim is that you CAN fix it yourself.
And that is true, even if you have to trust the code elsewhere. You have the code, you write a fix, you compile the fix and see if it works.
When it does, you have fixed the bug.
No need to compile the compiler (which used to be the way I installed Linux back in the Ygdrasil and Slack1.0 days...)
No, you're asserting incorrectly by Anonymous Coward · 2013-06-20 09:53 · Score: 0

The halting problem is taken as "you can't decide if a program will halt". But it doesn't. It indicates that you cannot guarantee any given program will halt. But a "Hello World" program WILL be proven to be haltable in its environment. And the vasty majority will also be provably haltable.
However, you cannot prove haltability for ALL programs.
You can guarantee equivalent function on programs where you can guarantee the function of one. When you re-compile with different options, if you could guarantee the halting of the original program, you can guarantee this second version will be guaranteed equivalent function (or not, thereby prove that the program you compiled is not the same program).
Is there any point to the binary? by Anonymous Coward · 2013-06-20 10:02 · Score: 0

If you gave me the source, the build environment and a binary, what point is there to the binary if I don't trust you to think the source is relevant to the binary?
If that were the case, I'd ignore your binary, compile my own and run that.
Why, then, would I care if your binary was built from that source? I'm not running it.
Missing the point? by ThisIsNotAName · 2013-06-20 10:04 · Score: 1

I can think think of two issues (aside from the malicious code issue which is being beaten to death).
First, we can't tell if the binary matches the source, so we can't tell if they're fully complying with the GPL.
Second, since we can't tell if the binary matches the source, if we try to hack around in the source we have the potential to be working in a different build than the published binary and getting wildly different results.
As for the malicious code, if you can compile the build from source and have a byte for byte match, you can be sure that you have the correct source. If there is malicious code, you'll be able to find it later. Or better yet, maybe someone else is verifying it. Does anyone question the value of being able to go back and look at malicious source code to see what it's done?
Maybe we should make it easier to make reproducible binaries?
1. Re:Missing the point? by ThisIsNotAName · 2013-06-20 10:36 · Score: 1
  
  Also, if you could build identical binaries, you could do something like have a separate machine on a separate domain, separate location, ... do a daily build of the source to check if the compiled version matches the binary. If they don't match, either they've drifted out of sync or maybe the machine has been compromised. If the machine is compromised, you'll be able to detect, externally, regardless of environment that something has changed when it shouldn't have and see if an attacker has replaced the binary or the source.
*facepalm* by Anonymous Coward · 2013-06-20 10:51 · Score: 0

Dependencies - do you know what they are? Everything right down to the OS heap layout can affect the layout of a binary. The chances of you being able to 100% audit your build machine and reproduce the exact same binary are slim to zero. Especially with things like randomization we're DELIBERATELY randomizing the built executable.
Good luck with that guys, give me a call when you've fixed it.
Chicken Littles all of them by Zynder · 2013-06-20 10:58 · Score: 1

I agree 100%. This is just another guy who after hearing about the NSA spying, suddenly wants to see boogeymen in everything. While I do not in anyway advocate and actual detest what the NSA has done, I am most pissed that proving the events happened has suddenly given every conspiracy theory nutcase legitimacy and they can all run around screaming "See I told ya so!" And now since they have something to point a finger at other people who may have been fencesitting or just not knowledgable of the conspiracy culture are now actually listening to these guys. Alex Jones does NOT need any more influence on American than he already does. I bet Art Bell has been jumping for joy for weeks now too. They both should have epic ratings.
Every binary is a unique little snowflake. by Anonymous Coward · 2013-06-20 11:11 · Score: 0

Fight the binary monoculture.
The whole point of running free software by Anonymous Coward · 2013-06-20 11:24 · Score: 0

Did anyone understand the bit, about how not generating the same exact binary as another, "severe limits the whole point of running free software?"
It's maybe somewhat interesting that we don't all build the same binaries, but I'm not sure I get how it prevents the maintenance that proprietary software users have to learn to live without, and surely that's the biggest point of Free Software.
Indeed, presumably the more I take advantage of Free Software's special powers, the more likely that my binary would be different than some arbitrary stale pre-maintenance one.
If you RTFA, the weird way he's looking at the situation is slightly less weird, but still pretty weird. And I don't think he ever really explains his warped view of "the whole point of running free software." But he does have some kind of quasi-security question ... framed just wrong enough to make things hard.
He is asking, "I have this here binary, and I want to know whether or not it matches this source. Is it a match? Yes or no." and he's having a hard time.
But if you're really worried about the binary (i.e. if this is really important), you don't ever ask that question. Instead, you utter the statement, "I want a binary compiled from this source." And once you look at it that way, everything gets really easy. Type "make."
(Is this all really about Ken Thompson's old story?)
Debian is not just binary by emblemparade · 2013-06-20 11:41 · Score: 1

I just want to point out an often overlooked difference between Debian and other free OSes: Debian is actually a very comprehensive build system for complete OSes, it's not just a set of packages. A .deb file may seem equivalent to an .rpm, but actually the toolset behind the two formats is as different as night and day.
To the point, a Debian-based OS is not a "binary" distribution as opposed to "source" distributions like Gentoo. In a Debian, every package is actually available as both source (.dsc files) and possibly binaries for various architectures (.deb files). The final vendor can opt to create an installation CD for a particular architecture, but there's nothing stopping anyone from created a source-based CD, too. Debian is a build system designed specifically for free software operating systems, and despite the clunkiness of its toolsset, it does its job very well.
So, it's unsurprising that the author found a strong equivalence in Debian. Indeed, the .deb files we get are procuded by .dsc files using the equivalent build process he used, but on the vendor's build farm.
Unsurprising, but still worthwhile that he checked this to make sure. So much of computer security is based on trust, and what looks obvious may not be. So, we now have some evidence about Debian's reliability in this particular matter.
1. Re:Debian is not just binary by Anonymous Coward · 2013-06-20 14:02 · Score: 0
  
  So much of computer security is based on trust, and what looks obvious may not be. So, we now have some evidence about Debian's reliability in this particular matter.
  We only have "some evidence", if we trust some random guy on the Internet.
2. Re:Debian is not just binary by bingoUV · 2013-06-20 23:34 · Score: 1
  
  How is that different from source RPMs available for rpm based distros? It gives comprehensive build information for the RPMs for various architectures.
  
  --
  Bingo Dictionary - Pragmatist, n. A myopic idealist.
3. Re:Debian is not just binary by emblemparade · 2013-06-20 23:44 · Score: 1
  
  The difference is that .dsc is integrated into Debian standard repositories. In fact, a repository is also a continuous build machine. (LaunchPad does this best.) The whole toolset iis oriented around handling upstream sources and patches together to create a dsc with a clearly traceable past. The RPM toolset is almost like a toy in comparison!
4. Re:Debian is not just binary by bingoUV · 2013-06-21 00:54 · Score: 1
  
  And srpm is integrated into redhat standard repositories. I don't see any technical differences mentioned in either of your posts, just buzzwords. Traceability might be useful though, thanks, but not directly relevant here.
  Continuous build is not a requirement for most distros, toolset is not being discussed.
  
  --
  Bingo Dictionary - Pragmatist, n. A myopic idealist.
5. Re:Debian is not just binary by cecom · 2013-06-21 05:30 · Score: 1
  
  You only see buzzwords because you don't understand the technical differences, not because the technical differences are not there. Debian guarantees a fully reproducible build environment. I can rebuild anything in Debian, even the whole distribution, without any special effort and be confident that I will get exactly the same binaries (modulo timestamps and signatures).
  That may or may not be important to you personally, but it is a big deal technically and there is an extraordinary amount of technical details and additional work that goes into achieving it.
6. Re:Debian is not just binary by MasterPatricko · 2013-06-25 11:24 · Score: 1
  
  Really, that isn't specific to debian.
  https://build.opensuse.org/
  http://koji.fedoraproject.org/
  
  --
  I'd tell a UDP joke, but you may not get it. I'd tell a TCP joke, but I'd have to keep repeating it until you got it.
Not sure the article author/submitter gets it... by Anonymous Coward · 2013-06-20 12:26 · Score: 0

If you're concerned about a compromised binary and you have source that you can verify as genuine...you build the package from source and remove the pre-built binary. Just about every modern distribution I can think of has some facility to do this. It's as if he/she is looking at the issue from entirely the wrong angle and creating an issue where there isn't one. If the source from everything down to the compiler used to create that package is available then it can all be built from source, from verified sources...it would take a long time but can be done, and I'm sure there's more efficient ways of hunting down potential compromises.
I just don't see what the point of his/her argument is. I run Fedora, if I thought that a particular package in Fedora might have a key logger built into it...I'd get the source RPM, read through it to the best of my ability, look it up on forums to see if I missed anything obvious...and if everything looked good, I'd use the source RPM instead of the pre-built binary. Any program used to build that binary, you follow the same process down the chain. It doesn't matter if the pre-built, distro-offered binary matches what you get when you build it from source -- it just matters that what you can build from source is secure and has the same functionality. Besides, as soon as you add in different compiler flags per build the signature of the binary is going to be substantially different anyway, et. al.
Slow news day?
absolutely sure it isn't the source code by Anonymous Coward · 2013-06-20 12:32 · Score: 1

Transfusion (the recreation of Blood in Quake) is advertised as open source software, and it is a GPL derived work. Years ago the lead programmer admitted to hiding the source code because released executables somehow don't count as a release.
Sometimes allegedly open source projects just blatantly violate open source licenses. The real question is: who cares enough to sue?
And if my compiler signs my code? by Anonymous Coward · 2013-06-20 14:05 · Score: 0

If the snapshot of your code at compile time is signed and linked to the binary output which is also signed then you can be sure that the two correspond. i.e. You can take your arrow of time and shove it up your straw man's....
Bit for bit identical binary packages? by dgharmon · 2013-06-20 18:04 · Score: 1

"in theory, to build binary packages from source packages that are bit for bit identical to the published binary packages"

Only if I have the exact same development environment as the published binary.

--
AccountKiller
You forgot Gentoo Linux .. by dgharmon · 2013-06-20 18:09 · Score: 1

Gentoo Linux would suit your needs, as it compiles from source at the install stage ...

--
AccountKiller
Gentoo by Anonymous Coward · 2013-06-20 21:01 · Score: 0

What? Did someone say Gentoo?
Re:Why not just use a source based distro like Gen by Hypotensive · 2013-06-20 21:42 · Score: 1

Or, indeed, *BSD.
make world!
Re:Problems with verifying the binaries from sourc by Anonymous Coward · 2013-06-20 23:36 · Score: 0

Ken is a former USG employee. He seeded the fields they now exploit with that "C" thing.
Debian has been and will be 0wn3d by tepples · 2013-06-21 00:32 · Score: 1

To avoid a targeted attack, just use a signed compiler package, e.g. from Debian.
Unless Debian happens to be compromised at the time you download packages, as it was in October 2003 and July 2006.
having a slow day, by Anonymous Coward · 2013-06-21 05:20 · Score: 0

tim?
Re:Makes as much sense as my 'Ultimate Terror Weap by OneAhead · 2013-06-21 05:39 · Score: 1

What's the point? Are you imitating Alan Sokal? Trying to lure extrordinarily stupid wannabe terrorists into wasting a huge amount of time? Or did you just want to show the world you have way too much time on your hands yourself? Or is it that confusing concept called "humor"?