Slashdot Mirror


GCC 4.3.0 Exposes a Kernel Bug

ohxten sends news from earlier this month that GCC 4.3.0's new behavior of not clearing the direction flag before a string operation on x86 systems poses problems with kernels — such as Linux and BSD — that do not clear the direction flag before a signal handler is called, despite the ABI specification.

256 comments

  1. Yep, by EkriirkE · · Score: 5, Funny

    That's what happens when you don't clear that STD...

    --
    from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
    to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
    1. Re:Yep, by Creepy+Crawler · · Score: 3, Funny

      ---That's what happens when you don't clear that STD...

      And the answer is to.... use condoms?

      And I thought we were here discussing bugs between GCC and LK.

      --
    2. Re:Yep, by EkriirkE · · Score: 2, Funny

      Some CLD will clear that STD, silly!

      --
      from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
    3. Re:Yep, by orkysoft · · Score: 1

      Woosh!

      --

      I suffer from attention surplus disorder.
    4. Re:Yep, by __aaclcg7560 · · Score: 1

      No, no, no. Take a knife and cut the wing-ding off. Intel does it all the time.

    5. Re:Yep, by cralewyth · · Score: 2, Funny

      All it really needs is some TLC.

      --
      "Women are just like ninjas; They lie even when it is more convenient to tell the truth." ~ Unknown
    6. Re:Yep, by gfxguy · · Score: 1

      I think the CDC might have something to say about what'll clear an STD.

      --
      Stupid sexy Flanders.
    7. Re:Yep, by allcoolnameswheretak · · Score: 1

      The developers where obviously having some THC.

  2. so what by Brian+Gordon · · Score: 5, Insightful

    OK so the kernel developers add a single line of code, the bugzilla ticket is closed, and we get on to real news?

    1. Re:so what by OverlordQ · · Score: 5, Insightful

      FTFA:

      This problem has existed for 15 years; GCC has always emitted code that worked correctly on kernels that did not follow the ABI, until now.

      Part of the problem is that there are an enormous number of installed kernels that are vulnerable to this problem, but only if GCC 4.3 is installed.


      That's, quite literally a fuckton of systems. So simply patching new kernels isn't going to make the problem go away.

      --
      Your hair look like poop, Bob! - Wanker.
    2. Re:so what by Anonymous Coward · · Score: 0

      Then any binaries compiled with GCC 4.3 will not work correctly on any version of Linux prior to 2.6.25 (assuming the fix is present in the next version). That's hardly acceptable, is it? Nor is the potential security risk involved in using GCC 4.3 itself, or any software compiled with it, on any machine with any currently existing kernel.

    3. Re:so what by Creepy+Crawler · · Score: 4, Insightful

      Over-reacting a bit, arent we?

      This bugfix is easily regressed, and has already been done.

      If somebody wants to stick with a buggy kernel, they can use an older version of GCC. It's not like older stable ones put out horrible binary or anything (we need to exempt RH using 2.96, cause that was ages ago).

      --
    4. Re:so what by evanbd · · Score: 4, Insightful

      Unless, of course, it turns out to be a security hole. The sysadmin installed GCC isn't the only way code gets on to systems. Besides, a lot of packages are shipped as binaries built with modern GCC, whatever that may be. This is going to be a pain to fix, even though the fix is simple.

    5. Re:so what by William+Robinson · · Score: 2, Interesting

      OK so the kernel developers add a single line of code, the bugzilla ticket is closed, and we get on to real news? p>

      Yes, Probably, a single line of code might fix it. (And I won't even call it a bug.)

      But before getting over this, I want to say kudos to gcc developers who have taken care to warn about this.

    6. Re:so what by Anonymous Coward · · Score: 0

      arch/x86/ia32/ia32_signal.c | 4 ++--
        arch/x86/kernel/signal_32.c | 4 ++--
        arch/x86/kernel/signal_64.c | 2 +-
        3 files changed, 5 insertions(+), 5 deletions(-) From the kernel patch.

      Oh OK then, it really is one line. Very exciting indeed.
    7. Re:so what by Brian+Gordon · · Score: 1

      Oh I see the problem.. now that GCC isn't turning out broken binaries, old kernels will be unable to run them. Everyone will be forced to upgrade, or more likely everyone will still make broken binaries.

    8. Re:so what by jlarocco · · Score: 0

      If I've read correctly, the bug only occurs when old kernels (without this newest little patch) are compiled with the brand new GCC 4.3.

      How likely is it that a person who compiles their own kernels with brand new versions of GCC won't be running the newest kernel that has this patch? How likely is it they won't be able to backport this patch if they're can't fully upgrade the kernel? How many people will say "Time to compile my old, stable, trusted kernel, with this brand spanking new, relatively untrusted compiler"?

      I'm not saying those people don't exist, but they're a very tiny subset of an already very tiny set of Linux users.

    9. Re:so what by RML · · Score: 5, Informative

      You have read incorrectly. The bug occurs when applications compiled with the brand new GCC 4.3 are run on old kernels, regardless of what compiler was used to compile the kernel.

      --
      Human/Ranger/Zangband
    10. Re:so what by Anonymous Coward · · Score: 0

      That's, quite literally a fuckton of systems. So simply patching new kernels isn't going to make the problem go away. I just about shat my pants when I read this; that is quite possibly the most ingenious use of the word fuck I have ever seen. I wonder, though, what is the exact measurement of a "fuckton"? Is it a unit or is it a constant?

      Oh, and were this 4chan, I would have lost the game.
    11. Re:so what by Profane+MuthaFucka · · Score: 2, Funny

      I'm a consultant, and I'm wondering what the billing rate times a fuckton is going to total out to.

      --
      Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
    12. Re:so what by serviscope_minor · · Score: 3, Funny

      Is a fuckton more or less than a metric assload?

      --
      SJW n. One who posts facts.
    13. Re:so what by rsidd · · Score: 1

      If I've read correctly, the bug only occurs when old kernels (without this newest little patch) are compiled with the brand new GCC 4.3.

      You read wrongly, or more likely did not read at all. And nor did the moderators. The bug exists no matter what compiler was used to compile the kernel.

    14. Re:so what by Duhavid · · Score: 0

      Yes. Next question, please.

      --
      emt 377 emt 4
    15. Re:so what by nategoose · · Score: 1

      I don't think that this problem is very likely since to my knowledge few applications ever flip the direction flag.

    16. Re:so what by chgros · · Score: 1

      I don't think that this problem is very likely since to my knowledge few applications ever flip the direction flag.
      I believe the problem happens if the kernel flips the direction flag: it will stay flipped when calling back to your application.

    17. Re:so what by und0 · · Score: 5, Insightful

      Nope.

      It's related on how the GCC assumes the kernel sets the state of a flag before calling a function (signal handler), and this happens for compiled applications in userland with newer GCC (4.3.0).

      I don't recall the gory details, on Sid with the latest (of today) version of libc6, SBCL exposes the bug (crashes). There aren't big differences between libc 2.7-8 and 2.7-9, but the second was compiled with the newer GCC. Kudos to Aurelien Jarno, a Debian developer, who isolated the bug and pushed a patch upstream. http://lkml.org/lkml/2008/3/5/207

    18. Re:so what by Codifex+Maximus · · Score: 5, Interesting

      Ok, I read the article and alot of the comments.

      Seems to me the easy and correct thing to do would be to use deprecation. i.e. keep the old functionality for a bit longer and also patch or make the new kernels properly set the flag right now. This way, we move in the right direction and when it's no longer an issue then we drop the functionality in the compiler and rely on the kernel setting the flag like it's supposed to do.

      Now, I see why the kernels have not been setting the flag. Why should they when the compiler was doing it? Time to set things right though... in the interests of portability with other environments and compilers. Having the kernels setting the flag starting now would satisfy ABI compatibility with the other compilers AND having gcc continue to cover the flag, by default for a time, would prevent breakage of alot of existing code.

      Seems like a no brainer to me. After all, isn't that what deprecation is for?

      That's my take on it...

      --
      Codifex Maximus ~ In search of... a shorter sig.
    19. Re:so what by Psychotria · · Score: 1

      Ok. So I was wrong. Interesting link -- thanks.

    20. Re:so what by HeroreV · · Score: 1

      That's, quite literally a fuckton of systems. So simply patching new kernels isn't going to make the problem go away. Other compilers, like ICC from Intel, do not set the flag. That's, quite literally a fuckton of binaries already out in the wild. So simply patching GCC isn't going to make the problem go away either.

      The problem is in the kernel, and GCC cannot solve that. This problem will exist whether GCC adds an ugly hack or not. Even if GCC had never changed their behavior, this would still be a problem for other compilers.
    21. Re:so what by torstenvl · · Score: 2, Interesting

      Actually - and I attribute this to good ol' BK - GCC *could* make the problem go away, by recognizing when it is compiling the kernel, and inserting the code itself.

      Just sayin'.

      Read this -- http://cm.bell-labs.com/who/ken/trust.html

    22. Re:so what by Vlad_the_Inhaler · · Score: 4, Interesting
      From what I saw of TFA, this is being done. An updated GCC is being pushed and I suppose that this reversion to the previous behaviour will be backed out again at some point.

      Interesting was:
      • GCC was the exception in this case - other C compilers always did it this way
      • While it affects some programs running under Linux or BSD, this GCC update appears to nuke Hurd completely.
      --
      Mielipiteet omiani - Opinions personal, facts suspect.
    23. Re:so what by qbwiz · · Score: 2, Insightful

      Of course, the security holes will only be in programs that were compiled with GCC 4.3.0. It's not as if some unprivileged user could cause problems merely by compiling something with a new version of GCC, but it will still be a problem if a trusted person uses GCC 4.3.0 to compile and run a program which would become exploitable.

      --
      Ewige Blumenkraft.
    24. Re:so what by dargaud · · Score: 2, Insightful

      Maybe it needs an entry for us regular programmer...

      --
      Non-Linux Penguins ?
    25. Re:so what by RupW · · Score: 2, Informative

      now that GCC isn't turning out broken binaries, old kernels will be unable to run them GCC never turned out broken binaries. It turned out overly-conservative binaries that cleared the direction flag even when the ABI spec said it could assume the flag was already clear.
    26. Re:so what by makomk · · Score: 1

      Yeah - I think basically the only OSes that follow the ABI on this are SCO Unix (probably because they wrote the ABI in question) and possibly Solaris. Ones that don't include every single past version of Linux and *BSD (all variants).

    27. Re:so what by petermgreen · · Score: 2, Informative

      Well afaict the debian developers plan to modify gcc 4.3 so it behaves in the old way to reduce the risk of crashes when upgrading from one version of debian to the next. Dunno if gcc upstream will agree on that reasoning though. This isn't perfect though, even before gcc's behaviour changed there was still a risk that a signal handler would break the code that it interrupted.

      Afaict this bug only affects a relatively small number of apps because little code messes with the direction flag in the first place

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    28. Re:so what by larry+bagina · · Score: 5, Funny

      at least nothing of value is affected.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    29. Re:so what by Eunuchswear · · Score: 3, Informative

      You never use memmove(3)?

      --
      Watch this Heartland Institute video
    30. Re:so what by xaxa · · Score: 4, Funny

      It depends, the US Fuckton is less than a metric assload, but the Imperial Fuckton, previously used in the UK, was more.

      NB The use of 'assload' without the 'metric' qualifier is discouraged, the customary US assload being a much greater mass.

    31. Re:so what by Bert64 · · Score: 1

      There are other compilers than GCC...
      People could still write the affected code in assembly...

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    32. Re:so what by Eponymous+Bastard · · Score: 1

      The buggy code would run on user mode, not kernel mode. The only way to compromise a system with this security hole would be to have the sysadmin install a 4.3-compiled daemon running as root or a 4.3-compiled suid root program on an older kernel. A user can copy a 4.3-compiled program to his home directory, but all he'll get is a crashing program.

      Yes, another variable to worry about when upgrading compilers and kernels, but not as easy as you make it sound.

    33. Re:so what by bytesex · · Score: 1

      There are about 200 assloads to the fuckton (well, 213.134). Assloads, in turn, are subdivided in shitloads (7 and a half of shitload per assload to be exact). Shitloads come apart in normal 'loads' (23 loads to the shitload), which subdivide in 'lots' (3.2 lots to the load). Things become more humanly measurable here; 'lots' subdivide in 'much'es (8), which subdivide in 'some's (2.2), into 'a bit's (1.7). Bits have their own proper conversions to mmol, eV, picoliter and Angstrom, but this margin is too narrow to describe it.

      --
      Religion is what happens when nature strikes and groupthink goes wrong.
    34. Re:so what by bosef1 · · Score: 1

      I'm out of practice in ass-packing so I could be misremembering, but I seem to recall that the weight of an assload varied with the contents. Kinda like how a bushel of wheat weighs differently than a bushel of oats. So you should check with one of the standard engineering references before specifying assloads.

      Incidentally, a butt-load is around 126 American gallons.

    35. Re:so what by nategoose · · Score: 1

      Yipes. I've never really looked at memmove's code before. Thanks for pointing that out.

    36. Re:so what by pizzach · · Score: 1

      That's, quite literally a fuckton of systems. So simply patching new kernels isn't going to make the problem go away. GCC 4.3.0 was released March 5, 2008. Few if any distros even have an option for installing it as they are still testing it for bugs. Most sane distros do not throw experimental packages on their users unless they specifically want it (at their own risk.)

      Furthermore, it's not very difficult to make a dependency for a specific kernel version against a GCC version in most package managers.
      --
      Once you start despising the jerks, you become one.
    37. Re:so what by courtarro · · Score: 1

      They cannot be directly compared since the prior is a measure of weight, while the latter is a measure of volume.

    38. Re:so what by pongo000 · · Score: 1

      That's, quite literally a fuckton of systems.


      Is that less than or greater than a shitload?

      Just want to make sure I get this right...
    39. Re:so what by ultranova · · Score: 1

      More interesting than the bug itself is this discussion from lwn.net:

      I think you got it backward. I claim it's standard because Linux does it that way. Linux is what violates the prescribed standard.

      I also didn't state the de facto standard as precisely as I could have, because Linux clearly should change to clear the DF flag. But Gcc should continue to clear it too, because old Linux exists.

      That guy is claiming that GCC should produce sub-optimal code rather than best possible within standards just to work around a Linux bug, and one where the patch is both trivial and already exists on top of that. Seriously, WTF ?

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    40. Re:so what by ultranova · · Score: 1

      They cannot be directly compared since the prior is a measure of weight, while the latter is a measure of volume.

      However, one imperial fuckton per one metric assload is the density of Joe Public.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    41. Re:so what by Midnight+Thunder · · Score: 1

      NB The use of 'assload' without the 'metric' qualifier is discouraged, the customary US assload being a much greater mass.

      Well don't forget there are also other variations off the metric assload, such as the one used for measuring liquid quantities.

      --
      Jumpstart the tartan drive.
    42. Re:so what by sjames · · Score: 1

      This problem has existed for 15 years; GCC has always emitted code that worked correctly on kernels that did not follow the ABI, until now.

      And that's why it's not really a big deal. All of the tests worked, all of those binaries did (and do) the right thing. The bug lasted 15 years because all of those kernels DO have the correct ABI when compiled with the compilers available at the time. The issue is that the source doesn't assure that correctness in the new compiler.

      Part of the problem is that there are an enormous number of installed kernels that are vulnerable to this problem, but only if GCC 4.3 is installed.

      Of those, the vast majority were compiled with an earlier GCC that happened to do the right thing anyway, so they have no worries. For 2 examples, Fedora Core 8 and Debian Etch are on GCC 4.1.x.

      The rest of the systems were built by people who enjoy using the latest and greatest of everything (with all the extra risks that involves) and would probably have grabbed the patches and updated their kernel anyway.

      The part that DOES matter will be for the very few who down the road choose to use a new compiler on an old kernel. They will need to be aware of this issue and backport the relevant patch. Of course, those cases are always a bit of a risk since necessarily the old kernel was never tested with the new compiler.

      Really, this shows that the Open development model works. A new compiler came out, people tested, the problem was found, and appropriate remedies are underway. This is all well in advance of a production system ever seeing the problem.

    43. Re:so what by jwiegley · · Score: 1

      "much greater ass" did you say?

      Farce = ass * ??

      --
      I will never live for sake of another man, nor ask another man to live for mine.
    44. Re:so what by maestroX · · Score: 1

      .reffid ot geb I

    45. Re:so what by Z00L00K · · Score: 1
      There are a few interesting factors/questions here:

      1. What does the flag actually do? Will it cause kernel panics, segmentation faults or is it that you leave access to kernel memory open for a non-privileged process to access?

      2. Are there any performance benefits with this change? OK, one instruction less makes a tiny bit.

      3. Why just remove a feature that were used, why not add a flag that allowed the programmers to keep that feature?

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    46. Re:so what by tyrione · · Score: 1

      Actually - and I attribute this to good ol' BK - GCC *could* make the problem go away, by recognizing when it is compiling the kernel, and inserting the code itself.

      Just sayin'.

      Read this -- http://cm.bell-labs.com/who/ken/trust.html That's a workaround and covers the short-coming of the kernel. I agree with providing the workaround while the kernel gets fixed.
    47. Re:so what by Anonymous Coward · · Score: 0

      That's a metric arseload, you insensitive clod!

    48. Re:so what by fatphil · · Score: 1

      Your post needs to be made in all caps, it seems to have been overlooked. (Hmmm, I'll see if my troller account has mod points...)

      Not just assembly, of which there are at least 3 commonly used dialects; even a different C compiler such as icc or tcc. And of course. Then let's not forget that there are other high level languages entirely.

      The ignorance of some of the loudmouths on the kernel discussion forum is exemplified by this post by "khim" near the end of the page:
      """
      Yes, from formal POV kernel is wrong and GCC is right, but in reality you can fix either GCC or kernel - it does not matter which:
      """
      That's a completely blinkered attitude, 'khim' obviously is completely oblivious to the fact that there are any other languages apart from C and any other C compiler apart from gcc. The sensible people more closely involved with the kernel, such as HPA, have a diametrically opposite oppinion, fortunately.

      --
      Also FatPhil on SoylentNews, id 863
    49. Re:so what by awrowe · · Score: 1

      That's, quite literally a fuckton of systems. So simply patching new kernels isn't going to make the problem go away.

      Is fuckton a new word?

      DAMN, I wish I could invent new words like that. The best I ever got was a unit of measurement called the poofteenth, which is a bit bigger than a gnats dick but a bit smaller than a tad.

      In a scramble to get on topic, releasing a patch will make the problem go away. Anyone who is going to use GCC 4.3 in an environment where it matters (i.e. not GarageNerd writing his new killer localhost 2.0 app) is going to check the situation out before going ahead with it.

      --
      A.I. Research. The peculiar science in which we know the question and we know the answer, but can't show the working
    50. Re:so what by Codifex+Maximus · · Score: 1

      maestroX said:
      ".reffid ot geb I"

      Now why did no one find this funny? Sorry MaestroX, we forgot to flip your flag.

      Sorry bro,

      --
      Codifex Maximus ~ In search of... a shorter sig.
    51. Re:so what by mhall119 · · Score: 1

      Unless of course you're referring to a metric fuckton.

      --
      http://www.mhall119.com
    52. Re:so what by cant_get_a_good_nick · · Score: 1

      This is not the first kernel bug exposed by gcc. The egcs fork of 2.9 series compilers exposed some bugs, and distros had to have a kgcc package that was back versioned to 2.8 to compile the kernel, the rest of the OS was the mainline egcs.

      It's not as if 4.3 is going to magically end up in a distro, entering ninja style without warning. Any distro vendor who adds 4.3 and does not have a patched kernel will have some kgcc equivalent.

      If you add a new compiler, put it in /opt, and keep using the system compiler for your kernel.

  3. GCC is wrong by BadAnalogyGuy · · Score: 0, Troll

    Rule #1: Don't break existing stuff

    GCC breaks this cardinal rule. It should be reverted.

    1. Re:GCC is wrong by Anonymous Coward · · Score: 5, Insightful

      "Rule #1: Don't break existing stuff"

      The ABI wasn't being followed correctly, hence GCC, Linux and the BSD kernels were already broken.

      "GCC breaks this cardinal rule. It should be reverted."

      It is not a wise idea to revert corrections to long standing issues.

    2. Re:GCC is wrong by bkaul01 · · Score: 5, Insightful

      So, are we going to get on GCC's case for enforcing standards compliance and thus breaking backwards compatibility while insisting that Microsoft should take the opposite approach with IE8?

    3. Re:GCC is wrong by Anonymous Coward · · Score: 5, Informative

      "Rule #1: Don't break existing stuff"

      GCC is in the business of creating new and better optimizations. It is pretty much impossible to make optimizations without assuming things in the ABI. As more and more stuff from the ABI is assumed in the optimizations, people get away with less violations of the ABI, but without assuming more stuff, faster optimizations wouldn't happen.

      Because the newest versions of GCC are necessary to improve the state of the art in C compiler optimizations in the open source world, the appropriate reaction to this is to have the compiler people follow the spec, and assume the spec, and if assuming the spec breaks something, the people affected by the breakage don't upgrade their compilers.

      This is why there are still people using GCC versions from the stone age.

    4. Re:GCC is wrong by BadAnalogyGuy · · Score: 2, Insightful

      I suppose this might be a longstanding issue if Linux was Unix.

    5. Re:GCC is wrong by Score+Whore · · Score: 0

      The ABI wasn't being followed correctly, hence GCC, Linux and the BSD kernels were already broken.


      I'm curious, why would you think that the BSD kernels were/are broken? Why would they be following the Sys V ABI? You do know that there are two general flavors of unix right? Sys V and BSD. Guess which one the BSDs are?
    6. Re:GCC is wrong by Anonymous Coward · · Score: 5, Informative

      Check the BSD mailing lists for yourself, they are affected. I'll give you one example below:

      http://leaf.dragonflybsd.org/mailarchive/commits/2008-03/msg00072.html

      Before flaming people next time, at least try and learn about what you're talking about.

    7. Re:GCC is wrong by Anonymous Coward · · Score: 0

      "I'm curious, why would you think that the BSD kernels were/are broken? Why would they be following the Sys V ABI? You do know that there are two general flavors of unix right? Sys V and BSD. Guess which one the BSDs are?"

      You do know that the SysV ABI goes along with ELF, and that the BSDs have adopted it along with the file format?

    8. Re:GCC is wrong by MostAwesomeDude · · Score: 1

      It's the x86 ABI, so it has nothing to do with the lineage of the code and everything to do with the architecture. (Unless you're going to tell me that BSD has its own double secret x86 ABI!)

      --
      ~ C.
    9. Re:GCC is wrong by Anonymous Coward · · Score: 0

      The name "SysV ABI" is misleading. All the BSDs have adopted it along with ELF, also originally a SysV standard. It's just historical. It's now pretty much the "everybody running on x86 Unix" ABI.

    10. Re:GCC is wrong by burgundysizzle · · Score: 1

      Of course it does. Try copying /dev/zero into a file to have a read of their double secret ABI! Once you're done you can save a copy by copying it back to /dev/null then removing the original file.

    11. Re:GCC is wrong by SeaFox · · Score: 3, Insightful

      Rule #1: Don't break existing stuff
      GCC breaks this cardinal rule. It should be reverted.


      Using that logic Microsoft shouldn't try to improve security in Windows since it breaks many third party applications that depend on exploits and other silly behavior to function.
    12. Re:GCC is wrong by Orion · · Score: 1

      It's not just GCC. The bug is actually in the kernel... you can propose that GCC be extra careful not to trigger this bug, but if someone wants to make a binary that triggers it, they don't need GCC to do that.

      It's a potential security hole, and an almost certain memory corruption-waiting-to-happen, and needs to get fixed in the kernel. GCC reverting to the old behaviour will, at best, prevent people from accidentally finding this.

    13. Re:GCC is wrong by mrmeval · · Score: 1

      Linus is a cave man?

      --
      I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
    14. Re:GCC is wrong by Anonymous Coward · · Score: 0

      No.

    15. Re:GCC is wrong by pembo13 · · Score: 1

      We darn well aren't. This is a fair bug, which was found out, and deserves to be fixed by someone who understands it. There should be no nitpicking, except to come to conclusion on how much of current systems are affected.

      --
      "Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
    16. Re:GCC is wrong by Anonymous Coward · · Score: 1, Insightful

      No, that's silly. GCC development has a track record of doing good things, so we can assume what they're doing is good. Microsoft has a record of doing bad things (to put it mildly), so we can assume that, whatever they decide to do, it's probably the wrong choice.

    17. Re:GCC is wrong by ukyoCE · · Score: 1

      I think you have it backwards. Shouldn't you be saying:

      "So, we ARE going to get on GCC's case, right? For breaking compatibility with millions of systems, just like Microsoft intentionally broke Firefox, Opera, and Safari?"

      Standards compliance is generally a good drum to bang, but whats REALLY important is what you're breaking. It seems to me GCC has a fix in search of a problem. If they really want to meet the standard here, I think it would be reasonable to request the fix from the broken kernels and wait a reasonable amount of time for proliferation before releasing the fix.

      I don't care much either way, mainly wanted to point out that the problem with IE has been the fact that it BREAKS other web browsers. The standards are just an easy place to point to determine which browser is the problem. And of course blatant abuse of a monopoly to squash competitors gets some of us a little peeved too.

    18. Re:GCC is wrong by n3tcat · · Score: 1

      SHHH!!! This is slashdot! An evangelist might hear you!

    19. Re:GCC is wrong by evanbd · · Score: 3, Interesting

      Silly question time...

      If this managed to affect both Linux and BSD despite no relevant common code, is Windows affected? I'm guessing OSX is, thanks to its BSD heritage. Has anyone tested either of them, though? How about other OSes?

    20. Re:GCC is wrong by Vlad_the_Inhaler · · Score: 2, Interesting

      It is not quite as bad as that. It causes problems between two threads, but both threads have to be from the same program. If someone has such a specially crafted program running on their system, they have been breached already.

      No privilege escalation, only DOS.

      --
      Mielipiteet omiani - Opinions personal, facts suspect.
    21. Re:GCC is wrong by badfish99 · · Score: 3, Funny

      On the other hand: the instructions affected by this aren't used very much, so if you want optimizations, a good candidate would be to not clear the flag unless it is needed. If the ABI were simply changed to allow this, no existing code would break (obviously), and future code could both conform to the new ABI *and* avoid the overhead of unnecessary instructions to clear the flag when it is not being used.

      I suppose the only barrier to this optimization would be the political effort needed to get everyone to agreee to change the ABI.

    22. Re:GCC is wrong by Bazer · · Score: 1

      Or let them add an opt-in compatibility flag which will tell GCC to clear that flag manually and be done with it.

    23. Re:GCC is wrong by WK2 · · Score: 2, Interesting

      1) Nobody is getting on gcc's case. As I understand it, they are doing the right thing, and reverting to the older, safer, although slightly slower, behavior.

      2) Perhaps you haven't gotten the news, but IE8 is doing the right thing too, by using their "less broken" mode by default. This is a switch from what they announced earlier, where you would have to opt-in to better standards compliance.

      3) The difference between IE, and gcc is IE is broken, and gcc is not. Clearing the DF does not break standards in any way. In fact, according to the ABI, it needed to be done anyway (although the kernel is supposed to do it). Guess what happens when you clear the DF twice?

      --
      Write your own Choose Your Own Adventure. http://www.freegameengines.org/gamebook-engine/
    24. Re:GCC is wrong by Lonewolf666 · · Score: 2, Interesting

      Enforcing standards compliance will be a pain in the short run, but pay off in the long run. Because you can get away with accommodating old bugs (or bad designs, but that gets offtopic) for a while, but eventually the difficulty in maintaining all the quirks grows to a point where it is no longer doable.

      I think Windows Vista is a good example of what happens when you try to maintain backwards compatibility to the assorted bugs and mis-designs of decades. See the various Vista articles on /. on how that worked out ;-)

      If Microsoft takes the opposite approach with IE8, I consider that a good move and a sign that they are capable of learning.

      --
      C - the footgun of programming languages
    25. Re:GCC is wrong by larry+bagina · · Score: 1

      as the year of the linux desktop becomes a reality, we've already seen malicious and trojaned programs make their way into repositories. Not to mention distros that contain root exploits with a default installation.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    26. Re:GCC is wrong by Anonymous Coward · · Score: 0

      You think they try?

    27. Re:GCC is wrong by Lumpy · · Score: 1

      The funny part.

      Last I knew you could download ANT older version of GCC you wanted. Who cares if GCC breaks compiling Linux Kernel, BSD and Hurd. it's not like there is a law that states "thou must use the latest GCC or you will be killed and eaten by the grue!"

      use the older compiler until you fix your broken code THEN upgrade to the latest. It blows my mind how many geeks think the MUST HAVE the latest revision.

      The GCC guys should NOT release a new revision that puts this back in. It's a fix, everyone else needs to catch up with the fix or use a older compiler.

      --
      Do not look at laser with remaining good eye.
    28. Re:GCC is wrong by EvanED · · Score: 1

      Windows doesn't have signals as such, so I'm going to say not really. It might affect the POSIX subsystem and perhaps even Cygwin, but not almost certainly not "native" Windows apps.

    29. Re:GCC is wrong by norton_I · · Score: 1

      Windows uses a different ABI, the windows ABI, so the definition of how the DF should be set is based on a different standard. Furthermore, in the windows world it is a fairly strong argument that the actual behavior of windows+MSVC is the most authoritative guide, rather than what any document may or may not say about what should happen. Finally, windows doesn't have unix signal handlers, so the particular point in question is moot, though there may be alternate ways where the same issue would show up in windows.

    30. Re:GCC is wrong by Eponymous+Bastard · · Score: 2, Informative

      Windows does not have signal handlers natively. (or actually, only a few now that I google it:SIGABRT, SIGFPE, SIGILL, SIGINT, SIGSEGV, SIGTERM) There is the whole SEH C-language exceptions which take over some of the uses, but no other signals natively. So you won't write a signal handler that gets called on a timer.

      Full signals for GCC-compiled programs would be implemented by Cygwin which should give you timer signals and so on. Since the standard way to upgrade GCC under cygwin is to use the cygwin upgrade/package manager, they can just make the new GCC package depend on an updated cygwin DLL which could set the correct flag for you in a thunk before passing on the signal.

      Don't bother trying to compile GCC yourself under cygwin, it's quite painful. Or at least time-consuming, the slower process spawning makes configure take an hour or more last time I tried it a few years ago. And then you have to wait for make bootstrap to finish.

      Then again, MS isn't notorious for following standards. If this does show up under windows (say when starting an SEH handler) they'll just say that that's the windows ABI and ignore it.

      Hell, it might even be different under win98/XP/Vista, as they are different kernel.

    31. Re:GCC is wrong by Anonymous Coward · · Score: 0
      Using that logic Microsoft shouldn't try to improve security in Windows since it breaks many third party applications that depend on exploits and other silly behavior to function.

      So you totally agree with the logic then.

    32. Re:GCC is wrong by dwheeler · · Score: 1
      The problem here is that even though the standard said something, neither kernels nor compiled programs complied with it. There's no need to rush to make this change - current code is working! By rushing to remove this "extraneous" setting, code generated with the new gcc will silently screw up on older kernels. Users often run older kernels, and/or need to revert to them when things go wrong. Suddenly switching, when there's no need to rush the change, is a terrible idea. Especially since there's no real advantage to the change (not even a real performance advantage for real programs under usual usage models).

      There's an easy solution: A grace period. Let the kernel developers know (done), let the kernel developers fix their kernels (done), and give time for such fixed kernels to spread. Change the compiler so that it won't generate the "unnecessary" code, but DISABLE that by default for a while. After a long time (two years? More?), switch to enable-by-default. This change has no useful upside for users, and lots of downsides (in terms of broken programs), so there's no reason to hurry it. It's fine to do it, but just do it slowly.

      Vista complaints have nothing to do with "too much backwards compatibility"; one of the key Vista problems is a LACK of adequate backwards compatibility: http://www.iexbeta.com/wiki/index.php/Windows_Vista_Software_Compatibility_List#Heavy_Problems.2C_Currently_Incompatible http://www.pcmag.com/article2/0,1759,2104022,00.asp

      The only reason to run Windows (any version) is to be able to use the hardware and software that is only compatible with Windows. I know of no one who claims that Windows is the "best operating system on Earth", by any measure; people choose Windows for application compatibility, not for its "innovation". A "grossly incompatible Windows" is completely worthless.

      Similarly, even if it's in the spec, it's absurd to change a system and make it so user programs just break. Think of the users. Instead, figure out how to make the change WITHOUT harming the users. That user might be you.

      --
      - David A. Wheeler (see my Secure Programming HOWTO)
    33. Re:GCC is wrong by Goaway · · Score: 1

      They are used in memmove(), which you can hardly argue is not "used very much".

    34. Re:GCC is wrong by Goaway · · Score: 1

      That is not an argument that should ever be made. This kind of messy, unpredictable problem involving the kernel is exactly the kind of thing that may suddenly turn out to be exploitable after everyone's assumed it's not, when somebody clever enough sets their mind to it. It's happened many times before.

      You really want to err on the side of caution in these cases.

    35. Re:GCC is wrong by bytesex · · Score: 2, Funny

      You lose one CPU cycle ?

      --
      Religion is what happens when nature strikes and groupthink goes wrong.
    36. Re:GCC is wrong by Anonymous Coward · · Score: 0

      I'm curious, why would you think that the BSD kernels were/are broken?

      Because it says so in the summary?

    37. Re:GCC is wrong by Vlad_the_Inhaler · · Score: 1

      Fixing asap (in the kernel) is a necessity, but 'all' it really does is to cause affected programs to hang or abort.

      What did Douglas Adams say? DON'T PANIC.

      --
      Mielipiteet omiani - Opinions personal, facts suspect.
    38. Re:GCC is wrong by Goaway · · Score: 1

      All that we know it does. That was the point. We can not tell that that is all it can be used to do with any high degree of certainty.

    39. Re:GCC is wrong by edwdig · · Score: 1

      On the other hand: the instructions affected by this aren't used very much, so if you want optimizations, a good candidate would be to not clear the flag unless it is needed.

      That's how the old code worked - it cleared the flag before using instructions impacted by it. However, you can't say the affected instructions aren't used much.

      All the string & bulk data instructions care about the direction flag. Unless your compiler generates really bad code, I'd expect memcpy, memmove, memcmp, strcpy, strlen, strcat, etc all to be impacted. realloc most likely would either call memmove or would use similar code as well if it has to move your memory to enlarge the allocation.

      Also, things like assigning one struct to another is essentially an inline memcpy, hence that could be affected as well.

      With the compiler no longer clearing the direction flag, any of those functions would misbehave if called with the direction flag set.

      and future code could both conform to the new ABI *and* avoid the overhead of unnecessary instructions to clear the flag when it is not being used.

      The flag only ever gets set when it's needed. Needing the flag set is rather rare, whereas most code needs it clear. Hence why the idea is that the code that sets the flag should be responsible for clearing it, rather than having to constantly clear the flag every time it comes into play.

    40. Re:GCC is wrong by Majikk · · Score: 1

      I'd say it's very safe to assume Windows is compiled with Visual Studio and thus has nothing to do with any of this.

    41. Re:GCC is wrong by ultranova · · Score: 2, Interesting

      It is not quite as bad as that. It causes problems between two threads, but both threads have to be from the same program.

      Actually, no. Two threads will work just fine, because the state of the CPU in its entirety (all flags) is saved and restored at when switching between them - indeed, if it wasn't, simply clearing the flag before using it wouldn't help any, because a task switch can occur between any two instructions, including the one clearing the flag and the one immediately following, which makes use of the now-cleared flag.

      No, the problem is in signal handlers, which are the software-level equivalent of interrupts. When a thread receives an signal, and a handler has been registered, it immediately interrupts what it was doing and executes the handler function - or, more precisely, the kernel switches the point of execution to the start of that function. Now, the problem is that the spec says that a certain flag should be cleared whenever a function starts, and he kernel doesn't make sure it is. It didn't matter previously, because the GCC generated code to clear it anyway; however, this is redundant according to the spec, so it was dropped.

      So, to sum it up: this has nothing to do with threading and can affect single-threaded programs just fine.

      No privilege escalation, only DOS.

      This bug could conceivably cause parts of a program's memory be overwritten by the contents of a string. It isn't unthinkable that this might cause foreign code execution attack in the program.

      Altought it does seem pretty unlikely that anyone would do string copying in a signal handler...

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    42. Re:GCC is wrong by Ungrounded+Lightning · · Score: 1

      Using that logic Microsoft shouldn't try to improve security in Windows ...

      Isn't that already their business plan?

      --
      Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
    43. Re:GCC is wrong by EkriirkE · · Score: 1

      Not acceptable!

      --
      from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
  4. Kernel bug by Harmonious+Botch · · Score: 4, Funny

    Better than a general fault.

    1. Re:Kernel bug by clickety6 · · Score: 3, Informative


      nut not as good as a major screw-up or even a private error

      --
      ----------------------------------- My Other Sig Is Hilarious -----------------------------------
    2. Re:Kernel bug by Anonymous Coward · · Score: 0

      It's still a major pain though.

  5. WOW!! by EEPROMS · · Score: 0, Flamebait

    A error was found and patched, now remind me why this is news ?

  6. Linux Replies by Anonymous Coward · · Score: 0

    "A bug? In the Kernel? BWAHAHA, a trivial matter" [Snaps random developer in half, ingests]

  7. EVERYBODY PANIC!!! by Anonymous Coward · · Score: 5, Funny

    GCC 4.3.0's new behavior of not clearing the direction flag before a string operation on x86 systems poses problems with kernels -- such as Linux and BSD -- that do not clear the direction flag before a signal handler is called, despite the ABI specification.

    Oh my GOD! If this is true, that means- that means-- it... the-

    Uh, what does it mean exactly?

    1. Re:EVERYBODY PANIC!!! by EkriirkE · · Score: 5, Informative

      When scanning strings for, say, a null terminator the direction flag determines if the current memory register gets incremented or decremented after each byte check. It could mean strlen returns 0 if your strings are grouped together in a segment of memory, or it just plain return the wrong result. Also memory copy routines could copy the wrong part of memory to the wrong place and overwrite executable code (or just cause a page/segment fault).

      --
      from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
    2. Re:EVERYBODY PANIC!!! by Anonymous Coward · · Score: 5, Funny

      I'm sorry, I'll need a car analogy on that one.

    3. Re:EVERYBODY PANIC!!! by EkriirkE · · Score: 5, Informative
      In x86 (assumed from here on) assembly, there are some 'quick' operations to read, write, and test memory (LODS*, STOS*, SCAS* respectively - there are probably more). The CPU has registers, or variables that are counters, or hold the memory addresses in question - in these cases a source memory position and a destination memory position. When you performs these commands the memory registers either increment or decrement value (position) depending on how the direction flag is set. GCC is assuming the flag is clear and the pointers will increment - go forward after each call. If the direction flag is set incorrectly upon calling these string or memory functions, the pointers could go backwards and thus copy (or scan) the wrong chunk of memory to the wrong destination.

      Say our source memory contains:

      Address: 0123456789ABCDEFGHIJKLMNOPQRSTUV
      Contents: XXXXXXXXA car is heavy.-XXXXXXXX


      Let's pretend the hyphen is a null (the string terminator or "stop" in most languages and OS) If I want to perform a strlen on that string at position '8', it should return 15 characters because it found the null at 'N' If the direction flag is wrong, it will not scan 8, 9, A, ... but 8, 7, 6, ... until it finally finds that null or crashes with an access violation.

      And with memory, I want to copy 5 bytes from '8' to position 'P' If that works correctly, we get this in memory:

      Address: 0123456789ABCDEFGHIJKLMNOPQRSTUV
      Contents: XXX-!@#$A car is heavy.-XA carXX


      However, if the direction is wrong, we will get:

      Address: 0123456789ABCDEFGHIJKLMNOPQRSTUV
      Contents: XXX-!@#$A car is heav!@#$AXXXXXX


      See how '8' copied to 'P' as expected, but decrementing we then get '7' to 'O', etc

      We now have corrupt memory. If we so a strlen, strcat or other null-expecting function on that string located at '8' we will see garbage where the memory copy wrote the wrong data to the wrong position. For the nitpicks, this example used per-byte, there are 16, 32, 64 bit variants of the functions that would cause similar problems bit in 2, 4, 8 byte chunks.
      --
      from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
    4. Re:EVERYBODY PANIC!!! by EkriirkE · · Score: 2, Informative
      Oops, source memory was supposed to be (better aligned, too):

      Address: 0123456789ABCDEFGHIJKLMNOPQRSTUV
      Content: XXX-!@#$A car is heavy.-XXXXXXXX
      --
      from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
    5. Re:EVERYBODY PANIC!!! by Neon+Spiral+Injector · · Score: 5, Insightful

      The rules of the road say that you should check that the car is in drive before setting out on your trip. The older version of GCC used to put the car into drive for you. But the new version lets you leave it in reverse if you don't check making you exit out the rear wall of your garage.

    6. Re:EVERYBODY PANIC!!! by Anonymous Coward · · Score: 0

      That was excellent. Thank you!

    7. Re:EVERYBODY PANIC!!! by teh+loon · · Score: 1

      Ahh yes.. another brilliant car analogy on slashdot. :)

    8. Re:EVERYBODY PANIC!!! by SL+Baur · · Score: 1

      I'm sorry, I'll need a car analogy on that one. It means that you are never sure if your car is in gear or in reverse. So you don't know which direction you will go when step on the gas.
    9. Re:EVERYBODY PANIC!!! by dido · · Score: 1

      I wonder if anyone still actually uses the old LODS/STOS/MOVS/CMPS instructions, and these are the only instructions affected by the direction flag. As far as I can tell, on modern x86 systems they are significantly slower than the equivalent multi-instruction versions that read/write/compare via register indirection, i.e. RISC-style code, and they are even slower yet than using MMX or SSE instructions to copy data, if they are available. I don't think that compilers are smart enough to use, say, a MOVSD instruction when they see a *p++ = *q++ in someone's code, as that would require setting up the direction flag, setting the ESI and EDI registers correctly, and possibly ECX as well, to do a REP MOVSD properly. CISC-style instructions often have strange requirements like this. The only way that these instructions that do care about the direction flag could plausibly appear in actual code is if someone wrote them in assembly explicitly, and it may be that the glibc code for something like memcpy uses it, but then on a Pentium or more recent processor the four-instruction equivalent to movsd: mov eax,[esi] add esi,4 mov [edi],eax add edi,4 (for some suitable ordering of the add instructions) would be faster than a movsd because of instruction pipelining. Correct me if I'm wrong but these instructions haven't been worth using on x86 since at least the Pentium, where instruction-level parallelism can blow the performance of these older instructions out of the water.

      --
      Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
    10. Re:EVERYBODY PANIC!!! by faragon · · Score: 1

      These "quick operations" are not quick anymore; on modern -out of order- x86 procesors (P4, PentiumM/Core, K7, K8), explicit string search (still without using SIMD tricks) is from 2x to 3x -using SSE2 prefetch- faster than the microprogrammed code, as you can unroll loops without conditional jump penalty.

    11. Re:EVERYBODY PANIC!!! by EkriirkE · · Score: 1

      MS Windows uses REP MOVSD and a topoff with REP MOVSB in its internal string memory copy (RtlCopyMemory/RtlMoveMemory), as does the MSVCRT... And I do as well :)
      When Windows has to backwards copy on overlap it does STD then immediately CLD after the REP operation, and also ensures the CLD on a forward copy. MSVCRT does not ensure it on the forward copy.

      --
      from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
    12. Re:EVERYBODY PANIC!!! by Anonymous Coward · · Score: 0

      It means a bunch of programs compiled with gcc-4.3 just plain don't work on x86 Linux.

      For example, libc on Debian unstable switched to being built with gcc-4.3 a couple weeks ago. Ever since then, SBCL (arguably the most popular free Lisp compiler) has been completely broken: it can't even install itself (it drops you into a debugger with "GC invariant broken").

      I suspect that just about any program that wants to talk to libc on its own (i.e., not through C code compiled by gcc) is going to barf now. Lisp compilers, sadly, tend to be easy victims, because (a) they're native compilers, (b) they usually provide easy access to some useful libc calls, and (c) they compile their own code (i.e., not through gcc or any C compiler).

      If only Lisp programmers were satisfied with an interpreter written in C, like the Ruby guys! Why oh why did we insist on having a dynamic language with good performance? (Yeah, I could use Clisp, which is byte-compiled and slow. It works fine still. But it doesn't even support threads, which my program needs.)

    13. Re:EVERYBODY PANIC!!! by faragon · · Score: 3, Informative

      Some examples, actual bencharks (2 years old, but are pretty the same with K8 and Core2Duo):


      REPNE SCASD: (look element into sequential dword vector)

      Pentium II @300MHz: 133 MB/s (100MHz FSB, 100MHz SDRAM)
      Pentium IV @3GHz: 2.3 GB/s (800MHz FSB, 400MHz DDR SDRAM)


      256-bit uprolling: (process 8 elements in a row)

      Pentium II @300MHz: 233MB/s (100MHz FSB, 100MHz SDRAM)
      Pentium IV @3GHz: 3.3 GB/s (800MHz FSB, 400MHz DDR SDRAM)


      256-bit uprolling w/ SSE2 prefetch to increase data cache hit: (process 8 elements in a row)

      Pentium II @300MHz: -no SSE2- (100MHz FSB, 100MHz SDRAM)
      Pentium IV @3GHz: 4.0 GB/s (800MHz FSB, 400MHz DDR SDRAM)



      P.S. Both REP MOVSB and REP MOVSD are slow: the performance per clock is between 1/8 and 1/16 in the first and between 1/2 and 1/4 in the second. The is no reason for using the microcoded instructions other than backwards compatibility, but it seems nonsense to me to save 16KB to write unrolled and/or prefetched memcpy/memmove/scan variants.

    14. Re:EVERYBODY PANIC!!! by RupW · · Score: 4, Informative

      The rules of the road say that you should check that the car is in drive before setting out on your trip. The older version of GCC used to put the car into drive for you. But the new version lets you leave it in reverse if you don't check making you exit out the rear wall of your garage. That's not quite right. In this case:
      • the rules of the road say that you can assume you'll find your car in drive
      • the old version of GCC used to always check anyway and put the car in drive for you; the new version just assumes the car is already in drive, because that's what the rules say.
      The problem comes when an affected kernel temporarily hands your car over to a signal handler - let's say "parking valet". The valet now doesn't bother checking the car is in drive when he gets in, because the rules of the road say the kernel should have given him the car in drive. In the past GCC looked over his shoulder to make sure the kernel had really left the car in drive for him. But now no-one bothers checking for him and he might then accidentally crash your car.

    15. Re:EVERYBODY PANIC!!! by petermgreen · · Score: 1

      do you know of anything that was broken besides SBCL (which was how this was discovered in the first place).

      BTW I belive the intention of debian is to attack this problem from all sides. Afaict SBCL is being changed to keep the direction flag set for as short a time as possible. gcc is being changed to return to the older less likely to fail behaviour and linux is being changed to do what it should have done in the first place.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    16. Re:EVERYBODY PANIC!!! by faragon · · Score: 1

      You're right. These instruction became useless -because a lot faster implementation was possible- since the Pentium Pro, being introduced out-of-order execution and enhanced branch prediction. I'm not sure about unrolling can be actually much faster on the original Pentium, but I'm convinced that you could be able to get a notable speed-up, if not in the string scan case, at least in the memcpy case using the FPU for 64-bit transfers.

    17. Re:EVERYBODY PANIC!!! by Anonymous Coward · · Score: 0

      Simply because Microsoft uses it doesn't mean it's any good. Depending on CPU, prefetch and using FPU (pre-MMX)/MMX/SSE registers is a clear performance win compared to rep movsd. Heck, rep movsd was only hot in pre-486 days!

      Currently movq/movdqa beats rep movsd by 1.05 - 3x depending on CPU model and manufacturer. Newest Intel CPUs (P4 & anything 'core') get about 5-50%. Newest AMD (models can get up to about 100% boost. Smaller benefit on small transfers due to extra setup cost. You can of course mitigate that by doing only 128-bit aligned transfers. Most everything is 16-byte aligned these days anyways, might just as well take advantage of that...

    18. Re:EVERYBODY PANIC!!! by MMC+Monster · · Score: 1

      (I am not a kernel/gcc developed) This is what it mean in a practical sense: All the userland applications that come precompiled on websites for various distributions (including (binary only) .rpm, .deb, and .tar.gz files) will not work with the newer versions of the distributions (once they take up this gcc/kernel patch). This means a cleaning house of a lot of binaries created in the last couple years.

      --
      Help! I'm a slashdot refugee.
    19. Re:EVERYBODY PANIC!!! by TheRaven64 · · Score: 1

      Note that this is only true if strlen uses the string instructions. Since using the string instructions for anything is just a plain bad idea, if this is actually a problem for anyone then it's likely to be a compiler bug.

      --
      I am TheRaven on Soylent News
    20. Re:EVERYBODY PANIC!!! by Anonymous Coward · · Score: 0

      ... the parking valet with someone more in charge of the situation. Voila, problem solved.

      If all things are as cars, we would have a beautiful world.

    21. Re:EVERYBODY PANIC!!! by Bert64 · · Score: 1

      That assumes code compiled to target a pentium pro or newer...
      A lot of code is compiled to target much older processors, so it may well still use these outdated instructions.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    22. Re:EVERYBODY PANIC!!! by Anonymous Coward · · Score: 0

      Damn Gentoo users -- always trying to eek out a measly performance gain!

    23. Re:EVERYBODY PANIC!!! by RupW · · Score: 1

      (I am not a kernel/gcc developed) This is what it mean in a practical sense: All the userland applications that come precompiled on websites for various distributions (including (binary only) .rpm, .deb, and .tar.gz files) will not work with the newer versions of the distributions (once they take up this gcc/kernel patch). This means a cleaning house of a lot of binaries created in the last couple years. No, not at all. Any existing binary compiled with GCC 4.2.x or earlier will never show this issue whether it's running against a kernel without the fix or not.

      The issue only occurs when a binary compiled with GCC 4.3 or later is run against a kernel that hasn't been fixed. Even then, things will only go wrong if a signal handler that uses string operations is triggered in the middle of a similar operation in the code that's running with the direction flag set, e.g. an overlapped memmove. It's very rare.
    24. Re:EVERYBODY PANIC!!! by faragon · · Score: 1

      That kind of unrolling has to be handwritten in assembly, as despite current compilers are good, are not good enough to give the last bit of performance.

      I know you were joking, but these days, in both in-order and out-of-order CPUs, there is very little code that it is worth to handwrite in assembly (only vector code, and it may be matter of time to get good enough vectorized code). Good C/C++ handmade optimization, helping the compiler with the strength reduction and constant propagation phases, with some explicit/manual unrolling, trying to avoid data cache misses, is almost as fast as writing in assembly (one data cache miss is worth more than any superb assembly optimization, as current RAMs have huge latencies -from 40 to 200 cycles-).

    25. Re:EVERYBODY PANIC!!! by faragon · · Score: 1

      I posted a benchmark that without using special instructions, just some unrolling, gives about 2x speed-up on Pentium Pro and above, without losing backward compatibility. For extra speed-up, you can use SSE2 prefetch, but it is not necessary.

    26. Re:EVERYBODY PANIC!!! by LarryWest42 · · Score: 1

      You're driving in your new gcc-4.3.0-built car, when you come to a traffic light that's red. When it changes to green, your car may have silently slipped into reverse, but you won't know until you press the gas pedal.

      For standard transmissions: you car was written in assembly language, so no problem here.

    27. Re:EVERYBODY PANIC!!! by LarryWest42 · · Score: 1

      Actually, of course, assembly language would have the same problem, but it would be just as relatively unlikely to show up, unless your code set the direction flag.

      Note that the kernel problem actually will affect any program, pre-gcc-4.3.0 or other compilers: there is a very small but finite probability that an interrupt will occur after setting the direction flag and before executing the memory copy or scan. And (with this bug) there's a non-zero probability that the interrupt handler will leave the direction flag in a different state.

      It seems like an extremely unlikely event, but one that's probably caused some number of unexplained (and irreproducible) glitches over the past several years.

    28. Re:EVERYBODY PANIC!!! by Anonymous Coward · · Score: 0

      > I'm sorry, I'll need a car analogy on that one

      Your wife won't let the car in neutral gear anymore, so you must get used to check the lever or you'll make a new door in the garage wall anytime soon!

    29. Re:EVERYBODY PANIC!!! by Toonol · · Score: 1

      You pull up and stop at a red light. You notice that you are about two feet over the line, so you put your car in reverse and back up slightly. You wait. Eventually, the light turns green, so you accelerate, and crash into the car behind you.

      The kernel (the driver) is supposed to verify that the direction flag (gearshift) is in the right position during every acceleration. But the driver is lazy, and wasn't checking. He assumed he didn't need to check, since the gearshift is generally always in "drive" at a redlight.

      I think the GCC compiler is represented by "stopping at an intersection" in that instance.

    30. Re:EVERYBODY PANIC!!! by Anonymous Coward · · Score: 0

      there is very little code that it is worth to handwrite in assembly (only vector code, and it may be matter of time to get good enough vectorized code).
      Don't forget atomic primitives for parallel code also. If you have a correct algorithm that only uses compare-and-swap it's quite likely to out-scale code that uses a higher-level construct such as locks to synchronize.
    31. Re:EVERYBODY PANIC!!! by faragon · · Score: 1

      Well, yes, but for parallel code the best synchronization is the one that can be delayed or completely avoided. I would rather prefer to use an algorithm that requires 100 critical region access per second using the OS primitives rather than a 100,000 accesses using "optimized" critical region code (i.e. favor better algorithms rather than optimize a low frequency operation).

    32. Re:EVERYBODY PANIC!!! by Anonymous Coward · · Score: 0

      do you know of anything that was broken besides SBCL (which was how this was discovered in the first place).

      I searched with apt for "compiler", and one of the first ones I looked at had a gcc-4.3 bug: gcl.

      BTW I belive the intention of debian is to attack this problem from all sides. Afaict SBCL is being changed to keep the direction flag set for as short a time as possible. gcc is being changed to return to the older less likely to fail behaviour and linux is being changed to do what it should have done in the first place.

      Yes, at breakneck Debian speed. 2 weeks ago today they had patches for SBCL and Linux 2.6.24, and fixing gcc (revert) or libc6 (build with gcc-4.2) are trivial -- and yet none of these fixes has appeared in sid yet.
    33. Re:EVERYBODY PANIC!!! by Anonymous Coward · · Score: 0

      > EVERYBODY PANIC!!!
      Especially the Kernel.

    34. Re:EVERYBODY PANIC!!! by petermgreen · · Score: 1

      I searched with apt for "compiler", and one of the first ones I looked at had a gcc-4.3 bug: gcl
      While the cause of that bug hasn't been tracked down yet it doesn't look like the bug this article is about to me.

      Yes, at breakneck Debian speed. 2 weeks ago today they had patches for SBCL and Linux 2.6.24, and fixing gcc (revert) or libc6 (build with gcc-4.2) are trivial -- and yet none of these fixes has appeared in sid yet.
      Yeah, debian can be a bit slow at times, especially when it isn't an immediate problem for debian testing ().

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    35. Re:EVERYBODY PANIC!!! by 644bd346996 · · Score: 1

      Almost doubling the throughput of a memory to memory copy is not a measly performance gain. Sure, it may not be noticeable for desktop usage, but for gaming or scientific computation, that's a pretty big gain.

    36. Re:EVERYBODY PANIC!!! by ignavus · · Score: 1



      It just means that the kernel could think it was in forward gear when it was really in reverse gear. ("Somebody moved the gear lever when I wasn't looking!")

      That *could* make a difference to the outcome.

      And yeah, I would prefer a kernel that knew which gear it was in before it stepped on the gas.

      --
      I am anarch of all I survey.
    37. Re:EVERYBODY PANIC!!! by CTachyon · · Score: 1

      But what about performance for shorter strings? Which has the greater setup overhead, REP or vector? And on which CPUs?

      These are the sorts of questions that GCC developers have to code for, and they've already decided that the REP instructions are better in some situations.

      --
      Range Voting: preference intensity matters
    38. Re:EVERYBODY PANIC!!! by dido · · Score: 1

      Read my earlier post. I don't think I ever saw any compiler, that would have been smart enough to convert a typical C idiom such as for (i=0; i

      --
      Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
    39. Re:EVERYBODY PANIC!!! by dido · · Score: 1

      Oops, seems there's a bug in Slashdot. There was a snip of C code that Slashdot misparsed. Here's the rest of what I said, with the code converted into prose.

      Read my earlier post. I don't think I ever saw any compiler, that would have been smart enough to convert a typical C idiom for moving memory into a rep movsd, with all of its odd requirements. I don't think Gcc was ever smart enough to do this kind of optimization (which is a space optimization at best; in practice it can actually result in code expansion because of the register save and set up overhead). There's a reason why Intel has, ever since the Pentium class of processors, been saying that these complex instructions are evil. The only way such an instruction sequence could plausibly appear in any code would be if a human put it there in assembly language somehow. I think few real-world compiler writers have ever bothered to do the work necessary to satisfy the demands of such idiosyncratic CISC instructions. It could appear in a library routine somewhere, such as an assembly optimized version of memmove or strcpy, but again, why would you still be using a library routine that uses a technique that has been suboptimal for nearly a decade?

      --
      Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
    40. Re:EVERYBODY PANIC!!! by faragon · · Score: 1

      Well, you can check for string length before applying one or another snippet.

    41. Re:EVERYBODY PANIC!!! by CTachyon · · Score: 1

      Well, you can check for string length before applying one or another snippet.

      GCC already does that; that's my entire point. You can't just look at one benchmark on copying long strings and say "Well, there's obviously no situation where the REP instructions are faster", which is how I read your previous post.

      Instead, they've got compile-time logic that spots fixed sizes (e.g. memcpy(dst, src, sizeof(struct foo));) and inlines the appropriate code for the target CPU, and then they've got more flexible logic for when the length is variable that does a few quick size comparisons at runtime. (They already need to check for word alignment anyway, so it's not outrageous overhead.)

      I saw a previous Slashdotter claim that the CLD instruction can take ~50 cycles on a modern processor (presumably because the pipeline has to clear before %EFLAGS can change). If that's true, then it's very understandable that the GCC folks would be chomping at the bit to strip those CLD's out wherever they can, because then the REP loops are reasonable in many more scenarios.

      --
      Range Voting: preference intensity matters
  8. What this really exposes... by suck_burners_rice · · Score: 2, Insightful

    What this really exposes is not a bug in any kernel. Indeed, the story states that the "bug" exists in both the BSD and Linux kernels. It really exposes something fascinating about the development process: Code is written based on certain assumptions and a working theory of how the code will function once put into use, but the only way to really know how well it works is to hand it over to the ultimate judge of code correctness--the computer--by running the code. If it works, case closed. Now it's entirely possible that the kernel developers never heard of this obscure nuance of the Intel processor. Then one day, the compiler changed, and with it, the assumptions changed. Mature code that has been declared good years ago seemingly breaks. Now it's easy to blame the code, but really this is a deletion of a feature from the compiler. Nevertheless, it exposes the fact that ultimately, no matter what tools we use and no matter how well we think our code through, you can only consider the code good once it runs and appears to do what it's supposed to.

    --
    McCain/Palin '08. Now THAT's hope and change!
    1. Re:What this really exposes... by noidentity · · Score: 1

      In summary, it's a bug in the ABI documentation; apparently the direction flag must be considered undefined in this case. Fixing the documentation won't break any current code.

    2. Re:What this really exposes... by HonIsCool · · Score: 2, Funny

      Hehe, I'm going to try that approach the next time I'm assigned a bug: "No, it's not the code that's wrong, it's the specification."

      --
      "Give me six lines of C++ code written by the most competent programmer, and I will find enough in there to hang him."
    3. Re:What this really exposes... by Alex+Belits · · Score: 5, Informative

      It really exposes something fascinating about the development process: Code is written based on certain assumptions and a working theory of how the code will function once put into use, but the only way to really know how well it works is to hand it over to the ultimate judge of code correctness--the computer--by running the code. If it works, case closed. Please don't ever again offer your great insight into software development process. If everything was stuffed into the kernel (or other software projects) once it compiles and runs, we would drown in unstable, crashing, insecure, impossible to debug code. Without any doubt, there are plenty of geniuses (some of them in Northwestern US) who develop in this manner, but I can assure you, neither Linux kernel, nor GCC, glibc or other major open source projects use this procedure. If you want to discuss this method further I recommend you to send your opinion to a friendly individual at djb@cr.yp.to .

      Before anything is released, people have to LOOK AT THE CODE and make sure that the source gives them a reason to think, it will run correctly when used with interfaces that it is supposed to utilize or provide. There are plenty of things in the kernel that would require massive amount of testing to be verified with any certainty, so people write usable code not because they are testing it until their hardware breaks but because they know what they are doing.

      Now it's entirely possible that the kernel developers never heard of this obscure nuance of the Intel processor. Then one day, the compiler changed, and with it, the assumptions changed. Mature code that has been declared good years ago seemingly breaks. Now it's easy to blame the code, but really this is a deletion of a feature from the compiler. Nevertheless, it exposes the fact that ultimately, no matter what tools we use and no matter how well we think our code through, you can only consider the code good once it runs and appears to do what it's supposed to. What the hell are you talking about?

      Code generated by a C compiler remains consistent regardless of the version, unless you mix binaries built with different versions of GCC. When code that kernel uses to pass control to applications' signal handlers does not keep the direction flag as it is supposed to according to ABI, then userspace code -- ANY CODE THAT CONTAINS SIGNAL HANDLERS -- compiled by a new compiler will not work correctly. In other words, kernel provides an interface that is incompatible with binaries made by a new GCC, and since the standard is on the side of the new GCC behavior, it's kernel that has to be changed. That's all. Nothing else is involved -- some code compiled with a new compiler will not work on an old kernel. Code compiled with an old compiler remains usable with a new kernel, no sources except for five lines in the kernel have to be changed. It's not even something that a C programmer has any control over unless he writes pieces of his program in assembly -- and then he should know. I don't even believe, any for a C programmer who knows how to write a signal handler it's possible that he "never heard of this obscure nuance of the Intel processor" -- both are very rarely used directly -- however this is completely irrelevant, the only sources that have to be changed are five lines in the kernel, not in signal handlers.

      The only real problem this "exposes" is that for some reason everyone who used x86 SysV ABI for anything that matters (Linux and BSD), decided to change the interface to exclude the requirement to clear the direction flag, even though that "official" standard said otherwise -- however it was known from the very beginning, and this is why older C compiler taken it into account in the first place. It's not a bug or someone's lack of knowledge, it's a violation of a standard, and GCC developers decided to get things back to the letter of a standard because the compiler's optimization benefits from it.
      --
      Contrary to the popular belief, there indeed is no God.
    4. Re:What this really exposes... by mav[LAG] · · Score: 1

      Now it's entirely possible that the kernel developers never heard of this obscure nuance of the Intel processor.

      Far from being an obscure nuance, CLD and STD are just ordinary instructions which tell the processor which direction the next SCAS, LODS or STOS intruction must go. They are explained very early on in most assembly tutorials that I've come across.

      A kernel developer who's never heard of the processor's direction flag has no business writing kernel code.

      --
      --- Hot Shot City is particularly good.
    5. Re:What this really exposes... by Anonymous Coward · · Score: 1, Funny

      What the hell are you talking about?

      You two are on different wavelengths. Consider that before you open your little can of whoop-ass next time. BTW, your attitude is a real turn-off which makes hearing your point much more difficult.

    6. Re:What this really exposes... by Alex+Belits · · Score: 1

      You two are on different wavelengths. Did FCC allocate a special wavelength for things that are utterly and hopelessly wrong?

      Consider that before you open your little can of whoop-ass next time. BTW, your attitude is a real turn-off which makes hearing your point much more difficult. Oh noes, my popularity among Anonymous Cowards is getting dangerously low! What should I do? What should I do?
      --
      Contrary to the popular belief, there indeed is no God.
    7. Re:What this really exposes... by Anonymous Coward · · Score: 0

      Prunes. Try Prunes.

    8. Re:What this really exposes... by gnasher719 · · Score: 1

      In summary, it's a bug in the ABI documentation; apparently the direction flag must be considered undefined in this case. Fixing the documentation won't break any current code. That is wrong. There are by definition no bugs in the ABI documentation: The ABI is what the ABI documentation says. The ABI says the direction flag is always cleared on entry of any function. Since it is the OS that calls signal handlers, it is the duty of the OS to make sure that everything is set up according to the ABI when a signal handler is called. The OS doesn't do that, so it is a bug in the OS.

      A bit off topic, that is one of the major problems with OOXML. Some people think that OOXML creates a standard that describes the file format of Microsoft Office documents. They are wrong. OOXML tries to create a standard where the standard is whatever the OOXML document says. According to the problems that people have found in the proposed standard, MS Office users can only hope that their documents are not compatible with OOXML, because if they are, then many things will stop working.
    9. Re:What this really exposes... by Anonymous Coward · · Score: 0

      I don't even believe, any for a C programmer who knows how to write a signal handler it's possible that he "never heard of this obscure nuance of the Intel processor" -- both are very rarely used directly
      Do you know what a signal handler is?
    10. Re:What this really exposes... by springbox · · Score: 1

      I agree that the original poster's comments were particularly absurd at points, but seriously, you don't need to rip into them like that to prove your point.

    11. Re:What this really exposes... by Anonymous Coward · · Score: 0

      Works as coded!

    12. Re:What this really exposes... by Alex+Belits · · Score: 1

      Yes.

      People hate to write their own signal handlers because they are called completely asynchronously to the process, leaving very little room for doing anything complex in there without creating a race condition. Usually libraries configure signal handlers for things like errors, timers or I/O, so the only places where application programmer has to make his own signal handler are setting flags for daemons exit or reload, or custom cleanup procedures. That's about once per project, and usually done by someone who is well aware of how CPU state looks like.

      --
      Contrary to the popular belief, there indeed is no God.
    13. Re:What this really exposes... by jonaskoelker · · Score: 1

      It really exposes something fascinating about the development process: Code is written based on certain assumptions and a working theory of how the code will function once put into use, but the only way to really know how well it works is to hand it over to the ultimate judge of code correctness--the computer--by running the code. If it works, case closed.

      Please don't ever again offer your great insight into software development process. If everything was stuffed into the kernel (or other software projects) once it compiles and runs, we would drown in [bad] code. [...] Before anything is released, people have to LOOK AT THE CODE and make sure that the source gives them a reason to think, it will run correctly when used with interfaces that it is supposed to utilize or provide.


      It's clear to me that your parent doesn't suggest "it compiles; ship it". Successful compilation and execution is a necessary, not sufficient, condition for code to be considered shippable (:=? correct). That is, "shipped implies verified in testing".

      Also, isn't it reasonable to assume that the part where you LOOK AT THE CODE is the part where you form a working theory of how the code will function? I think that's what P meant. Once you've looked at the code, you then proceed to testing.

      Say you want to argue that the parent's proposition is wrong. This means code is shipped without being verified in testing. Do you propose we don't test our code, or do you propose we ship code that fails our test suite? Or have I missed a third option?

      Mature code that has been declared good years ago seemingly breaks.

      Userspace code -- ANY CODE THAT CONTAINS SIGNAL HANDLERS -- compiled by a new compiler will not work correctly.

      It seems to me that you agree with your parent. Code breaks when built with the new compiler and ran against a faulty kernel.
    14. Re:What this really exposes... by Alex+Belits · · Score: 1

      It's clear to me that your parent doesn't suggest "it compiles; ship it". Successful compilation and execution is a necessary, not sufficient, condition for code to be considered shippable (:=? correct). That is, "shipped implies verified in testing". If the test is "ultimate", it means, it must be sufficient.

      Also, isn't it reasonable to assume that the part where you LOOK AT THE CODE is the part where you form a working theory of how the code will function? I think that's what P meant. Once you've looked at the code, you then proceed to testing. Both looking and testing are done AFTER the code is written, so at best they can confirm that someone indeed wrote something usable. To make sure that programmers follow a standard they have to actually understand it, and write in a manner compatible with it, not rely on testing catching their mistakes and misunderstandings.

      Say you want to argue that the parent's proposition is wrong. This means code is shipped without being verified in testing. Do you propose we don't test our code, or do you propose we ship code that fails our test suite? Or have I missed a third option? Most of interfaces are too complex to actually show all incompatible code in tests, and security problems are pretty much defy any usable testing procedure. Tests don't really verify anything, they at best provide a really, really coarse net to fish out most egregious mistakes, so the most important part of development is writing code correctly, not testing it.

      It seems to me that you agree with your parent. Code breaks when built with the new compiler and ran against a faulty kernel. No. There are no "bugs", and nothing is "faulty". All code acts exactly in the same manner as what accepted version of a standard prescribes, the only difference is that accepted version of a standard used by BSD and Linux was not the same as SysV ABI specification. Neither Linux nor BSD are System V, and they don't run System V binaries, so there is nothing unusual in the difference between their standards. If Linux and BSD were expected to follow SysV ABI, GCC would not clear the direction flag in the first place.
      --
      Contrary to the popular belief, there indeed is no God.
    15. Re:What this really exposes... by ignavus · · Score: 1

      "Now it's entirely possible that the kernel developers never heard of this obscure nuance of the Intel processor."

      I am a really amateur assembly language programmer. I know that the setting of the direction flag changes the direction of the string operation. That is what the flag is there for.

      I really, really, REALLY would hope that the kernel hackers knew what I know.

      And more. Lots more.

      --
      I am anarch of all I survey.
  9. Telling your age by symbolset · · Score: 1, Funny

    1991 was a long time ago. Linux is old.

    --
    Help stamp out iliturcy.
    1. Re:Telling your age by Anonymous Coward · · Score: 0

      If you think 1991 was a long time ago, what does that say about your age? :)

    2. Re:Telling your age by Anonymous Coward · · Score: 0

      17 years is a long time in high-tech years regardless of your age. If you think computers or computing haven't change much in that time, you might want to think again.

  10. [LWN subscriber-only content] by Chris+Pimlott · · Score: 4, Insightful

    This article is not yet public for non-subscribers. The link given is supposed to be for a subscriber to forward to a friend; putting it up on Slashdot goes against the intended spirit and does not help support Linux Weekly News, which deserves the community's support.

    1. Re:[LWN subscriber-only content] by Anonymous Coward · · Score: 1, Interesting

      Whatever happened to "information wants to be free"?

    2. Re:[LWN subscriber-only content] by totally+bogus+dude · · Score: 2, Insightful

      Alternatively it's a good way to get additional exposure for LWN, as clearly this article is of some value. Maybe 0.0001% of slashdot readers will subscribe because of this.

      Besides, we're all friends here, aren't we?

    3. Re:[LWN subscriber-only content] by rsidd · · Score: 0, Flamebait

      Indeed, Slashdot is becoming a disgrace. They could have waited another day: the article becomes freely available on March 20.

    4. Re:[LWN subscriber-only content] by gambolt · · Score: 5, Funny

      Information wants to be free. Bandwidth wants to cost money.

    5. Re:[LWN subscriber-only content] by martin-boundary · · Score: 2, Funny

      They could have waited another day: the article becomes freely available on March 20.
      Bloody northern hemisphere drongos! Some of us have the shrimps on the barbie a day earlier than the rest of youse insensitive clods!
    6. Re:[LWN subscriber-only content] by Cro+Magnon · · Score: 2, Funny

      Well, they could always link again in the dupe.

      --
      Slow down, cowboy! It has been 4 hours since you last posted. You must wait another few hours.
    7. Re:[LWN subscriber-only content] by Corbet · · Score: 3, Informative

      FWIW, I originally posted the subscriber link in question to reddit yesterday. I'm surprised to see it show up here, but I also don't mind that it has happened. I'd just as soon not see all LWN content on Slashdot as subscriber links (Slashdot readers probably agree), but this one has brought some attention and, I think, some subscribers. And that's where LWN content comes from in the first place.

      --
      Jonathan Corbet, LWN.net
    8. Re:[LWN subscriber-only content] by Hatta · · Score: 1

      If LWN didn't want the world to see it, they could have implemented access controls. How is this any different than the MobiTV silliness from a couple weeks ago?

      --
      Give me Classic Slashdot or give me death!
  11. Re:Linux is full of critical bugs by Psychotria · · Score: 1

    With all due respect, an application that uses strcpy will not necessarily bring a system down (nor introduce buffer overflows, or whatever). strcpy used properly is quite safe. Sure, if you strcpy to some unknown memory address of some unknown size then this will cause problems. That is not a strcpy fault, but a programmer fault. strncpy is not inherently "better". To say that "Any application that performs a simple strcpy brings linux down" is FUD.

  12. Re:Linux is full of critical bugs by chromatic · · Score: 1

    Microkernel? I think so...

    Microkernels have to follow processor ABIs too.

  13. Re:EVERYBODY PANIC!!! (Car Version) by Anonymous Coward · · Score: 1, Funny

    It's like you got a bunch of cars at a stoplight and you want to walk by each to panhandle for money but instead of starting at the first car in line and the walking down to the back, you start at the first then head out into cross traffic and get run over and something crashes.

  14. mod parent up by sydneyfong · · Score: 1

    I also wrongly assumed GP's view until I actually RTFM-ed....

    --
    Don't quote me on this.
  15. Translation? by Raul654 · · Score: 0, Redundant

    Can someone please explain that in terms that non-LKML subscribers can understand?

    --


    To make laws that man cannot, and will not obey, serves to bring all law into contempt.
    --E.C. Stanton
    1. Re:Translation? by rdebath · · Score: 1
      Probable local root exploit (Normal local user gets root access) with ability to install rootkit.
      Likely to also give 'real root' access on 'vserver' machines that have fake root accounts.
      Unlikely to directly give a remote exploit, but would likely mean that any remote exploit becomes a remote root exploit.

      However, as at present no exploit is known and it's 'only' a local exploit the Microsoft evaluation of this would be patch in the next service pack.

  16. Gnearly Perfect by rhinokitty · · Score: 1

    See, I told you we shoulda' used the Hurd!

    1. Re:Gnearly Perfect by Vlad_the_Inhaler · · Score: 1

      Taking that joke seriously, this GCC level totally breaks Hurd.
      It affects (breaks) some applications running under Linux or the BSDs but it kills Hurd directly.

      --
      Mielipiteet omiani - Opinions personal, facts suspect.
  17. History repeating by Brett+Johnson · · Score: 2, Informative

    I seem to recall the MS-DOS 2.x suffered this same problem with either the Int 21 or Int 13 interfaces. (Hey it was 20 years ago, I don't remember the details.) If you made certain BDOS calls with the direction flag set, the message "A evird rorre etirw daeR" ("Read write error drive A" backwards) would be displayed on the console. It wasn't fixed for years. I remember we rigorously enforced the "Clear the direction flag before calling into MS-DOS" rule.

  18. Feel the power (of Google) by Mathinker · · Score: 1

    http://www.urbandictionary.com/define.php?term=metric+fuckton

    > ... 4chan ...

    Now I know you're trolling --- who could be familiar with 4chan and not "fuckton"?

    1. Re:Feel the power (of Google) by Anonymous Coward · · Score: 0

      who could be familiar with 4chan and not "fuckton"?

      Cancer etc.

  19. One more thing. by EkriirkE · · Score: 1

    This is assuming the flag is unmodified from the kernel call, saying the string function is called or entered from the kernel. But if the string functions get called mid-code and the flag is changed be some other function, say a memmove that has an overlapping source and destination, the direction flag is set (STD) and the memory copied backwards end-to-start to prevent the beginning being copied over and over by the overlap.
    Does GCC's memmove clear the flag (CLD)?
    What if someone writes some custom inline assembly with a STD and no CLD (yes, this does violate asm practice - flipping a flag and not resetting it when done) then a string function sometime after that during the same procedure? GCC will fail.
    GCC should not rely on the kernel to have the flags in a particular state upon entry, as the functions will not always be called immediately.

    --
    from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
    to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
    1. Re:One more thing. by Alex+Belits · · Score: 1

      This is only relevant if function is a signal handler, so kernel may call it with a set direction flag instead of cleared one.

      If it's in a regular program, all flags remain consistent regardless of what kernel does, because flags are initialized when the process starts, saved when the process sleeps, and restored when the process is awakened from sleep, so continuity is preserved. The whole problem with signal handlers is that they may be called while the process had the flag set, so they inherit the flag, and the next string function called from them is executed backwards, wreaking havoc on process' stack or heap. This is what the kernel patch fixes -- flag is now cleared before calling a signal handler.

      --
      Contrary to the popular belief, there indeed is no God.
    2. Re:One more thing. by Anonymous Coward · · Score: 0

      What if someone writes some custom inline assembly with a STD and no CLD (yes, this does violate asm practice - flipping a flag and not resetting it when done) then a string function sometime after that during the same procedure? GCC will fail.

      No, it won't. GCC's inline assembler scanner is pretty damn smart. Read the chapter on argument substitution. The flags are an "argument" even though they are not directly addressable. If you clear the direction bit, GCC definitely knows about it.

  20. most appropriate by Anonymous Coward · · Score: 0

    Maybe it's time to break that. -- Larry Wall in 199710311718.JAA19082@wall.org

    An appropriate quote for the bottom of the page ;)

  21. What about other compilers? by oglueck · · Score: 1

    That means that all other compilers behave like the old GCCs in this case. Otherwise they would have exposed this bug already. So GCCs new behaviour could be seen as either non-standard or "innovative".

    1. Re:What about other compilers? by Vlad_the_Inhaler · · Score: 1

      Nope, other compilers always (?) did it this way - at least according to TFA.
      (There is a list of 'other compilers' in there somewhere)

      --
      Mielipiteet omiani - Opinions personal, facts suspect.
  22. Yes. And? by jimicus · · Score: 1

    Debian, RedHat et al aren't going to release new packages compiled with GCC 4.3.0 for every damn binary. Instead, they'll hold back on providing an update to GCC and they won't compile any updated packages with the updated GCC until the next major release.

    Of course, that's not very helpful if you depend on closed-source software and the vendor won't tell you what compiler they use. Neither is it particularly helpful if you run Gentoo (which sooner or later will expect you to upgrade compiler) or if you're in the habit of compiling packages from scratch using a compiler other than the one that shipped with your distro. But for most of us in the real world, that's not really a huge deal.

  23. That's no GNU'd! by lumbercartel.ca · · Score: 2, Insightful

    Most experienced assembler programmers know better than to assume the direction flag will be set or cleared unless this is specifically documented.

    1. Re:That's no GNU'd! by RupW · · Score: 1

      Most experienced assembler programmers know better than to assume the direction flag will be set or cleared unless this is specifically documented. That's the whole point - it *is* explicitly documented but the old GCC used to explicitly clear it anyway. The new GCC assume everyone's following the documentation and doesn't bother with the extra clear.
    2. Re:That's no GNU'd! by LanceUppercut · · Score: 1

      It has very little to do with "assuming" anything. It has everything to do with the nature of 'signal handlers'. As long as the kernel bug is there, "non-assuming" will not help you at all. This can only be fixed by fixing the kernel.

    3. Re:That's no GNU'd! by marcosdumay · · Score: 1

      Yes, but now an experienced compiler is assuming its state is the documented one, instead of losing time checking.

  24. .0 versions suck, anyway by Anonymous Coward · · Score: 0

    I still use GCC 2.95.3 to compile my kernel, but the developers don't allow it to build 2.6 versions. They're too dumb to fix it and I have to use 2.4.

    Anyway, why would you use a .0 version? It's like running Windows Vista RC0.

  25. Brings linux down - I don't think so by Chrisq · · Score: 1

    OK, I challenge you to find a user-space program that brings linux down using strcpy (as opposed to just crashing that particular program). If you are talking about kernel modules then the same is true of any OS.

    1. Re:Brings linux down - I don't think so by amorsen · · Score: 1

      OK, I challenge you to find a user-space program that brings linux down using strcpy (as opposed to just crashing that particular program). This is extremely easy if the program runs as root.
      --
      Finally! A year of moderation! Ready for 2019?
    2. Re:Brings linux down - I don't think so by Chrisq · · Score: 1

      It is extremely easy for a program run as root/admin to bring down any system without having bugs. If not how can you upgrade system files or even shut down.

    3. Re:Brings linux down - I don't think so by amorsen · · Score: 1

      It is extremely easy for a program run as root/admin to bring down any system without having bugs. If not how can you upgrade system files or even shut down. Not all programs with "root" privileges need the ability to shut down the system. There is no fundamental need to have an all-powerful root account, or indeed a root account at all. It's just hard to remove the root account and still feel like Unix-like to administrators -- e.g. SELinux tries to make a compromise that doesn't trip up too many people, but basically everyone turn it off anyway.

      Anyway, I was just trying to fight the myth that userspace can't kill a Linux kernel.
      --
      Finally! A year of moderation! Ready for 2019?
  26. Re:Yes. And? by Vlad_the_Inhaler · · Score: 1

    They don't need to. All they need to do is release an updated kernel.

    --
    Mielipiteet omiani - Opinions personal, facts suspect.
  27. I fixed this bug in 1989 too by flyingfsck · · Score: 2, Interesting

    I fixed this bug in 1989 in an Intel C compiler. That was some years before the GCC project was started. Some people never learn...

    --
    Excuse me, but please get off my Pennisetum Clandestinum, eh!
    1. Re:I fixed this bug in 1989 too by X3J11 · · Score: 3, Funny

      I fixed this bug in 1989 in an Intel C compiler. That was some years before the GCC project was started. Some people never learn...

      From http://en.wikipedia.org/wiki/GNU_Compiler_Collection:

      Originally named the GNU C Compiler, because it only handled the C programming language, GCC 1.0 was released in 1987, and the compiler was extended to compile C++ in December of that year.

      Perhaps the error in your assertion is a side effect of an uncleared direction flag.

    2. Re:I fixed this bug in 1989 too by LanceUppercut · · Score: 1

      .. but never got through "reading 101" apparently. Once again, Earth to Mars: it is a bug in kernel implementations, not in GCC compiler.

  28. Random DF value by flyingfsck · · Score: 1

    Yup, and another problem is that there are instructions that leave the direction flag undefined, a random value of either 0 or 1. Therefore one has to always explicitly set the direction flag before using it.

    --
    Excuse me, but please get off my Pennisetum Clandestinum, eh!
    1. Re:Random DF value by edwdig · · Score: 1

      And what would these instructions be? It doesn't make any sense for there to be instructions leaving it undefined.

      I've never seen any reference in the Intel manuals to anything that can change it other than STD (Set Direction) and CLD (Clear Direction). (And for the pedantic ones out there, popf, the context switch related instructions, etc... but it's still defined behavior.)

    2. Re:Random DF value by EkriirkE · · Score: 1

      Damn, I was doing to say POPF with unknown stack
      STD
      CLD
      POPF
      IRET/IRETD

      --
      from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
  29. Assembler code by hemanhedman · · Score: 2, Interesting

    Does this mean that you could hand-craft some assembler code that exploits virtually all Linux and BSD-kernels out there?

    1. Re:Assembler code by Alex+Belits · · Score: 1

      No. But you can make code that only has security holes if you compile it with a new compiler (what is absolutely useless).

      --
      Contrary to the popular belief, there indeed is no God.
    2. Re:Assembler code by cyba · · Score: 1

      No

    3. Re:Assembler code by LanceUppercut · · Score: 1

      You mean one piece of code that exploits all kernels? No, it doesn't mean that.

      However, it means that it might be is perfectly possible to exploit each and every kernel that suffers from this issue by writing specific code for it.

  30. Re:Yes. And? by WK2 · · Score: 1

    True. Major distros will hold back on upgrading to gcc 4.3.0. Unless they already upgraded. For the most part, this bug will only cause headaches (and possibly suicides) to people trying to diagnose issues in their code, either because they didn't get the memo, and are using gcc 4.3.0, or because they are helping someone with run-time issues, who are using gcc 4.3.0. If I remember correctly, we had similar problems with gcc 4.0.x. I don't recall any reported deaths.

    --
    Write your own Choose Your Own Adventure. http://www.freegameengines.org/gamebook-engine/
  31. I don't get it by OneSmartFellow · · Score: 1

    Why would GCC make an assumption about a register, shouldn't it (GCC) set the register to a known value if it needs it ?

    1. Re:I don't get it by gnasher719 · · Score: 1

      Why would GCC make an assumption about a register, shouldn't it (GCC) set the register to a known value if it needs it ? Apparently the ABI says: Yes, you can make an assumption about the setting of the direction flag.

      There are also cases where a compiler can make assumptions about the setting of certain floating point states, and obviously the compiler can make an assumption that there is a valid stack pointer and where the return address of the stack pointer can be found. The compiler can also make assumptions that after a call to another function, certain registers will be unchanged.
    2. Re:I don't get it by LanceUppercut · · Score: 1

      In this particular case it is the kernel that makes assumptions, not GCC.

  32. TGIOS (thank God is Open Source) by cabazorro · · Score: 1

    OMG OMG OMG! My kernel is vulnerable!!

    - regs->flags &= ~(X86_EFLAGS_TF);
    + regs->flags &= ~(X86_EFLAGS_TF | X86_EFLAGS_DF);

    make

    done.

    --
    - these are not the droids you are looking for -
    1. Re:TGIOS (thank God is Open Source) by LanceUppercut · · Score: 1

      Perfect, Einstein. Now why don't you go and get a clue about what a signal handler is?

    2. Re:TGIOS (thank God is Open Source) by cabazorro · · Score: 1

      er.. a program that executes when a signal is caught(SIGINT). So what's your excuse, Pointdexter?

      --
      - these are not the droids you are looking for -
  33. Whose ABI is it? by ebcdic · · Score: 1

    The document referred to is (old) SCO's ABI for System V. If Linux and BSD have not been following this ABI in some respects, perhaps the solution is to have a Linux or BSD ABI that reflects real practice, rather than having a gcc that causes problems because it adheres to a System V ABI that is not being followed.

  34. Re:Yes. And? by petermgreen · · Score: 1

    Well debian already packaged the latest glibc in sid using gcc-4.3. That is how this issue was discovered in the first place.

    --
    note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
  35. Re:Yes. And? by Anonymous Coward · · Score: 0

    Neither is it particularly helpful if you run Gentoo (which sooner or later will expect you to upgrade compiler)
    You really have absolutely no idea how Gentoo works, do you?
  36. It's not a bug... by Anonymous Coward · · Score: 0

    It's a feature! Now we know who's been lazy all this while.

  37. Re:Yes. And? by jimicus · · Score: 1

    You really have absolutely no idea how Gentoo works, do you?

    Yes, actually. I've run a whole bunch of servers running it.

    I found the amount of handholding required in order to turn it into a serious system for server use relative to Debian (such as setting up your own private portage repository, custom holding back of packages, updates which are known in forums to break functionality but don't have the good grace to warn you in the ebuild first, updates which completely restructure a package into different component parts, updates which haven't been tested for backward compatability and so break things subtly, meaning that what should be a simple emerge -U <package name> winds up becoming a complex mix of emerge --sync, editing /etc/portage/package.use and package.keywords, editing USE flags, finally emerging the updated package, the fact that you can't easily avoid this unless you're prepared to forsake security updates) really didn't gel with my idea of running a solid system.

    Granted, a major update to GCC will almost certainly wind up in a slot of its own. But sooner or later the version that you're using now will be obsoleted and removed from portage altogether, at which point you either have to put the ebuild in your own private portage repository lest future emerge --update's break things or recompile anything which is at risk.

    Now, most of these issues can be minimised by following practices that any good sysadmin should be anyway - for instance, setting up a test environment and making changes there first before putting them live. All this does, however, is move the risk from the live system to the test environment. It doesn't eliminate any of the work.

  38. Re:Yes. And? by jimicus · · Score: 1

    Maybe, but sid's the unstable repository and is intended for exactly this kind of thing. Even the Debian maintainers strongly recommend against using Sid on a production system because things are far more likely to break and they may stay broken for some time.

  39. Apparently you don't get it by LanceUppercut · · Score: 1

    Apparently you still don't get it. It is not a problem with GCC. It is a bug in the kernel. GCC just helped to detect it. Now, as its' been detected, it is no longer connected to GCC in any way.

  40. Let us blame the correct entity here... by bwalzer · · Score: 1

    Intel is to blame. The original 8086/8088 instruction set was just dumb in this respect. Having a global value (the direction bit) that can determine the behaviour of a class of powerful instructions is a great way to generate all sorts of subtle intermittent bugs. I have been personally burned by this (badly) as have many others.

    There is a policy you can enforce to try to improve things. You can try to make everyone leave the direction bit in the most common state after they are done with their less common use of the string instructions. This can work if the policy is enforced by something like a compiler. It won't work if the program is for instance called by another entity outside the control of the compiler. ...such as, for instance, a kernel calling a signal handler... You end up with a state of affairs where you have to depend on having some other programmer remember to set the bit to the right state to have your string instructions work right. You can't test for this as the bit might be right almost all the time. This is simply a poor approach.

    The only fix that can work reliably for your code is to have the compiler insure that the state of the direction bit is known before any string instructions are executed. If I am for instance using a C compiler I should not have to hear about the 8086/8088 string direction bit. ...ever...

    The kernel people should fix their failure to respect the ABI policy. The GCC people should revert to the old more deterministic handling of string instructions. The almost negligible optimization here is simply not worth generating a lot of intermittent, hard to find problems (there are likely more out there). If other compilers do not make their string functions entirely deterministic in the face of all external influences then those other compilers are doing it wrong. We can't fix the hardware architecture so this is a case where defensive programming is the best that can be done.

    Bruce

    1. Re:Let us blame the correct entity here... by LanceUppercut · · Score: 2, Interesting

      You don't get it either. In a signal-enabled environment there can't be any policy that would ensure the deterministic state of the flag, even if you set it explicitly before each flag-dependent operation. The only way to fix the problem is to make sure that the signal-routing environment meticulously stores and restores the value when handling interruptions. This was not done. This is the the problem in question. It was there all along and it is not related to any compilers. The current version of GCC was simply more likely to reveal it (and it did reveal it), but the problem itself was there since the beginning of time and can lead to problems with any version of GCC, or any other compiler.

    2. Re:Let us blame the correct entity here... by bwalzer · · Score: 1

      ... In a signal-enabled environment there can't be any policy that would ensure the deterministic state of the flag, even if you set it explicitly before each flag-dependent operation. ...

      My understanding of this is that the problem occurs when the kernel fails to clear the direction flag before calling the user space handler. I have heard nothing about anything to do with the hardware interrupt that might of originally caused the signal (which seems to be related to the sort of thing you are referring to).

      Bruce

    3. Re:Let us blame the correct entity here... by LanceUppercut · · Score: 1

      This is just how it happened when the bug was caught. The whole problem is more general: the kernel fails to properly save and restore the direction flag when calling the signal handler. And once one sees the whole picture, one understands that it immediately (and most importantly) applies to asynchronous signals.

  41. Oh No by fluffykitty1234 · · Score: 3, Funny

    I just heard that this has seriously set back the release date of Duke Nukem Forever!

  42. Who uses that anyway ? by billcopc · · Score: 1

    I used to do assembler, and I can't think of any one time I actually used STD. I often issued CLD when writing interrupt handlers, because that was the safe practice, but is it really that useful to reverse string scans at the opcode level ? I can't think of that many places where it would be useful, easily replaced with a manually decremented loop that's not much slower. It always seemed like a risky thing to do in the first place, and I was never fond of issuing CLD all the time "just in case".

    My rant doesn't solve the kernel issue, we'll have to deal with legacy code forever :/

    --
    -Billco, Fnarg.com
    1. Re:Who uses that anyway ? by EkriirkE · · Score: 1

      I can't think of too much for reverse scanning other than to, say, separate a filename from the directory, it's faster to just go backwards from the known length than to go forward and keep record of each path break encountered and take the last occurrence...

      --
      from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
    2. Re:Who uses that anyway ? by billcopc · · Score: 1

      Yep I thought the same, though I mostly just "pop(split('/',file))". Lazy, a bit slower than it could be, but filename processing is the least of my concerns when developing an app.

      --
      -Billco, Fnarg.com
  43. Units... by Anonymous Coward · · Score: 0

    Sorry, I know I need to brush up; how many libraries of congress to a fuckton on average?

    1. Re:Units... by Mr+Z · · Score: 1

      Somewhere around 1 / (thickness of a human hair).

  44. Its not a kernel bug by Frozen+Void · · Score: 1

    If it was "exposed" by changing the compiler, the bug is in the compiler.
    The kernels compiled with earlier GCC versions don't have the bug,right?
    If someone changes the rules, and expects everyone to know it beforehand, it would be a fault in that guy, not the people abiding by the old rules.

    1. Re:Its not a kernel bug by LanceUppercut · · Score: 1

      Wrong. It is like saying that if the program fails on this particular test input, then the bug must be in the input.

      The problem has absolutely nothing to do with the compiler used to compile the kernel itself. You can camile the kernel by any compiler - it will not change anything. The bug is in the kernel implementation itself. And it was there all the time. And it is still there.

      The _user_ code compiler with the new version of GCC just helped to expose it.

      The rules for the kernel did not change in any way at all. The rule has always been there: in ordeder to process [potentially asynchronous] signals properly you have to save (and restore) the immediate state of the user execution environment. The kernel did that. They just forgot to save and restore the direction flag.

  45. Re:Yes. And? by petermgreen · · Score: 1

    yeah, on the other hand it is only because some people do use sid that bugs get spotted before they get a chance to make it into testing.

    --
    note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
  46. The FreeBSD guys appear to have already fixed it by drclaw007 · · Score: 1

    Did some poking around and it looks like FreeBSD have fixed it for the -CURRENT builds : http://groups.google.com/group/mailing.freebsd.current/browse_thread/thread/3df4366ff396a60b/bfc90b9b0a478628/

  47. Career Aptitude Test by Software+Geek · · Score: 2, Insightful

    Please choose the statement that best describes you:
          A) I want to develop programs that are, theoretically, infinitesimally faster, even though they crash whenever I run them in practice.
          B) I want to force those annoying kernel developer fucktards to follow the damn specification.
          C) I want my software to work reliably, even though it means sacrificing performance and putting up with fucktards.

    If you chose A, academia might be right for you.
    If you chose B, consider the public sector.
    If you chose C, you might be suitable for a career in software development.

  48. I am having a hard time understanding why? by Douglas+Goodall · · Score: 1

    I have been writing assembler code for x86 since the beginning. It has always been the coder's responsibility to assure the direction flag is set appropriately before using a repeating instruction. My favorite was "rene scasb". In the old days, we would, pushf ! cli ........ popf to assure the direction flag and place it back where it was before. This used to work in the 8086 time. When reviewing assembler code, I often ask, where is the direction flag set, when I see a repeating instruction. Not setting it explicitly is risky coding.

  49. What did Microsoft ever do for us? by Chrisq · · Score: 1

    Having separate privileges assigned to user accounts rather than a global root is one of the ways that windows is better than traditional unix/linux. W