Slashdot Mirror


Errata Prompts Intel To Disable TSX In Haswell, Early Broadwell CPUs

Dr. Damage writes: The TSX instructions built into Intel's Haswell CPU cores haven't become widely used by everyday software just yet, but they promise to make certain types of multithreaded applications run much faster than they can today. Some of the savviest software developers are likely building TSX-enabled software right about now. Unfortunately, that work may have to come to a halt, thanks to a bug—or "errata," as Intel prefers to call them—in Haswell's TSX implementation that can cause critical software failures. To work around the problem, Intel will disable TSX via microcode in its current CPUs — and in early Broadwell processors, as well.

131 comments

  1. LOL ... Pentium 4? by gstoddart · · Score: 0

    Chips don't add?
    Transactions don't sync?
    Don't be sad,
    don't be a dink.

    Burma Shave!!

    --
    Lost at C:>. Found at C.
  2. Not all that surprising... by K.+S.+Kyosuke · · Score: 5, Interesting

    So, basically, they've just been forced to get rid of the most complex (that's why it's not all that surprising) yet also most beneficial feature with regards to server loads? I'm sure there are some Opterons laughing right now.

    --
    Ezekiel 23:20
    1. Re:Not all that surprising... by Rockoon · · Score: 2

      What of the folks that purchased these chips for these specific instructions? Surely many optimization experts (...assembler gurus) are going to feel quite burned...

      --
      "His name was James Damore."
    2. Re:Not all that surprising... by Predius · · Score: 1

      A feature that has yet to appear in the Xeon line, and Intel claims to already have a fix to bake into the next steppings so... Opterons can go back to being scared of the future.

    3. Re:Not all that surprising... by BitZtream · · Score: 0

      Considering that even with TSX disabled, the chips will still perform above and beyond a comparable AMD CPU in almost every way, I doubt anyone other than fanboys are laughing.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    4. Re:Not all that surprising... by gstoddart · · Score: 5, Funny

      What of the folks that purchased these chips for these specific instructions?

      Same as happens to all early adopters -- the feature may or may not work, and even if it does, there's no guarantee it will be supported (or the same) in the next version.

      This is a pretty big 'errata', which is an awesome marketing speak for "really bad QA".

      Engineers Release Really Awful Tech. Awesome!

      --
      Lost at C:>. Found at C.
    5. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      Bummer one of the features I was looking forward to in mainstream laptops and desktops. Oh well maybe next cpu.

      The big one coming up is the doubling of the integer registers from 16 to 32 in skylake. Which should be interesting in emulation scenarios and code that is kind of knarly.

      This should be interesting too.
      http://en.wikipedia.org/wiki/Intel_MPX

    6. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      Sorry, no current AMD cpus are experiencing errata. Just you boi.

    7. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 2

      I almost became one of those people, that's why I'm mentioning it.

      --
      Ezekiel 23:20
    8. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 1

      Except thar even if the future Xeons with this are going to work, on what machines are developers going to develop the software for them in advance? That's just awesome, isn't it?

      --
      Ezekiel 23:20
    9. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      They can not upgrade their drivers, and the chips will continue to work as they do now. Which may be not very well.

    10. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      See also Pentium 5 and the FDIV bug. It falls under "too bad, so sad, try your luck with the next revision".

    11. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 0

      Awesome, yet another complicated feature forced by the use of unsafe languages. Isn't that spiffy... (I'd expect errata for that as well.)

      --
      Ezekiel 23:20
    12. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      Except with the FDIV bug, there was a whole body of S/W that quit working
          (unless you were using it for govt work where close is good enough.)

    13. Re:Not all that surprising... by gman003 · · Score: 4, Informative

      I'm sure there are some Opterons laughing right now.

      Yes, but some of them take a while to get the joke because their TLB had to be disabled.

      (Certain releases of the "Barcelona" Opterons had a bug that could lock up the system. A workaround would prevent it, but had a stiff performance penalty. Later steppings had it fixed.)

    14. Re:Not all that surprising... by ShanghaiBill · · Score: 4, Informative

      See also Pentium 5 and the FDIV bug. It falls under "too bad, so sad, try your luck with the next revision".

      No. Intel offered to replace any P5 with the FDIV bug upon request. Most customers did not request a replacement, but the option was available.

    15. Re:Not all that surprising... by Impy+the+Impiuos+Imp · · Score: 1

      I heard none of the ultranerd devs had a tightly-coupled mammary.

      --
      (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
    16. Re:Not all that surprising... by BitZtream · · Score: 1

      ... Yes, even in perfect operating condition, they still don't compete with the current line of Intel chips. If you want to argue on price per buzzword, AMD is fine, but they are in no way 'the fastest' x86 chips.

      And lets not pretend AMD has never had CPU bugs, even if you're too stupid to know about them.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    17. Re:Not all that surprising... by Anonymous Coward · · Score: 1

      Only because IBM et al threw up a stink about it, Intel knew about the problem for months before but tried to downplay it. Everybody makes mistakes, fine, but trying to sweep them under the carpet is bad form. At least they're being open about it up-front this time round, there's some solace in the reliability of Intel hardware.

    18. Re:Not all that surprising... by CajunArson · · Score: 2

      Uh.. given that sort of standard, no Android application has ever been developed since the x86 PCs that are used to develop 100% of Android applications lack practically all features of the ARM SoCs that run those applications (the only exceptions being the newer Baytrail Android tablets that are also x86).

      Also: There's a space of about a million miles between "TSX ALWAYS FAILS EVERY SINGLE TIME NO EXCEPTIONS AND CAN NEVER BE USED EVAR!!" with "Oh, we found through extensive testing that under certain conditions TSX can cause issues. Don't use it for your nuclear power plant control system, but it's perfectly fine for non-critical testing. Oh, and just to be safe, we've made a microcode update to disable it."

      --
      AntiFA: An abbreviation for Anti First Amendment.
    19. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      What was that, Mr. Pot?

    20. Re:Not all that surprising... by bill_mcgonigle · · Score: 1

      Intel offered to replace any P5 with the FDIV bug upon request.

      Fortunately most of the P5's were socketed with a trivial heatsink. People with i7 48xx and 49xx laptops are going to be caught up in this - those could have been a really nice portable KVM machine with TSX.

      Then again, Intel chips are so expensive they must have the cost of a possible recall built into each one.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    21. Re:Not all that surprising... by Anonymous Coward · · Score: 2, Insightful

      I for one , would love to know how your 'safe' language manages to avoid dead locks, priority inversion, race conditions or guarantee lock-free processes on anything more complex than a singly linked list. Please enlighten me, I'm clearly ignorant.

    22. Re:Not all that surprising... by viperidaenz · · Score: 1

      the language being Intel 64 machine code?

    23. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      Yeah, but didn't they downplay it because they believed it wasn't that bad?

      Then IBM PR jumped on it?

      I think I remember that the common use scenario that would require precision (and really, all it was, was a precision bug) was an iterative use of the instruction. In other words, one run would have incorrect precision, but iterative usage would still converge on the same result.

      Does any one really know if this was the case? Isn't 20 year old errata fun?

    24. Re:Not all that surprising... by viperidaenz · · Score: 1

      They were going to get together and laugh about it, but turned up to the wrong address.

      According to AMD, "a very specific sequence of consecutive back-to-back pops and (near) return instructions, can create a condition where the process or incorrectly updates the stack pointer"

    25. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 1

      Except that at least the Android SDK provides you with a reasonable substitute, and given the performance ratio between a developer box and the typical target machine, simulation with dynamic translation is bearable for the purpose. I just can't see reasonably testing and tuning large TSX apps (which is to say all TSX apps you can expect) on any sort of simulation. There's a reason why they put a transactional memory controller into those chips. Now it turns out some people who bought it for that reason have been robbed. They must be very happy now.

      --
      Ezekiel 23:20
    26. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      Sorry, price per compute you're still being smoked.

    27. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 1

      You can easily avoid that issue by not generating code like that. A simple compiler update resolves the issue. The same kind of fix won't work for suddenly missing memory transactions, though.

      --
      Ezekiel 23:20
    28. Re:Not all that surprising... by CajunArson · · Score: 5, Insightful

      Nobody has been robbed.
      TSX today works exactly as well as TSX worked yesterday, and considering that Haswell has been on the market for over 1 year, I assure you that anybody who has been chomping at the bit to use TSX has been using TSX.

      If the TSX erratum were trivially easy to trigger, then this article would have been posted last spring before Haswell even launched.

      Intel has done the responsible thing by acknowledging the bug (trust me son, AMD & Nvidia often don't bother with that part of the process) and giving developers the OPTION to either use TSX as-is or disable it to ensure that it cannot cause instability no matter what weird operating conditions can occur.

      Tell ya what, why don't you take all your nerd-rage over to AMD or ARM where they won't rob you of all kinds of advanced features that they just don't bother to implement at all.

      --
      AntiFA: An abbreviation for Anti First Amendment.
    29. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      These are not the instructions you're interested in. Move along, developer. Move along.

    30. Re:Not all that surprising... by EvilJoker · · Score: 4, Informative

      I know this was a troll, but I feel compelled to reply in case someone doesn't know.

      ALL CPUs have errata. Some of it more significant than others.

      A quick Google for "AMD errata" revealed Revision Guide for AMD Family 16h Models 00h-0Fh, published June 2013, and applying to AMD's Mobile A,E, and G series, and Opteron X1100/X2100 (These are modern CPUs)

      There are 21 entries, with descriptions, system impact, and suggested workaround (if any)

      Haswell's errata has 131 entries

    31. Re:Not all that surprising... by Anonymous Coward · · Score: 4, Informative

      See also Pentium 5 and the FDIV bug. It falls under "too bad, so sad, try your luck with the next revision".

      No. Intel offered to replace any P5 with the FDIV bug upon request. Most customers did not request a replacement, but the option was available.

      Not at first they didn't.

      My friend was doing his master on neural networks (?) at the time and some of his algorithms were giving back hinky results, especially when he compared them to some of the SPARC systems.

      He had to actually provide documentation that it effected him, and I think sign an NDA, before Intel would give him anything. He jumped through their hoops to get a replacement, and then the very next week Intel announced their carte blanche replacement program.

      It took much screaming in the industry before Intel became "generous".

    32. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 1

      Quite obviously, C and its derivatives. Given that vector/array size is effectively subsumed by type systems, it's difficult for me to see why a type-safe language and compilation environment would require this kind of feature without the machine providing full typed assembly as its interface, since you can't typecheck just *some* type properties of the program - well, in C, you have to, that's the problem.

      --
      Ezekiel 23:20
    33. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      It is already in the Xeon line. The question is if it can be fixed in time for the coming Haswell E5s.

    34. Re:Not all that surprising... by mwvdlee · · Score: 2

      CPU's with TSX were first releasing in June 2013. Not really "early adopter" terrain any more.

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
    35. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      The offer to replace Pentium CPUs with the FDIV bug is still valid, and Intel actually still honors it.

    36. Re:Not all that surprising... by fnj · · Score: 1

      Sorry to say, I flat out don't believe you.

    37. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      No. Intel offered to replace any P5 with the FDIV bug upon request. Most customers did not request a replacement, but the option was available.

      WE ARE PENTIUM OF BORG
      DIVISION IS FUTILE
      YOU WILL BE APPROXIMATED.

    38. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      Yes, I agree with this. For some applications the microcode update that kills TSX is going to measurably reduce performance.

    39. Re:Not all that surprising... by ghettoimp · · Score: 2

      The FDIV bug was actually relatively limited in scope. Quoting Wikipedia, "Though rarely encountered by average users (Byte magazine estimated that 1 in 9 billion floating point divides with random parameters would produce inaccurate results),[3] both the flaw and Intel's initial handling of the matter were heavily criticized. Intel ultimately recalled the defective processors."

    40. Re:Not all that surprising... by Anonymous Coward · · Score: 2, Informative

      Huh? TSX shipped with Xeon-E3 v3 CPUs. I bought one LAST YEAR so I could play around with TSX.

      Note the RTM at the end of the flags. That signals support for the new TSX instructions. RTM means "Restricted Transactional Memory", as opposed to the other half of TSX, HLE, which is a backwards compatible change in semantics.

      $ cat /proc/cpuinfo | head -n25
      processor : 0
      vendor_id : GenuineIntel
      cpu family : 6
      model : 60
      model name : Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz
      stepping : 3
      microcode : 0x10
      cpu MHz : 800.000
      cache size : 8192 KB
      physical id : 0
      siblings : 8
      core id : 0
      cpu cores : 4
      apicid : 0
      initial apicid : 0
      fpu : yes
      fpu_exception : yes
      cpuid level : 13
      wp : yes
      flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm
      bogomips : 6585.24
      clflush size : 64
      cache_alignment : 64
      address sizes : 39 bits physical, 48 bits virtual
      power management:

    41. Re:Not all that surprising... by thestuckmud · · Score: 1

      Modern type safe languages have a lot going for them, but they don't solve the hard problems of concurrency. (n.b. purely functional languages allow easy parallelization of some mathematical functions, but do not solve the hard problem, either). Highly efficient threading, especially at the system level, is not made easier by type safety.

      This instruction set extension offers transactional memory access, so a thread can begin speculative execution that modifies a block of memory, and roll back on a conflict, rather than stalling on a semaphore lock.

    42. Re:Not all that surprising... by Anonymous Coward · · Score: 1

      the use of unsafe languages
      Why wouldnt the supposedly 'safe' languages not use these features?

      http://lmgtfy.com/?q=java+exploits
      http://lmgtfy.com/?q=python+exploits
      http://lmgtfy.com/?q=javascript+exploits

      The x86 arch is inherently unsafe. It does not have any sort of built in mechanism for bounds checking other than sprinkling if between conditions everywhere. Just like they do in 'safe' languages. It only has a rudimentary no execute bit which was just a stumbling block to the exploiters.

      This describes some of the subtle flaws in the arch we use. http://www.emulators.com/docs/nx03_10fixes.htm Specifically section 2 'Lessons Forgotten'. Basically they are putting functionality back in that we had in the 286 era and other CPUs have built in such as the 68k and powerpc.

      But your way works better I guess. Sprinkling if/between everywhere seems to be working very well. Right? /sarc

      The flaws are deeper than what language you use stop being such a snob.

      Like I said these seem exciting to me. You however seem to want to stand still. Forgive me I didnt realize you wanted your tech to stand still. You will have to pardon me I found it exciting.

    43. Re:Not all that surprising... by gl4ss · · Score: 1

      this is the first I hear of the option being available.

      point being, back in the p5 days, you would hear the switch possibility pretty late.. and I'm fairly sure the local pc magazines didn't cover the replacement possibility either, not even in the articles discussing the problem and showing how to find out if you had the fault or not.

      --
      world was created 5 seconds before this post as it is.
    44. Re:Not all that surprising... by Sun · · Score: 3, Informative

      I have a firend who came to me, eyes all glowing, about this new feature his shining new CPU has. I listened in and was skeptical.

      He then tried, for over a month, to get this feature to produce better results than traditional synchronization methods. This included a lot of dead ends due to simple misunderstandings (try to debug your transation by adding prints: no good - a system call is guaranteed to cancel the transaction).

      We had, for example, a lot of hard times getting proper benchmarks for the feature. Most actual use cases include a relatively low contention rate. Producing a benchmark that will have low contention on the one hand, but allow you to actually test how efficient a synchronized algorhtm is on the other is not an easy task.

      After a lot of going back and forth, as well as some nagging to people at Intel (who, suprisingly, answered him), he came across the following conclusion (shared with others):
      Many times a traditional mutex will, actually, be faster. Other times, it might be possible to gain a few extra nanoseconds using transactions, but the speed difference is, by no means, mind blowing. Either way, the amount you pay in code complexity (i.e. bugs) and reduced abstraction hardly seems worth it.

      At least as it is implemented right now (but I, personally, fail to see how this changes in the future. Then again, I have been known to miss things in the past), the speed difference isn't going to be mind blowing.

      Shachar

    45. Re:Not all that surprising... by Shinobi · · Score: 1

      Based on my experience, due to having learned from the FDIV bug experience, Intel much more readily acknowledge errors than AMD does. There are still some issues where AMD engineers are stonewalling us in regards to cache coherency in NUMA mode, causing major stalls forcing us to have to reset state. (And these are issues that Cray/Silicon Graphics solved in the 90's already...)

    46. Re:Not all that surprising... by crbowman · · Score: 1

      Suppose I bought chips specifically for this feature and now you've disabled that feature in firmware. Can you say class action law suit?

    47. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      It may become interesting when you have a compiler for a language that will produce the correct code every time. That will remove the code complexity issue.

      For example PyPy is a Python Just-in-Time compiler, they are working on emulated transactional memory build into the language so that basic types become save to use by multiple threads. Maybe they can use TSX to do it for real.

      This is also exactly the kind of load where transactional memory shines, since all mutable object need to be protected but there is rarely any contention, or even sharing.

      I would be even better if the compiler could figure out which objects are shared and which are not shared, I am sure they are working on that too.

    48. Re:Not all that surprising... by viperidaenz · · Score: 1

      There's all these issues too.. http://support.amd.com/TechDoc...

      And these ones http://support.amd.com/TechDoc...

      And these http://support.amd.com/TechDoc...

      Any probably many more, but these are just the first 3 Google hits

      All chip manufactures have problems with their chips, Opterons are no exception.

    49. Re:Not all that surprising... by rrohbeck · · Score: 4, Informative

      Singular: Erratum
      Plural: Errata

    50. Re:Not all that surprising... by viperidaenz · · Score: 1

      You've got your chicken and egg around the wrong way.
      CPU's don't have separate levels of memory that require synchronisation between threads because of C.
      Also, TSX also has nothing to do with types.

      Any "safe" language that supports multiple threads requires synchronisation even more so than a low level language like C.

    51. Re:Not all that surprising... by TheRaven64 · · Score: 1

      The big one coming up is the doubling of the integer registers from 16 to 32 in skylake. Which should be interesting in emulation scenarios and code that is kind of knarly.

      The only source I can find for that is Wikipedia, with the edit made by an anonymous user. Do you have anything a little bit more authoritative than that? 16 seems to be close to the sweet spot for integer registers, with enough that modern register allocation algorithms can do a good job, but not so many that context switches are overly expensive.

      --
      I am TheRaven on Soylent News
    52. Re:Not all that surprising... by TheRaven64 · · Score: 1

      MPX doesn't help here either. It provides bounds checking by storing the bounds in a look-aside table that must be atomically updated if you have pointers that are visible to multiple threads, which means that every pointer store to a global must be wrapped in a transactional region (ooops, TSX doesn't work anymore, hopefully they'll have fixed it by the time MPX ships). It looks like the overhead of actually using MPX will be more - in terms of both speed and memory usage - than pure software approaches.

      --
      I am TheRaven on Soylent News
    53. Re:Not all that surprising... by TheRaven64 · · Score: 1

      Possibly, although the described sequence sounds exactly like what happens in a ROP attach. In this case, giving the attacker the opportunity to incorrectly update the stack pointer may make the attack easier.

      --
      I am TheRaven on Soylent News
    54. Re:Not all that surprising... by TheRaven64 · · Score: 3, Informative

      It depends a lot on the data structures. There were a number of papers using TSX at EuroSys this year. The main conclusion was that TSX lets you get similar performance from simple approaches as you can get already from complex approaches. For example, you can protect a long linked list in a single lock and use HLE to get a big speedup with lots of concurrent insertions and accesses, but you can achieve similar performance with a fine-grained locking scheme. There was a nice paper about Cuckoo hashing where they initially found that TSX gave them a performance win, but then were able to get a similar speedup without it.

      The big win with TSX is that it's pretty easy to reason about coarse-grained locking and much harder to reason about fine-grained locking. If you can make coarse-grained locking almost as fast as fine-grained, then that's a huge saving on testing and debugging time.

      --
      I am TheRaven on Soylent News
    55. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      They are actually not amazingly beneficial to server loads, particularly while nobody has taken much advantage of them.

      But either way, they're still far ahead of Opterons in terms of performance, and of course Opterons don't have transactional memory capability. So not too many of those will be laughing.

      IBM's BlueGene, mainframe, and POWER8 CPUs, which all support transactional memory (and in the case of POWER8 at least, has far higher throughput on server loads than Intel's x86 core).

    56. Re:Not all that surprising... by gnasher719 · · Score: 1

      I thought TSX would work best with zero contention? You execute code that supposedly does a transactional operation, but because of a prefix code it doesn't actually do anything transactional - unless things go wrong, it rolls back what it has done, and does the same code properly transactional.

      So when there is no contention (which is most of the time), that's when TSX is most efficient. An example would be the gcc library std::string code. std::string doesn't need to be thread safe, but gcc's implementation needs to be. However, it will almost never happen that two threads access the same string data. So TSX should be perfect there.

    57. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 1

      As far as "this instruction set extension" is concerned, are you talking about MPX or about TSX? Because it seems to me that you're responding with a comment on the utility of TSX to my comment on the ad-hocness of MPX, which doesn't make sense to me. Unless I missed something, these two features have nothing in common (except for both intercepting memory accesses in some way).

      --
      Ezekiel 23:20
    58. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 1

      Doesn't using a method that appears to behave in a non-deterministic way seem to make the exploiter's job harder, rather than easier?

      --
      Ezekiel 23:20
    59. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 1

      Uh, I was referring to the mention of MPX, not to TSX.

      --
      Ezekiel 23:20
    60. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 2

      It has always been my understanding that HTM may not necessarily increase execution performance (outright), but always offers one huge win in terms of operation composability, which is something that individual locks are never going to have. In other words, even if it doesn't make identical programs faster, it ought to make the programming process faster, which is what modern programming seems to be about. An interesting question is what percentage of performance increase can one expect from significant restructurings of complex programs. I have no answer to that, though. (But the things you were saying seem to indicate to me that you haven't explored that path, and I'm not really sure why you claim that this increases code complexity and decreases abstraction when it is really the purpose of this HW design to work in the opposite way, at least once compilers and application libraries will be able to deal with the feature. Have RDBMS people ever complained that dealing with transactions is less abstract than dealing with low-level locks manually?)

      --
      Ezekiel 23:20
    61. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 1

      What is there not to believe? If memory transactions (for any random programming language) were implementable by a compiler fix, why did Intel bother to add all that complex circuitry? Perhaps you could point it out to them how to do it.

      --
      Ezekiel 23:20
    62. Re:Not all that surprising... by AmiMoJo · · Score: 2

      The law in most European countries requires that defective products be replaced. If a feature was advertised but doesn't work the vendor (not the manufacturer) can either replace it with one that does work or give a refund. The refund can either but full or partial, negotiated with the buyer and depending on how useful the product is without that feature.

      If I had one of these chips I'd be looking for a full refund or replacement with a fixed version as soon as a fix was available.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    63. Re:Not all that surprising... by TheRaven64 · · Score: 1

      If it's nondeterministic, yes. If it's 'you can change the stack pointer in this way with this sequence of instructions' then it may be easier.

      --
      I am TheRaven on Soylent News
    64. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      So it's ok if AMD has a bug in their hardware, because all a developer has to do is completely change their code to work the 5% of processors out there with the bug?

      Fanboy much?

    65. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      Opterons don't come anywhere close to the performance of even a mid-range, consumer grade Intel CPU, with or without these instructions.

      Nice try at a misinformation plant and troll though.

    66. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      While this case is
      we are haswell of borg
      transactions are futile
      your work will be locked frequently

    67. Re:Not all that surprising... by viperidaenz · · Score: 1

      The problem goes much further down than C. Assembly has the same problems. So does working with the native machine code.
      The cause of the problem is the use of a von Neumann architecture.

    68. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 1

      Except that compilers of many languages can avoid the problem that MPX solves by simply generating correct code from a safe language. I don't see what von Neumann has to do with this. (It is even possible (using software isolated processes) to remove such things as memory protection and CPU privilege modes if the OS/runtime handles things correctly, and code is presented to it in a safe fashion, but that won't work with legacy applications. Nevertheless, MPX is one of the "because-of-C" features.)

      --
      Ezekiel 23:20
    69. Re: Not all that surprising... by Anonymous Coward · · Score: 0

      And Intel's stock price falling.

    70. Re:Not all that surprising... by viperidaenz · · Score: 1

      Bounds checking is expensive. Safe languages do it in software.
      I wouldn't be surprised if VM's that run managed code (or the code produced by compilers) make use of MPX.

      von Neumann is relevant, as overflowing a buffer with data can result in that data being executed as code - arbitrary remote code execution. In a Harvard architecture, buffer overflows result in data corruption - that's privilege elevation at best.

    71. Re:Not all that surprising... by Bengie · · Score: 1

      Computer per watt, AMD is lacking. When you're limited by power, you can get almost 2x the compute from Intel. Even in the ideal situations, AMD is almost only faster than Intel for compute loads with little sharing that resemble parallel work similar to GPU loads.

    72. Re:Not all that surprising... by stoatwblr · · Score: 1

      1: There are no comparable AMD CPUs to i7-anything.

      2: Where AMD do compete (down at i5 level) they're significantly cheaper.

      3: Horses for courses. Unless you've been optimising for TSX it doesn't matter.

    73. Re:Not all that surprising... by K.+S.+Kyosuke · · Score: 1

      This erratum requires you to change your code if you use RTM because it won't work anymore. I'm not aware of AMD errata that require code changes (certainly beyond recompilation) just to make things work, but I'm willing to do my research on that.

      --
      Ezekiel 23:20
    74. Re:Not all that surprising... by Anonymous Coward · · Score: 0

      If you want people to click on your links, make them links. Put "<URL:" in front and ">" in back. Six extra characters. Not much work. Then "<URL:http://www.emulators.com/docs/nx03_10fixes.htm>" becomes "http://www.emulators.com/docs/nx03_10fixes.htm". Easy, no?

  3. Well, we call them... by ThatsDrDangerToYou · · Score: 2, Funny

    "Featurata"

    1. Re:Well, we call them... by NotFamous · · Score: 1

      I predict there will be future problems in this area - futurrata.

      --
      Some settling may occur during posting.
    2. Re:Well, we call them... by wonkey_monkey · · Score: 3, Funny

      It's okay, Intel are setting a new subdivision to undo these problems. And to maximise employee happiness, it's being built in the Canary Islands.

      I think I'd enjoy being a Featurata Reverter in Fuertaventura.

      --
      systemd is Roko's Basilisk.
    3. Re:Well, we call them... by Anonymous Coward · · Score: 0

      Wow, clever.

    4. Re:Well, we call them... by wonkey_monkey · · Score: 1

      No it's not. It's rather silly, really.

      --
      systemd is Roko's Basilisk.
  4. Can I have a refund? by Anonymous Coward · · Score: 2, Informative

    In some countries I would be entitled to get the product that was advertised or get a refund.

    1. Re:Can I have a refund? by Anonymous Coward · · Score: 0

      To borrow a car analogy, the Fight Club "a times b times c equals x" formula would apply here, but I'm not sure I trust Intel to do the math accurately.

    2. Re:Can I have a refund? by Anonymous Coward · · Score: 0

      In some countries I would be entitled to get the product that was advertised or get a refund.

      That is one reason why such chips are not designed by companies from such countries.

      Trade offs - they're required in the real world.

    3. Re:Can I have a refund? by jones_supa · · Score: 1, Flamebait

      In some countries I would be entitled to get the product that was advertised or get a refund.

      You probably didn't even know about the TSX instruction set before reading this article.

    4. Re:Can I have a refund? by Rashdot · · Score: 2

      Of course. According to my Pentium you're entitled to $0.99989960954

      --
      This is not the sig you're looking for.
    5. Re:Can I have a refund? by Anonymous Coward · · Score: 0

      Wow, are you trolling? If the compiler designers and firmware guys are not here any more , were are they hanging out these days?

    6. Re:Can I have a refund? by viperidaenz · · Score: 1

      Write a letter with proof of purchase to Intel.
      http://www.intel.com/content/w...

  5. Strange Timing... by Anonymous Coward · · Score: 0
    1. Re:Strange Timing... by ssam · · Score: 1

      19.802874743326488 years ago

  6. a bug != errata by Ecuador · · Score: 3, Insightful

    You either say "bugs - or errata" or "a bug - or erratum", since bug is singular and errata plural. At least the error - or "erratum" (see what I did here) in this case was in TFA and not introduced in the /. summary.

    --
    Violence is the last refuge of the incompetent. Polar Scope Align for iOS
    1. Re:a bug != errata by tepples · · Score: 1

      "A notice of errata", on the other hand, is singular.

    2. Re:a bug != errata by Anonymous Coward · · Score: 0

      Yes, a notice of errata is a notice of bugs/errors. Your point?

    3. Re:a bug != errata by tepples · · Score: 1

      My point is that "an errata" is probably short for "a notice of errata".

    4. Re:a bug != errata by TeknoHog · · Score: 1

      My point is that "an errata" is probably short for "a notice of errata".

      "Trolling is a art" is probably short for "Trolling is a form of art".

      --
      Escher was the first MC and Giger invented the HR department.
    5. Re:a bug != errata by Anonymous Coward · · Score: 0

      I like the fact that the summary writer doesn't know what errata is. He puts it in quotes like it's some magical new word only Intel uses.

      He must never have gone to college or even high school, where this term is always used to note errors or corrections to the textbook after it is published. It's all over the science and technology community. Just not the iDevice community.

      This site has gone so far down hill since Taco was around. It's like this place is full of script kiddies who think they know science and technology, but really have no clue.

  7. skip broadwell 2015 by Anonymous Coward · · Score: 0

    I use Skylake for the TSX.

  8. Bought a 4770 instead of 4770K because of TSX by Anonymous Coward · · Score: 1

    The only reason I got a 4770 instead of a 4770K was to play with this instruction in assembler code. To me this sounds like a reason for a partial reimbursement or a fixed chip, not just a BS "fix" that disables the whole feature.

    1. Re:Bought a 4770 instead of 4770K because of TSX by Anonymous Coward · · Score: 0

      Write a friendly message to Intel and ask for a reimbursement, who knows if you might actually get one. Explain that you bought your chip specifically for TSX.

    2. Re:Bought a 4770 instead of 4770K because of TSX by viperidaenz · · Score: 1

      But apparently it's much more fun to bitch anonymously on a website about it.

      But considering the 4770 is cheaper than the 4770k, I'm not sure how you would calculate the partial reimbursement.
      $404.20NZ for i7-4770
      $435.85NZ for i7-4770k

    3. Re:Bought a 4770 instead of 4770K because of TSX by CajunArson · · Score: 3, Informative

      You can still "play with this instruction" all you want.

      What happened here is that a third party developer managed to uncover a corner case where certain interactions with TSX can lead to instability. In order to be safe, Intel acknowledged the bug (a refreshing response) and is now giving you the OPTION to disable TSX if you feel that it could impinge the stability of a production load.

      So basically: Go ahead and play with TSX all you want, but be aware of the errata and that it's theoretically possible to hang your machine in some corner cases.

      --
      AntiFA: An abbreviation for Anti First Amendment.
    4. Re:Bought a 4770 instead of 4770K because of TSX by Anonymous Coward · · Score: 1

      If broken interrupt remapping on the 55xx chipset does not qualifty for a new stepping and recall, why the hell do you think TSX would?

      Without interrupt remapping, the IOMMU is so severely crippled that you lose any protection it could give you against malicious attacks between VMs over PCI. It still provides isolation, but it is badly crippled and trivial to bypass.

  9. Is this TSX function bad for security? by Anonymous Coward · · Score: 0

    Letting my imagination run, I couldn't help but wonder, clueless as I am, if this TSX function could be bad for security. :)

    1. Re:Is this TSX function bad for security? by BronsCon · · Score: 1

      You're thinking of the TSA.

      --
      APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
    2. Re:Is this TSX function bad for security? by Anonymous Coward · · Score: 0

      I think you're confusing RSA/TSA (public/private key cryptography)
      with TWA or SWA (Security of Winged Aircoaches)

      /lol tla

  10. So how does one find out /apply "fix" with linux? by Ungrounded+Lightning · · Score: 2

    It would have been nice if TFA had told us what chips were affected, or how to determine that, rather than saying "haswell" and expecting everybody reading it to do their own research.

    I just spent ten minutes looking around the web, trying to determine if the processor in my laptop is one of those affected - preperatory to perhaps trying to figure out, if it is, how to apply the "disable the broken feature" fix - without installing windows - to avoid the memory corruption bogyman if somebody distributes software that uses, or abuses the feature.

    No joy. The documentation seems to say that:
      - Core i7 is Haswell
      - TSX is NOT supported on versions up to somethng BEFORE the processor version in my laptop (i7-4700MQ)
      - But the descriptions of that processor I've found so far don't say, one way or another, whether it does or doesn't have TSX. B-b

    The "flags" field in /proc/cpuinfo doesn't include a "tsx". But would it?

    Can anyone tell us a simple way to check?

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
  11. Workaround possible? by Anonymous Coward · · Score: 0

    Million dollar question is disabling only viable solution?

    Could problem be worked around with clever microcode patching?

  12. Re:So how does one find out /apply "fix" with linu by heezer7 · · Score: 2

    Check the Intel ARK page for your model number Ex: http://ark.intel.com/products/...

  13. Re:So how does one find out /apply "fix" with linu by Anonymous Coward · · Score: 1

    If you have never updated your firmware, then you don't have to apply a fix.
    I think the fix is only for people who update their firmware constantly.

  14. Re:So how does one find out /apply "fix" with linu by cheese_boy · · Score: 2

    Can anyone tell us a simple way to check?
    Intel has on their website info on the processors.
    For example, for yours (i7-4700mq) you would look at:

    http://ark.intel.com/products/75117/Intel-Core-i7-4700MQ-Processor-6M-Cache-up-to-3_40-GHz

    Or you can look for all products that were "formerly haswell":
    http://ark.intel.com/products/codename/42174/Haswell#@All

    how to apply the "disable the broken feature" fix - without installing windows

    I would do some searches for updating BIOS from linux - ex:
    https://wiki.archlinux.org/index.php/Flashing_BIOS_from_Linux

    Or doing a microcode update:
    https://wiki.archlinux.org/index.php/Microcode

    Until there is a chip for sale that really supports TSX I wouldn't expect anyone to be distributing software that uses it. So I wouldn't be too worried about it yet.

  15. Re:So how does one find out /apply "fix" with linu by Anonymous Coward · · Score: 3, Informative

    Wikipedia has very detailed information on Intel processors. This page does not list TSX for your processor and does list it for others.

    Most Linux distros automatically handle Intel microcode patches (which I assume is how this errata will be handled). See Debian wiki or Arch wiki for details.

  16. Re:So how does one find out /apply "fix" with linu by BitZtream · · Score: 2

    ARK is your friend if you don't have the CPU. dmesg, kernel boot showing feature flags, or CPU-id or whatever the windows app is will all tell you what your CPU supports.

    Your Linux box will probably just have an update with new microcode for the issue and you'll never need to know anything about it, or it will fiddle with the cpu flags to show it as disabled anyway.

    Basically 'if you don't know, it doesn't affect you'

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  17. Point of order by Anonymous Coward · · Score: 0

    "Errata" is plural, with the singular being "erratum". Also, not a bug, nor a feature: It's a notice of error with correction. The errata to a book is then the list of errors found with corrections.

  18. when is a 'bug' a 'feature' by Mister+Liberty · · Score: 1

    If anyone can tell, it's ' Intel '.

  19. Actual details of the bug? by rbarreira · · Score: 1

    Are there any actual details of how the bug works?

    --

    The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
  20. With apologies to Joan Jett by Anonymous Coward · · Score: 0

    Hello, Intel! Hello, Dell! I'm your C-C-C-C-C-C-C-CLASS ACTION BOMB!

  21. Re:So how does one find out /apply "fix" with linu by EvilJoker · · Score: 1

    Honestly, if you're asking, it probably doesn't affect you. This really only affects a tiny percentage of users, who are specifically coding with feature.

  22. Problem and possible alternatives by enriquevagu · · Score: 5, Informative

    This is a real pity for the TM community. This is not the first chip with transactional memory support in hardware: The Sun Rock was announced to have hardware TM support, and the IBM Blue Gene/Q Compute chip also supports it. Unlike other proposals for unbounded transactional memory, all these systems employ Hybrid Transactional Memory (ref, ref, ref), in which restricted hardware transactions are designed to correctly coexist with unbounded software transactions, so a software transaction can be started in case a hardware transaction fails for some unavoidable issue (such as lack of cache size or associativity to hold speculative data from the transaction, not because of a conflict). Note that, in any case, very large transactions should arguably be very uncommon, since they would significantly reduce performance (similar to very large critical sections protected by locks).

    The problem with the hardware implementation of transactional memory is that they are not simply a new set of instructions which are independent from the rest of the processor. HTM implies multiple aspects, including multiversioning caching for speculative data; allowing for the commit of speculative (transactional) instructions, which could be later rolled back (note that in any other speculative operation such as instructions after branch prediction, the speculation is always resolved before instruction commits because the branch commits earlier); a tight integration with the coherence protocol (see LogTM-SE for an alternative to this very last issue, but still...); a mechanism to support atomic commits in presence of coherence invalidations... From the point of view of processor verification, this is a complete nightmare because these new "extensions" basically impact the complete processor pipeline and coherence protocol, and verifying that every single instruction and data structure behaves as expected in isolation does not guarantee that they will operate correctly in presence of multiple transactions (and non-transactional conflicting code) in multiple cores. There are some formal studies such as this or this, and the IBM people discuss the verification of their Blue Gene TM system in this paper (paywalled).

    As some others commented before, the nature of the "bug" has not been disclosed. However, since it seems to be easy to reproduce systematically, I would expect it to be related to incorrect speculative data handling in a single transaction (or something similar), rather than races between multiple transactions.

    Regarding the alternatives, Intel cannot simply remove these instructions opcodes because previous code would fail. I assume that the patch will make all hardware transactions fail on startup, with an specific error (EAX bit 1 indicates if the transaction can succeed on a retry; setting this flag to 0 should trigger a software transaction). In such case, execution continues at the fallback routine indicated in the XBEGIN instruction, which should begin a software transaction. Effectively, this will be similar to a software TM (STM) with additional overheads (starting the hardware transaction and aborting it; detecting conflicts with nonexistent hardware transactions) that would make it slower than a pure STM implementation.

    1. Re:Problem and possible alternatives by johndoe42 · · Score: 1

      Regarding the alternatives, Intel cannot simply remove these instructions opcodes because previous code would fail. I assume that the patch will make all hardware transactions fail on startup, with an specific error (EAX bit 1 indicates if the transaction can succeed on a retry; setting this flag to 0 should trigger a software transaction). In such case, execution continues at the fallback routine indicated in the XBEGIN instruction, which should begin a software transaction. Effectively, this will be similar to a software TM (STM) with additional overheads (starting the hardware transaction and aborting it; detecting conflicts with nonexistent hardware transactions) that would make it slower than a pure STM implementation.

      This seems unlikely to me. I'd expect that the patch will clear the cpuid bit for TSX and cause #UD (undefined opcode) on XBEGIN, etc.

  23. Look on the bright side... by ewhenn · · Score: 1

    Look on the bright side... at least it performs addition correctly, I know for fact as I recently upgraded to a Haswell based desktop. This isn't like that other 0.99912656367 time when they had the Pentium FDIV bug.

  24. CPUs should be replaced upon request, or... by KonoWatakushi · · Score: 1

    Alternatively, Intel should stop artificially segmenting their product line on every last instruction set extension or feature. ECC and VT-D should be standard features, yet are intentionally crippled on other Intel chips. If I paid extra for a Xeon, then I expect those to work and TSX is no different.

    It is infuriating that developers and users alike must face such a mishmash of arbitrarily enabled functionality just so Intel can extract further profit, even while bragging about their low defect rate on the 22nm process. I'm not saying that processors shouldn't be binned, only that it should be done on the basis of defects. It is criminal to arbitrarily destroy value in the pursuit of profit, and maybe the law should reflect that.

    1. Re:CPUs should be replaced upon request, or... by Z00L00K · · Score: 1

      Add to it that it's not obvious in easily accessible documents what the differences are between the processor models aside from cache size and other features that are easy to show to customers but when you have two processors with vastly different price but same basic specs (Clock, Cache, addressable memory) it's hard to understand why one is more expensive than the other.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    2. Re: CPUs should be replaced upon request, or... by Anonymous Coward · · Score: 0

      Doubt they can fix it on haswell . Horse has left the barn a long time ago. Gotta see if there is remuneration available.

    3. Re:CPUs should be replaced upon request, or... by kav2k · · Score: 1

      ark.intel.com qualifies as "easily accessible", no?

    4. Re:CPUs should be replaced upon request, or... by Z00L00K · · Score: 1

      Fails in the obvious part, the hard thing is to know that it exists, then it comes down to that the web page doesn't work unless you select Internet Explorer.

      So they could do more on the accessibility of the information. The documentation is also hard to get a grip on unless you read through it before you can decide if it's useful for a specific application or not.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  25. official list of processors that support tsx-ni by Anonymous Coward · · Score: 0

    You asked: Can anyone tell us a simple way to check? [if my laptop's CPU supports TSX-NI]

    Here is a list (as of November 2013), scroll down for an Intel reply:
    Where are the Haswell laptops with TSX-NI ?
    https://communities.intel.com/message/211616

    The list starts with i5-4200H, i5-4350U, i5-4300U, i5-4300M, ... and continues up to the i7 chips

  26. As a side note by Z00L00K · · Score: 1

    This article at least provided more information about the existence of the feature than any release note provided.

    --
    If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  27. Re:So how does one find out /apply "fix" with linu by johndoe42 · · Score: 1

    If you have a recent version of the cpuid tool, you can run:

    cpuid |grep RTM

    and you'll see something like:

    RTM: restricted transactional memory = false
    RTM: restricted transactional memory = false
    RTM: restricted transactional memory = false
    RTM: restricted transactional memory = false

    /proc/cpuinfo doesn't show it, presumably because no kernel support is needed at all for this feature. (And that's why, if this is indeed a privilege escalation issue, it won't be easily fixed with a kernel change.)

  28. Re:So how does one find out /apply "fix" with linu by Anonymous Coward · · Score: 0

    Include the intel microcode update packages for your distro, keep them up-to-date as well as the kernel, and stop worrying. BTW, the cpuinfo flag to search for is "rtm".

    There are so many ways to crash your box, it is not even funny, so don't worry about TSX.

    Also, for someone who bougth a box without support for ECC memory, you're caring too much about memory corruption. There are two things you can be certain about: 1. the box will, eventually, break down as all things do, and 2. you will be a victim to silent memory corruption due to lack of ECC memory.

  29. Phonology vs. morphology: compare "data" by tepples · · Score: 2

    That's different. I'll explain for the benefit of ESLers reading Slashdot:

    The use of "a" or "an" in modern English is always conditioned by the phonology. The rule is that "an" becomes "a" when followed by a phoneme with a sonority below "vowel". Hence "a hedgehog" in standard or "an hedgehog" (pronounced "an edge Ogg") in voiced-aitch dialects such as Cockney. I've seen only one consistent exception to this rule: "an hero" referring to one who commits suicide, which retains "an" even in voiceless-aitch dialects.

    By contrast, the reanalysis of a plural first as a mass noun and eventually as a singular referring to the collection is closer to morphology. The behavior of "errata" has loosely paralleled that of "data", which has already become a mass noun taking a singular (such as "the data is..."), with "datum" having become archaic in favor of "data point" or "piece of data". The step after a mass noun is a collective, which can lead to a double plural; "erratas" refers to what would be called "collections of errata" under the older convention.