Errata Prompts Intel To Disable TSX In Haswell, Early Broadwell CPUs
Dr. Damage writes: The TSX instructions built into Intel's Haswell CPU cores haven't become widely used by everyday software just yet, but they promise to make certain types of multithreaded applications run much faster than they can today. Some of the savviest software developers are likely building TSX-enabled software right about now. Unfortunately, that work may have to come to a halt, thanks to a bug—or "errata," as Intel prefers to call them—in Haswell's TSX implementation that can cause critical software failures. To work around the problem, Intel will disable TSX via microcode in its current CPUs — and in early Broadwell processors, as well.
Chips don't add?
Transactions don't sync?
Don't be sad,
don't be a dink.
Burma Shave!!
Lost at C:>. Found at C.
So, basically, they've just been forced to get rid of the most complex (that's why it's not all that surprising) yet also most beneficial feature with regards to server loads? I'm sure there are some Opterons laughing right now.
Ezekiel 23:20
"Featurata"
In some countries I would be entitled to get the product that was advertised or get a refund.
Almost exactly 20 years since the infamous pentium arithmetic problems
You either say "bugs - or errata" or "a bug - or erratum", since bug is singular and errata plural. At least the error - or "erratum" (see what I did here) in this case was in TFA and not introduced in the /. summary.
Violence is the last refuge of the incompetent. Polar Scope Align for iOS
I use Skylake for the TSX.
The only reason I got a 4770 instead of a 4770K was to play with this instruction in assembler code. To me this sounds like a reason for a partial reimbursement or a fixed chip, not just a BS "fix" that disables the whole feature.
Letting my imagination run, I couldn't help but wonder, clueless as I am, if this TSX function could be bad for security. :)
It would have been nice if TFA had told us what chips were affected, or how to determine that, rather than saying "haswell" and expecting everybody reading it to do their own research.
I just spent ten minutes looking around the web, trying to determine if the processor in my laptop is one of those affected - preperatory to perhaps trying to figure out, if it is, how to apply the "disable the broken feature" fix - without installing windows - to avoid the memory corruption bogyman if somebody distributes software that uses, or abuses the feature.
No joy. The documentation seems to say that:
- Core i7 is Haswell
- TSX is NOT supported on versions up to somethng BEFORE the processor version in my laptop (i7-4700MQ)
- But the descriptions of that processor I've found so far don't say, one way or another, whether it does or doesn't have TSX. B-b
The "flags" field in /proc/cpuinfo doesn't include a "tsx". But would it?
Can anyone tell us a simple way to check?
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Million dollar question is disabling only viable solution?
Could problem be worked around with clever microcode patching?
Check the Intel ARK page for your model number Ex: http://ark.intel.com/products/...
If you have never updated your firmware, then you don't have to apply a fix.
I think the fix is only for people who update their firmware constantly.
Can anyone tell us a simple way to check?
Intel has on their website info on the processors.
For example, for yours (i7-4700mq) you would look at:
http://ark.intel.com/products/75117/Intel-Core-i7-4700MQ-Processor-6M-Cache-up-to-3_40-GHz
Or you can look for all products that were "formerly haswell":
http://ark.intel.com/products/codename/42174/Haswell#@All
how to apply the "disable the broken feature" fix - without installing windows
I would do some searches for updating BIOS from linux - ex:
https://wiki.archlinux.org/index.php/Flashing_BIOS_from_Linux
Or doing a microcode update:
https://wiki.archlinux.org/index.php/Microcode
Until there is a chip for sale that really supports TSX I wouldn't expect anyone to be distributing software that uses it. So I wouldn't be too worried about it yet.
Wikipedia has very detailed information on Intel processors. This page does not list TSX for your processor and does list it for others.
Most Linux distros automatically handle Intel microcode patches (which I assume is how this errata will be handled). See Debian wiki or Arch wiki for details.
ARK is your friend if you don't have the CPU. dmesg, kernel boot showing feature flags, or CPU-id or whatever the windows app is will all tell you what your CPU supports.
Your Linux box will probably just have an update with new microcode for the issue and you'll never need to know anything about it, or it will fiddle with the cpu flags to show it as disabled anyway.
Basically 'if you don't know, it doesn't affect you'
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
"Errata" is plural, with the singular being "erratum". Also, not a bug, nor a feature: It's a notice of error with correction. The errata to a book is then the list of errors found with corrections.
If anyone can tell, it's ' Intel '.
Are there any actual details of how the bug works?
The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
Hello, Intel! Hello, Dell! I'm your C-C-C-C-C-C-C-CLASS ACTION BOMB!
Honestly, if you're asking, it probably doesn't affect you. This really only affects a tiny percentage of users, who are specifically coding with feature.
This is a real pity for the TM community. This is not the first chip with transactional memory support in hardware: The Sun Rock was announced to have hardware TM support, and the IBM Blue Gene/Q Compute chip also supports it. Unlike other proposals for unbounded transactional memory, all these systems employ Hybrid Transactional Memory (ref, ref, ref), in which restricted hardware transactions are designed to correctly coexist with unbounded software transactions, so a software transaction can be started in case a hardware transaction fails for some unavoidable issue (such as lack of cache size or associativity to hold speculative data from the transaction, not because of a conflict). Note that, in any case, very large transactions should arguably be very uncommon, since they would significantly reduce performance (similar to very large critical sections protected by locks).
The problem with the hardware implementation of transactional memory is that they are not simply a new set of instructions which are independent from the rest of the processor. HTM implies multiple aspects, including multiversioning caching for speculative data; allowing for the commit of speculative (transactional) instructions, which could be later rolled back (note that in any other speculative operation such as instructions after branch prediction, the speculation is always resolved before instruction commits because the branch commits earlier); a tight integration with the coherence protocol (see LogTM-SE for an alternative to this very last issue, but still...); a mechanism to support atomic commits in presence of coherence invalidations... From the point of view of processor verification, this is a complete nightmare because these new "extensions" basically impact the complete processor pipeline and coherence protocol, and verifying that every single instruction and data structure behaves as expected in isolation does not guarantee that they will operate correctly in presence of multiple transactions (and non-transactional conflicting code) in multiple cores. There are some formal studies such as this or this, and the IBM people discuss the verification of their Blue Gene TM system in this paper (paywalled).
As some others commented before, the nature of the "bug" has not been disclosed. However, since it seems to be easy to reproduce systematically, I would expect it to be related to incorrect speculative data handling in a single transaction (or something similar), rather than races between multiple transactions.
Regarding the alternatives, Intel cannot simply remove these instructions opcodes because previous code would fail. I assume that the patch will make all hardware transactions fail on startup, with an specific error (EAX bit 1 indicates if the transaction can succeed on a retry; setting this flag to 0 should trigger a software transaction). In such case, execution continues at the fallback routine indicated in the XBEGIN instruction, which should begin a software transaction. Effectively, this will be similar to a software TM (STM) with additional overheads (starting the hardware transaction and aborting it; detecting conflicts with nonexistent hardware transactions) that would make it slower than a pure STM implementation.
Look on the bright side... at least it performs addition correctly, I know for fact as I recently upgraded to a Haswell based desktop. This isn't like that other 0.99912656367 time when they had the Pentium FDIV bug.
Alternatively, Intel should stop artificially segmenting their product line on every last instruction set extension or feature. ECC and VT-D should be standard features, yet are intentionally crippled on other Intel chips. If I paid extra for a Xeon, then I expect those to work and TSX is no different.
It is infuriating that developers and users alike must face such a mishmash of arbitrarily enabled functionality just so Intel can extract further profit, even while bragging about their low defect rate on the 22nm process. I'm not saying that processors shouldn't be binned, only that it should be done on the basis of defects. It is criminal to arbitrarily destroy value in the pursuit of profit, and maybe the law should reflect that.
You asked: Can anyone tell us a simple way to check? [if my laptop's CPU supports TSX-NI]
Here is a list (as of November 2013), scroll down for an Intel reply:
Where are the Haswell laptops with TSX-NI ?
https://communities.intel.com/message/211616
The list starts with i5-4200H, i5-4350U, i5-4300U, i5-4300M, ... and continues up to the i7 chips
This article at least provided more information about the existence of the feature than any release note provided.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
If you have a recent version of the cpuid tool, you can run:
cpuid |grep RTM
and you'll see something like:
RTM: restricted transactional memory = false
RTM: restricted transactional memory = false
RTM: restricted transactional memory = false
RTM: restricted transactional memory = false
/proc/cpuinfo doesn't show it, presumably because no kernel support is needed at all for this feature. (And that's why, if this is indeed a privilege escalation issue, it won't be easily fixed with a kernel change.)
Include the intel microcode update packages for your distro, keep them up-to-date as well as the kernel, and stop worrying. BTW, the cpuinfo flag to search for is "rtm".
There are so many ways to crash your box, it is not even funny, so don't worry about TSX.
Also, for someone who bougth a box without support for ECC memory, you're caring too much about memory corruption. There are two things you can be certain about: 1. the box will, eventually, break down as all things do, and 2. you will be a victim to silent memory corruption due to lack of ECC memory.
That's different. I'll explain for the benefit of ESLers reading Slashdot:
The use of "a" or "an" in modern English is always conditioned by the phonology. The rule is that "an" becomes "a" when followed by a phoneme with a sonority below "vowel". Hence "a hedgehog" in standard or "an hedgehog" (pronounced "an edge Ogg") in voiced-aitch dialects such as Cockney. I've seen only one consistent exception to this rule: "an hero" referring to one who commits suicide, which retains "an" even in voiceless-aitch dialects.
By contrast, the reanalysis of a plural first as a mass noun and eventually as a singular referring to the collection is closer to morphology. The behavior of "errata" has loosely paralleled that of "data", which has already become a mass noun taking a singular (such as "the data is..."), with "datum" having become archaic in favor of "data point" or "piece of data". The step after a mass noun is a collective, which can lead to a double plural; "erratas" refers to what would be called "collections of errata" under the older convention.