Linus Torvalds Says Intel Needs To Admit It Has Issues With CPUs (itwire.com)
troublemaker_23 shares an article from ITWire:
Linux creator Linus Torvalds has had some harsh words for Intel in the course of a discussion about patches for two bugs that were found to affect most of the company's processors... Torvalds was clearly unimpressed by Intel's bid to play down the crisis through its media statements, saying: "I think somebody inside of Intel needs to really take a long hard look at their CPUs, and actually admit that they have issues instead of writing PR blurbs that say that everything works as designed... Or is Intel basically saying 'we are committed to selling you shit forever and ever, and never fixing anything'?" he asked. "Because if that's the case, maybe we should start looking towards the ARM64 people more."
Elsewhere Linus told ZDNet that "there's no one number" for the performance drop users will experience after patches. "It will depend on your hardware and on your load. I think 5 percent for a load with a noticeable kernel component (e.g. a database) is roughly in the right ballpark. But if you do micro-benchmarks that really try to stress it, you might see double-digit performance degradation. A number of loads will spend almost all their time in user space, and not see much of an impact at all."
Elsewhere Linus told ZDNet that "there's no one number" for the performance drop users will experience after patches. "It will depend on your hardware and on your load. I think 5 percent for a load with a noticeable kernel component (e.g. a database) is roughly in the right ballpark. But if you do micro-benchmarks that really try to stress it, you might see double-digit performance degradation. A number of loads will spend almost all their time in user space, and not see much of an impact at all."
His point is more likely the fact that ARM didn't do any sort of PR-bullshit and instead produced a very, very in-depth whitepaper, example-code and whatnot on the whole thing. Their behaviour here is pretty much everything one would hope for in a case like this.
So you decide to speculate a future instruction.
It happens to be a load.
The address is [ebp+eax]. A recent instruction had the same address field, so you speculate that it remained the same.
Now you need to translate the address. The translate might be in the TLB, but you check, and for some reason it isn't.
So you decide to speculatively trigger TLB load.
Finally, you get a physical address back. A previous write instruction is not yet translated, but it seems unlikely it will translate to the same address, so you decide to speculate the load and you make a cache line request from L1.
It might be in L1, but it isn't. So you decide to speculate again, and request it from L2. Not in L3, either, so finally you speculate the load all the way to external memory. When the cache line returns, you speculatively cache this at all levels. Then you speculatively store the value into the target register. The final step was the least dangerous, because you can dump this later, no harm to the abstract state. But the concrete side effects on the TLB and the three layers of cache are not so easily reversed. In theory, the concrete state doesn't leak into the abstract state. Because we simply don't like to think about time (time, above all things, being never simple; hint: functional programming has no time, only progress).
Not all speculative architectures are created equal. There are many opportunities for an architecture to Just Say No.
With cache coherence, you have the MESI protocol (and its bewildering shoe full of second cousins).
One could apply the same concept of "exclusive" to the page tables, an exclusively mapped page being one mapped only onto into the current process and security context. If TLB speculation hits a different kind of beast, abandon speculation. Same thing with cache fill. Concrete side effects thereby only accrue from speculation to exclusive resources. Share-nothing usually solves most problems in computer science (except performance, which is mainly defined in the time domain).
I'm gong to abandon the back of my envelope here, One has to think really damn hard to take this to the next logical level, and frankly, I don't have a damn to spare right this very minute.
But please, advance the conversation beyond:
[_] has speculation
[_] does not have speculation
Because that is Intel's diabolical trap, for as long as their PR department can continue to get away with tugging their wool in broad daylight.
It's not a pure monopoly, but it has a lot of monopoly power. Monopoly is not a binary state, as most lay pedants assume.
Were that I say, pancakes?
Chinese companies just put in backdoors for the Chinese government, organised crime, your Chinese competitors and so on.
https://thehackernews.com/2015...
http://www.zdnet.com/article/f...
http://www.securityweek.com/ap...
http://www.businessinsider.com...
https://tvnewswatch.blogspot.c...
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
BS. There are already proof of concepts that can be run and are in the hands of a select few for testing purposes. We have no idea if these exploits have been used - only that we have no visibility on it. The only real visibility we have is when a whitehat reports it, or when someone is caught. While personal computers are less impacted, the fact that the browsers will all also have to be patched since it can also be exploited through javascript... problematic.
The issue is that through using the exploits you can have access to things like passwords used in kernel code, certificates, etc. -- and that can get this through pilfering the cache -- which breaks the isolation between user applications and the operating system.... While already bad on a personal computer, it is horribly bad for shared hosting environments -- where some actor can get access to a common computing environment and attack from the inside.
Actually, 83% is often used as a cutoff in both the US and Canada, derived from (US) judge Learned Hand's opinion that a market share of ninety percent 'is enough to constitute a monopoly; it is doubtful whether sixty . . . percent would be enough; and certainly thirty-three percent is not.' [ United States v. Aluminum Co. of Am., 148 F.2d 416, 424 (2d Cir. 1945)]
davecb@spamcop.net
MIPS was bought by Imagination Technologies who also own PowerVR (and, oddly enough Pure, a wonderfully geeky DAB radio company)
https://en.wikipedia.org/wiki/...
MIPS/Imagination is heading resolutely for embedded platforms and probably the plughole.
Still the original MIPS architecture is probably patent free. And Loongson make MIPS compatible chips. Unlicensed as far as I know. Not that there is much to licence in the original MIPS architecture
https://en.wikipedia.org/wiki/...
So it's possible for third parties to build MIPS compatible chips. Not MIPS32/MIPS64 but the original 64 bit MIPS III architecture.
https://en.wikipedia.org/wiki/...
Hell skip the patented bits and make them NOPs. Lexra got in trouble not for implementing them but for making them illegal instructions. MIPS's lawyers argued successfully that a system integrator could write an illegal instruction trap handler that implemented the missing instructions in software, in perhaps the most amazing abuse of the patent system ever.
https://en.wikipedia.org/wiki/...
In 1999 MIPS Technologies sued Lexra again, but this time for infringing its patents on unaligned loads and stores. Though Lexra's processor designs did not implement unaligned loads and stores, it was possible to emulate the functionality of unaligned loads and stores through a long series of other instructions. In the opinion of Lexra, the ability to emulate the function of unaligned loads and stores in software predated the grant of the patent in question and could not be viewed as an infringement of the hardware patent by any reasonable interpretation. Also, much earlier than any MIPS Technologies processor, IBM mainframes supported unaligned memory operations. In these earlier IBM processors, unaligned memory operations and partial access to registers were available through microcode and the instruction set architecture. These aspects of earlier IBM processors posed the much greater threat of patent invalidation to MIPS Technologies, compared to the seemingly vacuous MIPS Technologies infringement claim against Lexra.
http://probell.com/Lexra/
If a Lexra processor encountered an unaligned load or store instruction in a program then it did the same thing that it would do for any other invalid opcode, it took a reserved instruction exception. In the second lawsuit between MIPS Technologies and Lexra, filed November 1999, MIPS Technologies claimed that because exception handler software could be written to emulate the function of unaligned load and store hardware, using many other instructions, Lexra's processors infringed the patent. Upon learning of this broad interpretation of the patent, Lexra requested that the US Patent and Trademark office (USPTO) reexamine whether the patent was novel when granted. Almost every microprocessor ever designed can emulate the functionality of unaligned loads and stores in software. MIPS Technologies did not invent that. By any reasonable interpretation of the MIPS Technologies' patent, Lexra did not infringe. In mid-2001 Lexra received a preliminary ruling from the USPTO that key claims in the unaligned load and store patent were invalid because of prior art in an IBM CISC patent. However, MIPS Technologies appealed the USPTO ruling and, in the mean time, won a favorably broad interpretation of the language of the patent from a judge. That forced Lexra into a settlement that included dropping the reexamination request before MIPS Technologies might have lost its appeal.
It was never determined that processors that execute the MIPS-I instruction set, but treat unaligned loads and stores as reserved instructions, infringed the '976 patent. The patent exp
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Meanwhile, enjoying my Ryzen, largely unaffected by Meltdown or Spectre in spite of some well meaning or self-serving FUD to the the contrary. Yes, I got an early part with the segfault bug, but AMD RMAed without fuss when presented with appropriate https://github.com/suaefar/ryz...>test data to eliminate the possibility of bad motherboard, memory or overclocking. Quite different attitude compared to Intel! And the Ryzen is sweet - 16 high performing CPU threads, tiny power consumption at idle and respectable under full load. Integer performance, iow, compiling is stellar and floating point is not shabby. Basically, Ryzen out-cores Intel's competing i7 parts by a wide margin, acquits itself well in single-core too and draws so little power that the CPU fan is off or barely turning for most normal desktop usage. And when all 16 threads are going full blast, iow doing real work, total system power is around 120 watts, the system still runs nearly silent. Can't say enough good things about it.
If you do step up to Ryzen, be aware of two things: 1) Check the production week stamped on the CPU, it has the form 17xx where xx is the week... make sure this is higher than week 25, otherwise run kill-ryzen.sh to verify the segfault bug and get an RMA promptly from AMD's only support site, if you see it. Windows users need to boot Linux to do this, get a live iso on a usb stick to do this in maximum comfort, and preferably, just overwrite Windows when done :-) Most of that early production is sold out already, so the chance of getting a bad part is slim, but be aware. Windows users for the most part don't seem to see any issue even with the early parts. Good for them, but it goes along with significantly lower performance without the upgrade to LInux :-) 2) Be aware that Ryzen has no on-board GPU, in spite of the fact that your Ryzen motherboard has video connectors... these are for AMD's APUs, which use the same socket. Respectable chips in their own right especially in terms of value for money, but when you run Ryzen you need to run a discrete GPU too. This is what you want anyway, because what is the point of crippling your high end desktop processor with a mickey mouse embedded GPU? To be specific: AMD's fattest APU has eight compute units (512 stream processors) vs 64 in the current Vega part, plus uses processor memory instead of higher bandwidth dedicated graphics memory.
Of course, what I really want is a threadripper... that's next.
When all you have is a hammer, every problem starts to look like a thumb.
It's not a pure monopoly, but it has a lot of monopoly power. Monopoly is not a binary state, as most lay pedants assume.
There is no such legal concept as "pure monopoly". There is only anti-competitive behavior as defined in America by the Sherman, Clayton and FTC acts which includes such concepts as market power. There is endless confusion about this simple fact: a monopolist need not control 100% of a market to violate anti-trust laws. Usually much less than that, less than 50% is not at all uncommon. What matters is breaking the law or not.
When all you have is a hammer, every problem starts to look like a thumb.
>you heard me. he may be a great programmer, but he doesn't know DICK about how hard it is to make a CPU
Did you forget that Linus worked at Transmeta?
It seems you're not clear on the details, actually.
Literally any logical person can see that Intel's suck-it-and-see approach is terrible. AMD's (and everyone else's) engineers specifically addressed the issue by using the correct logic. That's not luck. That's called doing it right.
Logical order of operations:
1) Begin speculative execution
2) Encounter ring-0 request
3) Check for ring-0 permission
4) Only allow speculative operation to be processed if permission is allowed
Literally everyone but Intel does it this way because they're smarter than Intel.
Intel's janky-ass order of operations:
1) Begin speculative execution
2) Encounter ring-0 request
3) Allow request to be processed, no matter what
4) When resolving branching, check if requester has ring-0 access and invalidate the results if not
The trouble comes with: 3.5) Unprivileged requester inspects artifacts (registers, cache values, etc.) of processing prior to speculative branch resolution.
The blame for this lays at the feet of Intel's engineers as much as anyone else. Granted, it's possible they found this a while ago and wanted to fix it, but were thwarted by management. Everybody knows that management sucks in every company, and they probably were the most recent roadblock to progress here. But this design got to production in the first place, and that's the engineers' fault.