'Kernel Memory Leaking' Intel Processor Design Flaw Forces Linux, Windows Redesign (theregister.co.uk)
According to The Register, "A fundamental design flaw in Intel's processor chips has forced a significant redesign of the Linux and Windows kernels to defang the chip-level security bug." From the report: Programmers are scrambling to overhaul the open-source Linux kernel's virtual memory system. Meanwhile, Microsoft is expected to publicly introduce the necessary changes to its Windows operating system in this month's Patch Tuesday: these changes were seeded to beta testers running fast-ring Windows Insider builds in November and December. Crucially, these updates to both Linux and Windows will incur a performance hit on Intel products. The effects are still being benchmarked, however we're looking at a ballpark figure of five to 30 per cent slow down, depending on the task and the processor model. More recent Intel chips have features -- specifically, PCID -- to reduce the performance hit. Similar operating systems, such as Apple's 64-bit macOS, will also need to be updated -- the flaw is in the Intel x86 hardware, and it appears a microcode update can't address it. It has to be fixed in software at the OS level, or buy a new processor without the design blunder. Details of the vulnerability within Intel's silicon are under wraps: an embargo on the specifics is due to lift early this month, perhaps in time for Microsoft's Patch Tuesday next week. Indeed, patches for the Linux kernel are available for all to see but comments in the source code have been redacted to obfuscate the issue. The report goes on to share some details of the flaw that have surfaced. "It is understood the bug is present in modern Intel processors produced in the past decade," reports The Register. "It allows normal user programs -- from database applications to JavaScript in web browsers -- to discern to some extent the contents of protected kernel memory. The fix is to separate the kernel's memory completely from user processes using what's called Kernel Page Table Isolation, or KPTI."
At one point, Forcefully Unmap Complete Kernel With Interrupt Trampolines, aka FUCKWIT, was mulled by the Linux kernel team, giving you an idea of how annoying this has been for the developers.
About par for Intel's course. Make it fast at the expense of horrible bugs.
Your hair look like poop, Bob! - Wanker.
Intel guys are doing the bulk of the work for the linux kernel changes, and I'm sure to be fair they'll equally cripple all processors with the changes not just their own.
Sorry for the lack of imagination, but if the user space process can only read kernel memory, and can't write to it, how could one make use of this?
I find it hard to believe that a virtual memory change will result in a 5-30% slowdown for Intel processors. Maybe for a few extremely specific (likely edge-case) tasks, but if there was a legitimate 5-30% performance decrease, you can bet there would be a far different solution in the works that would suitably fix the problem.
The developers behind the GRSecurity project measured up to 63% performance loss. If most common tasks are equally affected, Intel is sure fucked. Home users might not need to bother, but large cloud providers might be seriously affected.
Meanwhile the Linux kernel has received the largest incremental minor patch in its history (229KiB) - perhaps kernel 4.14.11 already contains all the required fixes.
I have a sneaking suspicion Intel shares will fall through the floor in the next few weeks because Intel CPUs might have suddenly become quite slower than their AMD Zen based counterparts.
sounds like one way to fix this would be to implement the operating system as some sort of message-passing microkernel...
I used to use Linux in some small VMs without swap. Even though there was no swap space, if memory even started to become scarce some kernel process, I think called kswapd0, would go stupid and would use 100% of the CPU. Clearly it wasn't doing anything useful, because there was no swap space at all to be used. Since it was acting in an unwanted and wasteful way, I consider this to be a severe bug. Do these changes fix this bug?
Linux Weekly News has been covering this for quite a while.
5% slowdown on average, with up to 30% for some particularly bad network operations.
ARM64 is also affected, so it's not just intel
-- Sometimes you have to turn the lights off in order to see.
And what is interesting, AMD is immune to that, proof: https://lkml.org/lkml/2017/12/...
The summary is not fully explicit: this is not a flaw in Intel x86 ISA, but specific to CPUs made by Intel. AMD processors don't have the problem, so they should not need the patch.
https://lkml.org/lkml/2017/12/...
This could be a huge win for AMD, because the patch incurs a measurable slowdown. At the moment, though, the Linux fix doesn't seem to distinguish between manufacturers. I expect the distinction will appear later -- better safe than sorry.
Escher was the first MC and Giger invented the HR department.
you cut 30% off the performance of my CPU expect to hear about it.
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
or heck if you've just got a low end laptop?
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
some of my sys admin friends posted this on a slack channel i'm in, apparently it's a big deal
http://pythonsweetness.tumblr.com/post/169166980422/the-mysterious-case-of-the-linux-page-table
I came to the datacenter drunk with a fake ID, don't you want to be just like me?
Anyone remember the TLB bug that also resulted in huge performance penalties in the first generation Phenoms? Guess it's Intel's turn.
An older link, about the KAISER patch set
-- Sometimes you have to turn the lights off in order to see.
Up above
The fix separates the memory layout in kernel mode from the memory layout in user mode. The page tables used to be the same, but there appears to be an access method that bypasses the CPU protections, so the kernel no longer keeps the kernel pages mapped when a process leaves kernel mode. This means that every time a program calls into the kernel (to read a file, send a packet, etc.), the memory layout changes and the CPU has to flush the Translation Lookaside Buffer twice per syscall, once when the process enters kernel mode and once when it leaves kernel mode. This is what causes the slowdown. It is more severe for loads that use many system calls, but unless you have highly computationally intensive loads that rarely use system calls, you will see significant slowdowns.
I'm curious how much Cannon Lake and Ice Lake CPU architectures are going to be delayed. Since Cannon Lake is basically SkyLake on a 10nm node, Intel cannot release it with such a glaring hole which causes such a significant performance loss.
I've been running a Sandy Bridge CPU for seven years now, and now I'm really looking forward to the second gen Zen CPUs. Viva, competition. I'm really glad AMD is still around.
https://www.fool.com/investing...
Less than a month before we know the linux kernel was being patched for this bug.
Does Intel still have shares of AMD stock?
This is why we run our mission critical workloads on SPARC and Power along side Linux. Solaris and AIX. Diversity -- in operating system, in processor, in manufacturer - is healthy. The SPARC T8's are blazing faster, secure, and don't have this nonsense. Neither do our POWER8's. Having all your eggs in the Intel+Linux basket could be a major shitshow here... meanwhile, we'll keep chugging along.
To which I am sure will be modded down due to relevance.. But..
Why don;t we see this crap comming from the AMD side of things, or more over from the ARM Side as well..
Makes ya wonder, If Intel is such a big conglomerate, overshadowing most.. Why the hell cant they get their shit straight..
The first pentium blunder I can understand,, It was their first major Blunder Fuck, to which they should have learned from.
Now, several blinder fucks later, and now this load. Wouldn't it be funny if it turns out that INTEL killed INTEL?? due to various blunderfucks..
Have to ask, does this also affect Intel-Macs? I infer "yes", but have not read many of the detailed articles yet...
This has long been a concern of direct rendering, giving direct access to video hardware to applications opens up a lot of possible bugs that did not exist.
It might be better to simply use GLX and send all of the OpenGL commands to the X server and have the X server handle the video card. Remember that the X server is being ported to use Mesa drivers with the Glamor project so the X server can send OpenGL commands through Mesa. I asked about this before , the geniuses that run X.org seem to be refusing to upgrade the GLX support to the latest OpenGL specifications, even though for situations like this it could really help and might be important for people that want to avoid the risks of direct rendering. or who want to use network transparency. Its quite foolish
Where do we apply for the refund?
CPU design is complex, that's why. Getting everything right, all the time, or even getting the testing fully comprehensive... "hard" doesn't even begin to describe the problem. The word you'd be looking for is "impossible."
--fyngyrz *
* Anon due to mod points, because Slashdot rules are stupid. Soylent News does it better. A lot better.
Even a 15% performance drop, not to mention 60%, is a huge hit. This is the kind of performance increase you buy a new processor for. So everyone just paid for a performance increase they did not get. False marketing to the tune of billions.
Glad my latest computer is a Raspberry Pi. Glad to be on an ARM processor. Perhaps this will help more ARM based computers become more mainstream this year.
The notion that Intel even has the capability of producing new fixed CPUs to match other than the latest packaging/pin requirements seems fanciful. In which case we'll just have to live with any slowdown. As buying all new systems is just too expensive.
Are we gonna get a list of affected Intel CPUs, so we can avoid buying them?
I won't buy buggy, flawed, unsafe, expensive and slower Intel chips since the past decade.
I will decide picking the AMD options before visiting the christmas shops.
I will buy RISC-V chips if they are polished and better than AMD/Intel chips.
I hate the broken escalation prevention.
Stock option time!
Just because ARM processors don't have this security bug it doesn't mean that there aren't any Broadcom ARM processor hardware (or its kernel) security issues lurking out there that are as bad or worse.
Mimetics Inc. Twitter
No slowdown caused by Intel because they will SELL you a new CPU.
And no slowdown on your iPhone because Apple will SELL you a new battery,
In both cases: THESE ARE DEFECTIVE PRODUCTS.
In both cases you must pay a 2nd time for performance you already paid for.
Silicon Valley and American Quality have reached a new low.
What a joke!!!
It's not just the banks that are TO BIG TO FAIL.
Won't there be people who decide that fixing this is not worth the slowdown? After all, if it is ran on an internal machine where users can't cause a buffer overflow or provide code, why should there be a risk?
Avantgarde Hebrew science fiction
From the AMD commit:
this can probably be rewritten in the inverse like:
Intel processors ... allow memory references, including speculative references, that
access higher privileged data when running in a lesser privileged mode, [including]
when that access would result in a page fault.
So it seems like: set up a speculative memory reference to a kernel memory structure, cause a page fault, and then get a bit of kernel memory out (and back in?). That could get you root before long. Some people have been saying this can be leveraged to get a guest into its hypervisor too.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
not you. thats's what the Linux team wanted to call this bug.
I read the El Reg article but I still don't understand what it is saying. At all levels. I don't understand if this means all intel processors or just the new ones. I don't understand if the 20% slowdown is for a tiny fraction of operations in the OS or if it means that things like e-mail, firefox or general python programming will be slowed down 20% overall. The latter would be a disaster. (could I ask intel to refund 20% of my computer costs?). And what's the consequences of not patching? Is the OS unstable or not use memory efficiently or "just" a security hole?
Some drink at the fountain of knowledge. Others just gargle.
Does this effect cryptomining I wonder.
I have an idea, Intel. Stop putting stupid shit INSIDE the chips. I don't need ME, I don't need whatever the hell this "feature" was, I just need it to do math and optimize stuff like multi-thread resource sharing. If you want to do some dumb shit, PUT IT IN THE CHIPSET.
In all fairness the occam's razor points to this says intel was slowing down amd. Why? because they have been caught doing this in the past with compilers and libraries that anti-optimized code for AMDs deliberately.
From TFA "It is understood the bug is present in modern Intel processors produced in the past decade. It allows normal user programs – from database applications to JavaScript in web browsers – to discern to some extent the contents of protected kernel memory.".
It could explain why Intel did put the brakes on CPUs production, and some of the 2017 are very hard to find.
Slashdot, fix the reply notifications... You won't get away with it...
Intel CEO, Brian Krzanich, apparently sold a bunch of shares on Nov. 29. While that's not unusual in and of itself, apparently Intel corporate bylaws require its CEO to maintain a minimum number 250,000 shares, and that's exactly how many shares Mr. Krzanich has left. Despite predicting future market growth, the guy dumped his stock for some reason.
https://www.fool.com/investing...
Design failure caused by negligence? Check.
The 'fix' massively reduces the utility and value of your hardware? Check.
Millions and millions of people nailed by it? Check.
It's the great problem for the cloud servers as Amazon's, Google's, Microsoft's, etc: 30% of slowdown is equivalent to 30% of their cloud servers shutdown but consuming 30% electrical power for nothing.
It's a waste of 30% of money in electricity each year.
It may be that over time, more efficient work-arounds will be devised. The first pass mostly just focuses on plugging the hole, while later patches may be more efficient because they can take time to study and test more efficient fixes.
Table-ized A.I.
That is not a memory leak.
I don't want a 30% slowdown to my workloads and I don't care if games hack each other to death.
Intel has really blown it with this bug. No argument from me there. On the other hand, Apple has not produced a defective product. Everyone’s batteries wear out. It is physics pure and simple. In Apple’s case they’ve done the riight thing by offering a very inexpensive battery replacement.
amd ryzen gen 2 will crush intel now!
not patching? = no other updates as well. unless you compile your own Kernel
https://www.fool.com/investing/2017/12/19/intels-ceo-just-sold-a-lot-of-stock.aspx
a bitch aint it? Artificially slow down anyone running AMD processors with your shit-tastic compiler... now you get your customers get a permanent 5-35% slow down. I hope you enjoyed your cake, what's it taste like now?
Intel's CEO just sold all the stock he legally could sell.
If video games influenced behavior the Pac Man generation would be eating pills and running away from their problems.
Forgive me if I am being naive, but won't this mean we are now going to have a slower processor than we paid for, making Intel owners eligible for participation in a class-action lawsuit involving not getting what we paid for or false advertising or something like that? If I bought a car with X horsepower, and suddenly something was found wrong with it and it had to be modified to work, and was suddenly X - 10%, I'd expect compensation. Their flaw is essentially going to make it so people will have to upgrade to keep up with current tasks. How is this at all fair? They can be lazy, then expect more sales?
For a year. Because they've fixed the battery problem right?
It's like the 4's rubber bumper thing. Band and it for a ridiculously short time for cheap or free and people like you eat it all up
I sincerely hope the last time you bought their device was over 2 years ago, so you can actually take advantage of this... Otherwise, you'll still be throttled and battery replacement will still cost you 90+
You should've stuck with PowerPC.
Now if Linux wasn't so damn slow on my Raspberry Pi, or at least had decent 3D acceleration.
I don't know, but maybe he runs an high performance computing (HPC ) cluster.
With compute nodes segregated on a separate network that might not even have internet access,
and certainly not running random javascripts downloaded from random websites.
And in these context, performance matters a lot,
while security is handled in a "perimeter" fashion.
In those cases, it makes sense to have an option to disable the fix.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
The OS devs patching a security hole in the CPU should not leave them open to a lawsuit when the only way to patch it slows down the performance of the system.
Well, not exactly the only way.
One way is to run the upgraded kernel, which will use the fix to circumvent the big CPU flaw, but will take a massive performance hit.
The other way, is to give "nopti" boot param to this new kernel, which will run it as-as with no fixes, which will leave the performance untouched but which is something you would definitely only do on machine that never ever run foreign untrusted code (which according to TFS, also includes javascript).
OS devs just give you a possibility to circumvent the CPU flaw and are cautious as usual to enable the fix by default (for security purpose) and give an option to disable.
It' still something which is optional, up to the choice of the user (even if by default the devs have sided on the security aspect).
On the other hand, Intel are the one who have delivered a pieces of hardware which doesn't work as advertised. (Or could be made sort-of working if you use a performance-killing circumvention).
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Comment removed based on user account deletion
a) that advice "just download Ubuntu 12.04 and run it on your old PC" is not so good... even on old hardware, a new, safer Linux is best;
b) the idea of not supporting old AMD ones may backfire if a supported old Pentium 4 becomes slower than the unsupported AMD;
c) ditto for newer 64-bit CPUs losing any (small) advantage they have to an older, non-buggy 32-bit CPU (even if it's Intel).
I'd like to thank all those who work on the problem, and make Linux as wonderful as it is -- or rather, thanks to all those who make Free software (and likewise open source, too) wonderful.
These people quietly work to make the world better... that's the definition of good human beings.
accreditation of those high security systems that are left. Most of the ones i worked on relied on Intel's x86 ring mechanisms to protect trusted processes from untrusted processes at the hardware level. Of course that wasn't the only means they used to ensure process isolation, but it was always part of their repertoire.
When there was the infamous FPU (floating point unit) problem in the original Pentium CPU, Intel provided 1-on-1 replacement to anybody who requested it for about 7 years. (Bring one faulty tile and get one good at same or higher speed. Early Pentium chip had red ceramic casing.)
We can expect the same to happen again or else Intel will be class-actioned by the seven horsemen of Apocalypse.
For once, this has nothing to do with the current lack of capability based security....
whew
It is rather clear that Intel screwed up. Intel should recall and replace all processors at no cost to the customers. I sure hope Microsoft and the Linux organisation send a bill to Intel for the damages caused.
Now I can say, "Hah!" in addition to, "I don't have Intel money".
The first thing I thought of when seeing the headlines was "Why can't the fix be for specific processors and available as a separate port?" since Linux already has separate versions for different processors.
After reading many comments, I'm glad to see that AMD contributed a patch so their processors won't be negatively affected where unnecessary.
I can't wait to see the newer Tom's Hardware benchmarks.
It was not a battery problem. There was no battery problem. But they tried to throttle CPU to âoebalanceâ the effects of normal battery wear. But with enough shrills it created a PR nightmare so they have in.
Still hate Apple thought.... batteries should not have caused $79 in the first place. Thatâ(TM)s a bigger sin than this episode, but whatever....
Could you explain the exact negligence that occurred here and why it took over ten years for 3rd parties to discover and which 3rd parties should we also hold responsible for negligence?
Change is certain; progress is not obligatory.
The fix is adding hundreds or even thousands of cycles of overhead to each and every crossing from user to kernel space. That means every system call, every page fault, and every interrupt.
Operating systems have gone to extraordinary lengths to optimize that system call fast path. x86 chips have added special SYSCALL/SYSENTER instructions to make this faster. I know in Linux, people count the number of cache lines touched and reorganize structures to reduce this number.
If your application does a lot of cheap system calls (such as reading from files cached in RAM), the overhead can exceed the "real work" of the system call.
If you want an extreme edge case, try calling getppid() in a loop. (That's the cheapest system call whose result can't be cached in user space.) I predict you'll find the time per iteration is more than doubled, i.e. >50% slowdown.
On the other hand, if your application is CPU-bound and makes relatively few system calls, there will be very little penalty.
I expect that once the emergency fix is released, people will search for ways to optimize it, but I doubt there are any huge gains available. Intel are quite aware that this is a serious public embarrassment and have sent it all the way back to their chip designers to see if there's a cheaper way to fix it.
Is there a chance that people running Linux in a VM will end up getting hit twice as hard due to fixes in both the Linux kernel and the hypervisor?
Slashdot your i and slashcross your t.
Uhh, throttling the phone down to the point of unusability without telling you the battery is the reason?
This is your expected behavior of design? Can't wait for the i car to come out. Your desired car with have no dash board indicators or gauges to tell if you're running out of oil or battery. The car's performance will just slow down and told it's expected behavior.
Lol
It's a feature.
For the NSA, etc...
There is no, "battery problem". It's simply the nature of battery chemistry that they degrade with age. All rechargeable batteries do this, just in different ways depending on the chemistry. Apple's "slow down" was an effort to make devices with marginal batteries *last longer* which is something you'd think the consumer would want. Their failing was not keeping people informed.
~Any apparent grammatical or typographic errors are caused by defects in your display device.
I am relieved for Brian Krzanich, Intel's CEO. He was lucky enough to sell all the stock he could right before this made the news:
https://www.fool.com/investing...
Otherwise God forbid, he would have lost a lot of money.
Pestilence, War, Famine, Death... Larry, Curly, and Moe?
Why is this being fixed via a software patch? Intel should be required to perform a recall and replace all affected chipsets.
Can you imagine if Takata's response to their airbag fiasco was to tell car manufacturers that they need to install bomb-sheilding in their cars between the airbag and the passengers?
No other industry gets away with fucking the public like the tech companies do. Well, except for the financial industry.
No, their failing was not designing a battery with capacity enough to offer a day's worth of use without charging after one year of normal use.
It's a design problem Apple tried to mask, and fanboys refuse to see it.
Takes 2 minutes to change an iPhone battery, and they don't cost 90$.
I've got better things to do tonight than die.
One for kernel-privilege code and one for user code, and a lightning fast way (use of a bit) that each instruction selects which of the two contexts it is using. I guess that would mean two separate caches etc.
This would enable efficient execution of true micro-kernel OSes wouldn't it?
Not sure if what I said there actually makes sense, in context of current chip architectures.
Basic idea is one particular kind of context switch, from kernel back and forth to user is not context caching/replacing but is just context selection from two.
Comments? Cure my ignorance?
Where are we going and why are we in a handbasket?
This update for kernel-firmware fixes the following issues:
- Add microcode_amd_fam17h.bin (bsc#1068032 CVE-2017-5715)
This new firmware disables branch prediction on AMD family 17h processor
to mitigate a attack on the branch predictor that could lead to
information disclosure from e.g. kernel memory (bsc#1068032 CVE-2017-5715).
Comment removed based on user account deletion
lets them have access to us secretly whenever they want.. kind of like hidden AMD passwords, and NSA Key's, and holes put in things deliberately..
the only reason to know why this happened is if you did psychic warfare probing on the bastards that designed all these components.
DO NOT TRUST ANY AMERICAN MADE PIECES OF SHIT.
https://www.trumpsweapon.com/
Next gen Intel CPUs will display 5-30% performance increase!!!
a random binary from the Internet. And so is everything Turing-complete.
So unless you never *ever* enable JavaScript/WebAssembly, or have your OSI stack parse anything like a language, maybe even anything context-sensitive, ... yes you do.
That is already a "Battle of 'tards".
Math, sadly, is 90% religion, and 10% science. Like philosopy. ... In reality it is just a man-made tool to work with the patterns detected in reality ... that is massively abused in a schizophrenic (aka religious) way, to act like it *defines* those patterns, aka reality. And to play with nonsense that has no basis in reality, and is hence useless.
See: Goedel's incompleteness theorem, and math insisting that it is the origin and basis of reality anyway.
And engineering is robotic work destined for people with zero creativity and loads of OCD busywork.
Although math has its share if the latter too. But at least it is creative. (But so are bible stories.)
I prefer something creative that is useful in reality.
Every asshole ever has played the "Whoops, sorry, stoopit me" card. Because black-eyed sheep love to reinforce that anticonspiracy theory.
Stupidity is diverging. It by definition fails to reach a certain goal. It is more chaotic and random.
Evilness (and goodness) is converging. It progresses towads a goal. (And if a person is stupid, it does not matter if she is evil too, as she will be too ineffective.)
That is how you tell them apart.
In most cases I have seen, it was clearly converging. I mean all governments of the world trying to turn their countries into totalitarian surveillance states in the last ~10-15 years? If that isn't convergence, then what is?
(It does not take any backroom conspiracies or planning, for evil convergence to happen. This was exactly what pre-CIA-impostor-"Anonymous (the mindset, not a goup)" was.
Unplanned, unorganized convergence. A damn interesting concept for system theory researchers.
So never attribute to stupidity, before thourougly checking for convergence that can attribute it to malice.
(And there's more than this dichotomy!)
Comment removed based on user account deletion