Slashdot Mirror


ARM TrustZone Hacked By Abusing Power Management (acolyer.org)

"This is brilliant and terrifying in equal measure," writes the Morning Paper. Long-time Slashdot reader phantomfive writes: Many CPUs these days have DVFS (Dynamic Voltage and Frequency Scaling), which allows the CPU's clockspeed and voltage to vary dynamically depending on whether the CPU is idling or not. By turning the voltage up and down with one thread, researchers were able to flip bits in another thread. By flipping bits when the second thread was verifying the TrustZone key, the researchers were granted permission. If number 'A' is a product of two large prime numbers, you can flip a few bits in 'A' to get a number that is a product of many smaller numbers, and more easily factorable.
"As the first work to show the security ramifications of energy management mechanisms," the researchers reported at Usenix, "we urge the community to re-examine these security-oblivious designs."

60 comments

  1. Every time by DontBeAMoran · · Score: 4, Funny

    Every time I hear about security, viruses and hacks, it's done via "opcodes", "registers" and "bits". Isn't it time we design more secure processors without these flaws?

    --
    #DeleteFacebook
    1. Re:Every time by Anonymous Coward · · Score: 0

      If they were based on nibbles, none of this could happen.

    2. Re:Every time by jellomizer · · Score: 2

      Normally at this level for the hack we start to cross the line from the digital to the analog. While most of us coders just worry about 0 and 1, on the processor we are looking at a values between a threshold, where wires are so close that a power change could cause a little static arch that in theory can change a bit.
      However these hacks normally need to be times perfectly and with intimate knowledge on what is going on at that time. Such a hack would most likely cause a program to fail, or some bad data to be processed, which is bad, however no worse then the bugs in most applications or OS, or just generic hardware failure.
      While I could see AMD would want to fix this, I don't see it currently as a major concern for security, as if the hacker was to get to that level, they would have access to a lot more on the PC.

      --
      If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    3. Re:Every time by elrous0 · · Score: 5, Funny

      Better yet, don't name your product with a name that can later become ironic, like "TrustZone." Try naming your product "ShitStorm" or "ClusterFuck" instead. That way, when it gets hacked or turns out to be buggy as hell, you can say "What did you expect? We told you so upfront."

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    4. Re:Every time by Anonymous Coward · · Score: 0

      Actually, TrustZone is an excellent architecture. Problem is, most SOC vendors / core software folk do not fully or properly implement the architecture. Because, proper implementation does take quite a bit of time and work. The architecture does include features that could have prevented the said attack.
      Experience says, managerial and product schedules are usually to blame. Ask me how I know...

    5. Re:Every time by glitch! · · Score: 1

      Better yet, don't name your product with a name that can later become ironic, like "TrustZone." Try naming your product "ShitStorm" or "ClusterFuck" instead.

      "With a name like that, you know it HAS to be good!" Like that old Saturday Night Live skit where they come up with bad names for the jelly. "Fruckers! You know it must be good!" Followed by "Monkey Pus!", "Painful Rectal Itch!", and "Death Camp! Look for the barbed wire on the label!" Then "10,000 Nuns and Orphans! What's wrong with that? They were eaten by rats!"

      --
      A dingo ate my sig...
    6. Re:Every time by TechyImmigrant · · Score: 1

      Other large semiconductor companies seem to be able to implement a secure enclave structure with dynamic voltage wobbling and managed to take fault injection seriously and they don't have these problems. It's heavy lifting to do a proper job, but in the case of RSA, it really isn't. Just sprinkle in some TMR and integrity testing with maybe a rail monitor and you will be good. I wonder why that isn't a part of TrustZone as standard. It should be.

      You are right. Management can't see security problems when you're building it and unless they've had some bitter experience, they don't know how to cost it.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    7. Re:Every time by gtall · · Score: 1

      I think if we could just have pink unicorns, we could ask their magical advice on how to design new processors.

    8. Re: Every time by Anonymous Coward · · Score: 0

      You racist against white ponies bro?

      WhitePonyLivezMatter

    9. Re:Every time by arglebargle_xiv · · Score: 1

      Actually, TrustZone is an excellent architecture.

      TrustZone is a terrible architecture. It started as a hash-for-secure-boot and then had more and more crap bolted onto it without rhyme or reason as the marketing folks sold it as all things to all people, with most of what was bolted on only partly finished or debugged, if that. The OPs suggestion that it be rebranded as ClusterFuck isn't too far off the target, because that's what it's turned into.

    10. Re: Every time by Anonymous Coward · · Score: 0

      What about "Crack Spackle?"
      That was a good one, and it's about as helpful as a "trust zone."

    11. Re:Every time by Anonymous Coward · · Score: 0

      Would the ClusterFuck products have hot and cold aisles? Hopefully ShitStorm products are not required to access them for maintenance.

    12. Re:Every time by Anonymous Coward · · Score: 0

      Exactly so.

      Causing bad data is a far cry from causing specific data to be created on a computer you do not otherwise have full control of -which is what would be needed in order to gain control of the system.

      Still, it is a first step.

    13. Re: Every time by Anonymous Coward · · Score: 0

      #FluttershyIsBestPony

    14. Re:Every time by Anonymous Coward · · Score: 0

      Is better to let software handle this, hardware is more expensive and software can be changed easier.

    15. Re:Every time by denis-The-menace · · Score: 0
      --
      Obama's legacy: (N)othing (S)ecure (A)nywhere and (T)error (S)imulation (A)dministration
  2. Would the Rust programming language help? by Anonymous Coward · · Score: 0

    Would using the Rust programming language help to avoid these problems that are happening between two different threads of execution? As is stated on the Rust web site front page, one of Rust's benefits is that it has "threads without data races" and it also has "guaranteed memory safety". Both of those sound like they could go along way toward helping prevent one thread of execution interfering with another.

    1. Re:Would the Rust programming language help? by Anonymous Coward · · Score: 1

      No, this is a hardware design flaw. It really has nothing to do with security other than it causing a security issue as a byproduct. Really there is no reason raising and lowering the voltage should flip bits at all other than someone made a big boo boo in the design.

    2. Re:Would the Rust programming language help? by radish · · Score: 5, Informative

      Not in this case. Rust (and similar programming approaches) prevent accidental interference between threads (of the same application) at the code execution layer - i.e. they prevent bugs due to programming errors. This attack is happening at the hardware level - the threads in question could be completely different applications and could be written in any language.

      --

      ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

    3. Re:Would the Rust programming language help? by jellomizer · · Score: 1

      No, Rust isn't a magical device that makes all your computers secure.
      It does help enforce better coding practices to make your code more secure.
      However on some level the point of the programming language is to interact with the system hardware.
      a Mutable data type will prevent threads from changing the data. However it doesn't stop the CPU from changing the data in the value. Because something needs to clear the memory when the variable is no longer needed (such as leaving the nest or program end)

      --
      If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    4. Re:Would the Rust programming language help? by Hognoxious · · Score: 1

      Could it be stopped by making appropriate amendments to the Code of Conduct?

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    5. Re:Would the Rust programming language help? by arglebargle_xiv · · Score: 2

      No, you don't understand. You're thinking of Rust as a programming language when in fact it's a religion. Every time there's some post about bugs, flaws, or bad code, the Rust flakes turn up to enlighten the heathens with their religion/language with its guarantee of perfect, error-free code and operation. All hail the mighty Rust! Lead us into the light of your perfection!

    6. Re:Would the Rust programming language help? by Anonymous Coward · · Score: 0

      +100 best comment. -PCP

    7. Re: Would the Rust programming language help? by Anonymous Coward · · Score: 0

      Nope, yet the legacy 'rust as an oxidation service' could potentially prevent proper execution of this attack vector.

  3. Easy fix by Anonymous Coward · · Score: 4, Insightful

    Don't allow non operating system code to muck with the system clock. Problem solved. Why would this functionality ever be exposed? This is something that non-OS code should NEVER be able to do.

    1. Re:Easy fix by Anonymous Coward · · Score: 0

      Except that doesn't address the issue. It still allows a lower privilege level get information about/control over a high privilege level, which is the problem. I'm not an ARM guy by trade, but in x86 parlance it really shouldn't matter whether it's possible to trick a processor to switch execution to system management mode from ring-3 or ring-0. It's privilege escalation.

    2. Re: Easy fix by sound+vision · · Score: 1

      Even if the program isn't given direct access to change the speed and voltage, it can trigger those changes indirectly.

    3. Re:Easy fix by thegarbz · · Score: 1

      Non OS code doesn't need that capability for non OS code to actually perform those actions via proxy.

    4. Re:Easy fix by gweihir · · Score: 1

      And then somebody hacks the OS and can compromise the Trust Zone anyways. No, what we need to do is secure the OS, because this is just one more case where anybody that owns the OS owns everything.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    5. Re:Easy fix by Anonymous Coward · · Score: 0

      It isn't a privilege escalation, its creating faulty access to TrustZone so that bad crypto blocks are generated. This gives the attacker known information (plaintext + bad crypto block + good crypto block) allowing them to brute force the key used in the crypto.

    6. Re:Easy fix by BadDreamer · · Score: 1

      Problem not solved, because breaking TrustZone means breaking the machine BENEATH the OS level.

  4. You're kidding right? by Ayano · · Score: 3, Interesting

    These Goldilocks voltages will vary by small margins.. too small to be accurately predicted for an actual attack.

    TFA tries to make the argument that this physical hack can be done remotely despite the highly controlled conditions by relying on the power and energy management utilities...

    Now i've got news as an embedded developer, that sh*t isn't accurate for anything this sensitive.

    --
    I don't read AC
    1. Re:You're kidding right? by Anonymous Coward · · Score: 0

      ISIS has claimed responsibility for hacking your power management. We need more fascist pig sh1t to keep us safe and secure.

      ae911truth dot org

    2. Re:You're kidding right? by Anne+Thwacks · · Score: 5, Interesting
      This is the same, or very similar, to an Intel bug described about a month ago:

      The issue in both cases is either:

      a) The device can be set to operate under conditions that are known to cause it to be unreliable (be out of spec)

      b) The device fails to operate reliably when operated within spec

      If it is (a) then perhaps the manufacturer should test devices more thoroughly - and then blow fuses to limit operation within spec. If it is (b) the manufacturer should test the devices more thoroughly.

      You may know that (eg) Intel sell processors "locked" to prevent over clocking. This prevents (a). It obviously fails to prevent (b) either the manufacturer chose not to lock the device as (or the buyer chose not to buy locked ones) and the suer was "free" to use the devices out of spec, or the article describes devices where the tests were inadequate.

      In reality, device performance is not consistent within a batch, and devices are sorted for performance - hence processors with different speed and power options. This has been true since the beginning of TTL. As devices have higher part count (see Moore's law) they have a higher probability of failure - since there are more failure modes, there is a much higher time-to-test. Time to test maps directly to device cost. Because time-to-test adds to cost, semiconductor devices are not tested 100%*: some parameters are, and others are only sampled to ensure that the tests are identifying the duds. The problem here is that the parameters tested by sampling may not be as reliably characterised as they are believed to be. If you assume that (for example) all static ram cells in the chip have essentially the same logic levels and speeds within a certain margin, and that margin has a wider spread between devices under circumstances that have not been identified, then testing some sample registers won't tell you that others are not reliable on chips with this unknown and unidentified problem.

      Complexity does not scale linearly with transistor count - it is partly that, but it also scales with number of modules, module complexity, and number of interfaces between modules (hardware equivalent of API's not API instances). A more complex CPU has more of all three of these factors. Any way you look at it, a more complex chip will be more likely to fail in modes that are hard to identify.

      About 15 years ago, I was part of a team that identified a problem in a CPU of fairly low complexity caused by data leakage between pipeline stages in a processor used in safety critical applications (AFAIK, no one actually died as a result of these failures). These failure modes are very hard to find. This one took about a man-year of very expensive engineers using very expensive equipment.

      I predict that Moore's law will eventually be hit the Thwacks Barrier: Processor complexity will reach the stage where a processor cannot be adequately tested within a timescale that makes it worth producing.

      I therefore hereby, formally pronounce that testability will be the barrier that ends Moore's law.

      *Some /. users who are old enough to afford lawns may recall the national Semiconductors Mil-spec scandal: devices were sold as 10% tested when in fact they were only sampled, because the failure rate was "very low". No Aircraft carriers or space rockets were actually lost, but crimes were found to be committed anyway.

      --
      Sent from my ASR33 using ASCII
    3. Re:You're kidding right? by Zorpheus · · Score: 1

      You can run any number of tries though, until you manage to change a bit.
      I don't know, but you can probably also use any number of tries of getting a corrupted trustzone key?

    4. Re:You're kidding right? by Anonymous Coward · · Score: 0

      There is one area where this would apply perfectly:

      Situations where local access is forbidden or restricted. A.K.A DRM. *

      As all of that data is handled locally, including access restrictions and authorization, successfully flipping a bit or two is all that's needed to get what the attacker wants. In particular, this type of attack is bad for DRM implementations where the actual encryption is done once and the decryption key is then encrypted via some other mechanism. One successful compromise, compromises the data for everyone, so this type of attack (if it can be repeated reliably), ensures that all such implementations will fail regardless as to how well the code is written.

      Just another example of DRM failing utterly at what it's designed to do, and the reason why these types of hacks can be so benifital. Even if you can't hack the Pentagon with this, you can easily get past the restrictions on media.

      * Of course this type of hack is also good for getting past locked bootloaders, bypassing forgotten passwords, jailbreaking a machine, etc.

    5. Re:You're kidding right? by Anonymous Coward · · Score: 0

      Hanging by a thread but posts like this are why I keep coming back to /.

    6. Re:You're kidding right? by izzo+nizzo · · Score: 1

      This is fascinating. I'm curious if the bulk of the testing techniques are things that could eventually be automated. If AI could bite off some of the burden, perhaps the chips could still be tested in a feasible time frame.

  5. Geez by Anonymous Coward · · Score: 0

    It's getting to the point where I kind of yawn when the daily data heist is posted. Like Equifax's sheer idiocy and incompetence is not surprising any more.

    But this one, yeah, it's so devious that it makes you sit up and say oh shit, is there really anything that can be made practically secure these days? We rest on our laurels by saying 1024 bits is perfectly fine until quantum computers come around, yada yada.

    But this one is like taking down Hoover dam by twisting a screwdriver in one strategic crack. It's using physics to coax math.

    With a ton of smart, dedicated crackers all around the world, how many truly devious exploits have been thought up for attacking secure enclave, for example?

  6. Sefcurify by JBMcB · · Score: 2

    It would all be more secure if there were a backdoor engineered into the design so the government could have unfettered access to our data. You know, to make sure it's secure.

    --
    My Other Computer Is A Data General Nova III.
  7. Targeted by JBMcB · · Score: 1

    It would probably work for a *very* targeted attack. A specific rev of a specific device running a specific OS.

    Useful for spooks, not much for anyone else. There were a bunch of these kinds of hacks in the NSA leaks - a MITM attack given a specific version of Apache, OpenSSL, and a specific version of a particular web browser.

    --
    My Other Computer Is A Data General Nova III.
    1. Re: Targeted by Entrope · · Score: 1

      The voltage variations in question are driven by the random defects in the silicon and in the fabrication process, not so much the CPU design or the OS (or even firmware) running on the chip.

    2. Re: Targeted by Anonymous Coward · · Score: 1

      But it's the same kind of vulnerability where you take advantage of a race condition and multithreading. In software, you set up some handlers to catch the segmentation faults, and whatever. But just keep trying again and again until you get lucky.

  8. Broken Hardware by Anonymous Coward · · Score: 1

    If the power management can change the state of the processing engine then the power management methodology is broken. There should be no way to flip bits or change any of the processing state by manipulating the power state. That is is possible shows a serious flaw in the design.

    1. Re:Broken Hardware by Anonymous Coward · · Score: 0

      The ARM manufacturers must have gone all sub-threshold before the circuit designs can take it. And they like it.

  9. RTFM by Anonymous Coward · · Score: 0

    Don't allow non operating system code to muck with the system clock. Problem solved. Why would this functionality ever be exposed? This is something that non-OS code should NEVER be able to do.

    RTFM, idiot, the clock is manipulated by the order of executed instructions, problem is NOT solved.

    1. Re:RTFM by Anonymous Coward · · Score: 1

      RTFM indeed, it isn't manipulated by the order of the executed instructions but by telling the dvfs system to change the voltage and frequency. So limiting access to the DVFS should mitigate the issue enough such that they'll already have root access on the system before they can mount the attack.

  10. Entrope is an idiot by Anonymous Coward · · Score: 1

    The voltage variations in question are driven by the random defects in the silicon and in the fabrication process, not so much the CPU design or the OS (or even firmware) running on the chip.

    RTFM, idiot:

    Thus any frequency or voltage change initiated by untrusted code inadvertently affects the trusted code execution.

  11. Yes its Targeted by johnjones · · Score: 2

    The claim that you can not manipulate the keys was made and clearly thats not the case... the team at Columbia University : Adrian Tang, Simha Sethumadhavan, and Salvatore Stolfo deserve credit for showing that was not always the case...

    I wonder how many side attacks the PLA have...

    john.jones.name

  12. It was hackers that hacked by hacking with hacks! by Anonymous Coward · · Score: 0

    Thank you so much for being your lovable talented self, EditorDavid.

    Now if you refrain from tacking on empty scareword headlines while trying to find content we might actually have some indication the linked article might be worhty of reading instead of clickbait.

    But then, you're all about the clickbait, aren't you?

  13. Rowhammer all over again by Dwedit · · Score: 2

    This looks just like Rowhammer all over again. Flipping bits by messing with something nearby.

    1. Re:Rowhammer all over again by viperidaenz · · Score: 1

      It's flipping bits by gaining root access, profiling the system, crashing it many times in the process, then mess with something nearby.

    2. Re:Rowhammer all over again by swillden · · Score: 1

      It's flipping bits by gaining root access, profiling the system, crashing it many times in the process, then mess with something nearby.

      True, but that doesn't mean it's not bad.

      The whole point of TrustZone and similar technologies is to provide a place for computations that you wish to remain secure even in the event of complete compromise of the main operating system. Note that I'm not claiming that the attack is practical, it may or may not be sufficiently automatable to carry out remotely, on a large number of devices. That's for future research to determine. But it does make me nervous (my main project for the last four years is an Android subsystem that runs in TrustZone, SGX, etc.).

      Well, I should say it would make me nervous if there weren't much easier ways to attack TrustZone already, due not to weaknesses in TrustZone but to the operating systems that run there.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  14. If that's all... by johannesg · · Score: 1

    I'm actually more terrified by the notion that activities in one core can cause bits to flip for another, completely randomly. We have a _lot_ of important stuff riding on the correct calculations happening in all those CPUs, worldwide, and the idea that you can pretty trivially cause random results is not a happy one.

    1. Re:If that's all... by Anonymous Coward · · Score: 0

      I'm actually more terrified by the notion that activities in one core can cause bits to flip for another, completely randomly. We have a _lot_ of important stuff riding on the correct calculations happening in all those CPUs, worldwide, and the idea that you can pretty trivially cause random results is not a happy one.

      Yes.. that is why you should not vary voltage beyond spec :)

  15. Not too hard to fix by Anonymous Coward · · Score: 1

    Making things secure is much harder than breaking into things. Given that, this one is an easy fix. The hypervisor controlling security can make sure the security states are stable before granting access (Accross dvfs variations) The security software can monitor votalge varations beyond allowed and lock down the system/user program ( red alert)
    Btw voltage can be varied from outside without power management commands to bypass pm control software. So a best solution is a voltage monitors (most chips have this)
    At last , anytime you mess with dfvs beyond what the hardware was designed for , you crash the system most of the time. This is not reliable hack beyond lab controlled environment.

  16. Not quite so simple. by viperidaenz · · Score: 1

    You need software access to the registers that control the core voltage regulators.
    So you first need to gain root access.
    They changed the DVFS tables to make the soc run outside it's operating areas.

    They had to profile the DVFS operating points for the specific device they used to find the right values to used. The profiling causes device reboots or freezes. Not something you can do without being noticed.

    Step 1: probe DVFS tables, profile system to find points where it causes bit flips without rebooting or freezing.
    Step 2: use performance counters to profile the victim code to find exactly when you need to trigger a fault
    Step 3: load new values to DVFS table
    Step 4: trigger a spin loop at the precise time in a core that shares the same clock-voltage values as the core the victim thread is running in, causing the system to change to the altered voltage/frequency point.
    Step 5: profit?

    The easiest way to mitigate this is to implement power saving better. Separate all the core frequencies and voltages, like Intel does already. The way it's done in ARM chips seems wasteful to me. why would you raise the frequency and voltage of 4 cores when you're only needing one?

    You could also not allow the performance counters to be used to profile code running at a higher privilege level.

    1. Re:Not quite so simple. by Anonymous Coward · · Score: 0

      Your mitigation won't work, in fact it's already implemented like this and it actually helps the attacker, because the attacking thread can run on a different core than the victim thread and make the victim run out of specification while the attacker itself keeps running reliably within specification. A good mitigation would be to redesign the frequency and voltage regulation in such a way that it's impossible to pick an unsafe frequency and that for a given frequency a sufficient voltage will always be selected.

      As for the impact of this problem, it's an escalation of privilege from root to TrustZone. In x86 parlance, that would be from ring 0 to ring -2 (or system management mode or whatever you want to call it) so the impact is similar to the Memory Sinkhole problem. I personally as a user actually like these problems because I believe that on a personal computing device, like a desktop computer or a laptop or a phone, there should be no parts that aren't accessible to the end user (when logged in with sufficient privileges) and lower levels below root / ring 0 / call it what you will make it too easy for device manufacturers to shield part of the device's software from the end user in a way that is very hard to get around. I've heard the usual arguments, about microcode patches that the OS shouldn't change and power management and so on, and maybe these things should be made to be hard to tinker with accidentally, but they should be accessible from ring 0, which should be the lowest level. Otherwise it's too easy to hide part of the software from the end user and I consider that a threat to our digital freedoms. Also, if ring -2 is compromised without the end user's knowledge, it's an excellent place to put a rootkit that is almost impossible to discover.

  17. Is anyone beginning to get the idea by Sqreater · · Score: 1

    that you can't have computer security yet? That it is not possible? That what a man can make, a make can take apart?

    --
    E Proelio Veritas.
  18. Semi-practical vs entrenched flaws by EndlessNameless · · Score: 1

    Apps cannot be granted permission to control DVFS, which is necessary to induce the faults, but they can manipulate it because Android responds to the application's load/behavior.

    However, the application has no specific knowledge of the overall system load and therefore it cannot consistently induce faults. The scenario in a lab is probably far, far easier than real life---it eliminates the effect of other apps, network state changes, etc on the power state.

    Very clever proof of concept, but it will take a Herculian effort to turn this into an effective attack in the wild.

    Fixing the problems will require all parties. There are elements under the control of ARM directly as well as the SoC designers. Android may be able to mitigate the issues at the OS level---but I assume that would penalize battery life, system performance, or both.

    --

    ---
    According to the latest ruleset, this post should be modded as Vorpal Flamebait +5.