Slashdot Mirror


Misinterpretation of Standard Causing USB Disconnects On Resume In Linux

hypnosec writes "According to a new revelation by Sarah Sharp, misinterpretation of the USB 2.0 standard may have been the culprit behind USB disconnects on resume in Linux all along rather than cheap and buggy devices. According to Sharp the USB core is to blame for the disconnections rather than the devices themselves as the core doesn't wait long enough for the devices to transition from a 'resume state to U0.' The USB 2.0 standard states that system software that handles USB must provide for 10ms resume recovery time (TRSMRCY) during which it shouldn't attempt a connection to the device connected to that particular bus segment."

29 of 280 comments (clear)

  1. A bug in Linux? by Anonymous Coward · · Score: 5, Funny

    Clearly the whole thing is broken, and we should transition to a newer, more open and transparent system than even open source.

    I will call it OPENER Source. You aren't just able to read the source, you're required to read it!

  2. Update by Anonymous Coward · · Score: 5, Informative

    "Update: Looks like this is an xHCI specific issue, and probably not the cause of the USB device disconnects under EHCI. "

  3. linux has bugs? by Anonymous Coward · · Score: 5, Interesting

    Could have fooled me, I end up spending upwards of 3 months a year fixing bugs in the base os that we ship to run our appliance on. Some of the linux subsystems still read like they were written in someone's basement even after a decade of most of the maintainers being paid a yearly salary to maintain it. God forbid you actually fix some of the crap and post fixes though that are more than ten lines long.. Its a fine way to get blacklisted.

    1. Re:linux has bugs? by TheGavster · · Score: 5, Insightful

      It's true. Linus has been quite vocal about whose fault it is when a kernel change breaks an application...

      --
      "Because Science" is one step from "Because old book". Try "Because of my experiment testing my falsifiable assertion".
    2. Re:linux has bugs? by LordLimecat · · Score: 3, Insightful

      Im not a software dev, but I have never agreed with this:

      If a change results in user programs breaking, it's a bug in the kernel. We never EVER blame the user programs.

      What happens when some program has been using a privilege escalation bug to get around sudo, and it breaks when the kernel is patched to fix the vulnerability? Is that still "a kernel bug", should they not patch it? It seems to me that, yes, you try not to break applications, but this is why you have an official, supported API, and if bad developers want to rely on buggy kernel behavior for their programs, you have to choose between either screwing them, or screwing everyone else.

      If anyone can enlighten me as to why thats wrong, Id appreciate it.

    3. Re:linux has bugs? by Anonymous Coward · · Score: 5, Insightful

      For all practical purposes there's no way, I repeat, no way to "heat the whole apartment block" to eradicate bed bugs.

      So you don't know what you are talking about.

      If anyone can enlighten me as to why thats wrong, Id appreciate it.

      1. syscall returns -EFOO when to report condition A
      2. hmm, someone notices that -EFOO is too generic. That syscall should return the more specific -ECOND_A_ERROR instead. They change it.
      3. ALL SOFTWARE suddenly works *different* and perhaps does not work at all on the modified kernel that uses that syscall vs.older kernel.

      Kapish??

      Do not change API internals. Fixing undocumented features (ie. bugs, like overflows) is one thing. Modifying documented and established API on a whim is a bad bad bad thing.

      If you want to modify it like that, you do the following,

      1. syscall returns -EFOO when to report condition A
      2. hmm, someone notices that -EFOO is too generic. That syscall should return the more specific -ECOND_A_ERROR instead. SO MAKE A NEW SYSCALL THAT RETURNS CORRECT! Leave old one as deprecated for removal in some years.
      3. ALL SOFTWARE continues to work.

      If #2 is too much effort for reward, then do nothing. But above all, do not break userland with kernel changes. Ever.

    4. Re:linux has bugs? by hedwards · · Score: 5, Informative

      Basically, the kernel has an Application Binary Interface which is a bit like a contract. If the application gives the kernel something formatted in a specific way, the kernal promises to give it back something in a specific way and the other way around. Any software that is written to respect the contract should never be broken by a change to the kernel as the application has no knowledge of how the kernel performs its obligation.

      Changes to the ABI are not supposed to be common events. They're supposed to be changed only when lesser changes can't work. FreeBSD handles it using compatibility libraries which maintain the ABI for various kernel revisions so that applications can continue to use older ones if need be. AFAIK, Linux doesn't do that, and as a result, the kernel maintainer and the developers writing the code have to be even more careful about changes made not messing up the ABI.

      Also, because Linux is just a kernel without a userland, a change to the Linux kernel that was permitted to break the ABI could hose all of the distros all at once requiring the rewrite of hundreds of little bits of software that are cobbled together to make the distros function as complete OSes.

      There's more to it, but that's basically why Linus takes the stance that the kernel is to blame and not the developer. But, he undoubtedly doesn't consider it to be the kernel's fault if a developer does things that don't comply with the normal ABI specifications.

    5. Re:linux has bugs? by You're+All+Wrong · · Score: 3, Insightful

      Strange. I'll admit that Linux has some drivers that are full of bugs, but the ones that are most full of bugs seem to be ones thrown over the wall by large hardware vendors. You know the ones - the drivers with 20000-line C files, that create 2000 checkpatch warnings. Those drivers were written by salaried employees, not sitting in their basement.

      --
      Your head of state is a corrupt weasel, I hope you're happy.
    6. Re:linux has bugs? by minus9 · · Score: 3, Insightful

      Yet you still choose it as the base OS to run your appliance on. Presumably it's still better than any of the alternatives.

      All software has bugs. I'm sure plenty of device drivers written in brightly lit offices by people with smart haircuts and shiny shoes have some absolutely horrific code too. I doubt which floor you work on has any significant effect.

    7. Re:linux has bugs? by Forever+Wondering · · Score: 4, Informative

      The ABI has been changed upon occasion. If a struct passed to a syscall (or ioctl) has some spare option bits (that were reserved [and therefore zero], that can be the way to go (e.g. turning on the bit indicates that the program is aware of the new semantics).

      Otherwise, a new syscall (or ioctl) number is assigned. For example, the stat syscall originally had a syscall number of 18. When the "struct stat" was modified [added some new fields and/or expanded the size of others], the syscall number was bumped to 106. Old programs that were not recompiled issued stat with 18 and worked unchanged. If you recompiled, you got syscall 106 and the new semantics.

      --
      Like a good neighbor, fsck is there ...
    8. Re:linux has bugs? by iserlohn · · Score: 3, Interesting

      I'm not sure which world you live in, but leading the project which produces the OS kernel that is used in more computing devices than any other - well, that's not a bad result really.

  4. Maybe not all the disconnects? by AdamHaun · · Score: 5, Informative

    Sarah's Google+ post has an update:

    Update: Looks like this is an xHCI specific issue, and probably not the cause of the USB device disconnects under EHCI. To everyone who commented with other USB issues (none of which really sounded related), please email the linux-usb mailing list with a description of your issue.

    --
    Visit the
  5. USB sucks by Skapare · · Score: 4, Informative

    USB as a whole is already a silly design, having all these silly details and ambiguities. For example, where it has a minimum time (10ms in this case), it should also have a maximum time (for example 50ms). Devices should be able to communicate after that maximum time or they are broken. Actually, there should be a maximum time when powered up ... how is a minimum even useful for anything.

    This only needs to specify controller communication, not device function. For example a hard drive might take several seconds to spin up and get in sync. But the controller should be able to do basic communication in 50ms, even if all it can say about the actual hard drive is "spinning up but not ready". USB has a lot of other stuff that is far from the KISS principle.

    --
    now we need to go OSS in diesel cars
  6. Re:not surprising by Anonymous Coward · · Score: 4, Informative

    You just said power management worked well on windows 98 and 95.

    I am calling you a liar.

  7. Re:not surprising by TheGavster · · Score: 4, Informative

    15+ years is a stretch. Even in the 2006-07 era at the end of XP's development, there were brand new machines that couldn't return from sleep correctly. It was particularly vexing since a lot of them were laptops factory configured to sleep when left unattended. I will say that I haven't had any complaints with S3 sleep since the advent of Windows 7, however.

    --
    "Because Science" is one step from "Because old book". Try "Because of my experiment testing my falsifiable assertion".
  8. Resume? What's that? by UltraZelda64 · · Score: 4, Insightful

    Back in the mid-1990s to the mid-2000s when I used Windows, I realized sleep mode was a complete joke, unreliable, and just stopped using it by the time I upgraded to Windows XP or shortly after. In Linux, I am still not a fan of waiting for the damn thing to "wake up" for 5-10 seconds before it will even accept my password, so the only component that ever even enters standby on my machines is the moniter (and this has been the case for over a decade, even dating back to my last years in Windows). Windows, Linux--doesn't matter what the OS is, not putting the system into standby makes the whole experience much smoother, faster and hassle-free.

    On the other hand, though--it is a good thing this was fixed for those laptop users out there.

  9. Re:not surprising by Qzukk · · Score: 3, Informative

    Spoken like someone who's never had to reboot a computer from coma mode.

    --
    If I have been able to see further than others, it is because I bought a pair of binoculars.
  10. Welcome to EE by Ignacio · · Score: 4, Insightful

    The 10ms is for the software. The flip side of this is that the hardware has a maximum of 10ms to get its shit together so that it can be connected to. And 10ms is forever in hardware.

    1. Re:Welcome to EE by Ignacio · · Score: 5, Funny

      The 10ms is for the software. The flip side of this is that the hardware has a maximum of 10ms to get its shit together so that it can be connected to. And 10ms is forever in hardware.

      Dear Linux kernel, i'll be ready when my disk is done spinning up. kthanksbye

      Dear USB hard drive, that's fine, but don't go and disconnect from the USB bus in the meantime. Forever waiting, Linux kernel.

  11. Re:not surprising, since there are few docs by dltaylor · · Score: 5, Informative

    Far too many vendors are only willing to provide chip documentation under a Non-Disclosure Agreement (NDA), which prevents a knowlege-, as opposed to empirical-based Linux driver. This allows them to kludge around chip deficiencies in a Windows driver without the user being aware of any issues. Even Intel has started making it harder to get the real manuals for their CPUs and bridges (they used to ALL be published on Intel's FTP and HTTP sites). Frequently, in System-on-Chip (SoC) implementations, even the CHIP vendors don't know anything; they just pass along whatever quick and dirty proof of concept the designers of some feature of the chip provided and call it a "working driver", while it is nothing that would pass even a cursory QA process.

    The first Linux code I wrote was a "quirk" handler for a parallel ATA PCI chip that came up programmed to the same default I/O addresses as the South Bridge's internal ports, and a BIOS that didn't properly perform PCI enumeration on it, since it already had PCI addresses.

  12. So I can close my laptop now? by kriston · · Score: 3, Informative

    So I can close my laptop now instead of carrying it around like a sort of open pizza box for fear of never having a working mouse until the next reboot? How annoying to start a meeting by rebooting a Linux laptop.

    --

    Kriston

    1. Re:So I can close my laptop now? by tlhIngan · · Score: 4, Insightful

      Why do you need a mouse ?

      Because most laptops generally have terrible pointing devices. If they have touchpads, they're usually far too tiny to be useful (Apple ones excluded - why can't others put big ass touchpads on their laptops?)

      The rubber trackpoint ones are nice for PCs, though the rubber tips wear down way too quickly and you end up with a slippery lump in short order.

      And practically all are pathetic at scrolling. Unless it's an Apple trackpad where the double finger scroll works (once you fix the ()#@% scroll direction).

      Life's just generally easier with an external pointing device.

  13. Re:not surprising by philip.paradis · · Score: 4, Funny

    THe other one is my wife's so it isn't running Linux.

    My wife's laptop is running Debian 7. What's up with your wife? :)

    --
    Write failed: Broken pipe
  14. It explains a lot! by gagol · · Score: 3, Insightful

    That is why my laptop sd-card reader is not working when I close the lid... until reboot. F!*$&%" usb...

    --
    Tomorrow is another day...
  15. Re:Misinterpretation *By Linux* by FrankSchwab · · Score: 4, Insightful

    There is no ambiguity in the USB spec, and Sarah has an incorrect interpretation. The spec requires that the host provide at least 10 ms of recovery time coming out of suspend; a device is required to be able to communicate after this minimum time. Any device which isn't ready for communications after 10 ms of resume recovery time is broken. A host is permitted to provide more than this, but isn't required to.

    So, yes, it's perfectly valid for the host to blindly attempt to communicate with the device after 10 ms - presuming that the host KNOWS precisely when the recovery period began. If the host requested that the bus resume, set a timer for 10 ms, and then tried talking, the HOST is at fault because it didn't check with the hardware as to when the resume period began. I think the 17 ms that they reference in the article is related to this - there is a delay between the request to resume the bus and the actual time that the hardware does resume the bus, so they were trying to talk with devices before the 10 ms period was up.

    The device is perfectly within the spec if it ignores communications prior to 10 ms, or if it responds to them - it has complete flexibility. After 10 ms, however, it MUST be ready to communicate.

    --
    And the worms ate into his brain.
  16. Re:not surprising by Barsteward · · Score: 3, Insightful

    i've got a laptop that blue screens when you pull out the power cord.

    If i remember correctly a lot of the power management problems are due to the manufacturers not implementing the standards correctly, they implemented them to the broken Windows implementation in order to keep WIndows working.

    --
    "The hands that help are better far than lips that pray." - Robert Ingersoll (1833-1899)
  17. Re: not surprising by FireFury03 · · Score: 4, Interesting

    FWIW, I now have a policy of avoiding Acer like the plague and advising my customers to do the same, owing to their appealing customer support when advised that an entire product line had a bios bug.

    http://www.nexusuk.org/~steve/acer.xhtml

    TL;DR: one of their lines of laptops has a dsdt bug, I informed them, they weren't interested. I even sent them a patch, still not interested (and decided that completely ignoring my emails was the best approach). To this date they haven't released an updated bios.

  18. Re:not surprising by Alsee · · Score: 4, Funny

    Oh heavens, it must be happening again. I'm obviously experiencing a relapse of those terrible hallucinations that have plagued me for years.
    Oh, wait, she's real after all.

    Dude, you posted a photo of a laptop sitting on the armrest of an empty couch ;)

    -

    --
    - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
  19. Re: not surprising by Muad'Dave · · Score: 4, Funny

    ... stubbornly refuses to sleep with win7. Works fine with Linux.

    At least she has some standards.

    --
    Tiller's Rule: Never use a word in written form that you've only heard and never read. You will end up looking foolish.