Misinterpretation of Standard Causing USB Disconnects On Resume In Linux
hypnosec writes "According to a new revelation by Sarah Sharp, misinterpretation of the USB 2.0 standard may have been the culprit behind USB disconnects on resume in Linux all along rather than cheap and buggy devices. According to Sharp the USB core is to blame for the disconnections rather than the devices themselves as the core doesn't wait long enough for the devices to transition from a 'resume state to U0.' The USB 2.0 standard states that system software that handles USB must provide for 10ms resume recovery time (TRSMRCY) during which it shouldn't attempt a connection to the device connected to that particular bus segment."
"Update: Looks like this is an xHCI specific issue, and probably not the cause of the USB device disconnects under EHCI. "
Sarah's Google+ post has an update:
Visit the
Far too many vendors are only willing to provide chip documentation under a Non-Disclosure Agreement (NDA), which prevents a knowlege-, as opposed to empirical-based Linux driver. This allows them to kludge around chip deficiencies in a Windows driver without the user being aware of any issues. Even Intel has started making it harder to get the real manuals for their CPUs and bridges (they used to ALL be published on Intel's FTP and HTTP sites). Frequently, in System-on-Chip (SoC) implementations, even the CHIP vendors don't know anything; they just pass along whatever quick and dirty proof of concept the designers of some feature of the chip provided and call it a "working driver", while it is nothing that would pass even a cursory QA process.
The first Linux code I wrote was a "quirk" handler for a parallel ATA PCI chip that came up programmed to the same default I/O addresses as the South Bridge's internal ports, and a BIOS that didn't properly perform PCI enumeration on it, since it already had PCI addresses.
Basically, the kernel has an Application Binary Interface which is a bit like a contract. If the application gives the kernel something formatted in a specific way, the kernal promises to give it back something in a specific way and the other way around. Any software that is written to respect the contract should never be broken by a change to the kernel as the application has no knowledge of how the kernel performs its obligation.
Changes to the ABI are not supposed to be common events. They're supposed to be changed only when lesser changes can't work. FreeBSD handles it using compatibility libraries which maintain the ABI for various kernel revisions so that applications can continue to use older ones if need be. AFAIK, Linux doesn't do that, and as a result, the kernel maintainer and the developers writing the code have to be even more careful about changes made not messing up the ABI.
Also, because Linux is just a kernel without a userland, a change to the Linux kernel that was permitted to break the ABI could hose all of the distros all at once requiring the rewrite of hundreds of little bits of software that are cobbled together to make the distros function as complete OSes.
There's more to it, but that's basically why Linus takes the stance that the kernel is to blame and not the developer. But, he undoubtedly doesn't consider it to be the kernel's fault if a developer does things that don't comply with the normal ABI specifications.