Slashdot Mirror


The Ugly State of ARM Support On Linux

jfruhlinger writes "Power-efficient ARM processors are moving up the food chain, to the extent that even Windows will soon see an ARM port. Linux, which has long been cross-platform, should have a long head start in this niche, right? Well, blogger Brian Proffitt explains just how messy the state of Linux support for ARM is right now, partially as a result of mutually conflicting kernel hacks from ARM manufacturers who just wanted to get their products out the door and weren't necessarily abiding by the GPL obligations to release code. Things are improving now, not least because Linus is taking a personal hand in things, but sorting the mess out will take time."

15 of 94 comments (clear)

  1. Problem is simple by JamesP · · Score: 4, Insightful

    ARM manufacturers are idiots

    Intel gets open source, most ARM manufacturers don't.

    Hence, most BSP rely on proprietary drivers, they don't have up-to-date support for devices in the mainline kernel, etc

    Also, there's a lack of a 'standart platform', even though ARM is pretty much homogeneous

    Things are beginning to change, still. And ARM is still miles ahead from SH, embedded MIPS, etc

    --
    how long until /. fixes commenting on Chrome?
    1. Re:Problem is simple by Microlith · · Score: 4, Interesting

      most BSP rely on proprietary drivers

      Not true. Almost every device released today has full driver support in the kernel sources that are dropped. Userspace components notwithstanding, the kernels released are fully capable of supporting other OSes when recompiled (assuming the device will boot them.)

      What does happen, however, and I stated this elsewhere, is the drivers are released ONLY into those tarballs with no revision history, full of android-specific code and are never merged upstream into the kernel. This makes porting newer kernels to the device even harder, which you can see in the 2.6.36 and 2.6.37 changeup in how some sound drivers are structured. As a result, you've got tons of drivers for hardware sitting, and rotting, in obscure folders on corporate websites.

      And all this mess is before the schism created in the userspace by Android.

    2. Re:Problem is simple by JamesP · · Score: 3, Informative

      I wasn't talking about Android, but the point stands.

      If wireless controllers on Android devices don't depend on proprietary drivers, great! That's a start

      But try Hw accelerated video playback, 3D drivers, etc

      And some products absolutely depend on those. Think set-top-boxes, multimedia players, etc.

      --
      how long until /. fixes commenting on Chrome?
    3. Re:Problem is simple by serviscope_minor · · Score: 3, Interesting

      Also, there's a lack of a 'standart platform', even though ARM is pretty much homogeneous.

      Kind of. Actually things are not that bad. There are a lot of SoCs out there which bundle an arm core with a few other cores (ethernet, usb, etc). There are actually staggeringly few vendors for the peripheral cores. The SoC vendors don't generally mention who the core vendor is, but they provide a datasheet and stick the core at some random place in the address space.

      As a result, there are a lot of reimplementations of the same drivers. This has been recognised and peopls are now trying to spot duplicate drivers and replace them with a asingle unified one.

      --
      SJW n. One who posts facts.
    4. Re:Problem is simple by Microlith · · Score: 4, Informative

      But try Hw accelerated video playback, 3D drivers, etc

      Working on MeeGo makes me all too keenly aware of that mess. None of it really applies to the kernel though, since all interesting bits are in userspace. And the graphics core IP vendors (Qualcomm most notably) have already been refused entry into the kernel because of this.

    5. Re:Problem is simple by Sun · · Score: 3, Interesting

      Here's my experience. I did a project for a company that were producing a SoC themselves. We were using the designware SPI peripheral. We wrote the driver ourselves (don't remember right now why - the dw_spi module was not for the right chip or something along these lines. I didn't do the original development).

      Turns out this chip doesn't have proper peripherals support. No NAND controller and no integrated MAC, so we use SPI for both persistent storage and for networking. Except the chip isn't fast enough to service the "SPI queue is almost empty" interrupt, despite the designware having a huge queue (256 bytes), and no matter how high we place the watermark, so we do some serious trickery in order to get things working (in essence - directing SPI chips select to a GPIO and manually controlling activation and deactivation). Poor SPI throughput. Worse, the driver is now unsubmittable, as it contains hacks which really only make sense to this particular chip.

      So I come along, and suggest to hook the SPI driver to the existing on board DMA controller. Get the whole buffers through without the CPU needing to do anything. A bit of hard work, and the DMA is working (not improving performance, but that's another story). Except neither the DMA infrastructure nor the actual hardware are generic enough to do such a thing so that I don't care which DMA controller is hooked into the SPI controller. So, more hacks. In theory, I could rework the infrastructure so that it is more generic, but that's a project that will cost (man hours) about as much as the original SPI driver rewrite.

      The project wound up being canceled, so things never progressed any further, but you can understand that none of that code was ever released. This is not due to the client's desire not to release. Search for Baruch Siach's contributions in the enc286 code for example of vanilla integrated code that were done on that client's dime and with their consent. It's just that there is a limit to how much time a company can authorize merely so that the code is generic enough to go into main.

      Shachar

  2. The GPL remarks GPL in the article are nonsense. by MatanZ · · Score: 4, Informative

    The ARM vendors (TI, Samsung, etc.) do release their kernel changes. What they do not do is work with Linus and RMK on getting their code merged upstream. The GPL does not require that they do that.

  3. AMEN by synthesizerpatel · · Score: 5, Informative

    Having worked on bring-up on three custom ARM projects, I can personally attest to how gnarly it can be. But it's not necessarily something that Linus will be able to fix, or the Linux kernel community at large.

    The main problem is the custom board support - even though the source code is GPL, they give you full source code and even submit it to back into the eco-system, it's just haphazard code that was pushed out the door too quickly. Linus can't stop people from writing bad kernel code, he can stop them from submitting it back into the mainline, but thats kind of what we have right now. If your code isn't up to snuff it doesn't make it into mainline. That doesn't stop them from shipping a product and giving that code to customers.

    In one case, the documentation for the ARM chip I recieved was a password protected PDF that you can't even cut text out of, describing how to use the features by writing your own device driver. In that case, they had minimal Linux support but for all the bells and whistles you had to do it yourself.

    The problem is as dense and layered as the chips themselves - what really needs to happen is a standardized method for publishing SoC features in a structured format (XML?) where common features (FIFO registers with a bytes_remaining field? Write only configuration registers, Read only configuration register.. etc) could be defined and the code could in many cases just be automatically generated.

    Need to set reg A to all f's, reg B to all zeros, flip bit 12 of reg C and then your PHY is configured - done.

    For more complex interlocking mechanisms that would be difficult or impossible to communicate in a cure-all DSL, but even if you could eliminate 80% of the problems that'd be great.

    Which brings me to the other problem - a lot of what you do to get ARM systems up and running happens way before you run Linux - in U-Boot/RedBoot or whatever else is out there.. And thats a whole other kettle of fish.

    1. Re:AMEN by bgat · · Score: 4, Interesting

      You think it's gnarly now? You should have seen it a couple of years ago! Things have improved by light-years since then.

      It's true that ARM isn't as cleanly supported as, say, x86. But the simple explanation is that there is significantly more diversity in the ARM world than in the x86 world, so comparisons between the two are a bit like comparing apples to orangutans.

      There are limits to what can be done to address the problem. I prefer having a diversity of ARM chips to having a BIOS--- and that would be the only way to tame this beast long-term. I think most platform developers (those who do both hardware and software) would agree with me: it's easier to port Linux to a good chip for your end application, than it is to use a less-than-ideal chip in the platform just because it has a mature Linux port. So while we should continue refactoring Linux on ARM, we should also accept that things will never be as clean as they are on x86. It isn't in anyone's best interest to even strive for that goal.

      In parallel with all of this, we must be careful not to kill the goose that lays the golden eggs. ARM is the singular reason why Linux owns the embedded space for 32-bit CPUs that run OSes. Nobody else is even close. So despite all of Linux's warts for ARM, it still works really, really, REALLY well. Vendors of ARM SoC's should recognize this, and pony up some funding to clean up the mess as an investment in their futures.

      --
      b.g.
    2. Re:AMEN by JackDW · · Score: 3, Interesting

      I don't know if such a standard SoC description format could ever exist in a useful form. Anything even moderately complex would require driver code, not just descriptive data. Descriptions produced by vendors would inevitably be buggy, like ACPI data. This solution would probably just make the problem worse.

      It would be much better to simply standardise the SoC, so that every ARM system has the same basic elements. Just like a PC, where the interrupt controller and memory are always in the same place, and the timer always has the same register map.

      I assume that SoC vendors do not do this because (1) they don't need to, (2) they want to have "value-added" features like their own custom power management subsystem, and (3) the diversity makes it harder to use a different SoC as a drop-in replacement.

      But they should standardise. There's no advantage to the user, the OEM, or the OS developers in having so many different SoCs.

      --
      You're an immobile computer, remember?
  4. weak ARM support is not surprising by Anonymous Coward · · Score: 3, Interesting

    weak ARM support is very much related to the constantly moving target of ARM hardware. there are several series of ARM cpus in use today and as soon as one becomes commonplace, it is phased out in favor of a "cheaper and better" cpu, sometimes in the same series, sometimes not.

    this phenomenon is related to wireless providers having an economy of scale that doesn't make sense in an end-user context. for them, having a team of skilled programmers that cost > USD 10 mln / yr is nothing and they leverage the hell out of this fact. expect this sort of stuff to continue despite ARM cpus comprising the majority of cpus on the planet.

  5. Wow some errors in this article by Anonymous Coward · · Score: 3, Insightful

    >> a threat that could effect dozens of companies' livelihoods
    A lot of semiconductor companies were releasing linux-based SoCs way before the mainline kernel started consolidating code from vendors. If Linus stopped pulling ARM code, no business would shut down. I personally don't know any companies that rely on Linus' tree to ship their customers.

    >> To make matters worse, even though the GPL v2 license on the Linux kernel requires these changes to be released back upstream to the main Linux kernel, often they were not.
    This doesn't make any sense to me. GPL requires the changes to be released to the person who purchases your device/code. The vendors have zero responsibility to the mainline.

    >> ...this is entirely the reason why the non-profit Linaro consortium (...) was put together...
    One thing I wonder about Linaro is how they are going to be the leader and not play catch up. There are a lot of board-specific drivers they can consolidate, but as they consolidate, the vendors are coming out with even more.

    >> [a]s an indication of the scale of this problem, each new kernel release sees about 70,000 new lines of ARM code, whereas there's roughly 5,000 lines of new x86 code added."
    I find this comparison very unfair. Yes, that 70K number could be more like 20-25K but the devices with ARM processors have very different structures, designs, and end goals. One code can't fit them all. On the flip side, most x86 implementations are on either desktop or server side.

    I'm surprised Likely didn't talk about the device-tree support for the ARM tree. I've implemented a few (ppc-based) boards with device trees. The initial learning curve was a bit painful, but once you understand it, it enables a lot of common code and cuts down development time too. synthesizerpatel above mentioned "a standardized method for publishing SoC features in a structured format" above and the device trees are exactly it (except they're not XML! so, even better!)

    My preference as a lowly bring-up guy would be if the desktop/server kernel split up from the embedded kernel completely. Embedded kernel devs then can emphasize what's important to them (cut down development time, wide variety of device support, aggressive power mgmt) while the desktop/server devs can focus on their stuff.

  6. Re:NSLU2 by petermgreen · · Score: 5, Interesting

    That is because the slug is old hardware, wasn't exactly high end when it was released and was bought in large numbers by linux hobbyists. So it's well-known but slow. The shortage of ram doesn't exactly help either (it's possible to upgrade it but it's not for the feint hearted). Modern arm hardware is faster though there are speed issues caused by the floating point mess.

    AIUI the big issue on ARM is lack of a standard platform.

    On a PC you can assume you have a BIOS that can load stuff from HDD and execute it in an environment with basic disk access services. You can assume the addresses of most of the basic hardware (real time clock, interrupt controllers etc) You can generally assume there is a PCI bus for auto-configuration of other devices and that PCI bus has it's configuring space mapped to the processor in a standard way. There is a standard way of reading out how much ram there is and how it's mapped and so on. These things mean you can build one kernel and use it with one bootloader on pretty much any PC.

    On arm afaict there is no standard platform. Therefore each arm processor and sometimes each arm board needs specific support to tell the kernel things like how to find out where stuff is mapped in the processors address space, how to find out how much ram there is and all the other quirks of the new system. Often these things are hacked up as quickly as possible by vendors who want to get a working system out which appears to be what is pissing linus off*.

    There is also the floating point mess. ARM has been used with many floating point units over the years. Right now there is one that is most common and debian at least seem to have decided that the way to go is to build two ports, armel for systems without FPUs (or systems with unsupported FPUs) and armhf for systems with vfp but if vfp falls out of favour then they will be left with either adding yet another port or trying to hack something up. Also afaict there is no easy way to migrate between different debian arm ports without reinstalling.

    * and afaict pissing linus off is bad because if he doesn't merge code then it tends to bitrot unless it has very active maintainers.

    --
    note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
  7. The Ugly State of ARM Support on GCC by Suiggy · · Score: 3, Informative

    The Kernel isn't the only thing suffering from shoddy support. The ARM backend and code generator for GCC is suboptimal. The GCC __sync_* builtin functions for atomic memory access are unoptimized and call into kernel functions, which isn't always necessary, hopefully this will be fixed with the new C1x/C++0x atomics and memory model. And then the ARM NEON neon intrinsics/builtins implementation is in an absolutely horrendous state, I'm surprised NEON register allocator is even functional.

    I'd fix it myself, but then I'd have to spend 2 months learning how to make changes to GCC, and wait another 6 months for my patches to be accepted.

  8. Re:NSLU2 by EETech1 · · Score: 4, Informative

    I imagine its very similar to what I find rewriting libraries for microcontrollers from various vendors and even different micros from the same vendor. While they all have similar hardware I.E. a CAN interface, there is no standard way of configuring the hardware for bit timing, or message ID's or acceptance masks and filters, the number of available mailboxes and their functionality differs, message tx rx signaling, interrupt types, error reporting, register descriptions, its all different! ADC's are the same way, timing, triggering, re-triggering, addressing, configuring, accessing, input scaling, reference source, result scaling, register access, all different for essentially (IE a 10 bit successive approximation ADC) the same hardware.

    Every single one of the various little tidbits of IP that gets added is different from each and every manufacturer!
    No two vendors do anything the same. And one would probably be sued by the other if they did. We had to get special approval from Motorola to have Infineon replicate similar functionality in one of their DSP's to allow us to use the same code output from Simulink across multiple ECU families.

    You have to be different to be better, and all these vendors implement features attempting to be the best so you have a reason to purchase their device over the other 10 that are essentially just like it.

    Makes it very difficult on the person developing the API to have consistency across multiple platforms without dumbing it down to lose some features striving for a common set, or having slightly different API's or slightly different usage per micro, or designing them around an application, and hiding much of the other functionality.

    Cheers!