Slashdot Mirror


Nailing the Cause of Recent Linux Power Issues

An anonymous reader writes "For the Linux kernel power regressions that were found a few months ago, and hit in Ubuntu 11.04, Phoronix has found the regression that's still present in the Linux 3.0 kernel. The power regression is caused by a change in ASPM, the Active-State Power Management, for PCI Express support."

33 of 156 comments (clear)

  1. Carpentry and computer power failures by mpoulton · · Score: 3, Funny

    Interesting headline. I was trying to figure out how old-school manual construction work would be responsible for tricky power supply problems on Linux machines only.

    --
    I am a geek attorney, but not your geek attorney unless you've already retained me. This is not legal advice.
    1. Re:Carpentry and computer power failures by WrongSizeGlass · · Score: 5, Funny

      The headline demonstrates a skill that the Linux community seems to lack: the modern corporate marketing mindset. What the Linux community should have done is used this extra power consumption to their advantage: Linux, now more powerful than ever!*



      * more powerful based on the amount of energy used to perform the same tasks

  2. Summary: not a Linux problem, but a BIOS problem by ArsenneLupin · · Score: 5, Informative
    To sum up the article in 3 sentences:

    It's due to some buggy BIOSes not properly advertising power-saving features of PCIE cards. Older kernels didn't honor those BIOS hints, and disabled power to unused PCIE cards anyways (causing hangs in rare cases), whereas new kernels do the right thing (causing power wastage in lots of cases). The workaround is to specify pcie_aspm=force on the boot (Grub) command line, to tell the kernel to forge ahead, and just use power management on these cards regardless of the BIOS advice.

  3. Re:No more Moronix, please! by qinjuehang · · Score: 5, Informative

    As bad as some of the Phoronix articles can be, they have contributed a lot to the community. After all, they played a pivotal role in setting up openbenchmarking.org, and are pretty much the only source of Linux hardware reviews.

  4. "serious bug" my ass by KiloByte · · Score: 5, Interesting

    The article is full of sensationalism like "serious bug", "major regression" to promote Phoronix and its "wonderful test suite". If you read it closely, you'll see they have seen a 10% increase in power consumption on just one of their test laptops that depends on BIOS settings. That particular laptop has a bug in its BIOS where it claims it wants to manage configuration of a particular piece of hardware, and new kernels obey that request. You can even tell the kernel to disregard BIOS and force power settings anyway.

    For me, improving power efficiency everywhere but that particular laptop is a major win. If you feel nice, you can even detect this particular buggy BIOS and ignore its request. But then, even after throrough fiddling, Phoronix guys weren't able to improve power usage by more than 15% even on this laptop, so it's not a big issue anyway.

    --
    The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
    1. Re:"serious bug" my ass by gbjbaanb · · Score: 2

      I think you've being a little harsh, there isn't much in the way of hardware reviews for Linux so these guys doing them provides some service to the community.

      And if they'd detected 10% decrease in power consumption, the article would be just as sensationalist, only this time considered good. I never knew about the kernel option, now I do.

    2. Re:"serious bug" my ass by fuzzyfuzzyfungus · · Score: 4, Interesting

      While this article is a touch overblown, stories like this make me profoundly pessimistic about the advent of EFI...

      Yeah, the BIOS pretty much sucks, and its horrible backwards-compatibility hackery makes purists cry; but the very fact that it sucks so much has had the basically positive effect of keeping vendors from trying to get too clever it. Given the results of their trying to do so(like everybody's favorite problem child, ACPI) this is probably a good thing.

      EFI, especially in conjunction with CPUs that have hardware level virtualization support, is pretty much an entire second OS, moonlighting as a bootloader, that you either have to perform coreboot-platform-port level black magic to replace(if the board even allows you, you might also have to defeat some sort of firmware integrity check) or lament unto your motherboard vendor in hope of getting fixed. If buggy BIOSes are an issue now, buggy EFI will be a fucking nightmare. The last thing we want is more and more stuff going on under the surface, with development handled by motherboard OEMs with, to put it charitably, no OS-development experience worth putting on a CV...

      At least the suckitude of the classic BIOS created a de-facto pressure toward "let the bootloader bootload and then GTFO so that the OS can handle things". Ideally, we could have just had a modern, lessons-learned, minimal bootloader, that could skip the brief sojourn to the 80s; but still bugger off as fast as possible. Instead, we are facing the looming advent of having every computer running two OSes with hardware access, even after the bootloading is done, the resultant messy(but model/firmware-revision specific) infighting of which are going to make ACPI look like an architecturally elegant story of idyllic peaceful cooperation...

    3. Re:"serious bug" my ass by dnaumov · · Score: 3, Insightful

      The user is not going to give a shit. The user will see that Windows doesn't suffer from this increase in power consumption and will decide that Linux is inferior.

    4. Re:"serious bug" my ass by David+Gerard · · Score: 5, Informative

      You are entirely correct. See Matthew Garrett's blog for the icky details of EFI on Linux. He makes this hideous piece of shit work for a living.

      --
      http://rocknerd.co.uk
    5. Re:"serious bug" my ass by hitmark · · Score: 2

      Heh, i think it was Torvalds that decried ACPI as insanity. Not helped by Microsoft having a test suite that deviate at various places from the Intel equivalent. But what will the OEMs use? Why, the Microsoft one...

      --
      comment first, facts later. http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm
    6. Re:"serious bug" my ass by fuzzyfuzzyfungus · · Score: 3, Interesting

      This behavior is by design:

      "One thing I find myself wondering about is whether we shouldn’t try and make the "ACPI" extensions somehow Windows specific.
      It seems unfortunate if we do this work and get our partners to do the work and the result is that Linux works great without having to do the work.
      Maybe there is no way Io avoid this problem but it does bother me. Maybe we couid define the APIs so that they work well with NT and not the others even if they are open."

      William H. Gates III

    7. Re:"serious bug" my ass by Artem+S.+Tashkinov · · Score: 3, Insightful

      Phoronix guys weren't able to improve power usage by more than 15% even on this laptop, so it's not a big issue anyway.

      15% from 6hours make it roughly one hour so I cannot say this issue is really minor. I'd even dare to say that every such detail does count since most hardware vendors tailor their products exclusively for Windows and the fact that Linux even works is a wonder.

      And please don't judge Phoronix harshly. It's one of a very few websites which actually drive Linux development. Yes, Michael likes sensational style, but then again he wants to eat, buy hardware to test Linux on, pay for other people's work.

  5. tl;dr by OliWarner · · Score: 3, Informative

    Add pcie_aspm=force to your boot options.

    Test it by editing grub (which is a temporary edit that will be lost next boot) first and test out suspend, hibernate, etc.

    If that works, edit your grub configuration files. For ubuntu users this means editing /etc/default/grub and editing the GRUB_CMDLINE_LINUX_DEFAULT variable. Then call sudo update-grub.

    1. Re:tl;dr by AlexiaDeath · · Score: 3, Interesting

      It cant be a quick test either. Some machines start randomly hard locking with ASPM managed by the kernel. I have one of those. Uptime can vary from 5 minutes to 3 hours.

  6. Re:Summary: not a Linux problem, but a BIOS proble by nagnamer · · Score: 2

    Is it possible that unused PCIE cards waste that much power? On Linux I drain my laptop's batter in under 2 hours, sometimes 1.5. On Win7 it used to take 3+ hours with brightness at 100% (because I was outdoors).

    DISCLAIMER: Author of this post is currently using Linux because of superior performance and availability of tools not available on Windows platform.

    --
    Every harsh word you utter has the right address. It only sounds harsh because the one on the envelope is the wrong one.
  7. Re:Summary: not a Linux problem, but a BIOS proble by Manip · · Score: 5, Interesting

    That is an accurate summation of the article; but calling things "right" and "wrong" is a little nieve. Windows treats this information very differently to Linux, and BIOS manufacturers are caught between the two. Simply advertising ASPM sounds good, unless it causes Windows to treat card without ASPM support as if they have it just because the bios advertised that the system supported it. Now current versions of Windows might act rationally in this regard, but XP and older are still highly prevalent particularly amongst corporate clients and governments.

    So I guess my point is - it isn't a simple right or wrong/black or white scenario. It is a messy, ugly, undocumented hack, that ultimately leaves nobody happy. Linux will likely wind up having to implement a hack too to fix this, which makes them no better or no worse than the bios manufacturers who did exactly the same thing.

  8. Re:Summary: not a Linux problem, but a BIOS proble by daid303 · · Score: 3, Informative

    The article points out that there is also a power regression in the scheduler. Which is the next thing that the writer will look at.

  9. Re:Summary: not a Linux problem, but a BIOS proble by fuzzyfuzzyfungus · · Score: 5, Interesting

    Hard to say without the exact specs of the machine, and probably a bunch of test-probes clipped in awkward places inside the laptop; but the overall trend in hardware does seem to have been toward ever higher theoretical maximum-if-we-felt-like-burning-that-much power draw(remember back when a ~50-80 watt CPU was considered a howling-mad-danger-to-self-and-others overclock/overvolt insanity demandng nerves of steel and custom cooling? Now boring retail CPUs have TDPs in the ~130 watt range); but a corresponding increase in the ability of hardware to throttle various clocks(CPU, GPU, high sped busses), sometimes cut Vcore as well, and turn off(or very nearly so) unused peripherals.

    Exactly where the delta exists vs. Windows seems to be a matter of some confusion; but unless Linux is just plain burning more CPU time for housekeeping purposes(which, one assumes, is the sort of things that the Big Serious Corporate users of 1000+ node commodity server/compute setups would have noticed by now), it likely rests largely in the hands of a (no doubt alarmingly large and ever changing) set of hardware-specific power throttling stuff whose responsibilities were designed to be divided between the buggy BIOS and the vendor's Windows drivers. If it were Just One Mistake, it'd likely have been quashed by now...

  10. Why not us a database? by DoofusOfDeath · · Score: 2

    Would it work to have the kernel default to using whatever the BIOS indicated, but also have a database of overrides based on the exact card model?

  11. Re:Sigh... by AlexiaDeath · · Score: 2

    The machine crashing randomly is CERTAIN to drive them away while power issue may or may not.

  12. Re:Sigh... by AlexiaDeath · · Score: 2

    Trouble with this is that there is no defined list of BIOS-es that will crash or BIOS-es that work and no automated way to gather it, plus it would require maintenance. That can not happen in kernel. The distro's installer is perfect place to detect and configure grub accordingly but I doubt the maintainers are willing to shoulder the burden this brings.

  13. Re:Summary: not a Linux problem, but a BIOS proble by drinkypoo · · Score: 4, Interesting

    That is an accurate summation of the article; but calling things "right" and "wrong" is a little nieve. Windows treats this information very differently to Linux, and BIOS manufacturers are caught between the two.

    In other cases this has been because microsoft wrote the tools and designed them to be hostile to Linux, e.g. ACPI. is there any of that here?

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  14. Re:Summary: not a Linux problem, but a BIOS proble by jonamous++ · · Score: 3, Interesting

    I'm using a Vaio S that gets 7+hr battery life in Windows, and under 2hr battery life in Fedora. The big problem that I see with this laptop is that Fedora is not utilizing the "hybrid" graphics system, and it is constantly running off of the graphics card instead of the integrated graphics (in windows, this brings the battery life to under 2 hours, as well). It would be nice to be able to switch that permanently to integrated to get the battery life.

  15. Re:Summary: not a Linux problem, but a BIOS proble by SuricouRaven · · Score: 3, Informative

    Linux does thing the way they should be done according to standard. Windows does things they way they actually are done in the real world. The reason is simple: BIOS vendors noticed Windows doesn't follow the standard well, and made the reasonable assumption that the vast majority of users would run windows. Thus they deviated from the standard in order to better support it.

  16. Re:ACPI has ALWAYS favoured Windows... by SuricouRaven · · Score: 2

    ACPI vendors always favored windows, because that is just what most of the users will run. With the exception of server boards, non-windows users are a vanishingly tiny percentage, and scarcely worth the time to test for even briefly. It's a self-sustaining business advantage, as is seen so often in the technology sector: The dominant platform is the most widely supported, which helps to ensure it's continuing dominance.

  17. Re:Summary: not a Linux problem, but a BIOS proble by drolli · · Score: 2

    That has happened before so many times you cant count.

  18. Re:Never upgrade your Linux... by jedidiah · · Score: 4, Interesting

    Yes. 30 years of Microsoft sabotaging competitors great and small does make it hard for anyone else to get a toe hold.

    As always, this situation depends on how demanding your expectations are and whether or not you can put up with crap you're forced to put up with.

    Microsoft thinks it needs dirty tricks and that it's product can't survive on it's own merit.

    --
    A Pirate and a Puritan look the same on a balance sheet.
  19. Re:Why is it now assumed everyone uses grub? by swalve · · Score: 2

    Who are these pansies that use a boot loader at all? I enter in the machine code by hand, that's the only way to be sure.

  20. Re:ACPI has ALWAYS favoured Windows... by drinkypoo · · Score: 4, Informative

    ACPI implementors (what is an ACPI vendor? can I buy it by the pound, or is it sold by the unit?) favored Windows, because Microsoft built a tool for creating ACPI tables that intentionally craps on all other operating systems, INTENTIONALLY building an invalid table for use with non-Windows operating systems. Linux now claims to be Windows in order to get a table that works. Bill Gates proposed this "feature" personally.

    The dominant platform is the one supported by fraud and deceit, which helps to ensure its continuing dominance, and the proper use of apostrophes. No wait, that was me.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  21. Re:Um.. Why cant it be something simple? by drinkypoo · · Score: 2

    Why does it have to be that they are hostile to Linux?

    it doesn't have to be, that's why I'm asking the question. It COULD be, and it HAS BEEN IN THE PAST, specifically in the case of ACPI Microsoft DELIBERATELY created a tool that would make an invalid ACPI table for use with non-Windows operating systems.

    Why do you tin foil nuts always make it out to be some conspiracy against Linux?

    Because it so often is. BTW, tin foil hats concentrate radio signals at the center of the skull, I guess you aren't keeping up though. There was a test at MIT.

    why should any OEM give a fuck whether their desktop products (which are going to require good suspend/resume/battery support etc) work with OS that is an economically insignificant portion of the market?

    Because Linux is continually gaining market share. And in any case, again in the case of ACPI, Microsoft did it deliberately. It's stuff that would have worked fine without their influence. I want to know if the same thing is happening all over again.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  22. Re:Summary: not a Linux problem, but a BIOS proble by GooberToo · · Score: 2

    I think you're spot on. Over the last decade I've constantly read articles about broken hardware whereby the manufacturer simply hides in their windows drivers. Chances are extremely high any power regression is actually a case of extremely broken hardware more dramatically exposed because of a bug fixes and/or compliance improvements in the Linux implementation.

    Based on what I've read over the last decade, I definitely get the impression hardware bugs, specifically in power management, are fairly common. As a whole, manufacturers just don't give a shit about pumping out broken, non-compliant hardware specifically because 1, they hide their shame in their drivers, and 2, non-windows systems likely represent a fraction of their overall sales. Which means, who cares because who's actually going to know they can't properly follow a specification.

    Unless someone has a smoking gun which proves Linux is doing the wrong thing, chances are the regressions are actually shit-poor hardware implementations with full knowledge of the manufacturers.

  23. Re:Never upgrade your Linux... by arth1 · · Score: 2

    Obligatory quote:

    <@insomnia> it only takes three commands to install Gentoo
    <@insomnia> cfdisk /dev/hda && mkfs.xfs /dev/hda1 && mount /dev/hda1 /mnt/gentoo/ && chroot /mnt/gentoo/ && env-update && . /etc/profile && emerge sync && cd /usr/portage && scripts/bootsrap.sh && emerge system && emerge vim && vi /etc/fstab && emerge gentoo-dev-sources && cd /usr/src/linux && make menuconfig && make install modules_install && emerge gnome mozilla-firefox openoffice && emerge grub && cp /boot/grub/grub.conf.sample /boot/grub/grub.conf && vi /boot/grub/grub.conf && grub && init 6
    <@insomnia> that's the first one

  24. Re:Summary: not a Linux problem, but a BIOS proble by Just+Some+Guy · · Score: 4, Informative

    Did you read the linked PDF at all? Here's what the rest of it said:

    From: Bill Gates
    Sent: Sunday, January 24, 1999 8:41 AM
    To: Jeff Westorinen; Ben Fathi
    Cc: Carl Stork (Exchange); Nathan Myhrvold; Eric Rudder
    Subject: ACPI extensions

    One thing I find myself wondering about is whether we shouldn’t try and make the “ACPI” extensions somehow Windows specific.

    It seems unfortunate if we do this work and get our partners to do the work and the results is that Linux works great without having to do the work.

    Maybe there is no way to avoid this problem but it does bother me.

    Maybe we could define the APIs so that they work well with NT and not the others even if they are open.

    Or maybe we could patent something related to this.

    In summary, Bill Gates explicitly wanted to break ACPI on Linux.

    --
    Dewey, what part of this looks like authorities should be involved?