Slashdot Mirror


Why Don't Servers Support Power Management?

Cerlyn asks: "I am the network administrator of three server grade machines purchased from three separate companies. The recent power problems in California reminded me of the fact that none of these servers seem to support power management. The operating systems these systems run (Linux 2.2, 2.4, and FreeBSD 4.2) are compiled to support power management, but do not detect any power management capabilities at all. Granted, no one wants a server sleeping on the job. But the way things seem to be coded, processors can not even sleep while idle without known hardware support. Lightly loaded machines are often idle 75% of the time or more. Sleeping while idle could make them save a significant amount of power. For many companies, the extra ten seconds it would take to spin up a backup server's hard drive(s) likely would be a non-issue. So, why don't server grade computers support advanced power management (APM), APCI and the like?" And in the land of the rolling blackout, one has to wonder if the potential power saved could help the situation, assuming a good percentage of the big iron in Silicon Valley were configured to conserve what power it could (as opposed to adding on to the drain as it is now).

29 of 286 comments (clear)

  1. Re:Windows 2000 by the+red+pen · · Score: 3
    • Do I even have to mention that Windows 2000 comes complete with a robust power management system?
    ...and when we can run it on a Sun E10000, an IBM RS/6000, or an HP K380 we'll call you up and ask how to enable the power management.
    • Microsoft covers all the bases, you people are stuck in the outfield.
    Seeing as how no one is playing baseball right now, this analogy is ironically apt.
  2. Re:It wont save any power ... by FFFish · · Score: 3

    Why should a hard drive stop spinning? There's a great amount of inertia to overcome when it's stopped, plus a lot of static friction.

    Seems to me that they HD should spin 15000rpm when it's in use, and operate on a sliding-scale when not in use: the longer unused, the slower it spins, down to perhaps 1000rpm.

    But keeping it spinning: I should think that's important in achieving fast spin-up times and reducing the power demand during spin-up.

    --

    --

    --
    Don't like it? Respond with words, not karma.
  3. Re:interesting... by Barbarian · · Score: 3

    I like your load balanced server idea... on large installations, you could have an authoritative server that's on always, and in a light traffic situation, tells, say, 7 of the other 9 to go to sleep, then wakes them up when when /. links..

  4. Re:Not needed--already done--idle @ HLT by alhaz · · Score: 3

    Well, the issue is, the APM specification does not cover multiple cpu systems.

    As Alan Cox said, "If making that APM call reformats your disk and plays tetris on an SMP box the bios vendor is within spec (if a little peculiar). No APM call of any kind is SMP safe."

    --
    This is just like television, only you can see much further.
  5. Re:It wont save any power ... by rnturn · · Score: 5
    ``drive spinup seems to be the time when a borderline drive will fail''

    Not to mention that it's the time when the drive's current requirements are the greatest. These inrush spikes are not a big problem for a system with a drive or two but I've seen places with systems with large RAID arrays attached to servers where they popped breakers if the power came back on while the drive cabinets were sitting there with their power switches in the ON position. Apparently, not all setups allow you to or are configured to use the SCSI start command to sequence the drive's startups like they used to do in the days of yore. Happily, newer drives are not as power hungry (I can remember some old 5.25 inch disks that used 40+W of power) but now that these 15,000 RPM drives are coming out...

    If you're trying to save power turning off the monitors when no one's actually sitting in front of them helps enormously. Where I used to work, whenever there was a power outage and we switched over to the UPS (no generator while I was there) standard procedure was to immediately turn off any monitors that no one was actively working on. Gave us well over another half hour or more of battery time. Switching to KVM boxes to handle, say, eight servers with a single monitor halped out a lot too.


    --

    --
    CUR ALLOC 20195.....5804M
  6. A few bits of info by arivanov · · Score: 5

    First, I agree power management in a server makes sense. But not because of california legislations but because the most important server parameter is MTBF. Power management can increase the MTBF and efficiency of the cooling subsystem. This in turn increases MTBF of disks and the entire system. One degree away from the optimum operating temperature can decrease a disk's life by an year or more.

    Also, you do not spin down disks on servers for both business and reliability reasons. The business reason is server latency. The reliability reason is that most server HDUs hate to be spun down and their MTBF decreases (which is again business in a sense). Also, the biggest power eaters in most modern servers are the cooling systems and the CPUs. Not the disks. Disks hardly go above 2-10W nowdays while a PIII with the fans can go up to a 100W. Alpha goes even beyond that. Also, spinning up and down disks to 7200-10000 RPM can actually generate more heat and consume more power than keeping them running.

    Some bits of info by platform:

    • x86APM does not work at all or has only limited functionality with SMP systems and any newer boards. Which means that only an ACPI supporting system will have working power management. ACPI is a new addition in linux and BSD. Neither ACPI nor APM exist in solaris. NT is not really using it for power management in servers to the extent of my knowledge. So only a very upto date installation can actually use power management. But it will be only the CPUs. I have yet to see an x86 server where the fan is actively controlled by chassis temperature. Usually servers have them hardwired at MAX. Which means the entire exercise meaningless as you are not actually improving your MTBF that much.
    • Alpha Only recently someone (forgot who) modified the original DGUX PAL code to do power management on the newer CPUs. This is hardly used and unusable in all AlphaBios installs. Which is a pity as the alphas have always had the fan speed controlled by CPU temperature.
    • MIPS - never heard of power management. Server lines of PPC derived (u)Sparc - same.

    So overall the situation is that for one of the most popular platforms the power magement is hardly used due to the fact that the OS support just came in. For the second most popular platform (Sun) the power management was never there. The others are pretty much there as well.

    And to conclude: I do not feel comfortable installing linux 2.4.0 or the ACPI support for BSD on real production machines yet.

    --
    Baker's Law: Misery no longer loves company. Nowadays it insists on it
    http://www.sigsegv.cx/
  7. dont forget building cooling by peter303 · · Score: 3

    Not only do the servers consume kilowatts of power,
    but require kilowats of air conditioning.

  8. Efficient Design at Appropriate Size w/Builtin UPS by kentborg · · Score: 3
    Three Points.

    First, APM itself might not be a good idea for serious servers, but building (and configuring) servers with some consideration of power efficiency would be smart. The power use by server farms is a horrible expense. The cooling costs of server farms is horrible. But up to now it seems that getting a computer to work at all is the only point; how many watts it takes and how many BTUs it dumps is mostly ignored. Being Biggest and Baddest is used to sell, efficiency is not. I expect this will soon change...

    Second, most servers are not on server farms. My basement server might be on a DSL connection that is faster than most leased lines of yore, but it is still IO-limited. So it works quite well for me to run a little hacked Think NIC box (www.thinknic.com): I added an otherwise missing hard disk and underclocked (!) the CPU, and the result takes very little power--it has to, the power supply on the thing is too small to draw much. I keep the CRT off when I am not using it. I also bought a little UPS--but the server takes so little power my backup time should be very good. Certainly I am a minimal case, but I suspect that many servers out there are over powered and misused.

    Third, why don't computers and related equipment have small builtin UPSs? They already have DC power supplies, and DC is what is needed to charge most batteries. DC is what the computer actually needs, and DC is what batteries produce. Doing some battery backup inside each box would be pretty easy. How much battery does a little ethernet hub need? External UPSs need to make AC from DC (which is never terribly efficient) and they themselves become single points for potential failure. Sure, if you need a survivable facility, buy big UPSs and generators, but the failover and resistance to tripping over power cords would be so much better if each piece of equipment had a few minutes of backup built in. A well maintained generator should be able to start up and be running smoothly within just a few minutes. If the equipment itself could last a dozen minutes or so, there would be no need for any external UPSs other than for a few CRTs. As most power problems are very short, even home users would like a few minutes of backup time.

    -kb, the Kent who thinks computers are in a brute-force '50s "muscle car" era and that there is a lot of room for a little design and deployment efficiency.

    P.S. Don't forget that most so called "screen savers" are really just entertainments that don't save anything.

  9. Why should a server save power? by funkman · · Score: 5
    There not meant to. If a server is in a position where it can go into a power saving mode, then someone has not done a good job on the server farm. Consolidate any boxes that have that light of load so light that they may frequently go to sleep. With consilidation: you save on administration (less boxes) and you should be more secure since there are less boxes to administer AND you'd be saving power because you are using less boxes.

    But lets say you need a box that needs to be on its own and has the ability(time) to spin down. I personally would not want this not because of the extra time for the spin up, but because the spin up is hard work on a motor and for a server - once that hard drive is spinning - keep it spinning. There is much less wear on the motor to keep it spinning than spinn up process. This should give a more predictable life to the drive.

    1. Re:Why should a server save power? by dubl-u · · Score: 4

      There not meant to. If a server is in a position where it can go into a power saving mode, then someone has not done a good job on the server farm.

      This just isn't true. I dunno about you folks, but even with the nominally 24x7 web sites I work on, the difference between lowest and peak is more than 2x. So if you have a load balanced configuration, it's plausible that half the servers could leave the active set and sleep at nights. And in an office setting, the peak-to-valley gap should be even higher.

      As many posters mention, stopping a running hard drive is asking for trouble. But it would be nice if all processors could drop speed when idle, which apparently works well in laptops. And in power saving modes, it would seem to make sense to gently drop a disk drive's rotational speed. If it never stops and the spin-up to full speed is very gradual, this might extend disk lifetimes rather than reducing them.

    2. Re:Why should a server save power? by clare-ents · · Score: 3

      "
      Consolidate any boxes that have that light of load so light that they may frequently go to sleep.
      "

      What if there is only one box - e.g. the router / server machine in my house does nothing for 95% of the day and only has stuff to do when I'm actively using it - I'm fine with it going to sleep / powering off hard disks etc. since it's a couple of seconds wakeup time. I can't power it off since it requires me to have physical access to bring it up again [I can't be bothered to try wake on lan], plus everytime it boots it seems to force a disk check on me taking about an hour.

      This is a valid point for server farms though - the main servers I use have an obvious 24hour periodic cycle [loaded while US + Europe is awake - empty at other times] and it would be great to bring up additional machines as required.

      --
      Only two things are infinite, the universe and human stupidity, and I'm not sure about the former. (Einstein)
  10. Server Performance and Blackouts by Null_Packet · · Score: 3

    Most of the wear on servers is typically on their drives, so sleeping the disks would increase failures and shorten their lifespan. For example, the large corporation that I did contract work for in San Diego had about 100 PC/NT servers, with another 100 HP-UX servers. For Y2K, they had checked for possible issues with disks, but they only restarted servers and left the disk arrays going. This is because the spinning up/down of the disk increases wear and opportunity for failure (motor bearings, etc).

    The second issue that is slightly incorrect is the state of California's power problem. The state deregulated and totally fscked up the way power was sold by allowing people to sell power at open market prices. Power plants were then purposely shut down, decommissioned, and reduced in capacity to raise the value of the price of power. For Example, you have 2 powerplants, PPA and PPB. They each create 1000Mw's of power at 1 cent per megawatt, for a total income of $20. You create an excuse to shut down PPB, causing a shortage of available power. This in turn raises the selling cost of power to 2 cents. You have just kept you same income but have halved your operating costs.

    There is a shortage of power, but not because California's usage suddenly went insane. This problem started back in the early summer in San Diego, and no one took action until the end of the summer.

    If you really wanted to conserer power, then have all the Slashdot readers retire Seti@Home until all blows over and let their boxes sit powered off or do Wake-on-LAN, as I am sure far more power is consumed by Seti@Home users in CA than by not-sleeping server processors.

  11. Power Saving Matters by Alan+Cox · · Score: 5

    Compiling in power management support on the test boxes I use cut the power bill by 20%. A lot of that actually seems to come from monitor powerdown rather than CPU idling, but with an Athlon drawing 60 watts of power at peak (or 240W once we all have nice quad athlon boxes) its still a substantial saving.

    For most boxes the cpu halting BSD and Linux do will actually give almost as good results as the APM bios. On laptops APM bios is often measurably better as it is able to reconfigure SDRAM timings and the like in ways only practical for box specific code.

  12. Not needed--already done--idle @ HLT by redelm · · Score: 3

    APM is basically useless for servers. You certainly don't want to be spinning down their disks (wear and high start-up power) and they don't have monitors attached.

    The server OSes (*BSD, Linux, OS/2, and even MS-WindowsNT) all have HLT in their idle thread. When the machine has no tasks to run, it runs the idle thread. For x86 CPUs after the 486sl this automatically drops the CPU into powersavings. Typically a CPU that will draw ~20-30W will drop to less than 1 Watt at HLT. That's all you want.

    APM is more targeted at desktops where it's especially important to turn off that power-hungry monitor (100+W) and to compensate for the failings of MS-Windows9*|Me which idles in a busyloop.

    For non-x86 CPUs I cannot speak. I would hope that Sun & Alpha have something equivalent to x86 HLT powersavings by now. But my older Alpha 21066 does not. Perhaps the thinking is the machine will be busy all the time.

  13. Incredible DOS Attack on California Power Grid by lildogie · · Score: 5

    SACRAMENTO - The California power grid was taken down today by a so-called "packet storm," where script kiddies coordinated themselves to ping every sleeping server in California to wake it up ...

  14. Ahem [Because APM and ACPI are incredibly buggy] by Nailer · · Score: 4

    (Sorry, the spelling on that was atrocious).

    ...and always have been. The specifications aren't always fully implemented and don't perform reliably even in consumer environments - i.e. every shipping copy of Win98SE is unable to recover once the machine goes into suspend unless a patch is applied. In terms of (SME) servers, the suport hasn't existed. Windows NT4 didn't support the full capabilties of either spec, and while Win2K does, it is still not in widespread use. As for Unix-likes, Linux has supported APM for some time now fairly reliably, but some applications (specfically poorly written FTP servers) still have some issues with it. Anyone know about ACPI?

    Powering down hard disks does indeed cause wear and tear, but there are other components - ie, monitors (if you use monitors on servers), KVMs, and even switches which aren't in use during certain hours which won't be significantly harmed by powering down.

  15. Ehm, no.... by Lion-O · · Score: 3
    Maybe this is an option for Windows machine (no trolling intended since I really would not know) but I'd consider this a big NO NO for *nix based machines. Sure, maybe you can save some power in theory by letting some hardware sleep or spin down for some time. But how much power would it cost to get everything back up & running every hour of the day?

    Getting things back up usually costs more power then letting them spin & run. I don't know about you and your servers but mine cannot give up on those hourly cronjobs since some jobs simply have to be done. So basicly I think it would end up consuming even more power then it does now. Not to mention the extra wear and shortened lifetime on the hardware which will surely not please my boss.

  16. Re:Spinning down any HDD is a bad idea. by zensonic · · Score: 5

    There are too many things that can go wrong when drives spin up and down. Particularily if the drive hasn't been idle for a large period of time.

    I have had disks which have spun for years without problems in a server, but when taken down
    in order to upgrade the machine in some way. Some disks don't survive. Why you might ask? The above post is one point. Another is that the heat of old drives really degenerate the components of the HDD and thus it can't stand the powerup cycle which normally puts more stress on the components.

    Other HDD have used all the lubricant inside during the years, and when powered up after a server upgrade the HDD doesn't have the power to compensate for the not so smooth motion anymore.

    If it ain't broken don't stop it :)

    --
    Thomas S. Iversen
  17. what would REALLY help the CA power crisis ☺ by ChristTrekker · · Score: 3

    I'm surprised nobody's hit on this yet. Ditch the x86 boxen and use PPC instead. The power savings could be significant, maybe 50% or better. (Unless you've been using the waste heat from your servers to supplement the office toaster and microwave.) No need to spin the drive up and down from sleep, just use an efficient processor.

    Those millions of dollars that the gov't loaned out to the power companies could have been put to better use helping finance this upgrade. At least you'd have a better architecture to show for it, instead of 17 days later with no improvement.

    for the humor-impaired.

  18. Spinning down SCSI HDDs is a bad idea. by Trepalium · · Score: 3
    There are too many things that can go wrong when drives spin up and down. Particularily if the drive hasn't been idle for a large period of time. There's actually a problem where the drive heads can acumulate material from the disk itself, and by powering down the drive, the heads come to rest and the residue either falls in the drive platters or virtually glues the head to the platter. Not good.

    CPU and video card power management would be far less worrysome, since there's no mechanical wear and tear that can be caused by them going in/out of sleep, with the exception of the monitor which can be externally replaced without downing a server. However, I'd be rather concerned that they don't go completely to sleep otherwise a power failure may cause the server to completely drain the UPS without trying to shut itself down.

    --
    I used up all my sick days, so I'm calling in dead.
  19. Power Managment vs Network Management by vchoy · · Score: 3
    May be one of the reasons why we do not see power panagement is because for most of the critical servers, even though they are not used most of the time, they are monitored all of the time.

    I have some critical servers that I am responsible for and run network monitoring agents on these machines that poll for disk utilization, capacity, CPU load, network ping status...etc etc. Big Brother is one such network monitoring system I use. Others I have tried What's up, BMC Patrol, HP Openview. I know these do not take up much system resources, but the server needs to be 'awake' in order to collect and report network management data. Servers that are polled often do not get a chance to sleep unfortunately. Wake On Lan enabled servers will might never go to sleep if a network monitoring program is ping the server to see if it is alive every 5 minutes.

  20. computers aren't California's only power drain by abde · · Score: 3

    And in the land of the rolling blackout, one has to wonder if the potential power saved could help the situation, assuming a good percentage of the big iron in Silicon Valley were configured to conserve what power it could (as opposed to adding on to the drain as it is now).
    oh come on. California has huge heavy industries, gigantic metropolitan areas, teeming millions of people with personal computers, dishwashers, air conditioners, televisions... its a huge, populous, and industrial state. The impact of the Information Industry is still a minor fraction of the total costs. Even if you turned out the lights on Silicon Valley completely, California would still have a problem, because they haven't built enough new plants to supply the demand (of a huge, populous, industrial state).

    If you had the exact same deregulation fiasco in Texas, or New York, or Illinois, you'd see the same thing, and there aren't Silicon Valleys of even comparable size in any of those states.

    Is it just Geek Hubris to assume that our industry is the most important and central over all others? I see this same thing on reports on how our economy is supposedly tanking right now, just because of the NASDAQ. There's an entire world out there beyond our walled garden, you know..

    --
    Don't blame me - I voted for Howard Dean. http://dean2004.blogspot.com
  21. Bravo on the wear and tear issues, plus a few more by ma11achy · · Score: 3

    Yep, I'm fully in agreement with the wear and tear issues of power management. The spin-up and spin-down of hard drives is the one area that causes most wear and tear. I personally would not like to power down any drives for any reason on one of our systems running in the field. These machines (and the drives they contain) are DESIGNED to run 24x7 if configured properly. They are not designed (and in most cases, not desired) to spin down and spin up several times during a week, or a day. Why, in the world of high end UNIX servers for example, would you want your backend database/web/application server to even THINK of powering down one of it drives even for a SECOND? Especially in the high volume hits of today's I.T industry.

    --
    Eagles may soar, but weasels don't get sucked into jet engines
  22. wouldn't have mattered much. by sootman · · Score: 3
    I used to live in california, and back when there were water shortages there, everyone was asked to put bricks and half milk cartons in their toiet tanks, water their lawns every other day, etc. Later, I learned that agriculture in CA uses 85% of the state's water. So, if every urban person in the whole state (about 30 million people) had cut their water usage in *half* (not bloody likely, or even possible), that would have only made a difference of about 7%.

    (Besides, at the same time, I got a job cleaning up a gov't construction site. The boss, at one point, took a running hose and stuck the nozzle into a urinal to save himself from having to walk to turn it off. I mentioned the drought. "The government," he said, "does not have a water shortage.")

    Similarly, a quick look at the laws of thermodynamics tells us that, for example, it takes more energy to cool a room because of a computer than the computer itself gives off. Air conditioning, lighting, and utilities for new residents are some of the reasons behind the brownouts in the golden state. A few idling CPUs and spun-down hard drives, while a Good Thing, wouldn't make much difference.

    --
    Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
  23. interesting... by Helix150 · · Score: 3

    good point. IMHO another good idea would be for load-balanced servers to put one in suspend when there is low traffic

    --
    --IronHelix
  24. Re:Bravo on the wear and tear issues, plus a few m by Helix150 · · Score: 3

    well if you have one app/db/http/etc server running then you have neither the want nor the need for it to go offline. In such a case, the first user that wants something is going to be waiting 10-45 seconds for it to come back online.

    However, if you have cluster servers, redundant servers or load balancing where several servers do the same job, but not all the time, then APM is good. For example, say you have a website with a loadbalanced HTTP cluster and a redundant backup cluster. You would want one of the main machines to be always on. When it got overloaded it would wake up one of the others, and grab more of them as load increased in peak hours. Then when everyone went to bed it would suspend those it woke.

    For the redundant cluster, they should be kept in a constant state of suspension, but be ready to wake up should the main cluster fail.

    As for reliability of the drives, thats why you have RAID arrays. I would rather periodically weed out the weak ones and have the RAID re-distributed than shut them all down, and half dont come back on. Its absurdly unlikely that two drives are going to fail at the same powerup. If one dies, then thats OK. If two dies its not. Give them all ample opportunity to fail and they will do so one at a time. Give them few opportunities and they will die in clumps.

    --
    --IronHelix
  25. Servers like to just run by JamesGreenhalgh · · Score: 3

    Spinning up and down drives will (depending obviously on your load and spindown configuration) actually end up using more electricity, in worst case scenario. It's like people who turn their engine off in only light traffic jams - starting it up costs a lot more fuel than just leaving it running at idle. With desktop PCs it makes sense, since they tend to honestly do nothing at all unless people are sat using them (lets discount seti@homers ;-) ) for long periods of time, so you can make a big power saving.

    The other thing here is that as various people have pointed out, the most common failure point on drives is during spinup - the time at which most stress is being exerted on the drive. If you've got a big ultrareliable server, you'll want it to stay that way, and the best way to do this is by keeping everything the same. The car analogy fits here too. A car that does 100,000 miles in its life with lots of stopping and starting and small journeys, will be considerably worse off than one that just ran for long distances.

    In case anyone was wondering - the adverse wear+tear of power up/down also affects PCs (cpu/PSU/drives), and even monitors. I've seen it happen often enough, and it would be interesting to know how much money the PC hardware industry makes out of components failing early due to "power saving" measures (be they system controlled or little Johnny turning his PC off at night).

    If we move over to better renewable resources, like vast farms of hamsterwheels - this won't be a problem :-)

    --

    --
    ALL YOUR BASE ARE BELONG TO US!
  26. It wont save any power ... by Decado · · Score: 4

    I assume that Power saving only works if there arent frequent power downs and power ups, if these machines were power saving for 1min then had to power up again there probably wouldnt be much (if any) saving whereas the wear on the servers would be a lot greater.

    --

    Slashdot: Proof that a million monkeys at a million typewriters can create a masterpiece

  27. The extra wear and tear isn't worth the power risk by rigor6969 · · Score: 3

    Power supply fans, hard drive motors, they take on the most wear and tear during power up. Always have.. They are 99% of your power utilization. Thats where all the power is, the redundant power supplies, the raid unit.. I certainly wouldn't want my n+1 power supply to be sleeping, when the primary fails.. besides, a nice raid 0+1 which i use at work, takes upward for a few minutes to power up correctly, Stage that with some sleeping power supplies, and your talking minutes of down time. I'm sure most unix apps won't tolerate that. Most noc's use polling software like whatsup and redalert.com service, which test your sql server etc, won't work... You cali folks just need to have off-site backups in states where there is no issue. (Hint: i have an empty noc, msg me:) or just keep the lights off. I highly doubt "the internet" is truly eating up all of your power. How many light bulbs to computers are there in California? Why dont ya turn some of those off? ==sam=== free server vulnerability scan = www.vulnerabilities.org

    --
    ===sam=== free nessus vulnerability scan = www.vulnerabilities.org