Round Robin Scheduling Not Power-Efficient

← Back to Stories (view on slashdot.org)

Round Robin Scheduling Not Power-Efficient

Posted by kdawson on Friday May 9, 2008 @03:01AM from the toward-cooler-server-farms dept.

Via_Patrino writes "While having to distribute load between several servers, round robin, or any other technique that balances load equally, is the most common approach because of its simplicity. But a recent study shows that trying to accumulate load on some servers can improve energy efficiency because the other servers will be mostly unused during off-peak periods and then able to make better use of power saving methods. Specially, where load involves lots of concurrent power-consuming TCP connections, which was the case in the study, a new load-balancing algorithm resulted in an overall 30% power savings. Here's the paper (PDF)."

15 of 141 comments (clear)

Min score:

Reason:

Sort:

Logical conclusion by DoofusOfDeath · 2008-05-09 03:04 · Score: 4, Insightful

So if we're willing to sacrifice speed for energy savings, shouldn't we just use the bare minimum number of computers that can handle the workload without crashing?
1. Re:Logical conclusion by TooMuchToDo · 2008-05-09 03:12 · Score: 4, Insightful
  
  What this means is someone needs to architect an intelligent loading system. Ideally, it would manage the load on your base load servers (that are on all the time), and when those servers reach 85-95% of capacity (numbers from my ass) other servers should be brought out of low power/sleep mode to start serving.
  Of course, if you use Amazon EC2, this is all moot, as they can shift load around to have their cluster run at peak efficiency.
2. Re:Logical conclusion by Anonymous Coward · 2008-05-09 03:22 · Score: 1, Insightful
  
  Not that I'm going to read the article... but surely the situation is more along the lines of:
  
  You have a load that varies, the peak load requires 10 servers to keep performance acceptable (where the definition of acceptable can vary according to redundancy requirements, etc). But the usual load is 20% of the peak.
  
  With round robin each server would be running at 20% load in the usual case. From a power perspective it might be better to have 2 servers running at 100% and 8 servers idle, or 4 servers at 50% and 6 servers idle or whatever.
  
  I can't see a sacrifice in speed there. You are just moving the idle cpu around so that some servers are essentially completely idle and hence use less power instead of all server's being partly idle.
  
  If a server running at 100% and another idle uses less power than 2 servers running at 50% then why not?
3. Re:Logical conclusion by TooMuchToDo · 2008-05-09 03:28 · Score: 1, Insightful
  
  Unless you're running a hugely complex ASP.net application, in which case, it doesn't work so well.
4. Re:Logical conclusion by Anonymous Coward · 2008-05-09 04:56 · Score: 1, Insightful
  
  Hm, maybe if people programmed things like they did in the 70s and 80s instead of relying on twelve layers of abstraction and gobs of RAM and clock cycles to move numbers around?
  How come it's always on the hardware team's side to reduce power? Why don't you software monkeys come down to earth and do some assembly?
5. Re:Logical conclusion by kesuki · 2008-05-09 06:13 · Score: 1, Insightful
  
  'The environmentalists would have a hissy fit'
  
  "Only the stupid ones."
  
  So your logic is that, 'since we're using all this energy, what we need to do is heat up the oceans, instead of the atmosphere, because it takes less electricity to do that, thus making us use 5% less energy making us 5% more environmentally friendly'
  
  the difference is huge though, if we heat up the atmosphere, it radiates into space (if it didn't LA would be about 6000 degrees Fahrenheit at noon)
  
  If we heat up the oceans, less CO2 and less O pass through the membrane, and certain life forms that are considered parasitic tend to benefit from the warming of the ocean as well...
  
  so basically, not only are we killing off ocean life, we're hitting ocean life with a 3 pronged attack 1. less algae can propagate, because tho ocean is warming in that 'heat island' 2. parasitic life forms are propagating fast, killing off other aquatic life. and 3. because there is less algae, and the disruption in the food chain means there is now less food in the ocean for all aquatic life.
  
  Sure it takes more energy to heat air than to take cold water and make it warm, but most of that heat radiates away from earth, without causing harm, if everybody within a mile of water used bodies of water for cooling to 'save electricity' there would be a horrendous mass extinction even in the oceans, much like the one occurring on land by human action.
  
  there Is a half way solution though, it's called 'evaporative cooling' water as it evaporates takes energy into the earths atmosphere, where it can then radiate into space, and eventually the water falls back to earth as rain... problem is, because so many plants and animals rely on evaporative cooling, and because heat creates humidity from wet things, evaporative cooling only works constantly in a dry climate.
  
  you can also take cold water, and heat it all the way to steam, which can be done no matter the humidity level, but then you loose a lot of the energy savings that you were aiming for in the first place.
  
  I know you personally feel that using less energy does less harm to the environment, but in this situation you're comparing apples to oranges. heating up air and heating up water have different effects and different consequences... if they didn't everyone would be using seawater today, since it's been in use at least 20 years if not longer.
  
  --
  https://www.gnu.org/philosophy/free-sw.html
Managed power supplies... by Colin+Smith · 2008-05-09 03:11 · Score: 4, Insightful

Just switch them off...

If the load on your boxes is below a threshold, remove one of them from the load balance list, wait for connections to end, or migrate the processes off to another machine, and switch it off. When the load is above a certain threshold, you power on an additional node, configure it for whichever service and add it to the load balancer.

Oh come on people, you call yourselves engineers? It really isn't that difficult.

--
Deleted
1. Re:Managed power supplies... by russotto · 2008-05-09 03:18 · Score: 4, Insightful
  
  If the load on your boxes is below a threshold, remove one of them from the load balance list, wait for connections to end, or migrate the processes off to another machine, and switch it off. When the load is above a certain threshold, you power on an additional node, configure it for whichever service and add it to the load balancer.
  
  Sure, that's not too difficult to do. But it does add complexity. And it does mean your system can't respond to increased load as quickly, as you have to wait for your additional boxes to boot up. If the increased load is predictable, you can anticipate, but that adds more complexity. It doesn't save you on capital costs as you still have to size your power and A/C systems for peak load. Powering the boxes on and off may shorten their lives or reduce their reliability. The question isn't whether it can be done; it's whether it's worth it.
2. Re:Managed power supplies... by PenguiN42 · 2008-05-09 13:29 · Score: 2, Insightful
  
  Oh come on people, you call yourselves engineers? It really isn't that difficult.
  
  You'd be surprised how much of engineering is taking "obvious" ideas and banging your head against them for months/years trying to get all the details to work out right.
  
  --
  The following sentence is true. The preceding sentence was false.
Pound, haproxy by QuoteMstr · 2008-05-09 03:21 · Score: 3, Insightful

We're running a no-frills OpenBSD load balancer at work. Right now, it's running Pound (the quickest thing we could get up once traffic spiked a few weeks ago), but we're considering other approaches too. haproxy's load balancing knobs look interesting. It looks like you can configure it so the maximum number of clients scales with the current load. The problem is that there's no feedback system.

Some kind of loadavg-based, or even response-time, feedback mechanism would be great! Pound has that (I believe), but since Pound requires downtime for every configuration change, we want to move away from it ASAP.
Very cool, but obvious now that we see it by mlwmohawk · 2008-05-09 03:27 · Score: 4, Insightful

This is a very cool idea, and I don't think it will affect usability too much either. As long as the load balancer keeps tabs on system loading, via snmp or something, it can turn on/off machines based on need.

Assuming your system scales smoothly, i.e. gets proportionally slower as the system load starts to exceed processing capacity. For example, a process will always take 100ms as long as there is CPU time to spare, but once the CPU gets to 100% utilization, you have to start time slicing more processes, that 100ms starts to be 150ms. The load balancer can spin up a new server an start bring down the processing times.

This is an obvious solution to an obvious problem, but until now, we've just never had to examine it.
Re:IT discovers boiler scheduling by russotto · 2008-05-09 03:27 · Score: 3, Insightful

Operators of multiple steam boilers have been dealing with this problem for a century. The number of boilers fired up is adjusted with demand, with the need for some demand prediction because it takes time to get steam up. This was done manually for decades; now it's often automated.

Which, alas, won't stop someone from patenting it with respect to servers. Even if it's already been done with computers too.

Incidentally, I've seen descriptions of currently available HVAC control systems for office buildings which takes into account the season, the direction the building faces, the thermal mass of the building, demand, etc, and even learns some of these parameters while running, rather than forcing the installer to calculate them. But every office building I've worked in has had crappy systems which amount to running the compressors on a timer and using individually controlled dampers to provide even cooling (poorly). It seems that we have the technology, but not the will (or the capital) to use them.
The real question is what "fully loaded" means by Animats · 2008-05-09 05:00 · Score: 2, Insightful

The only thing that makes this hard is a metric of what "fully loaded" means for a server. With generators and boilers, you have a single number which represents output, and you know what the capacity of each unit is, so you know when to start up the next unit. Computer servers are more difficult to characterize.
So you have to measure some values of server load, convert that to a single number, and use it for load measurement purposes. Then it all works just like boiler scheduling.
You don't even need to do much advance planning, as you have to do with boilers and generators, since you can usually start up another server in a minute or so. It takes hours to fire up a big boiler, so you need serious prediction capability.
(The classic power company approach was a chart recorder recording system load. Every day, somebody took the day's load graph and cut out a piece of cardboard to match. The cardboard pieces were accumulated in a rack, and the result was a 3D load graph for the year. It looks like a mountain range. There's an Internet Archive film showing this. That's a worthwhile exercise for your server farm, and you can probably do it without glue and scissors today. I've seen some of Amazon's server load graphs, which have a huge peak entering the Xmas buying season. In fact, the real reason Amazon is selling "cloud computing" is that their plant is sized for the holiday season and 80% idle the rest of the year.)
Could apply to CPUs as well by bill_kress · 2008-05-09 05:15 · Score: 2, Insightful

I think it's probably simplistic to simply distribute a load to all cores of a CPU evenly. Although asymmetrical might be tougher, I could see a system with one low-power always-on core to deal with system requests and organization (Maybe even low enough power to remain on during a suspend), One to handle all GUI threads and interact with the GPU on a private bus, a couple normal cores to handle typical user threading, one of which doesn't come on until the first is like 50% loaded, and one or two high-speed high-power cores that run all-out when the system is plugged in and needs them for intensive processing.

It would take some targeted software design to take advantage of this, but I think we could be looking at a moores law style increase in power...
It's new because it's smarter by Steve+Hamlin · 2008-05-09 12:04 · Score: 2, Insightful

More smarts, I think.
Does your setup allocate ZERO connections to certain servers over some length time, which are set up to reduce energy use upon such zero connections? If not, this looks like it might help.
They're claiming real-world energy efficiency gains, so it looks like it's an improvement somehow.
I would assume it's because this now adds dynamic adjustment, which could be based on total system-stack metrics of peak_load_capability, energy_minimization, acceptable_response_time, etc. Something that seems to be lacking in the current load-balancing system that you describe.
Future:
- Allow for tuning based on diverse hardware, each with different energy and load capability profiles.
- Smart managing of a large population of these systems, based on varying load.
- Add real-time upstream energy cost data into the mix
- Dynamic scheduling of administrative tasks based on energy efficiency vs. hard deadlines.
- If energy starts becoming a significant cost of hosting, go back to selling system time based, in part, on total energy used - track CPU, disk, network energy requirements by watt/hr, by user. Add those to account plans, side-by-side with Mb/s and GB/month. Serving .cn? Host your site out of a data center in Iceland and take advantage of cheap, midnight power!