Round Robin Scheduling Not Power-Efficient
Via_Patrino writes "While having to distribute load between several servers, round robin, or any other technique that balances load equally, is the most common approach because of its simplicity. But a recent study shows that trying to accumulate load on some servers can improve energy efficiency because the other servers will be mostly unused during off-peak periods and then able to make better use of power saving methods. Specially, where load involves lots of concurrent power-consuming TCP connections, which was the case in the study, a new load-balancing algorithm resulted in an overall 30% power savings. Here's the paper (PDF)."
So if we're willing to sacrifice speed for energy savings, shouldn't we just use the bare minimum number of computers that can handle the workload without crashing?
This problem shows up in many places.
technical writing / development
I don't think that we should go down this road again - why don't we talk about religion or politics, instead?
More
For he's got a life and doesn't focus on computer stuff.
I salute you, Sir.
Smile, don't click...
Just switch them off...
If the load on your boxes is below a threshold, remove one of them from the load balance list, wait for connections to end, or migrate the processes off to another machine, and switch it off. When the load is above a certain threshold, you power on an additional node, configure it for whichever service and add it to the load balancer.
Oh come on people, you call yourselves engineers? It really isn't that difficult.
Deleted
Load balancing is where you actually check the load and then make an informed decision about where to allocate the work.
OK, rant over. Now back to your scheduled programming.
"It doesn't cost enough, and it makes too much sense."
Operators of multiple steam boilers have been dealing with this problem for a century. The number of boilers fired up is adjusted with demand, with the need for some demand prediction because it takes time to get steam up. This was done manually for decades; now it's often automated.
The same thing applies to multiple HVAC compressors. Usually there's a long-term round-robin switch so that the order of compressor start is rotated on a daily or weekly basis to equalize wear.
More and more, IT is becoming like stationary engineering.
We're running a no-frills OpenBSD load balancer at work. Right now, it's running Pound (the quickest thing we could get up once traffic spiked a few weeks ago), but we're considering other approaches too. haproxy's load balancing knobs look interesting. It looks like you can configure it so the maximum number of clients scales with the current load. The problem is that there's no feedback system.
Some kind of loadavg-based, or even response-time, feedback mechanism would be great! Pound has that (I believe), but since Pound requires downtime for every configuration change, we want to move away from it ASAP.
more obvious statements
A cluster of computers doing a job is less efficient that a single server doing the same job. Adding to that having a cluster creates more points of failure, and more overhead communicating between those statements.
If you have the option to run the DB & the application on the same server, try to do so.
No comprende? Let me type that a little slower for you...
This is a very cool idea, and I don't think it will affect usability too much either. As long as the load balancer keeps tabs on system loading, via snmp or something, it can turn on/off machines based on need.
Assuming your system scales smoothly, i.e. gets proportionally slower as the system load starts to exceed processing capacity. For example, a process will always take 100ms as long as there is CPU time to spare, but once the CPU gets to 100% utilization, you have to start time slicing more processes, that 100ms starts to be 150ms. The load balancer can spin up a new server an start bring down the processing times.
This is an obvious solution to an obvious problem, but until now, we've just never had to examine it.
Deleted
Redundant systems are not efficient? You don't say.
Redundant systems are redundantly redundant..That's why they are robust.
This message brought to you by the Department of Redundancy Department.
Everybody knows that it should be Round Batman, it is soo much better :P
how long until
This is not a new idea. VMware is making a killing around this same concept of consolidating load from many servers onto fewer servers. People tend to forget that an idle server still uses 50% of its peak power utilization. There is a good write up here on Green Data Center
Colin McNamara - CCIE #18233 "The difficult we do immediately, the impossible just takes a little longer"
The general approach of stacking load efficiently to minimize power consumption is not a new concept. There's already a commercial implementation of this idea: VMware Distributed Power Management (part of DRS: http://www.vmware.com/products/vi/vc/drs.html). It will move virtual machines around to the minimum number of servers, then power down unneeded machines. When the workload increases, DPM will automatically bring the powered-down machines back online.
(This is not meant to disparage the paper, which does a good job of measuring the various trade-offs involved in power management, with a focus on network-related issues.)
BigIP's can use round robin and use prioritizing, in other words one server receives the most connections over the others. So how is this new?
Amazing that all these discoveries can now be repeated with Green Tech phrasing & sound like they're new. Now a new discovery. Busy waits R not energy efficient. Where's my nobel prize?
So yes, there's nothing to see here, move on!
The only thing that makes this hard is a metric of what "fully loaded" means for a server. With generators and boilers, you have a single number which represents output, and you know what the capacity of each unit is, so you know when to start up the next unit. Computer servers are more difficult to characterize.
So you have to measure some values of server load, convert that to a single number, and use it for load measurement purposes. Then it all works just like boiler scheduling.
You don't even need to do much advance planning, as you have to do with boilers and generators, since you can usually start up another server in a minute or so. It takes hours to fire up a big boiler, so you need serious prediction capability.
(The classic power company approach was a chart recorder recording system load. Every day, somebody took the day's load graph and cut out a piece of cardboard to match. The cardboard pieces were accumulated in a rack, and the result was a 3D load graph for the year. It looks like a mountain range. There's an Internet Archive film showing this. That's a worthwhile exercise for your server farm, and you can probably do it without glue and scissors today. I've seen some of Amazon's server load graphs, which have a huge peak entering the Xmas buying season. In fact, the real reason Amazon is selling "cloud computing" is that their plant is sized for the holiday season and 80% idle the rest of the year.)
Take a load off Fannie, take a load for free;
Take a load off Fannie, And (and) (and) you can put the load right on me.
I think it's probably simplistic to simply distribute a load to all cores of a CPU evenly. Although asymmetrical might be tougher, I could see a system with one low-power always-on core to deal with system requests and organization (Maybe even low enough power to remain on during a suspend), One to handle all GUI threads and interact with the GPU on a private bus, a couple normal cores to handle typical user threading, one of which doesn't come on until the first is like 50% loaded, and one or two high-speed high-power cores that run all-out when the system is plugged in and needs them for intensive processing.
It would take some targeted software design to take advantage of this, but I think we could be looking at a moores law style increase in power...
14 soccer moms are taking the team of 14 kids to a game. They have two options:
A. Spread the kids among all the cars, and drive all the cars (14 cars)
or
B. Fill up a car, and send off. Repeat until done. (6 cars)
What is more energy efficient?
Soccer moms have solved this without statistical analysis or engine torque curves.
don't cut it off www.mgmbill.org
Load balancing is typically used primarily for performance and scaling (active+active+active ...), not redundancy (active+passive). Better availability is more of a side-effect.
HA clustering typically involves redundant, highly resilient nodes with a high level of internal redundancy, tightly coupled, often in active/passive roles. The idea is to minimize the likelihood of failure, even given a higher cost.
Load balancing configurations typically rely on many active nodes of inexpensive hardware with minimal if any internal redundancy (power supplies, etc), so that although the likelyhood of failed nodes is significant, the cost of replacing a failed node is minimal.
I prefer rogues to imbeciles because they sometimes take a rest.
"Round Robin Scheduling Not Power-Efficient when using Windows Live Messenger"
RTFA, in the abstract, "In this paper, we characterize unique properties, performance, and power models of connection servers, based on a real data trace collected from the deployed Windows Live Messenger."
The research itself appears pretty solid. I'd be interested if they publish a followup paper where the model was based off of a variety of applications which utilize round-robin, not just one.
'The tyrant will always find pretext for his tyranny.' - Aesop's Fables
Texas electric company will hate this.
Here is the solution. In the winter run your web farm in the North hemisphere. In the winter migrate to the South hemisphere. Run it in basements of large apartment complex. Charge for the heating. Heating oil is going up the roof.
- these are not the droids you are looking for -
the future belongs to parallel execution, more threads, lower clock. Today's chips will be considered overclocked. Compare a chip with a brain. More processors with lower performance each will beat a few "overclocked" processors. I'm talking about the future, not the current crappy power management.
It made me sick.
After all we've learnt about sharing server resources for desktops from the people at LTSP, HP turn around and sell a setup where each user gets their own virtual OS. During the presentation they laughed at/looked down on linux and instead spoke abount XP and Vista.
But really, how many instances of Vista could you get up and running on each box? Not very many at all I think.
So, you've got this big mother f*cking cluster serving a Vista per user who in turn run the remote client on ... a thin client? Maybe, but I'd bet most would be runnning a windows on their local PC too, you know, you just can't survive without multimedia and local devices in the office.
Add that to the HP and VMware licences you'd be paying.
And then add on the hardware costs. I can see why HP wants to sell the idea. One nice fat HP server / 5 users?. That's good business.
Environmental issues are not important to these people.
...you set necessary goals and then find the most efficient way(s) to go about them.
OTOH I think the kind of study summarized by the Yahoo link gives science a bad name in human rights circles. In this case they treated a necessity as if it were a luxury where efficiency could become the paramount consideration. So we now know about a bit of human nature within an either/or false dichotomy (which is not very useful), plus we have the nasty suggestion that feeding everyone simply won't do from an efficiency standpoint.
If the researcher wanted to be logically consistent with the choices they offer, s/he probably should have an option for distributing food among farmers only. Letting the rest of society starve would qualify as exceedingly efficient by this study's criteria, but I suspect such options would have thrown the author's foolishness into high relief.
More smarts, I think.
Does your setup allocate ZERO connections to certain servers over some length time, which are set up to reduce energy use upon such zero connections? If not, this looks like it might help.
They're claiming real-world energy efficiency gains, so it looks like it's an improvement somehow.
I would assume it's because this now adds dynamic adjustment, which could be based on total system-stack metrics of peak_load_capability, energy_minimization, acceptable_response_time, etc. Something that seems to be lacking in the current load-balancing system that you describe.
Future: .cn? Host your site out of a data center in Iceland and take advantage of cheap, midnight power!
- Allow for tuning based on diverse hardware, each with different energy and load capability profiles.
- Smart managing of a large population of these systems, based on varying load.
- Add real-time upstream energy cost data into the mix
- Dynamic scheduling of administrative tasks based on energy efficiency vs. hard deadlines.
- If energy starts becoming a significant cost of hosting, go back to selling system time based, in part, on total energy used - track CPU, disk, network energy requirements by watt/hr, by user. Add those to account plans, side-by-side with Mb/s and GB/month. Serving