Writing High-Availability Services?

← Back to Stories (view on slashdot.org)

Writing High-Availability Services?

Posted by Cliff on Tuesday April 15, 2003 @10:23AM from the keeping-your-services-from-falling-over dept.

bigattichouse asks: "I have a project coming up that will require some serious load capabilities accepting socket connections. while I have a design that can be distributed over multiple servers (using queued reads/writes to the db) and is as low-overhead as I can make it - I am concerned about falling into common problems that may have been overcome in many other projects. What strategies (threading, forks, etc) give the best capability? What common pitfalls should I avoid?"

2 of 21 comments (clear)

Min score:

Reason:

Sort:

Beware slow connections by linuxwrangler · 2003-04-15 10:42 · Score: 4, Informative

In a former job we totally hammered an app on our internal lan and got many times the requests rate we would need in the real world.

Fat, dumb and happy we figured that the real world couldn't hammer us as hard as we could internally. Wrong! Slow connections require maintaining connection resources much longer than on an internal network where the response can be created and dispensed with almost instantly.

Maintaining all those simultaneous connections depleted our resources and the app went into full meltdown mere seconds after being released on the public servers.

We beat a hasty retreat to the old code, licked our wounds, and learned a valuable lesson.

--

~~~~~~~
"You are not remembered for doing what is expected of you." - Atul Chitnis
The C10K problem by Panoramix · 2003-04-15 10:55 · Score: 4, Informative

You probably know about this paper already, but just in case you don't:

The C10K problem

The paper deals with web servers handling ten thousand simultaneous TCP connections. But most of it is not particularly related to HTTP or web problems, but with more general socket I/O stuff --particulary with the ways of dealing with readiness/error notifications (e.g. select(), poll(), asynchronous signals, etc.). It also discusses other kind of limits (threads, processes, descriptors).

It is quite enlightening. It may be a bit outdated --I remember reading it about the time Netcraft was doing all that noise about Windows being faster than Linux as a web server-- but I'm sure most of it is very relevant.