DIY Carrier Grade Linux with Debian
An anonymous reader writes "Carrier Grade Linux, once the domain of big-bucks Bells and commercial software vendors, just became more attainable for universities, companies running high-availability web services, and average Linux hackers interested in learning what goes into the world's most reliable, maintainable, and available systems. The Debian project, backed by HP, has launched the Debian-Carrier Grade Linux subproject, and registered Debian-CGL with version 2.02 of the CGL spec. LinuxDevices has created a simplified version of the registration form that lets you see which Debian packages to apt-get, and which packages you'll have to download and compile out side of Debian, in order to get your own Carrier Grade Linux setup."
Debian has long been 'the example' IMHO. RedHat got all the fame and glory, but Slackware and Debian really showed what Linux should be like.
I just wish all these projects (i.e. ubuntu) that base off of debian would give them more credit.
do() || do_not();
Hmmm... Let me know when someone finds a "carrier-grade" carrier. I have yet to find any carrier with 5 minutes or less of downtime per year. Our current carrier is at approximately 24 hours of downtime per quarter-year.
Telcos feel they need 99.999% uptime from their equipment in order to provide you with a much lower level of service - typically 99.9% for a T1 or an analog voice line, occasionally 99.99% for a set of redundant circuits.
Our current carrier is at approximately 24 hours of downtime per quarter-year.
That's roughly 99%. If this is a T1, you should be able to do ten times better. Your SLA should provide a clause to escape your contract if it's really that bad. However, find out what the downtime is caused by - if it's local loop issues, then it's not the CLEC's fault, it's the ILEC's fault and you're still stuck with their wiring no matter who you choose. The best thing to do in that circumstance may be to demand a different physical circuit from your existing CLEC.
[A]pparently "Carrier-Grade" refers to telecommunications carriers, which can typically accept no more than 30 seconds to 5 minutes of downtime per year ...
Well, if we measure this in a way comparable to the way that phone companies measure uptime, it'll mean measuring the time that the OS responds to pings. A machine that is a total zombie, with no processes making any progress, will be considered "up" if you can ping it from a nearby machine.
After all, we are all familiar with phone systems that give a dial tone (i.e., is "up") but can't make calls, or makes them but doesn't transmit sound in one direction, or has so much noise that the speech is unintelligible. But none of these problems are considered "downtime"; the most common definition of "up" is providing a dial tone within N seconds.
Since a recent upgrade, my wife's Mac Powerbook has repeatedly gotten into a state that it doesn't respond do any input except mouse motion. We can show it's alive by movingg the mouse and watching the pointer move on the screen. But button clicks or keyboard input have no effect. I can ping it from another machine, but I can't telnet or ssh to it. The on-screen clock changes once a minute. I'm sure that Apple would consider this to be "uptime" for OS X, along the lines of the phone companies' way of measuring their 6-nines "uptime". And when we finally give up and reboot it, that's not considered "downtime" either, since it was done intentionally by the user.
Something very similar happened on my RH linux box a year or so back. But I can't replicate it.
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
Seems like a good idea at first but if you have 5-30 minutes downtime per year
that means one very quick kernel patch per year . If you are really concerned
about uptime applying patches in a timely fashion is just as important as
hardening the system to start with.
Obviously starting with solid proven code should mean less patches are needed
but nobody is perfect and what about new functionality ?
That kind of uptime is IMHO more a function of your hosting environment and the
hardware you choose , this is going to be a waste of time for anyone but the
carriers who can afford it . You would do better to have multiple servers in seperate
locations , a nifty routing/caching setup and a sensible Develoment/Production regieme.
Still its a nice stick to beat microsoft with , even if it is a bit too bendy
[site]