Motorola Releases HA Linux
A reader sent us the word that there's been yet another entry in the Linux Distro Population Index. Yes, Motorola has released a distro they are calling High Availibility Linux. It's released for x86 and PowerPC platforms and is intended to be for embedded systems that need to be "99.999% uptime". They've also released
details on their Web site about the system. Their main target is telecom development, according to their press release.
I think it's great that the giant Motorola/Micro$oft techno-machine is about to come toppling down with the aid of one of the players! Does anybody have any info on Micro$oft'$ reaction?!
Since when has there been a Motorola/MicroSoft techno-machine? Sure Motorola uses MicroSoft technology in some of their products (like most major technology companies), but they also support a number of competing technologies such as:
Psion(symbian) - competes against Windows CE for PDA market
MacOS X - Motorola's biggest external semiconductor customer is Apple
Linux - supported for embedded applications
While not on unfriendly terms with the folks in Redmond, Motorola is definitely not as tied to them as intel, AMD or any of the x86 chip makers!
Oh shit, that's why I'm already running Linux.
This is the proof that Linux is ready for the major leauges. Telephone systems provided by a major company are running Linux with a promise of more than 99% uptime! This is very good news indeed. Next we will see companies (big ones, with name we all recognize) deciding to use it inhouse for their data processing applications. Not because it is less expensive but because it is best in class. The IS types will look at the Motorola release as proof that Linux is indeed "good enough" for them.
Hmmm. I suppose that's true. As viewed from Earth, Jupiter (or Saturn, in the book) -is- up. So, by definition, HAL's clockchip -is- an uptime. :)
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
There are a few reasons, the simplest is that they are generally selling a metered service and don't make any money during downtime. Depending on the bandwidth of the link, and the price of the service, this can be thousands or even hundreds of thousands of dollars a minute (this is especially true of under-sea cables: see Neal Stephenson's article Mother Earth, Mother Board)
Second, these are industries that are or have been highly regulated, and while overregulation breeds inefficiency, at least part of that inefficiency is overengineering.
Finally, while Joe Average doesn't care too much about a busy signal for 5 minutes, big customers do and have quality of service and availability guarantees built into the contract. These customers will want to see that appropriate hardware is being used, since no matter how good the contract, it is unlikely to cover actual customer losses caused by an extended downtime, say for a financial institution.
As you point out, however, it doesn't necessarily make sense to route all calls over this kind of network, which is why voice over IP is growing more popular. It is cheaper because there are less gaurantees for availability and quality of service.
--
"L'IT c'est moi!"
Agreed. HA solutions are mostly a matter of money. Specify your required amount of downtime, and then throw enough money at the problem until that target is reached. From the base level, you start by adding redundant storage, and move up, adding N+1 power, memory, processors etc., and then start into clusters with hot-spares and finally distributed clusters. By the time you're reaching that point, though, it's costing you a lot of money, and you have to weigh up whether the potential losses from downtime justify the amount you need to spend to guarantee the uptime.
Of course, to do it properly, you need to have kernel support, and at the five nines level and above, probably hardware support as well. For most, though, a software-only solution will be more than adequate, and will provide a suitable balance between cost and reliability demands.
"The invisible and the non-existent look very much alike." -- Delos B. McKown
Now that it has hot swapable CPUs you won't be able to shut it down when it goes psycho and kills everybody!
What are you doing, Dave?
I'm pulling your processors so you won't kill me!
I'm sorry, Dave, but my HA Linux system allows me to hot swap CPUs - that won't save you, Dave...
=tkk 8)
Bill Gates - Creationist?!?
For some obscure reason, this sounds like an OS designed for men with vests and Harley-Davidsons...
That's almost like plugging in an ISA card while the computer is compiling a kernel. *shiver* I actually pulled an ISA NIC card out of a running computer by accident once. The computer locked up, but I was astounded that after a reboot, everything worked fine (Including the NIC!)
Next they're gonna tell me the machine supports direct-circuit contact Water Cooling. (RUN! ACK!)
-- Give him Head? Be a Beacon?
-- Give him Head? Be a Beacon? :P)
(If you can't figure out how to E-Mail me, Don't.
How do they handle this; do you have to take the CPU offline first, or does it survive a CPU crash?
In other words, is it like Sun's servers where a CPU crash means a reboot, or like Reliant's servers where a CPU crash means a delay of a couple of milliseconds and a service call?
I would think there would have to be hardware involved, not just software, for the latter.
They need their own distribution to maintain configuration control over the software. Any operating system that is expected to maintain 99.999% availability requires extensive testing and iron-clad configuration control. All changes must be justified, the new/modified code must be reviewed, tested and integrated into a new build. The new build must be extensively tested before it is released into production. Building fault-tolerant hardware is relatively easy. Getting the same reliability out of the software is much more difficult.
Mea navis aericumbens anguillis abundat
Many American telephone companies are pioneering the provision of second-rate service. It's the bit about lower prices that they have trouble with.
Mea navis aericumbens anguillis abundat
You hit the nail on the head there... guess I hadn't given enough thought about what really a distribution for an embedded system means.
However, I have to disagree on one thing. If you're doing an embedded system, then you should be doing your OWN distro... not use Motorola's... since an embedded system can have from 10Kb anywhere up to 128Mb flash memory... Motorola should only give the necessary guidelines to build an embedded Linux system... provide GPL'ed drivers, and leave the rest to them.
When did Morotola buy Metrowerks?
LK
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
In the case of Motorola they were actually correct releasing completely separate distribution. Since this is very unique version of Linux it simply makes sense to package it differently. It would also make life easier for people who actually want to use since there is a lot less to worry about then it would be if they had to rebuild standard distribution.
COntrary to what many people have said, I think it is a good thing that hardware manufacturers are building their own distrobutions and kernels for Linux. Thinking that there ought to be a single Linux distro that should work on all systems equally well means it will work equally crappy on all systems. Every processor and chipset is a different beast and should have customized code to run it at its highest level of efficiency. A single kernel recompiled to run on different types of processors doesn't make alot of sense to me because some processors and chipsets behave so differently. What I would not like to see if the fragmentation that UNIX suffered between the versions the colleges were releasing and the versions Bell Labs continued to work on. There's already arguments about which is a better distro (which is two people arguing that the sky is blue), what isn't needed is different design philosophies.
I'm a loner Dottie, a Rebel.
Actually, the x86 version is Red Hat-based, while the PPC version is LinuxPPC-based. So, you can expect that any software running on either of those systems will work fine on the motorola distribution.
I love seeing new, specialized Linux distributions which excel in their niches. What I hate is the "produce another general-purpose distro" idea, and I can't wait until some of them shake out.
--JRZ
yeah..theyre here (http://www.mcg.mot.com/cfm/templates/swdetail.cfm ?PageID=682&PageTypeID=10&SoftwareID=6). ....
How is this any different than a Sun E10K or some Mainframes and Crays (amongst others) that support the ability to dynamically add/remove not only CPUs, but memory, and cards. Even most new Intel servers (and NT) support at least some ability to dynamically add/remove PCI cards.
As the news article states, Motorola plans to begin shipping HA Linux in May.
I think they were single machines, servicing different areas of the country. 2 went down, leaving one machine to service the whole country, which naturally struggled under the weight. It was as software problem in the other two machines, AFAIK, and the machines were down for many many hours.
Forgive me if I sound like a hopeless math luser, but how did you come to this number? 99.999% uptime tells me nothing. 99.999% of *what*? A year? A day? A nano-second?
The number he arrived at was using the time period of a year.
99.999% is rather general, but it's supposed to be. You want the product to spend 99.999% of the amount of time is trying to be used in a running state. If you're designing software that will only be run for an hour once a week, then spending that hour, even once, having the software not work will pretty much guarantee the software isn't 99.999% because of how long it takes to make up the downtime. How long? One hour of downtime required about 12 years of uptime to balance off.
It's EXTREMELY hard to do, because upgrades, maintenance, and even failures all have to be handled without the software going down.
---
"You know your god is man-made when he hates all the same people you do."
I work on a database application which catalogs circuit designs for an RBOC's 911 and FAA circuits; some of these systems are so important their data is routed over four redundant trunk groups--you can have three simultaneous circuit failures and the data will still get through. If it doesn't, well, in an air traffic corridor where planes less than a mile apart are closing on each other at over a thousand miles an hour, five minutes without radar data is an amazingly long period of time, odd/even thousands be damned.
Telephone switches themselves are astonishingly robust pieces of equipment, as has been pointed out above. They are designed to handle tens of thousands of simultaneous connections and dynamically shunt traffic from overloaded or unavailable trunk groups. If a switch crashes, which happens once every three or four years, it can reboot within twelve seconds and existing calls aren't interrupted. New attempts made during those seven seconds are quietly rerouted somewhere else (sometimes during the last second of boot it just pretends to be ringing the line) and you'd never know the thing went down.
It ain't just phone calls; it's stuff where five minutes of down time could cause a catastrophe.
--
This is not my sandwich.
Whoo hoo! Hot swappable CPUs! Here goes!
# umount-eject /proc
This equates to about 5 minutes of downtime a year, which is not too bad.
:)
Just for the curious,
--
M
This craziness needs to stop!!
Now, I think it is great that Linux is coming out all over the place and runs on just about every CPU and architecture out there. However, all of these variant distro's are going to create serious impact on software development.
We see everyday on Freshmeat that this package or that package has been fixed to work with "Insert Distro Here". There is a fix for Debian this day, Red Hat on another, Slackware later on the second day. Talk about a nightmare for anyone who creates an app with a moderate amount of interest.
---
Just my 2 cents. Oh wait, my thoughts ain't worth even that much in the end.
The Motorola page is down now (slashdotted), but when I saw it a second ago, it looked like they were pushing *TWO* different distros. One distro was three-nines availability for embedded equipment, and the other was 5-nines availability for HA equipment.
Well, you can get the EGCS/Gnu tools with the CodeWarrior IDE for Linux from Metrowerks. Annoyingly, there are separate Red Hat and SuSE versions.
I'd like to see the source code for this High Availability distro, to difference against others. Can't find it on their web site, after some looking. They have some downloads, but they're just patches for PCI bus support.
... then why did they put CodeWarrior 5 for Linux on hold?
Sure, their distro is for the embedded market and CodeWarrior is for the desktop, but Linux is Linux..
Ordinarily, I'd agree, *BUT..
This is a very hardware based solution, and hence, requires a very specific distro for it. This isn't for running Wordperfect and Quake III. This is high availability. Granted, if they stuck with a given distro's standards, that be nice, but this is a VERY specific solution..
-- I'm the root of all that's evil, but you can call me cookie..
Secondly, how usable is their HA code? I know that Sun's HA code is buggy and phenominally unreliable to the point of being completely useless.
Lastly, what (if any) code of their own are they going to release?
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
-- Are you an EFF member yet?
With major hardware makers, software companies wanting to release their own distributions, I think we should really call again for a linux "standard" distribution, in order to make things such as Oracle distributions possible.
Otherwise we risk to go the way of the UNIX's splits, where basically each UNIX is it's own platform, and this is going to scare people off real quick.
I'm all for innovation and stuff, but not at the cost of loosing my favorite current platform. This is why I think SGI did the right thing by creating a Linux Powerpack that comes on top of a Redhat distribution. It's less work for them (they don't have to take care of the whole distribution) and it's more reassuring for the users and the software makers that don't have to worry too much about yet another distribution.
Hey, I've hot-plugged ISA cards before.. Just make sure no software is expecting to use it, the IRQ is clear, engage the ground contacts first, and cross your fingers.
It really freaks PC people out when you yank cards out of a working system, doesn't it? During the course of testing a system, we beat on the chassis and cards with our fists, wiggle and pull on the connectors, bodycheck the entire bay of equipment, pull each card, reset the processor, interrupt any signals that are supposed to be protected, and if at any time traffic fails for more than 50ms, we have to go back and figure out what went wrong. Computer users just don't understand it.
It's difficult, to say the least, to get people to understand that when we apply power to something, it never goes down, for any reason, until it's obsolete and ready to be removed.
Do you have an example, or some good way of explaining this sort of stuff to people outside the industry? I seem to get blank stares when I describe telco availability standards to PC people.
Sorry I don't have anything serious to add to this discussion. But how about some possible advertisements for HA Linux:
* * * *
"Argh! The Blue Screen of Death again! Where can I get an OS with 99.999% uptime? aHA! HA Linux."
Or
"Laugh at the competition: HA HA HA. Use HA Linux."
Or
"After seeing 2001 Space Odyssey: So that's what H.A.L. stands for: High Availability Linux"
Now back to your regular scheduled posts.
***********************************************
"I did try to found a heresy of my own; and when
I had put the last touches to it, I discovered
that it was orthodoxy." G.K. Chesterton in
Orthodoxy
***********************************************
I know this may be moderated down, supposedly
:)
because it's "redundant", but it's something that has to be known.
All right, Motorola does their own distro...
is it Debian, or RedHat based? Or none of those?
What will happen if you want to work with embedded controllers, using Motorola chips, but don't want to use their own distro? Will you lose technical support?
IMHO, I think hardware manufacturers should just test distros that work correctly with their hardware, issue a hardware driver patch to the kernel so Alan and his gang can merge it into the general kernel, and be done with it. That way, in case anyone is using another XYZ distro and wants to work with Motorola embedded chips, Motorola simply says "Just patch the kernel... the patch is available right here!" And, if they don't know how to do that, they shouldn't be programming an embedded controller, for christ sakes!
Hardware manufacturers should NOT create their own distros... I repeat... Hardware manufacturers should NOT create their own distros... just release a kernel patch, binary driver module, or
whatever is best for them. It's the best solution
for them (they don't have to manage their own distro), and for everybody else (just download the patch, patch the kernel, recompile it, reboot, and wala! You're done)
If I hurt somebody's feelings, sorry... had to calm down my temper.
theonion.com release Ha-Ha Linux, just because they could.
--
+&x
Hot-Plug CPUs? Jesus, that's scary. I'd be afraid to do it, even if I had a Motorola technician walk me through it. It's just so...so WRONG.
;-)
No, this is just so COOL, if you've ever watched it done.
This is a mandatory test for TELCO equipment; yank the ACTIVE processor card out of it's slot and make sure the inactive side took over, and correctly noted the event. That's with an 80% of maximum traffic load applied.
When you're dealing with ENTERPRISE class equipment and service levels, you don't reboot even to upgrade the kernel. It stays live all the time. Probably make a killer e-commerce (or portal) server, as well.
In the TELCO environment, 99.999% uptime means just that. Too much time outside of normal operation and you're writing inch-thick reports to the FCC.
And believe you me; there's a huge difference between 99.9% (just reboot and get back to work ) and 99.999%
A new kind of meat designed to appeal to vegetarians.
Hardware manufacturers should not create their own PROPRIETARY distributions. Hardware manufacturers should not create their own PROPRIETARY distributions. If the modifications are appropriately gpled, and later merged into other distributions, then great! Motorola, however, can't wait around while other companies (which may have interests opposed to Motorola's -- these things happen) kowtow to Motorola's demands and release what Motorola wants to exist right now. Hopefully these modifications will make it into the general kernel, but one of the chief (for businesses) benefits of GPLed software is they don't have to wait around for the OS manufacturer (*cough* Microsoft *cough*) to do what needs to be done now.
You make it seem like managing one's own distro is harder than trying to manage someone else's distro. You also make it seem like achieving these levels of uptime is just a matter of inserting a new driver. I'd suggest otherwise in this particular case.
And using reverse psychology like yours on moderators, while effective, is beneath my contempt. No need (+1 Insightful) to insert (+1 Informative) subliminal messages (+1 Funny) in my posts. No sir-ee.
"If one is really a superior person, the fact is likely to leak out without too much assistance" -- John Andrew Holmes
Version 9000 of High Availability Linux has soem incredible uptime features.
HAL-9000# shutdown -r now
I'm sorry root, I can't do that.
George
HA Linux provides:
And this Linux seems destined for the telco market, designed to run in telecom systems that require major high uptime (carrier grade networking etc). After 2 or the 3 computers that service 0800 and 0845 etc numbers in the UK crashed at the same time a couple of weeks ago, this uptime is required.
I've been waiting for this... TTC has been using Solaris in their Centest offerings for a while, and their TestPad 2000 series products actually run DOS and Windows(!). These are only test instruments, so their accuracy and ease of use are more important than uptime; customer traffic isn't affected.
Nortel runs HP-UX in some of their transport equipment, but again it's a non-service-affecting application. Failure of the overhead processor means that performance monitoring and protection switching are lost, but it doesn't immediately affect traffic. I don't know what the DMS-series switches run at the core, but the user interface looks the same as on their TransportNodes.
Tellabs runs their Titan series cross-connect systems on PowerPC processors. As in the Nortel equipment, the traffic itself is carried on dumb electronics; loss of the processor only affects fault recovery, system provisioning, and performance monitoring.
So far, nobody's using Linux for mission-critical stuff, processing customer calls in real-time. This is probably about to change! Slashdot readers know that Linux is more stable than the average desktop OS. But most people don't realize the extreme requirements of the telecom industry.
For instance: When a tornado ripped the roof off a central office and half the switch was soaked, the parts which weren't physically destroyed by water kept running.
This is an industry where there's (hopefully) no such thing as downtime. I've been in offices where data circuits have been functioning continuously since before I was born. A few bit errors here and there due to the occasional lightning strike, but no real interruptions. From the switches that actually handle your calls, to the transport systems that move data from one office to another, everything has backups. Commercial power fails? No problem, the office runs on batteries anyway. They go from charging to discharging, and you've got 12 hours to get the diesel generator running in case it doesn't start itself. After that, you've got a week's worth of fuel in an underground tank. Let's say some knucklehead throws a wrench into a power board. Instant pinkslip, but the customers never know, because everything has two power feeds. Down to the individual card level, every circuit in a piece of telcom equipment has a backup that takes over in the event of a failure.
In the PC world, RAID comes close to this level of reliability in terms of a drive failure, but how many of them can give you access to your data even if a controller or bus fails?
Is your desktop box ready for this?
HA Linux IS a significant development. I haven't had a chance to check out the specs yet, (Slashdotted -- how's that for availability?) but from the quick blurb here, I can say that this will seriously change some things in the carrier market. Your ESS or DMS or EWSD might not run Linux any time soon, but some enormous routers and call-processing systems might.