AMD Demos Live Migration Across Three Opterons
bigwophh writes "Advanced Micro Devices has just revealed to the public the first video and images demonstrating live migration across three generations of AMD Opteron processors on VMware ESX 3.5, including the six-core AMD Opteron processor, often referred to as 'Istanbul.' For those unaware of the strains in a server environment, live migration of virtual machines across physical servers is crucial to providing flexibility for managing data centers. AMD is also taking this opportunity to highlight its continued, cooperative development efforts with Microsoft as evidenced in Windows Server 2008 R2 Hyper-V, which just also happens to be available today in beta form, that adds support for AMD-V technology with Rapid Virtualization Indexing."
VMWare ESX has been able to do live migrations for a while now. I'm not sure what makes the Opteron special in this regard.
My blog
They can call me when they've demonstrated seamless live migration between Intel and AMD chips, not just generations of their own hardware. Nobody wants to build a large-scale cloud if they're going to be locked to one vendor forever once they get started.
I was curious how they migrate active network connections though. Does the old host act as a proxy/router? Can anyone shed some light?
Its been around 9 years since I did the project in college, but it is possible to transfer active network connection state from one computer to another. It was a "connection-aware seamless backup server", where our hacked linux kernels would exchange state about an active TCP/IP connection at regular intervals, and when the primary dropped (in our demo we yanked the ethernet cable out of the hub in the middle of streaming an mp3), the other would take over, pretending to be the same IP and picking up where the other left off. The best part was that TCP/IP already deals with redundant, missing, or out of order packets so anything sent or received since the last update to the backup would be handled automagically.
That was just as stupid semester project on a 3-computer ethernet LAN, but I imagine the big boys have figured out how to make it work. Besides, they're literally transfering the memory image of the guest OS over to the other machine so all the state update is already done. The hard part is probably making the IP migrate along with, but I'm sure they've figured that out too.
The enemies of Democracy are
Nothing that complex. It simply passes the TCP state from one nic to the other. I don't know the gory details of how that all happens behind the scenes, but it's pretty impressive to watch it happen - in our lab at work we tested this functionality on multiple guest OSes simultaneously and didn't lose a single packet, even when pulling the power chords out of a number of components (including one of the ESX servers).
That was just as stupid semester project on a 3-computer ethernet LAN, but I imagine the big boys have figured out how to make it work
Checkpoint's Firewall-1 product has been able to transfer the current firewall state to a backup firewall for quite some time (at least 1998 or so), but it originally required the use of RIP to implement failover which introduced delays. In about 1999, Stonesoft produced an add-on product, Stonebeat, which added ARP spoofing, so the failover was virtually instantaneous. These days, the failover is done using VSRP.
as far as the nics are concerned, its all just ethernet packets. There is no state to handle. The state is maintained by the guest OS, and is transferred when the OS memory is transferred.
Supposedly Double-Take software is good at "mirroring" one box with another... and then when box1 dies box2 very quickly takes over, stealing the IP, etc. I have the software, but I haven't had the balls or the weekend time to test it for real yet.