MBONE for Software Distribution?
Warren Vosper asks: "As I sit here twiddling my thumbs, waiting for the RedHat mirror sites
to finish pulling down RH7, I ponder the need for this. Why can't we
use the MBONE to update the mirrors? I could satisy my burning
need for instant gratification *so* much sooner. Hell, why couldn't
I tune in to an MBONE broadcast from RedHat and get it at the
same time as the mirror sites? As I looked over the ancient (5-6 years
ago) online info regarding MBONE I understand that it's used mostly for video and audio, but why not software distribution?"
Just a couple of points. You were using multicasting, not MBONE - there's a difference, one is a protocol, the other is a network established to test large scale multicasting.
And secondly, Ghost killed your network because your switch is dumb, misconfigured or both. In order for multicasting to work well on a switched network, the switch has to listen in on the multicast (usually referred to as IGMP snooping) to determine which ports are part of the multicast. If your switch is unable to do this, whether due to design deficiency or misconfiguration, the multicast session devolves into a bandwidth sucking broadcast storm.
Fcast seems way more complicated than it needs to be.
Say that you have a file that you want to send to a lot of people. These people are going to want to get the file as fast as possible, but they are also all going to have differing speed connections.
Now, as the sender of the file, we would like to minimize the number of packets we send, but we don't have to ensure that we only send each packet once, we just need to be better than sending every packet once to every recipient.
So, instead of using one multicast channel, use a bunch. Each channel broadcasts at some lowest common denominator speed which can be picked based on your intended recipient's networks (if you don't know, you need 14.4kbps or something like that). Then, compute the time it will take to transmit the entire file at that speed. Time shift each channel by channelNumber*totalTime/numberOfChannels and start broadcasting all of them continuously, at the prechosen speed.
Now, as a receiver, you know how much bandwidth the sender has to you (or at least you can figure it out). Simply subscribe to the largest number of channels you can w/o getting dropped packets over some threshold. You might get some duplicate packets from wrap around between the beginning and ending of the transmission, but those can be tossed.
If packets are lost, you could either request specific packets from the server (if you have only a few and the server isn't too loaded), or you could just jump on the channel that will have that packet soonest (and onto another channel if you miss it again, rinse, wash, repeat).
Assuming a constant base channel speed (which seems reasonable until broadband access is more wide spread), the trade off here is the number of channels. By increasing the number of channels, the sender has to repeat each packet more times, but the clients can have better maximum throughput and less time to wait to replace dropped packets.
There is probably some additional cost at the routing layer for all these people subscribing and unsubscribing from extra channels, but I assume (maybe incorrectly) that the routing layer would be able to handle this problem since it would be distributed across a whole bunch of routers.
becuase with multicast, there is no flow control?
And no guaranteed delivery?
This is perfectly acceptable for media broadcast, where the codec can deal with dropped bytes.. but..
|-| == bandwidth to move it.
|----| == tarball size. 80s
|--| == bandwidth to move it.
|--------| == tarball size. 90s
|----| == bandwidth to move it.
Over the last 30 years of computing, we've always had more program than pipe to push it through. The only way to overcome this slowly increasing speed of affordable bandwidth, is to pay big bucks for a line that will be outdated in a few years.
In the not to distant past, todays game emulator ROMs used to be moved across the country in a game console containing a huuuuge amount of graphic hardware,. For the day, having a game that totaled more than 1 Meg in run time size was just gigantic, and now, we zip these same ROM images around the net in seconds.
In the very near future IP6 over Multi-Gps fixed wireless will make mirroring linus' balls of tar a trivial task. but, of course, by then, the "kernel" will be 200G ;).
The lesson here is that affordable bandwidth, slowly and stedily has increased over the history of computing, and I see no reason why it should jump.
Most multicast-native customers that make use of the MBONE have quite a bit bandwidth to toss around for video and data broadcasts, or it is part of their business model. (broadcast.com, NASA JPL, US DOE, etc.)
Now in regards to software distribution, it would not be feasible for RedHat to multicast a 600Mb ISO using the Internet multicast backbone as each provider that wanted access to that data would also subject their providers, and their providers providers' to receiving that data as well. So essentially you would have 600Mb flying through 6 transit networks to reach you. Imagine the waste of bandwidth. Do you think multicast providers would take this with an enthusiastic grin?
Currently, there are a few providers that use multicast for stream distribution to multiple servers on live events. You can be assured this is the case for large scalable video distribution houses like broadcast.com, possibly Akamai and others. Hope that provides some insight. I'm not an expert, I've just been performing a lot of multicasting research as of late. Cheers.
The data then branches off from there. This would be quite suitably for updating mirror sites, since
One problem I could see is that this method of distribution for data files (versus video and audio) wouldn't scale well. Imagine one site drops a packet. Well it can't very well start over, since that same packet did possibly reach all the other listening parties. They are all expecting the NEXT packet, not a retransmit.
On an fast, fault tolerant network (major backbones, and obviously intranets) this works great (we use ImageCast at work to simulcast drive images to multiple systems) the bandwidth used is no more than if a single system was done one at a time. But on any network where packet loss and latency are a problem, thing would seriously hamper to practibility of the system.
So I say, multicast to a few hundred major FTP mirrors from the master server (redhat in this case), and then good ol' traditional FTP from there.
... and published on Doctor Dobb's Journal (yes, the source code is available). See http://www.ddj.com/articles/2000/0005/0005i/0005i. htm
The guy who wrote it works for Microsoft (so of course his implementation is Windows-dependant) but he makes some pretty good points on using multicast for file distribution, and naturally the idea and/or algorithms could be reimplemented in some D-O-S (Decent Operating System) like Linux...
Best Regards,
--
Durval Menezes.
Best Regards,
Durval Menezes.
I have never met a computer that didn't like me.