Microsoft Worms and Global Routing Instability
James Cowie writes: "Fresh analysis here indicates that worm propagation periods correlate very strongly with global BGP routing instability, as measured by sustained exponential increases in the number of prefix announcements and withdrawals seen in BGP message traces."
Net instability can also be predicted if Slashdot links to a .... well anything.
I am Jack's HTTP Server
of contributing to global worming. They need to cut back their toxic emissions immediately before it's too late to save the planet.
try { do() || do_not(); } catch (JediException err) { yoda(err); }
OK, everyone knows that word association is a powerful marketing tool. Example: Microsoft Office. When you say "office suite of programs" to the average person, they automatically think Microsoft Office. Well this article sure gives us a great one:
In this online note, we summarize our preliminary analysis of the surprisingly strong impact of the Internet propagation of Microsoft worms (such as Code Red and Nimda) on the stability of the global routing system.
Look on AP, Yahoo, MSNBC, CNN, and you always see "the Nimda virus" or "the Code Red virus," but I prefer the way the article said it. So from now on in your conversations with others, refer to each virus in this category as a "Microsoft Virus" and hopefully by word of mouth word association we can sway public opinion away from this crappy MS software.
~ now you know
So...on a related note.
If it is true that viruses create BGP instability, one can extrapolate that this is a form of
terrorism, by disrupting international communications.
Now - as Microsoft has done almost nothing to effectively eliminate the threat of viruses, and
hence a form of terrorism, MS can then be seen as "harbouring terrorism".
Didn't George W himself say that those who harbour terrorists will receive the same fate?
It's therefore in the international communities best interests to destroy Microsoft!
Sparks:Gadget:Beer Maker
first off, i'd just like to say, i love it when a hardcore networking article gets posted to slashdot, the number of responses is so much lower due to the userbase having no experience with the subject; and mindless pontificating and chest beating (as in anti microsoft/pro linux articles) doesn't cut it with this subject matter.
k /i to_doc/index.htm
as an aside, i don't mean the above preamble as a negative statement about the specific poster i'm responding to.
"Consequently, since routes time out after a while
...This would logically increase the load on route discovery protocols such as BGP."
well...not exactly. when 2 routers are set up in BGP partnership they exchange an initial set of rotes which are statically set by the AS administrator, there's no dynamic discovery process. those routes are only changed under a few specific conditions : explicit changes announced by the BGP partner, or the loss of connectivity to the partner (too many missed hello packets). BGP route exchange is not based on some kind of dynamic route timeout/refresh algorithm as that would be horrifyingly inefficient.
a few words on how routing and route caching work (this is assumed to be on an defaultless internet backbone router) :
a packet enters the router destined for some ip address, a lookup against the routing table is done, the appropriate outbound interface is selected (this set is known as path determination), the packet is then sent to the appropriate outbound interface, re-framed, and sent out to the next hop (this step is known as switching); route caching associates a destination ip address with an next hop interface, thus bypassing the redundant route table lookup. a definate gain in efficiancy, cisco makes a number of advanced caching/switching engines that are used in thier high end core routers.
to summarize/explain the BGP/worm paper : worms generate excessive traffic; the traffic overwhelms some routers and wan links; thus, BGP hello packets get lost or never sent depending upon traffic or router load; consequently the BGP routes are being announced/withdrawn at a high rate (this is known as route flapping). this is bad, having a route fail is not a problem, as long as it stays failed. rapidly changing states creates extra load on the router. route dampaning policies help, but with a worm creating these conditions everywhere at once the cumulative effect is instability.
check these sites out to learn networking :
http://www.cisco.com/univercd/cc/td/doc/cisintw
http://www.merit.edu/mail.archives/nanog/
anyone who writes a wise ass follow up to this had better include a CCIE number.
One of the inherent problems with all routing protocols is that rely on inband announcements and updates, and communciate state purely by reachability. This is clearly a flawed approached on heavily loaded links and routers. This problem has already been addressed worldwide on the telephone network with the introduction of SS7. One of the key aspects of SS7 is that it is transported over an Out of Band network (the actual transport may be on a dedicated timeslot on a SONET link, but the basis is that the link is dedictated to management).
By implementing a low throughput (say 64K -256K - this requires more analysis) management network, the ISPs could be certain that the state of the BGP peering sessions and the integrity of the UPDATE messages are always intact.
One of the key aspects/benefits of BGP is that unlike other routing protocols it does not advertise routes in the simple - "here's my routing table" messages that protocols such as RIP and while less so, but similarly, OSPF and ISIS use. BGP relies on TCP sessions between peers. On connection the entire known (or filtered via policies) short test path routing table is exchanged. After this the link stays idle, with the exception of TCP keepalives, until an UPDATE message is sent to communicate that a new route is added or an existing route is removed from that peer's routing table. Also BGP does not assign any significance to the port that receives the information - merely the peer. This all makes BGP inherent scaleable, stable and reliable - unless resources are not available (CPU, memory, buffers or links). TCP is the reliability mechanism here. The presence of the TCP session validates all the routes learned via that session. The absence of the TCP session invalidates all the routes and causes them to be withdrawn for that TCP session.
Maintenance of the TCP session stability is key to the stability of the routing table. With over 80,000 routes on any BGP full update, the processing needed to cope with multiple TCP sessions failing or starting is immense (and probably better servered by a UNIX platform than by a router to be honest).
SS7 uses a mechanism whereby UNIX servers process the routing information and create the core routing table - note: table is the key - it is not the path the data or calls follow. Building a similar architecture within the Internet would allow routers to have one or two TCP sessions to BGP servers (a concept already grasped with route reflector servers) and dedicate their CPU to forwarding packets etc. The dedicated servers never need to see a packet to be forwarded - it's just not that important to BGP, so they have no need to be on the same physical cables/links as user packets. This architecture would take some rethinking but not would not be outside the plans of most ISPs, and definitively not outside the skillsets.
Clearly the next problem then becomes low speed customer connections. Again the Telco industry has addressed this problem with ISDN - with the B channels. For these lower speed connections, there is no need to change the existing model. Losing one customer here or there is nothing (UPDATEs on BGP are typically well over 100 a second at NAPs) and would be catered for simply.
The NAPs could merely serve as routing table peering points, and not data transfer points - again another area of congestion.
The Internet is proving to be reliable and a trustworthy international communications medium, the next step is to make it even more robust, and truly scalable. Using OOB management is the obvious next step to this goal.
GMPLS is being touted as the next step for ISPs in terms of exchanging routing information in an OOB network. This is only one aspect of the work that is being done there.
Nope, sorry a tabbaco virus is a tobbaco virus because it destroys tobbaco crops. These worms are MS worms because they destroy MS boxes which then attempt to destroy everything. It's time the world knew about it.
You won't hear the popular press refering to "another MS worm", however. They would not risk losing their piece of the $1,000,000 advert budget MS has for XP. As you see, "professionals", and those writing formal papers are free to call the thing what it is and should. The popular press will get it sooner or later.
You and I should not censor our own speech for MS and their sloppy wares.
Friends don't help friends install M$ junk.
Very shortly after the beginning of Code Red this ceased to be about server admins. The boxes being infected by these viruses now are home or non-power business users who have IIS enabled by default. Why by default? Because MS doesn't care about security. Why not throw in features most users won't need by default? What's the harm? Oh, we're destroying the stability of global routing? Oopsie.
The majority of the IP addresses spreading these viruses show the default homepage if you go to them. Because the home or casual business users running these boxes DON'T KNOW what IIS is, or that they have it enabled, they DON'T KNOW that they're vulnerable or infected. These are the people that criticalupdate would reach. These are the people that need the patches. By NOT pushing this patch, MS is leaving the situation as it is, and it will never get better. To repeat - security conscious server admins are having their network hammered by this virus not because other server admins are lazy - but because many non server admins have operating systems with IIS enabled by default, and MS is making no attempt at all to reach those people despite the fact that the situation has not improved.
It's easy to say this, but speaking as one who works for an enterprise, it's not easy to do.
We've got tens of thousands of PCs running hundreds of applications - some internally developed, some externally developed.
For MS security patches (or anything else) that we release into "production" we need to engineer the build to make sure it works with our OS build, then test against Tier 1 applications.
Once that is complete, the development groups need to sign off saying that their application runs with that code.
Specifically in terms of IE 5.5 SP2, Quicktime is no longer compatible. Sure, there's an update to Quicktime, but my point is this - how many other things stop working? Which of our internal apps are dependent on IE or subcomponents that no longer work with IE5.5 SP2?
We don't know. Frankly, even if we thought that we knew, we couldn't be sure outside of testing.
IE has seen 7 security patches in the last 8 months. Particularly in this economy, we can't afford the testing staff to nail each of these as they are released.
Of course we're at risk. Now is the time to question our continued use of MS products. I'm doing that.
Regards,
Anomaly
But Herr Heisenberg, how does the electron know when I'm looking?