Rerouting the Networks
prostoalex writes "Scientific American looks at a new approach to clearing networking jams, in research funded by the US military. Instead of using routers to route the packets from point A and point B, thus making some hop in the sequence critical for delivering the message, researchers are exploring a new approach called 'network coding.' (Here is the illustration cited in the article.)" Quoting: "[Four researchers] then at the University of Hong Kong published groundbreaking work that introduced a new approach to distributing information across shared networks. In... network coding, routers are replaced by coders, which transmit evidence about messages instead of sending the messages themselves. When receivers collect the evidence, they deduce the original information from the assembled clues. Although this method may sound counterintuitive, network coding, which is still under study, has the potential to dramatically speed up and improve the reliability of all manner of communications systems and may well spark the next revolution in the field. Investigators are, of course, also exploring additional avenues for improving efficiency; as far as we know, though, those other approaches generally extend existing methods.'"
What happens when the second node goes down?
If you automatically broadcast the knowledge of the message across all available channels, if it gets to the other end its won.
This is the not putting all your eggs in one basket scenario which attempts to address bandwidth limitation issues.
It was only recently I was thinking about global communications on a wireless mesh without ISP involvement (reroute the backbone across home networks), I wonder if this could be the kickstart it needs?
liqbase
You're right, their example is not very clear. It works best when there are extra resources available to the system that *can't* be used for other purposes; the best example of this is wireless communication, where all transmissions are broadcast. For example, if A sends a packet to B, but C also hears it, we can't possibly use the A-C transmission for some other data.
Tsunami -- You can't bring a good wave down!
Note that this is a distribution network for multicasting. Nothing wrong with that, but it doesn't do anything for ordinary point to point Internet communications. The cable TV people might find it useful, though.
For their system to work, it seems that A has to send to B and C
which blocks C from sending anything
but C must be sending to D at the same time so that the messages can be merged at the "coder" point.
And the coder would then use TWO transmissions to send the "hint" and the merged data.
It's still sending two data streams (one for merged stream and one for hints) when they were originally complaining about how it was a single channel.
Its a pitty that 99% of the internet's traffic is unicast not multicast so this won't actually help. Plus this gets really hard computationally quickly.
There is a good explaination in this reference. Although that paper mostly treats the case of quantum networks (which is ubber cool) it gives the background. The standard example (not explained well in TFA) is the http://en.wikipedia.org/wiki/Network_coding>butter fly network and the wikipedia article does a reasonable job explaining it. Basically, even with the three different routes, there is still a bottleneck if you just reroute information. But by coding the information, you get it all through. The example of the butterfly network is very simple and worth learning -- just google it if the wikipedia entry or the arxiv paper doesn't help.
this already happens on really lossy transmissions -- especially things like satellite tv or, say, deep-ish space missions don't want to waste time retransmitting (in satellite tv, if you didn't get it, you can't ask for it again, your picture just looks terrible for a split second).
so, they use error correcting codes, like, for example, parity (like raid 4,5) or reed-solomon (i think that's used in raid 6 or 7 or something), or turbo-codes.
fyi, for the latter i don't actually know how it works.
anyway, this is a pretty good idea (even though it's certainly been thought before); i'd be happy to see tcp replaced with something a bit smarter and just as functional.
the privacy of one's mind is important.
you do have something to hide.
Consider using the described method instead of routers, which use the ISO/OSI stack. The message uses an inference, in some ways like how striping works in a RAID 5 drive failure; the failed drive can be inferred from data on the surviving striped drives. Not a big deal. Chunks/nibbles of the messages have a low occupancy time within a cloud, but many receivers must evaluate many messages-- not all are intended for them and must be discarded or validated (imagine spoofing in such a concept as described in the diagrams!).
But in the message diagram, there's a talker, and a receiver that gets messages through inference. Between the two, the cloud between them becomes unbelievably saturated with diffuse messages-- the same reason that ATM looks good on paper with its 53-byte packets, but in reality can't deal with traffic jams. This is why MPLS and other deterministic routing methods were invented-- to qualify transmission routes, not just shoot them like a shotgun into a cloud, hoping that the intended receiver gets the message and deciphers it through receive-side heuristics.
This has been done before, and it didn't work. I'd love to hear comments on why this should succeed.
---- Teach Peace. It's Cheaper Than War.
Network coding is far from a brand new idea. It is introduced in a paper by Ahlswede http://pdos.csail.mit.edu/decouto/papers/ahlswede0 0.pdf published in 2000 and has ever since been a very popular research topic in the networking world (http://www.ifp.uiuc.edu/~koetter/NWC/index.html). These "clues" are linear equations where actual packets can be retrieved by applying a gaussian reduction on the equations.
Its most obvious applications are with multicasting where utilization of network links can indeed be increased in an informational theoretical perspective. The tradeoffs are increased CPU load on the intermediate and end nodes. The research so far is a bit from getting into the practical stages but it has promise. As for "99% of internet traffic being unicast"; even though that might be true, one needs to think outside the box. If it turns out that multicasting will be much cheaper than now (multicasting is now in most cases basically multiple unicasting), broadcasting TV through packet networks might become much more efficient and this proportion could change.
Finally, don't forget that research is still in early stages. I believe there are some years, even decades (if ever), until we will see any of this in practice. Maybe it turns out to be useless for computer networks, who knows. Even so, the basic principle might still prove useful for other applications such as routing inside solid state chips.
I worked it out, coding is still better.
For the example before:
1. A->B: pAB
2. C->B: pBA
3. B->A, B->C: pAB xor pBA
Assume that at time=3 a bird flies between B & C. The sequence becomes:
1. A->B: pAB
2. C->B: pBA
3. B->A, B->C: pAB xor pBA [B->A SUCCESS, B->C ERROR]
4. B->C: pAB xor pBA [retry]
B resends the packet, but addresses it only to C. This is still quicker than routing, which would take 5 time slices.
Exactly. Their diagram is pretty and has unused resources in the overly simplified "before" case ... but it does NOT scale when your have multiple senders and not every receiver wants every message.
So each point in their diagram will either have to be aware of every other point (you think routing tables are bad right now) or they have to send MASSIVE amounts of extra data.
Just from their diagram, they went from 3 lines to 6 lines at the first router. Then they fed those into 6 routers each with TEN lines.
That is a total of 7 routers and and 66 lines to deliver 3 messages to 20 recipients (60 transmissions). Or, 10% more than would be needed if you just ran 1 each.
Now add 2 more senders and 20 more recipients. But 5 of those only want 1 message, 10 want both the new messages and the other 5 want all 5 of the messages from all five of the senders.
You end up with congestion just from the packets telling the receivers which streams are merged and which streams they need to unlock them.
And the machines that only want 1 message have to decode 5 messages to get it.