Content-Centric Networking & the Next Internet
waderoush writes "PARC research fellow Van Jacobson argues that the Internet was never designed to carry exabytes of video, voice, and image data to consumers' homes and mobile devices, and that it will never be possible to increase bandwidth fast enough to keep up with demand. In fact, he thinks that the Internet has outgrown its original underpinnings as a network built on physical addresses, and that it's time to put aside TCP/IP and start over with a completely novel approach to naming, storing, and moving data. The fundamental idea behind Jacobson's alternative proposal — Content Centric Networking — is that to retrieve a piece of data, you should only have to care about what you want, not where it's stored. If implemented, the idea might undermine many current business models in the software and digital content industries — while at the same time creating new ones. In other words, it's exactly the kind of revolutionary idea that has remade Silicon Valley at least four times since the 1960s."
Did he just reinvent magnet links?
Give me Classic Slashdot or give me death!
Why does he say "it will never be possible to increase bandwidth fast enough to keep up with demand"?
When I want to watch streaming video, I fire up Netflix and watch streaming video. When I want to download a large media file, I find it on bittorrent and download it. The only time I've noticed any internet slowdowns, it's been in my ISP's network, and it's just a transient problem that eventually goes away.
Sure, Netflix has to do some extra work to create a content delivery network to deliver the content near to where I am, but it sounds like the internet is largely keeping up with demand.
Aside from the IPv4->IPv6 transition (we've been a year away from running out of IP addresses for years), is there some impending bandwidth crunch that will kill the internet?
it will never be possible to increase bandwidth fast enough to keep up with demand.
I've been hearing that since I got on the net in '91. Tell me a new lie.
Its an end time message. "Repent, for the end is near". Yet, stubbornly, the sun always rises tomorrow.
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
See http://en.wikipedia.org/wiki/Uniform_resource_name . This is a very old [and good] idea.
For example: urn:isbn:0451450523 is the URN for The Last Unicorn (1968 book), identified by its [ISBN] book number.
Of course [as the dept. notes] you still need to figure out how to get the bits from place to place, which requires a network of some kind, and protocols built on that network which are not so slavishly tied to one model of data organization that we can't evolve it forward.
Koans and fables for the software engineer
Arguably, this is just a CAS. Now, of course, Freenet/Entropy have been trying their hand at this in an anonimizing setting, much as Tahoe/LAFS has been trying to do this in an encrypted fashion. A well-funtioning CAS with sufficient FEC, and positioned more towards usability, and less extremely towards anonimity, may be just what we need; a single-hop anonimity with lots of storage (a DHT on short I2P tunnels, say), may make a distributed safe-enough Usenet possible.
If this had come out of almost anyone else's mouth, I'd be the first to say they were full of it.
But... Van Jacobson!
Any idiot can have a pile of ideas. The implementation is what matters.
Too bad the idea pays 95%, the implementation 5%
So back in the day, we had a thing called the mbone, which was multicast infrastructure which was supposed to help with streaming live content from a single sender to many receivers. It was a bit ahead of its time, I think, streaming video just wasn't that common in the 1990s, and it also really only worked for actually-simultaneous streams, which, when streaming video did become common, wasn't what people were watching.
The contemporary solution is for big content providers to co-locate caches in telco data centers, so while you still send multiple separate streams of unsynchronized, high-demand streaming content, you send them a relatively short distance over relatively fat pipes, except for the last mile, which however only has to carry one copy. For low-demand streaming content, you don't need to cache, it's only a few copies, and the regular internet mostly works. It can fall over when a previously low-demand stream suddenly becomes high-demand, like Sunday night when NASA TV started to get slow, but it mostly works.
TFA (I know, I know...) doesn't address moving data around, but it seems like this is something that a new scheme could offer -- if the co-located caches were populated based purely on demand, rather than on demand plus ownership, then all content would be on the same footing, and it could lead to a better web experience for info consumers. That's a neat idea, but I think we already know how both the telcos and commercial streaming content owners feel about demand-based dynamic copy creation...
2*3*3*3*3*11*251
I suppose it makes sense. The smarter the intermediate nodes are about deciding what to cache (based on popularity, size, speed of original request, who's nearby and what they have cached), the better this would work.
it's exactly the kind of revolutionary idea that has remade Silicon Valley at least four times since the 1960s."
Well, that's settled.
(The point of SQL is that you say what you want, not where to find it - hence the concept of "NoSQL" just silly)
Sent from my ASR33 using ASCII
A query for information goes where... broadcast.
Think about how many packets that requires... Now think about how many search engines that assumes...
And then think about the returning packet storm.
Now consider millions of queries...
Nope. Not gonna happen.
there's already too much TCP/IP infrastructure bought, paid for and in use.
"I don't know, therefore Aliens" Wafflebox1
"not where it's stored."
So we should make the Internet into Plan 9?
deleting the extra space after periods so i can stay relevant, yeah.
“We can sit here and speculate about where the tollbooths will go, but to me, it’s more about whether there are pockets of money out there ready to address problems that people have now. The tollbooths will go where they need to be.”
I'm pretty sure where the tollbooths will be - embedded in your local ISP. They will be put there by the music and movie industries so that when you in this new future request a tune or a clip by name rather than by IP address you can be either billed or denied.
If Slashdot were chemistry it would look like this:Cadaverine
There is not only a cost of deploying the new tech, but also the cost of change. That cost of change is REALLY high as the current methods are deeply seeded. IPv6 isn't "there" yet... and the experience has been dizzying for many. Now there's another new approach? It may be better, but people don't want the change. Something catastrophic will have to cause such change and even then, people will gravitate to the solution with the least amount of change possible.
But it's good that someone who was involved in the early Internet realizes that it's a good one.
And no, it doesn't mean throwing TCP/IP away.
But really, Slashdotting should be impossible. To me, the fact that it is possible indicates a fundamental problem with the current structure of the Internet. If you can come up with someone other than using content-addressing that solves the Slashdotting problem for everybody (even someone serving up content from a dialup) then it doesn't really solve the problem.
Need a Python, C++, Unix, Linux develop
Bittorrent and other p2p protocols. Even if -all- content wete distributed this way, you would still need an underlying network, link, and transport mechanism. The tcp/ip serves that very well, then hopefully you have no hotspots of traffic or failre becaise of the distributed nature of the content. Another interesting facet is that if all content is truly distributed and redunt with no single point of storage, master copy, or decryption, there is no way to EVER remove content completely.
Silence is a state of mime.
Let's consider Freenet. Don't they store and retrieve data based on some cryptographic keys? Of course, data is distributed across all participants, and communications still piggy back on top of IP. But that's what I'd call content-centric networking. The content isn't located by location, but by its nature (hash/key/...).
cpghost at Cordula's Web.
And the list goes on....
The Pirate Bay/BitTorrent.
Have gnu, will travel.
Magnet links only use the hash, so there's a possibility of hash collisions. He's proposing an identifier + resolver scheme ... which again, has been done many, many times already.
Eg, ARK or OpenURL
Or, we get to the larger architecture of storing & moving these files, such as the various Data Grid implementations. (which may also allow you to run reduction before transfer, depending on the exact infrastructure used).
Build it, and they will come^Hplain.
Didn't really read TFA, but this is what DNS is for. I don't care /where/ kernel.org is, or even if it's in the same place every time I access it.
Van, Sally Floyd and Lixia Zhang have been talking about this for a while; how much of this is tied back to the adaptive web caching project from the late 90s would give one a sense of how long the idea has been kicking around. One of the neat things that fell out of that project was routing and forwarding on decomposed URNs or URLs... well, there was a paper on that that came out of the adaptive web caching project in IEEE INFOCOMM 2000.
At the risk of sounding snarky, but, Snap! That's the basic idea behind content-centric networking! And that basic idea is patented.
But here's the problem: You still need a network to transport packets. That was the big win in Internet engineering: the ability to create a level of indirection to hide the nastiness of bridging across different media. Sometimes this indirection layer worked well (cf. Ethernet), sometimes it was really clunky and nasty (cf. ATM). And you still need to choose a transport style, connectionless or connection-oriented. And you still need... feel free to add more to the list.
Van's proposal doesn't invent a new internet. It's a new indirection layer and possibly a replacement transport layer.
Any time someone talks about Content Centric networking or routing, there are always a bunch of people saying that it's basically the same as distributed hash tables, multicast, a cache, etc.
However, it may use such technologies, but it isn't the same.
Content Centric is all about having distributed publish/subscribe, usually on a lower network layer.
The content part in the name means that there is being looked at the content itself for routing, not some explicit addressing. For instance, to give a very simple example you can send out a message [type=weather; location=london; temperature=21], then anyone subscribing to {location==london && temperature>15} will receive this message.
The network is typically decentralized, and using this kind of method can give a number of interesting efficiency benefits.
This is currently mostly being used in some business middleware; ad hoc networking stuff and some grid solutions. None of those particularly large.
The real problems with widespread use of this technique are the following:
* It's unnecessary: IPv6 is completely necessary, somewhat doable in terms of upgrading, and almost nobody is using it even now. This is someone suggesting a whole new infrastructure for large parts of the internet. The fact is, this would possibly be more efficient than many things that are being done now, but in reality nobody cares about it. Facebook and youtube (ok Google) would rather just pay for the hardware and bandwidth than give up control.
* Security is still unclear, it's easy to do some hand-waving about PKI, but it's hard to come with a practical solution that works for many.
you should only have to care about what you want, not where it's stored.
Isn't that what Google is for?
This has been proposed before. It's already obsolete.
The Uniform Resource Name idea was supposed to do this. So was the "Semantic Web". In practice, there are many edge caching systems already, Akamai being the biggest provider. Most networking congestion problems today are at the edges, where they should be, not at the core. Bulk bandwidth is cheap.
The concept is obsolete because so much content is now "personalized". You can't cache a Facebook page or a Google search result. Every serve of the same URL produces different output. Video can be cached or multicast only if the source of the video doesn't object. Many video content sources would consider it a copyright violation. Especially if it breaks ad personalization.
As for running out of bandwidth, we're well on our way to enough capacity to stream HDTV to everybody on the planet simultaneously. Beyond that, it's hard to usefully use more bandwidth. Wireless spectrum space is a problem, but caching won't help there.
The sheer amount of infrastructure that's been deployed merely so that people can watch TV over the Internet is awe-inspiring. Arguably it could have been done more efficiently, but if it had been, it would have been worse. Various schemes were proposed by the cable TV industry over the last two decades, most of which were ways to do pay-per-view at lower cost to the cable company. With those schemes, the only content you could watch was sold by the cable company. We're lucky to have escaped that fate.
And it has the same issues. 15 years ago everyone said that we'd move past using file to store stuff and just go for the stuff we want. Microsft had WinFS for example (part of Longhorn).
But then the question comes where do you actually store the stuff?
The real change came not by eliminating using files to store stuff, but by changing how we retrieve stuff.
And this is the same way. Changing how you locate stuff on the internet is not going to remove the need for TCP/IP. You're still going to have to contact a machine to get the data and it'll have to send it back to you and the internet will have to route it between the two.
And not to put down Van Jacobson, but we're already well along the path. I remember when URLs first started appearing in ads, some day in the future, we'll look back and remember the days of URLs in ads as quaint.
Why go through the trouble of creating a URL to and even a short URL (http://bit.ly/itsmbmam) to the sampler for My Brother, My Brother and Me when if you search for "mbmam sampler" the sampler is the first result? Some day we'll stop even bothering. At least it seems like it to me.
http://lkml.org/lkml/2005/8/20/95
You have to care where it is stored. It isn't TCP/IP that is holding you back - it is physics.
Where you get/store your content from where you are and how to get there is no different in model on how a someone has to find a path to get groceries, gasoline or any other resource that requires some sort of addressing and path to get there. Whether it is addressing for storage protocols (Fibre channel or other disk tech SATA etc) or MAC addresses IP is an addressing tech - changing it will not fix an oversubscription of data across a fixed infrastructure.
The issue of bandwidth is more of an issue of physical infrastructure technologies than it is an issue of protocols used in those technologies.
Until all of the data you ever need is always with you where ever you are - you still need to care about where you are, where what you want is and how to get there.
I've been hearing that since I got on the net in '91. Tell me a new lie.
Its an end time message. "Repent, for the end is near". Yet, stubbornly, the sun always rises tomorrow.
Your well-reasoned multi-year in-depth technical analysis and reams of substantiating data have me convinced that it's all just a "lie" (as you put it).
May I mod you super-genius? I was afraid your post was just going be just some typical uninformed anecdotal horse-manure that provided all the insight of a dead skunk.
It's just peer to peer networking. Which you can do on top of TCP/IP, and you want to do that because the "who" is frequently more important than the "what."
Anyway, the problem with this is that the "who" that the content starts out with originally is afraid to trust it to anyone but a limited set of trusted sources. The solution will end up being large media providers with servers close to most of their customers.
The fundamental idea behind Jacobson's alternative proposal — Content Centric Networking — is that to retrieve a piece of data, you should only have to care about what you want, not where it's stored.
So he wants to re-invent Xanadu?
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
Because it is not possible to censor something that exists everywhere and nowhere!
Simple isn't it?
"PARC research fellow Van Jacobson argues that the Internet was never designed to carry exabytes of video, voice, and image data to consumers' homes and mobile devices, and that it will never be possible to increase bandwidth fast enough to keep up with demand" The internet was never designed to carry exabytes? Who is this guy kidding? It's not the "internets" fault or how it was designed. Blame the ISPs that provide the terrible bandwidth. Google fiber seems to be the answer and the image other ISPs need to follow. Greed is what powers todays slow bitrate not the "internet". The reason "it will never be possible to increase bandwidth" is because the ISPs refuse to.
tonyaldo.com
"not where it's stored."
So we should make the Internet into Plan 9?
Your stupid minds. Stupid! Stupid!
Provenance is more important on the Internet than most think - it is one of those I.myths, like being anonymous. Well... that goes for first class Internet citizens, if you're just a sharecop (ie Apple chattels) then I guess provenance doesn't matter - at least it isn't your problem.
Our transformed relationship with content is one in which individual users are the gravitational center and content floats in orbit around them. This “orbital content,” built up by the user, has the following two characteristics:
Liberated: The content was either created by you or has been distilled and associated with you so it is both pure and personal.
Open: You collected it so you control it. There are no middlemen apps in the way. When an application wants to offer you some cool service, it now requests access to the API of you instead of the various APIs of your entourage. This is what makes it so useful. It can be shared with countless apps and flow seamlessly between contexts.
The result is a user-controlled collection of content that is free (as in speech), distilled, open, personal, and—most importantly—useful. You do the work to assemble a collection of content from disparate sources, and apps do the work to make those collections useful. These orbital collections will push users to be more self-reliant and applications to be more innovative.
insensitive clod overlords obligatory xkcd car analogy russian reversals whoosh pedant fanbois ftfy in 3...2...1..PROFIT
In fact it sounds identical to what CORBA promised. In fact, CORBA will take the world by storm! It will... um...
*headscratch* Hmm....
You know, this sounds too much like the way freenet works. It's not a "new" idea, at least not to the internet, but I'm sure we could benefit from that. You'd still need the bandwidth to move the data from source to other places where you would store and serve as local hubs for people to download faster from, but I admit a single transfer between hubs and then using local infrastructure would be nice...
FROM specifies which tables (or views), not which server, or network, or storage device.
That in itself isn't the point of SQL, rather it's non-procedural, meaning you don't specify how to get the data, you only describe the data you want (in terms of how it relates to others). If your data doesn't have that sort of structure, the "NOSQL" strategy is fine (and can be done in SQL anyway).
SQL's main problems are the inconsistent and sometimes misleading syntax, and the complexity of the where clauses. There are unpopular alternatives to the former (set based syntax is nice), but I'd really like to see deductive databases help with the latter. Foreign key constraints mean that the database can deduce much of the where clause itself, in the same way that Prolog resolves queries (I've seen a deductive database that uses a Prolog syntax, but there's no reason SQL can't be used instead). They're slower, but only for the first deduction, if it's cached), I don't know why they've never caught on.
That's a tangent, but at least it's irrelevant.
This stuff has been around for a while, and I have the following problems with it:
1. We already pretty much have CCN. They're called URLs, and companies like Akamai and others do a great job of dynamically pointing you to whatever server you should be talking to using DNS, HTTP redirects, etc. When I type www.slashdot.org, I already don't care what server it lives on. When I type https://www.slashdot.org/ I still dont care what server it is on, and I have at least some indication that the content is from someone authorized to speak on behalf of www.slashdot.org (PKI crap aside)
2. The article mentions that this tech would be used to relieve load at the core -- which I'm not sure I buy. The core is well known to be overprovisioned, and a recentish survey http://techcrunch.com/2011/05/17/netflix-largest-internet-traffic/ has shown that netflix and youtube consume 40% of downstream bytes -- both services already serviced by major CDNs pushing at least some traffic away from the core.
3. I'm unclear on the value proposition for us to redesign every router to be effectively, an HTTP proxy cache. These devices are well studied and even if we got a higher cache-hit-rate using CCN, I'm not convinced it would help anything. After all, we are doing just fine.
4. I think this approach is in the end, fundamentally wrong. Regardless of how much magic we use to find out what machine to get data from, we will always be transferring data from one computer to another (a caching router is effectively a computer). It seems to me that until we no longer need to move packets from some machine A to some other machine B, it makes sense to have host-centric primitives, and build our abstractions on top of them. That's what we've been doing, and it's been working pretty well.
A facebook page does not need to be cached. It needs to be sent to 10 people. It should be stored on your own personal machine with secure access handed to your "friends". Of course this requires actual peer-to -peer networking which doesn't really exist - just try to get a fixed URL from your ISP, and then try to find a common app that uses it. I'd like to see an IPv6 subnet where the addresses correspond to GPS location - that's just plain easy to route, and it helps with identification.
"you should only have to care about what you want, not where it's stored."
where its stored is very important. we forget in the age of fiber networks that the internet DOES have topology. there was an age where traceroute was a very neccary tool for setting up IRC networks to determine how you linked servers to your hubs, and how you formed your backbone of linked hubs.
Given that computers on the internet are owned and operated by a variety of diffrent intrests, many of which view eachother with suspicion, its very relivant in knowning which computer, hence operatorship, you are dealing with.
this creates NATURAL boundries to keep invidual and group online spyou should only have to care about what you want, not where it's stored.aces within their own realms, and some form of soverignty for system operators/owners.
I also disagree, the internet was meant to scale, and its done a marvelous job at that. I can't see his ideas scaling nearly as well.
I think he's hinting bringing back the Seller/Consumer model of the internet, where people are fed information from a single source, and a gross distinction between who get to host content and who get to view it, and how its allocated.
Isn't is illegal?
I have a feeling that the current crop of PARC researchers are not as bright as their peers 20 or 30 years ago
They do not give us any new insight on what's beyond the horizon, nor demonstrate to us what their visions are leading to, unlike their peers 20, 30 or 40 years ago had done
Muchas Gracias, Señor Edward Snowden !
consistent High heeled Shoes use can be a problem. Using them for specia
URNs will never be the defacto standard until the security peace of mind that location-based communication brings can be brought to the table. Say for example I have to send an application or OS image to a remote client. I want to know that he gets the resource from me, and not a hacker in Sweden.
Not entirely true. Some Facebook pages (like mine: http://facebook.com/lannocc), are publicly viewable to anyone (as long as you're logged in). I actually wish FB would remove the logged-in restriction so my page could be searched and accessed by any person or web spider.
However, your other idea about hosting your own personal data is something I like and have thought about frequently. I imagine a social web of providers where you can pick a storage provider (or provide your own) from a marketplace. Some would be free, probably ad-supported. Others might take a small payment but guarantee an encrypted store with options for key delegation in the event of death, etc.
-IOVAR Web Dev Platform