Researchers Map Locations of 4,669 Servers In Netflix's Content Delivery Network (ieee.org)
Wave723 writes from a report via IEEE Spectrum: For the first time, a team of researchers has mapped the entire content delivery network that brings Netflix to the world, including the number and location of every server that the company uses to distribute its films. They also independently analyzed traffic volumes handled by each of those servers. Their work allows experts to compare Netflix's distribution approach to those of other content-rich companies such as Google, Akamai and Limelight. To do this, IEEE Spectrum reports that the group reverse-engineered Netflix's domain name system for the company's servers, and then created a crawler that used publicly available information to find every possible server name within its network through the common address nflxvideo.net. In doing so, they were able to determine the total number of servers the company uses, where those servers are located, and whether the servers were housed within internet exchange points or with internet service providers, revealing stark differences in Netflix's strategy between countries. One of their most interesting findings was that two Netflix servers appear to be deployed within Verizon's U.S. network, which one researcher speculates could indicate that the companies are pursuing an early pilot or trial.
Why is netflix's server architecture so interesting? Or is science just miles behind the industry in this subject and now they are desperately trying to catch up?
The BitTorrent approach is the wrong one for two reasons:
1. A lot of people have asymmetrical connections with a very slow upload speed
2. A lot of people have monthly data caps with hefty fees for going over
Netflix will send a CDN server more or less to any ISP which requests one, and is willing to pay the power bill. Do you not remember when many ISPs were loudly refusing to install these free machines even though they would save them money because they objected to "free" colocation on principle?
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
USA peak bandwidth was about 60Tb/s back in 2012 and average bandwidth growth has been at least 50% year or over and steady since the 80s, giving us 400% growth of bandwidth, or about 300Tb/s. Netflix represents about 1/3rd of peak bandwidth, or 100Tb/s. That gives an average of 21Gb/s per server, which sounds ballpark correct seeing that they've moving to 40Gb and 80Gb/s uplinks on their servers.
Regardless of how many people are actually watching, 20Gb/s average is pretty cool. Another interesting note is Netflix servers barely benefit from caching data in memory. Each server is handling to many requests per second from so many different customers, almost no customers are at the same point in the same show, and requests from a customers are temporally far away from each other that almost all requests are just random access. It's also interesting to know that Netflix is beyond the 80/20 rule, they're in the 90/10 rule, in that 10% of their data represents 90% of their requests. Predicting which 10% is important, and they can't use normal evict least-used algorithms because that would cause cache-thrashing. They algorithmically predict what will be watched every night, upload the data to be cached and logically "pin it" so it doesn't get evicted.
Other interesting stuff that they support for syncing the servers is each server can be configured to use a different route to pull down its data and even configure the amount of bandwidth, then the servers within a local can sync with each-other with a kind of P2P setup. This helps load balance routes. Their SSD servers hold quite a bit less storage than mech-drive storage, so the SSDs typically are hit first, but hold only the most requested of data. Last I knew, their SSD servers did not support acting as a cache while loading, because of IO patterns that didn't play well with SSDs with mixed sustained heavy reads and writes. They may have changed or may be changing in the near future. I know the biggest reason for this was the way most SSD firmware supported garbage collections could cause long pauses of no activity with sustained heavy writes. One of the changes was for FreeBSD to have a target latency for reads/writes and throttle the writes until latency came down.
I wish torrent was better at taking advantage of my fast symmetrical connection. If more users uploaded to me instead of to others, I could upload more to others and it would be overall faster for everyone. Easier said than done, I know. Simple attempts to do this would result in gaming the system or DOSing the system by wasting other's upload bandwidth.