Making BitTorrent Clients Prioritize By Geography?
Daengbo writes "While I live in S.Korea and have virtually unlimited bandwidth in and out of the country, not all my Asian friends are so lucky. Many of the SE Asian and African countries have small international pipes. Even when a user has a high-speed local connection, downloads from abroad will trickle in.
Bittorrent clients apparently don't prioritize other users on the same ISP or at least in the same country. Why is that? Is it difficult to manage? If I were to write a plug-in for, say, Deluge, what hurdles would I be likely to come across? If this functionality is available in other clients or through plug-ins, please chime in."
There is already a plugin for Azereus that does this. I downloaded it about a year ago. I'm at work right now or otherwise I would look at my installation and tell you the exact name of it.
IPs could, theoretically, be prioritized based on a database of known general geographies associated with certain digits. Just remember - prioritizing is one thing, but it's a slippery slope to peer exclusion.
I was under the impression that utorrent already did this, at least to a 'country' degree.
uTorrent has a feature called local peer discovery that does that exactly. It was even able to discover other people at my university sharing the files.
There is a plugin for Vuze (formerly Azureus) called Ono which does exactly that. Not sure what the problems they ran into, but as it is a college project I am sure they would be willing to discuss some of it with you. http://www.aqualab.cs.northwestern.edu/projects/Ono.html
One would not even need to prioritize by geographic location: the client could easily give extra priority points by network class: C first, then B, then A, then the rest. The odds of having a very fat pipe to another machine in the same class C are far better than having a fat pipe to a random machine across the planet.
And that would also alleviate the load on backbone links.
http://www.dieblinkenlights.com
That way they can take advantage of the tendency of IP packets to flow downhill.
What you're looking for is an Azureus plugin called "Ono". It prioritized based on router hops. Theoretically, this would make those connected to the same ISP preffered. After that it would make ISP's with direct connections to your ISP preferred. After that, resonably close geographically, ie same country.
... and in the DRM, bind them.
How good is latency or hops as indicator of distance from peer? The idea is that if it takes 5 hops, as opposed to 10, then the peer taking the least hops to get to is the closest.
Jumpstart the tartan drive.
For Vuze, formerly Azureus, there are Ono and P4P, which should do what you're looking for, although for different reasons. Unfortunately, they both rely on people in your region being interested in the same torrents you are, while P4P additionally benefits from an iTracker, an ISP provided tracker that's topology aware (they did some work to prioritize based on ping latency, using that as a distance estimate, but I don't know if it's a fallback mechanism). Due to the iTracker infrastructure and possibly conflicting supporters, there are some privacy concerns.
"Bittorrent clients apparently don't prioritize other users on the same ISP or at least in the same country. Why is that? Is it difficult to manage?"
The reason BitTorrent doesn't prioritize other users on the same ISP or the same country is that it doesn't know which ones are part of the same ISP or the same country. For ISPs, since the introduction of CIDR addresses, ISPs can have multiple blocks of IPs. Can you honestly tell me what all of, say, Comcast's IP blocks are with any degree of certainty?
For countries, you either need to know which IP blocks IANA has allocated to which IP registry or use a geolocation library.
MaxMind's GeoIP seems to be the de facto geolocation library, but they charge money for the "good" version. There is a free version now, but it has some annoying requirements, such as having to include "This product includes GeoLite data created by MaxMind, available from http://maxmind.com/" in all advertising materials and documentation. It also only has a 99.5% accuracy as claimed by its creators, which means the the accuracy is probably considerably lower than they claim. Even if it were 99.5%, that means it's wrong for 1 out of every 200 people.
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
I've always wondered why ISPs can't give higher speeds if you stay within their network. You'd get your download faster. You'd use less peering bandwidth, costing the ISP less money. Everybody wins.
Prioritize by network topology is a better way to put it, that just happens to coincide with physical AND political geography in many cases. In the case where you can get 10Mb over a 10-hop connection, or 8Mb over a 3-hop connection, which do you pick? If you pick the latter, there is a good chance that two other users can utilize the other 70% of that 10-hop connection, making total throughput (theoretically) 24Mb.
Oh, you meant prioritize by politics, not geography.
No. You can try reading the summary, asshole. Here, I'll repost it here in case you were too lazy to read it above:
"While I live in S.Korea and have virtually unlimited bandwidth in and out of the country, not all my Asian friends are so lucky. Many of the SE Asian and African countries have small international pipes. Even when a user has a high-speed local connection, downloads from abroad will trickle in.
Bittorrent clients apparently don't prioritize other users on the same ISP or at least in the same country. Why is that? Is it difficult to manage? If I were to write a plug-in for, say, Deluge, what hurdles would I be likely to come across? If this functionality is available in other clients or through plug-ins, please chime in."
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
So big pipeline! We American have tiny pipeline. Not big pipeline like you!
Actually, reading the summary, the submitter is concerned with people in countries which have small international pipes- if they can prioritize peers who aren't constrained by the international bottleneck then those people might see a speed bump with bittorrent.
No. Prioritize by ASN. A smart tracker would get a BGP feed and then hook users together based on locality of network connectivity.
Any other approach is "wrong."
If your ISPs international pipe is flooded then bittorrent will automatically prefer local peers as they'll be the only people who can send you data at a fast enough rate. If you notice local peers who you're not connected to then it's most likely just because they've already reached their connection limit.
Most bittorent clients will connect to many peers and try to saturate your downstream bandwidth. They don't care where in the world those peers are as long as they're fast. If, in your part of the world, local peers are faster then that means you should just automatically connect to more local peers.
Nick
This should be solved in the tracker, it should know about Internet topology (full BGP table) and send 2/3 of answers with clients in the same AS-number or max one AS-hop away from you.
Unfortunately tracker programmers I have spoken to so far have seen this as a performance problem and a complication of their software, and been little interested in this. I am sure the ISP business would be able to save substantial amount of money with this enhancement, and at the same time end users would see improved download times. This is only true to larger torrents though, as torrents with 50-100 peers are less likely to be closer to you.
All of this would go away if (and maybe some do already) bittorrent clients just seeked out (and always continued to seek out) the highest bandwidth peers they can find.
That would even give ISPs the opportunity to promote "in-network" sharing by allowing in-network clients huge amounts of (what is essentially free for the ISP -- when compared to peering bandwidth costs) bandwidth while limiting out-of-network bandwidth.
At Napster I wrote a system to weight peers that were closer to the person searching by using network distance.
It was mostly because universities were complaining and so we weighted everyone on Internet 2 towards each other, but it also worked quite well for service providers like @Home and AOL. Since ISPs don't seem to care as much when their own bandwidth is used, a lot of complaints about our bandwidth consumption disappeared overnight. Indiana state university and someone else helped out if I remember correctly.
It was a rather simple system that used BGP routing tables from a number of routers to build a graph of network connectivity. It wasn't perfect, but it didn't have to be.
That said, with IPv6 weighting is *much* easier because of how the IP space is divided up. You can do a super naive implementation just by prefix.
An Azureus plugin Ono does something similar, though I believe they just look up the IP address for a CDN and weight people that look up the same IP towards each other. It is a decent solution, but it only works for between people who are running the plugin.
If people are able to get suped up downloads from their peers based on their location, could this ever lead to extra implications with copyright infringment?
That approach is wrong as well. Whos says I want or only want to connect to certain pre-selected peers? I don't need or want a tracker doing that for me, I'll do that myself.
It only takes one man to change the Wisdom of the Crowd to Tyranny of the Masses.
No. Prioritize by ASL. A smart tracker would get a BGP feed...
There fixed it for you...
Seriously though, I'm not that clued up on network acronyms. I know I'll be told to google it, but why can't we just type the whole set of words out? It's not that hard is it?
I have determined that my sig is indeterminate.
The biggest speed issue facing Asia/Australia is the latency of traffic to the rest of the planet. The (Windows) TCP Receive Window is tuned too small for the distances required. If you change the receive window to the maximum, you can get 4x more data in the same period using any client (P2P, browsers, etc...).
Refer to:
http://cable-dsl.navasgroup.com/index.htm#IncreasingWindow
I'm, I'll, can't, it's?
Autonomous System Number. I don't think it helps much either way. You either know what it means or you don't. Also, Border Gateway Protocol.
I suppose it depends on what problem you are solving. If your goal is to get a particular piece of data as quickly as possible, then if it exists on a peer on your local network, that is where you should pull it from.
If you want a file A, broken up into parts A1, A2, A3, ... A6. Parts A1 - A4 are on computers on your local network (50 Mbps) and parts A1 - A6 are on computers across the ocean (1.5 Mbps), the quickest result will be to begin fetching A5 & A6 from the slow, distant peers and begin fetching A1 - A4 from the fast, local peers. Given the general pointlessness of providing a full BGP feed to every participant in the network, putting it at the tracker is the best location. What's more is that it would be an optimization that could be done now without making any changes at the client end.
In fact, I would not be surprised to find that ISPs would raise the cap on protocols that encouraged transfers within the ISP's network.
Basically a self configuring akamai.
When I was in my first year at college, we were asked to produce a questionnaire about using ATMs, including the question: "If you could change one thing about your bank's ATMs, what would it be?"
The most popular answer I managed to get was "if the machine's running out of money, they should restrict the cash withdrawal function to customers of this branch".
Does anyone see a parallel here?
Yeah, I don't see talking about mountains and canyons and rivers and oceans.
I see talking about countries and borders and pipes.
Countries. Politics.
Geography = mountains and canyons and rivers and oceans and etc.
There is little to no support for multicast by last-mile ISPs.
It would be nice - ISPs keep bitching about how P2P is eating their bandwidth, but they don't bother implementing multicast which would make P2P use a fraction of the bandwidth it currently does.
Admittedly, in addition to lack of support, IPv4 multicast is pretty "meh" - there aren't many multicast addresses available and I have yet to see a good way of choosing/assigning them on a global network.
retrorocket.o not found, launch anyway?
It's called 'political geography'. You're both right. Now shut the fuck up uneducated troll.
Support my political activism on Patreon.
The problem is not "getting the software developed" but getting its deployment okayed by the legal department. The risk of being seen as "helping the pirates" will keep ISPs away from this kind of "optimisation".
ISP's in America are against locals serving content. This is very obvious by the fact that your upload allowances are a significant digit smaller than your download allowances. For this reason, bit torrents are far better prioritizing a larger pipe than a shorter hop.
Remember, ISP's are NOT your friends. They are a contract partner. Their interest is not to make your experience better; it's to only make your experience slightly better than the competition.
For torrents to coexist with ISP's would require:
1. Extending a business partnership with them, and convincing them that they CAN allow users to serve content without choking their already oversold bandwidth
2. Proving to their salespeople that doing so would be an advertisable asset, thus bringing them more customers.
3. Proving to their lawyers that they would be safe from litigation, both from the media conglomerate and from Uncle Sam controlled by the media conglomerate, for encouraging the spread of unregulated data copying (or copyright infringement to the aforementioned parties).
Always going forward, 'cause we can't find reverse.
Just because there's a term for it doesn't make it true.
I bet you think male and female refer to behavioral traits and societal norms. You know, just ignore all that biology and science.
We have male and female RJ-45 connectors, are these biological? Are you going to tell me they're not really "male"?
Support my political activism on Patreon.
Ono uses statistical data from CDNs to be a little bit smarter about picking peers in certain cases. In most cases the random solution is fine; your client can just randomly pick peers then stick with fast ones and drop slow ones. Ono aims to improve performance in certain cases where that strategy isn't very good.
Just in case anyone reading doesn't notice, Ono aims to find peers that are close to you on the network. That doesn't necessarily mean close to you geographically and so doesn't answer this ask /..
Nick
I'd prefer one that deprioritizes (or outright bans) my ISP and geographical region.. but then again, I'm a Comcast subscriber in the US. As a Comcast subscriber, I know that I'm not likely to get decent speed from other Comcast subscribers (unless they happen to be on my node), and as a US citizen, I'm well within the reach of anyone who decides they want to sue me because my kids torrented some MP3s when I wasn't looking.
https://www.eff.org/https-everywhere
It's not explicit, but the tit-for-tat algorithm which is at the core of the self-organization mechanism of BitTorrent already favours fast connections. These generally translate to geographical closeness, but they also end up preferring links that are otherwise quick: for instance, in Finland I get quite a bit of stuff from Sweden thanks to the interconnections and the swedish Bredbandsbolaget.
Adding a GeoIP style thing to BitTorrent can only make the algorithm perform worse, as it would prefer addresses by geographical locality rather than locality as defined by network topology.
Because ISPs need to pay for traffic that leaves their network they try hard for p2p traffic to stay within their network. If someone big would pay for something like this it would be a large ISP. And I think they are already working on something like this. If I am not mistaken it is for Bittorrent. So keep checking.
That doesn't work well with networks split with CIDR
It does if the algorithm ignores "class" and just prefers hosts with a longer common IP address prefix. Imagine three hosts: you, A on the same /22 as you, and B which is only on the same /9. You're more likely to be close to A than B.
Just because three islands are labeled A, B, and C, doesn't change the fact that the cable connecting B and C is small while the one between A and B is larger. Politics has nothing to do with this.
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
They're called male and females colloquially.
They're called such because they mate (physically couple, much as most male and female organisms do during sexual intercourse).
And you don't have a male RJ-45 connector.
RJ-45 is registered jack 45.
Only the female end is RJ-45.
Keep trying though.
Politics has a lot to do with it, and in fact is often the reason why countries put big pipes to some countries and small pipes to others. Politics has a lot to do with filtering. Politics has a lot to do with why a person would want to prefer/block certain peers/hops in certain areas.
Geography has very little to do with it.
Last I checked, undersea cables were pretty reliable when they weren't being cut "accidentally".
Mountains don't exactly block packets.
Rivers don't make my bits come out soggy on the other end.
Throughput is what matters, and network topology is the best thing to consider (aside from raw throughput, of course).
I heard some people are working on an advanced file transfer protocol that solves that problem. They call it HTTP. I don't remember what the letters stand for. Combine that with something called (sorry if I get jargony) a "caching proxy" at the ISP, and it works beautifully.
"Believe me!" -- Donald Trump
BT's tit-for-tat transfer algorithm already favors trading with the peers that you have the fastest connection to. Trying to make it geo-specific will just slow it down.
You have confused "geography" with "topography" and "terrain". Network topology is certainly constrained by both politics and terrain, both of which are things that can be studied with geography.
Aha, troll mode: excessive pedant. Got it.
Support my political activism on Patreon.
Really, the charting of the earth?
And I proposed it for two popular BitTorrent clients, only to be told, "we don't need that.."
Simple enough to do. Start with Class C address.. less then 2^24 of them, as some
are reserved, and not routable. Make a bitmap that big, so divide by 8.
Only a two meg file. Then just watch your connections, total by bytes received, then divide by 8. If the result is greater than say 1 kb/sec sustained, then
set the appropriate bit to true. Allow the bitmap to be zeroed if you move or whatever.
After getting the list of peers, prioritize connection attempts towards those
that have a useful sustainable rate. Nothing worse than seeing 80% of my
connections saying connected 30 minutes, 100 KB transferred. Sigh.
While in china, I had a 2 mb connection. But too many Chinese hammering it, so
I could sustain, 2 mbit up, and 1 mbit down. Asymmetric the wrong way..
Seems blindingly obvious to me, yet I still see no clients with this feature.
The law is a weapon of the government, not a protection for the likes of you. Surely you understand that.
Last I checked, undersea cables were pretty reliable when they weren't being cut "accidentally".
Mountains don't exactly block packets.
Rivers don't make my bits come out soggy on the other end.
Very true, however getting around all of these geographical obstacles can be very expensive. That's less political discrimination and simple economics - they either can't (or wont) spend the money where others will. They might have different priorities, or different revenue/expenses.
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
Does the Akamai you refer to relate to Akamai Technologies, or is this some obscure networking term that I am unaware of.
Random Thoughts From A Diseased Mind (Not For Dummies)
It refers to AKAM.
Er, what country is where in relation to you has always been a part of geography. India's on the far side of the world, the friendly Canucks live to my north, and Mexico is down to the south. As I've already said, the prioritization isn't based on politics at all, it's preferring in-country peers because they're less liable to be slow in certain parts of the world. It would make it political if it preferred them out a sense of patriotism. But it's just reflecting the realities of telecommunications in certain parts of the world where physical proximity becomes much more important.
I brought this up on the bittorrent list in, lessee.... 2005 and there was some brief discussion in the thread about factors that might be in play.
I've written here before that Comcast could've saved a hell of a lot of money (and now time before the FCC) by optimizing bittorrent rather than fighting (and denying fighting) it. Who knows, maybe they contributed to the recent Azureus/uTorrent plugins.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Maybe I should read the article, but it seems to me that throughput is more important than how close peers are. Latency isn't really an issue with big downloads, so if you are getting 10 MB/s from China and 2 MB/s from your neighbor next door, why not go for the throughput instead of the neighbor? You could test all the peers for average throughput through a short test, then select the fastest ones...
LS
There is a fine line between being a cultivated citizen and being someone else's crop. - A. J. Patrick Liszkie
I think they've got it all wrong. The software shouldn't be connecting people to nodes in their area. It should be connecting people to nodes in regions that can't sue them for uploading.
The main reason is you can't detect 'local peers' with zero cost.
If you go by hop count, (which is not necessarily reliable), then you have to spend extra time at the start of each session varying the TTL field of your packets.
If you go by database, like GeoIP, you require large databases for each client and obviously these change regularly. And as previously stated, these are not 100% reliable.
The ideal route is ISP participation, so that mechanisms are available which make distributed systems topology aware. However, this involves a cost to the ISP.
One of the other important factors is that the tracker will return a random subset of peers anyway. So to make this efficient at all you'd really need tracker participation.
Finally, on the more theoretical side of things, you can not have a swarm which is entirely divided geographically. Otherwise you risk losing pieces of the torrent to a subset of peers who are only communicating with one another. So even if you did do this, you would probably want to make sure a percentage of your peers were of a random geographical location.
Rivers don't make my bits come out soggy on the other end.
Heh, soggy bits.
If I'm on ADSL-2 from a great ISP, and a neighbour (next door) is with the same company but has bought their 256K ADSL service, that's going to be less hops than another person with ADSL-2 or a 1Gbit corporate link. Latency is slightly better than hops, but not much. The only way that makes sense to me is some sort of algorithm that works out the optimal number of download peers, and the minimum acceptable peer speed based on what your link can handle, the number of peers available, and the expected average transfer speed. If any peer is slower than that limit, it should be disconnected, and then a faster one found. A slight improvement on that would be to find another, connect to it as well, see if it reaches the remaining bandwidth for that slot, and if it perhaps overtakes the previous slot holder in terms of delivery. If so, the previous holder could be dropped in favor of the new one.
Comment removed based on user account deletion
I was actually thinking of this very topic a few days ago. i had been using Ono for quite a while with limited results, but the most recent version of Ono allows us to put in our own "nearby" IP lists.
The idea is to use get the IP list from BGP routes of your ISP. The way I did it was to go to www.robtex.com and put in my ISP. My ISP is large and they have multiple regions. I pick out my region using the AS number. Then I look at all the routes within the AS number. A bit of copying and pasting later, I have a list of close IP subnets that I can put into Ono.
As an example, speakeasy.com only has one AS number (AS4355). It has almost 700 subnets which it considers close.
I'm not a BGP expert so hopefully someone can explain how this works.
The author of http://homepages.xnet.co.nz/~createcoms/ tried to do the same thing, but they rely on manual discovery
Hopefully this method can be validated and then published on Ono's webpage. Maybe /. will be /.'d
In terms of connecting two countries, the economics takes a back seat to the politics. It's not so much "how much will it cost?" as it is "how much will do we want to spend?" "how connected do we want to be with them?" and "which of us is paying?".
No, some land mass is on the far side of the world.
...how routing decisions and packet switching through backbone ISPs works. It's like a brand new generation of ignorant /. readers!