Errr, what? Seriously, I'd do some research first - unless you're using really small TCP packets, you should easily be able to manage 20:1 if not 50:1. With a non-acknowledged protocol such as UDP, you can increase that to over 100:1.
Just because you're using a vastly inefficient method to download your "must-have" illegal TV-rips, doesn't mean we get to blindly accept your facts.
I've done exactly the same as you, used every single tool under the sun, eventually settling on Unison until I realised I was being silly...
Let's put it this way - just set up each computer how you want it, and sync the *data*, not the whole home directory.
For instance, my Documents are synced with Dropbox (though tempted to move them to UbuntuOne), my development directories are generally stored in some kind of revision control (svn/bzr/git) and either not synced or at worst, unison-ed, and everything else just stays on the machine it was created on, and backed up with duplicity to a central fileserver hosted in France.
When you realise that syncing home is *not* good, it suddenly becomes clear what you need, and what you want are completely different.
Here's the harsh truth - Ubuntu will never gain mainstream acceptance in the developing world. They might gain 10% or even 20% share, but there's an important fact to consider:
Your average person with a computer wants to run what everyone else is running - and that's Windows. They're not going to accept an alternative because they want the same skillset the developed world has. That doesn't mean they'll spend $$$ on Windows, that means they'll get a copy from the local market trader, and will run it illegally.
I run Ubuntu on my servers and my laptop because it's better, and because I appreciate the free software ideals. I still run Windows at work, and I can't see myself moving anytime soon.
Here's a scenario for you, that will cater to your needs:
Buy the most power machine money can buy - up to about £3000 in terms of CPU power, lots of RAM, and every storage slot filled with high capacity storage - stick with SATA if available, otherwise SAS disks will do.
Then, go to Viglen, and buy their crappy little £79 PCs that go on the back of the monitor with a VESA mount. They're shockingly underpowered - 400MHz, but they make fantastic thin clients.
You can run about 100 think clients on such a system, and it'll work really nicely.
However, it being a school - there's no chance it'll take off, and you'll be stuck with the same rubbish everyone else is.
As an IT professional, I actually am against computers in schools. Typing is all well and good, but kids these days already know Google and Word, anything they actually need for modern business is pretty much self-taught or taught at their first place of employment.
Computers are the bane of the modern UK school system.
I'd like to add my thanks too - I'm absolutely awestruck by this, I'm incredibly envious of those that get to visit the ISS, and I truly hope that one day we'll get to do the same.
"You can't possibly conclude that the entire engine is 30MB in size."
I've seen better physics engines in 64kB. Seriously.
30MB is a HUGE code base for what is essentially a simple physics and 3D engine. Now, I don't have much experience with D3D, but with OpenGL, you can get something very respectable from an engine (with test code) in well under 1MB using procedural texture generation. Even with complex textures, models and so on, 30MB is a *lot*.
Better than that, I've actually purchased games on Steam that I considered "cheap" with the intention of playing them when I'd finished with something else, and never got around to actually playing them.
When you consider the cost of distribution is in the pennies, Steam actually works out like a really good deal - and the auto-update feature is very nice too!
Put simply, Steam may have it's foibles, but it's revolutionised the PC game industry. I won't buy games in the shops any more, unless it's for a console - and the Xbox360 method of being able to download is swinging that way too.
I can honestly say that while I may own 30 games, I have zero pirated software on my computer, and I put that solely down to Steam and high-quality open-source software such as OpenOffice and Ubuntu (although I do have a legit copy of Windows Vista Business too)
Oh, I see what you're saying now - make a big fat array and tag the bits 1 for use, 0 for don't use.
Sorry, that doesn't work in practice.
Plus, you're patently optimising for bulk file transfer - most file transfers are under 100 packets long, and the majority of those are under 10 packets long.
Here's a better solution that follows your idea:
An 8-bit array of every AS number on the internet. Increment the number by 16 for every AS hop. Increment the number by 64 for every hop through a known transit provider. For tiebreakers, add on single digits for end-point congestion, on a sliding scale.
Just downloading a list of "slow vs fast" will be local to your branch of your ISP - if we're both on ABC ISP, but you're in LA and I'm in London, you're going to get faster speeds to Tokyo than I am, but I will get faster speeds to Amsterdam than you are, even though we're on the same ISP.
Why are you still dividing by 8! IP addresses are 4 bytes long, not one bit!
You need to know the AS-path for this situation. It doesn't change that often (unlike a traceroute) and generally allows you to find out easily where good peering relationships are, which gets you the fastest speeds (usually).
If throughput is greater than 1kB/s, flag it? What are you on? "Throughput" (bandwidth) does *not* matter. You're thinking as though the internet connects to you - you don't. You are part of a collective, which collects to other collectives which exchange data.
There is no point in thinking of things from the point of view of the user for this, it's all to do with network effects - who's closest in terms of AS path count, what's in the way (China has firewalls which slow things down) and most importantly, how oversubscribed is the remote side!
All routes "work" - that's the whole point. You see slow speeds because the very last hop is congested, not the networks in-between.
This is a "simple to understand" point, like you say.
Ok, let's say you're right, and there are a maximum possible number of prefixes at 30 million.
Why do I divide by 8? Surely an IP address is 4 bytes, plus a 4 byte origin AS number, and a path length, and some community information and costs and so forth...
The current BGP table weighs in at 70MB. Compressed.
The full routing information base on 29th September (the last copy I downloaded) is 740MB. And that's only with AS path information, no costings or communities.
Put simply, the most efficient way to make P2P efficient is to choose the following peers, in order of most preferred to least preferred.
Is the peer local (i.e. on the same subnet) Is the peer in the same prefix range? Is the peer in the same AS number (a simple lookup) Is the peer in a peering relationship with the ISP (no transit link means faster and cheaper - again, a simple WHOIS lookup) Is the peer two hops away (i.e. can you see it through a transit link without passing through someone else) Is the peer in the same RIR (RIPE for Europe/Middle East, ARIN for America/South Africa etc) Is the peer reachable at all?
No BGP feed necessary, it can all be done with a copy of the DBs from RIPE, ARIN, AFRINIC etc, which could be downloaded *once* and updated once a year or something, rather than getting "live" data with all the bouncing etc.
Put simply, it doesn't matter WHERE in the world the other side is, ping doesn't matter with file transfers. All that matters is packet loss and congestion.
The reason your internet sucks is that if you were given good upload bandwidth, people would run servers from their homes - and that's a premium service someone can charge you a lot extra for. If you want guaranteed bandwidth, with no contention and speeds that don't suck, you'll pay for it, and trust me, it ain't cheap. Especially not to your home.
That's because Class C addresses haven't existed since 1994.
On "the internet" right now are address sizes all the way from/8 down to/25 - meaning that there are potentially more than 30 million entries, so you're looking at a 300 MB file, that would have to be updated fairly regularly.
The best way of dealing with it is just to do a WHOIS lookup on your IP, find out which AS it is associated with, and what the next-hop AS is - if it's declared as peering, use it. If it's transit, don't prefer it, but keep it anyway.
As many have alluded to, the best way to do this is on the tracker, where the tracker has access to a BGP feed, and a copy of the RIPE/ARIN/APNIC etc databases, to save on DB load.
Or, just use HTTP with caching proxies set up by the ISP, and stop downloading the latest episode of Strictly Come Dancing on BitTorrent.
When the IPv4 pool runs out, I can see carrier-grade NAT becoming a real option, and caching proxies would then be "easy" to implement, and would save a lot of transit costs for the smaller ISPs. Except in Europe, because we're all about the peering.
I've been a member of the organising team at Assembly since 2003, and attending since 2002. As a Brit, I can say that Assembly is unique in it's atmosphere, and that I've never heard of anything stolen. PLENTY of things lost though, over the years.
Yes, unfortunately, they do. Copyright infringement doesn't matter which border it crosses. Although Cliff Richard's songs of the 50s are becoming out-of-copyright (on the recordings) in the UK, I still can't get US music from the 40s as it's under copyright. Go figure.
This is using cut-through, not checking the CRC etc. It's just using a vessel with a known return time to "store" the packet while the route is chosen.
Here's an idea, what if you used one of these ideas at a fiber termination point? You "read" the signals header using part of the light ray, and divert the rest of it to the "slow buffer". After a few microseconds, the path is chosen, and the light ray is reflected back into the system, outputing to another fibre... Or is that just silly? Sounds like optical switching to me, although it does lead to degradation of the signal, presumably a repeater could be integrated into the system?
The way I understood it, and I'm probably very wrong in this, was that with Rate Limiting:
Several packets arrive. Some make it through to the physical line, up to the agreed limit. The rest are dropped - and as so many packets are lost, many RSTs occur, meaning you get about 10-25% less throughput.
With QoS, I thought:
Several packets arrive. The router determines which to send first. The router drops some and sends back ICMP congestion messages so that the sending speed is reduced. Over a number of packets, the flow rate maxes out the connection.
Errr, what? Seriously, I'd do some research first - unless you're using really small TCP packets, you should easily be able to manage 20:1 if not 50:1. With a non-acknowledged protocol such as UDP, you can increase that to over 100:1.
Just because you're using a vastly inefficient method to download your "must-have" illegal TV-rips, doesn't mean we get to blindly accept your facts.
I've done exactly the same as you, used every single tool under the sun, eventually settling on Unison until I realised I was being silly...
Let's put it this way - just set up each computer how you want it, and sync the *data*, not the whole home directory.
For instance, my Documents are synced with Dropbox (though tempted to move them to UbuntuOne), my development directories are generally stored in some kind of revision control (svn/bzr/git) and either not synced or at worst, unison-ed, and everything else just stays on the machine it was created on, and backed up with duplicity to a central fileserver hosted in France.
When you realise that syncing home is *not* good, it suddenly becomes clear what you need, and what you want are completely different.
And the whole £ vs # issue. And the placement of the \ key is a whole lot more sensible. And the return key is double height, which is VERY useful!
Well, the 10G WAN PHY is 9.953Gb/s but the 10G LAN PHY is certainly 10Gb/s.
Wait, you can't have the TLD "ALLAH"? Why not? I'm sure there will be a "GOD" one...
That's because it's the American market.
I have a Honda Civic (08 plate) in the UK, and it looks like a Cylon. It's fantastic, I love it, and I get 50-52 mpg on a 2.2 litre Diesel.
The American version of the Honda Civic looks like it's been pulled out of a scrapheap, it ain't pretty.
Where do you get your 250GBP servers from? And do they have hot-swap drive bays? =)
Here's the harsh truth - Ubuntu will never gain mainstream acceptance in the developing world. They might gain 10% or even 20% share, but there's an important fact to consider:
Your average person with a computer wants to run what everyone else is running - and that's Windows. They're not going to accept an alternative because they want the same skillset the developed world has. That doesn't mean they'll spend $$$ on Windows, that means they'll get a copy from the local market trader, and will run it illegally.
I run Ubuntu on my servers and my laptop because it's better, and because I appreciate the free software ideals. I still run Windows at work, and I can't see myself moving anytime soon.
Here's a scenario for you, that will cater to your needs:
Buy the most power machine money can buy - up to about £3000 in terms of CPU power, lots of RAM, and every storage slot filled with high capacity storage - stick with SATA if available, otherwise SAS disks will do.
Then, go to Viglen, and buy their crappy little £79 PCs that go on the back of the monitor with a VESA mount. They're shockingly underpowered - 400MHz, but they make fantastic thin clients.
You can run about 100 think clients on such a system, and it'll work really nicely.
However, it being a school - there's no chance it'll take off, and you'll be stuck with the same rubbish everyone else is.
As an IT professional, I actually am against computers in schools. Typing is all well and good, but kids these days already know Google and Word, anything they actually need for modern business is pretty much self-taught or taught at their first place of employment.
Computers are the bane of the modern UK school system.
I'd like to add my thanks too - I'm absolutely awestruck by this, I'm incredibly envious of those that get to visit the ISS, and I truly hope that one day we'll get to do the same.
Humbling, it truly is!
"You can't possibly conclude that the entire engine is 30MB in size."
I've seen better physics engines in 64kB. Seriously.
30MB is a HUGE code base for what is essentially a simple physics and 3D engine. Now, I don't have much experience with D3D, but with OpenGL, you can get something very respectable from an engine (with test code) in well under 1MB using procedural texture generation. Even with complex textures, models and so on, 30MB is a *lot*.
Better than that, I've actually purchased games on Steam that I considered "cheap" with the intention of playing them when I'd finished with something else, and never got around to actually playing them.
When you consider the cost of distribution is in the pennies, Steam actually works out like a really good deal - and the auto-update feature is very nice too!
Put simply, Steam may have it's foibles, but it's revolutionised the PC game industry. I won't buy games in the shops any more, unless it's for a console - and the Xbox360 method of being able to download is swinging that way too.
I can honestly say that while I may own 30 games, I have zero pirated software on my computer, and I put that solely down to Steam and high-quality open-source software such as OpenOffice and Ubuntu (although I do have a legit copy of Windows Vista Business too)
SUCS is the Swansea University CompSoc. It came to fame with being credited in the kernel boot sequence for a while.
Milliways is their BBS, which is quite cool, but is essentially IRC chat with a few legacy BBS features patched on.
Oh, I see what you're saying now - make a big fat array and tag the bits 1 for use, 0 for don't use.
Sorry, that doesn't work in practice.
Plus, you're patently optimising for bulk file transfer - most file transfers are under 100 packets long, and the majority of those are under 10 packets long.
Here's a better solution that follows your idea:
An 8-bit array of every AS number on the internet.
Increment the number by 16 for every AS hop.
Increment the number by 64 for every hop through a known transit provider.
For tiebreakers, add on single digits for end-point congestion, on a sliding scale.
Just downloading a list of "slow vs fast" will be local to your branch of your ISP - if we're both on ABC ISP, but you're in LA and I'm in London, you're going to get faster speeds to Tokyo than I am, but I will get faster speeds to Amsterdam than you are, even though we're on the same ISP.
Does that make sense?
Three points:
Why are you still dividing by 8! IP addresses are 4 bytes long, not one bit!
You need to know the AS-path for this situation. It doesn't change that often (unlike a traceroute) and generally allows you to find out easily where good peering relationships are, which gets you the fastest speeds (usually).
If throughput is greater than 1kB/s, flag it? What are you on? "Throughput" (bandwidth) does *not* matter. You're thinking as though the internet connects to you - you don't. You are part of a collective, which collects to other collectives which exchange data.
There is no point in thinking of things from the point of view of the user for this, it's all to do with network effects - who's closest in terms of AS path count, what's in the way (China has firewalls which slow things down) and most importantly, how oversubscribed is the remote side!
All routes "work" - that's the whole point. You see slow speeds because the very last hop is congested, not the networks in-between.
This is a "simple to understand" point, like you say.
You just haven't understood it.
Ok, let's say you're right, and there are a maximum possible number of prefixes at 30 million.
Why do I divide by 8? Surely an IP address is 4 bytes, plus a 4 byte origin AS number, and a path length, and some community information and costs and so forth...
The current BGP table weighs in at 70MB. Compressed.
The full routing information base on 29th September (the last copy I downloaded) is 740MB. And that's only with AS path information, no costings or communities.
Put simply, the most efficient way to make P2P efficient is to choose the following peers, in order of most preferred to least preferred.
Is the peer local (i.e. on the same subnet)
Is the peer in the same prefix range?
Is the peer in the same AS number (a simple lookup)
Is the peer in a peering relationship with the ISP (no transit link means faster and cheaper - again, a simple WHOIS lookup)
Is the peer two hops away (i.e. can you see it through a transit link without passing through someone else)
Is the peer in the same RIR (RIPE for Europe/Middle East, ARIN for America/South Africa etc)
Is the peer reachable at all?
No BGP feed necessary, it can all be done with a copy of the DBs from RIPE, ARIN, AFRINIC etc, which could be downloaded *once* and updated once a year or something, rather than getting "live" data with all the bouncing etc.
Put simply, it doesn't matter WHERE in the world the other side is, ping doesn't matter with file transfers. All that matters is packet loss and congestion.
The reason your internet sucks is that if you were given good upload bandwidth, people would run servers from their homes - and that's a premium service someone can charge you a lot extra for. If you want guaranteed bandwidth, with no contention and speeds that don't suck, you'll pay for it, and trust me, it ain't cheap. Especially not to your home.
That's because Class C addresses haven't existed since 1994.
On "the internet" right now are address sizes all the way from /8 down to /25 - meaning that there are potentially more than 30 million entries, so you're looking at a 300 MB file, that would have to be updated fairly regularly.
The best way of dealing with it is just to do a WHOIS lookup on your IP, find out which AS it is associated with, and what the next-hop AS is - if it's declared as peering, use it. If it's transit, don't prefer it, but keep it anyway.
As many have alluded to, the best way to do this is on the tracker, where the tracker has access to a BGP feed, and a copy of the RIPE/ARIN/APNIC etc databases, to save on DB load.
Or, just use HTTP with caching proxies set up by the ISP, and stop downloading the latest episode of Strictly Come Dancing on BitTorrent.
When the IPv4 pool runs out, I can see carrier-grade NAT becoming a real option, and caching proxies would then be "easy" to implement, and would save a lot of transit costs for the smaller ISPs. Except in Europe, because we're all about the peering.
I nearly woke up half the street HAH-ing to that one ;)
I've been a member of the organising team at Assembly since 2003, and attending since 2002. As a Brit, I can say that Assembly is unique in it's atmosphere, and that I've never heard of anything stolen. PLENTY of things lost though, over the years.
Yes, unfortunately, they do. Copyright infringement doesn't matter which border it crosses. Although Cliff Richard's songs of the 50s are becoming out-of-copyright (on the recordings) in the UK, I still can't get US music from the 40s as it's under copyright. Go figure.
This is using cut-through, not checking the CRC etc. It's just using a vessel with a known return time to "store" the packet while the route is chosen.
I guess technically it's a hybrid of S&F and C-T.
Here's an idea, what if you used one of these ideas at a fiber termination point? You "read" the signals header using part of the light ray, and divert the rest of it to the "slow buffer". After a few microseconds, the path is chosen, and the light ray is reflected back into the system, outputing to another fibre... Or is that just silly? Sounds like optical switching to me, although it does lead to degradation of the signal, presumably a repeater could be integrated into the system?
Microsoft's dildonics programme finally useful. Except the "rips off your genitals and faxes them to the FBI" is a feature not a bug.
Same here, I'm keeping my mirror running forever more :(
Will have to turn off bashpodder updating it though!
The way I understood it, and I'm probably very wrong in this, was that with Rate Limiting:
Several packets arrive.
Some make it through to the physical line, up to the agreed limit.
The rest are dropped - and as so many packets are lost, many RSTs occur, meaning you get about 10-25% less throughput.
With QoS, I thought:
Several packets arrive.
The router determines which to send first.
The router drops some and sends back ICMP congestion messages so that the sending speed is reduced.
Over a number of packets, the flow rate maxes out the connection.
Is that wrong?