craig:~$ psql
Welcome to psql 8.3.3, the PostgreSQL interactive terminal.
craig=> set statement_timeout=1000;
SET
craig=> SELECT generate_series(0,100000000000000000);
ERROR: canceling statement due to statement timeout
Another comment here revealed part of why someone might think a tool like this was useful:
In MySQL, EXPLAIN apparently works more like PostgreSQL's EXPLAIN ANALYZE (and related features in other RDBMSs). MySQL's EXPLAIN actually executes the query rather than just running it through the query planner. The documentation even warns that data modification is possible with EXPLAIN in some circumstances.
If your database gives you no way to ask the query planner what it will do without actually executing the query, something like this begins to look faintly useful. Personally, though, I can't imagine voluntarily using such a database.
Most PostgreSQL users don't seem to use the existing, and superior, tools like EXPLAIN, EXPLAIN ANALYZE, PgAdmin-III's graphical explain, etc. I'm sure the same is true for users of many other databases.
It's not like these tools are particularly difficult to use or understand. No training is required, though being willing to think and read a little documentation helps if you want to get the most out of them. Understanding at least vaguely how databases execute queries is handy for any database user anyway. The same understanding is required to get anything useful out of this just-posted tool.
Anyway, as I've noted elsewhere the exiting tools for this do a much better job due to integration with the RDBMS and superior knowledge of how the DB will execute the query.
I don't see what this has over EXPLAIN and an appropriate graphical display tool like PgAdmin-III. There are large numbers of tools that display graphical query plans - and unlike this simple SQL parser, they know how the database will actually execute the query once the query optimiser is done with it.
Furthermore, a simple SQL parser has no idea about what indexes are present, available working memory for sorts and joins, etc. It can't know how the DB will really execute the query, without which it's hard to tell what performance issues may or may not arise.
See comment 24461217 for a more detailed explanation of why this whole idea makes very little sense.
Execution of SQL statements can require the RDBMS to perform nested loops over parts of the query execution.
This can be an issue if the DBMS is forced to do something like perform a sequential scan of one table for each record matched in another table. That gets expensive *fast*.
There are many other possible performance issues, of course.
However, I don't see how SQL parsing can tell you much about the performance characteristics of the query. The database's query optimiser makes choices about how to execute the query, and is free to change its mind depending on configuration parameters, available resources, system load, disk bandwidth, present indexes, statistics gathered about data in the table, etc. PostgreSQL's planner for example does make heavy use of table statistics, so query plans may change depending on the quantity and distribution of data in a table.
Any decent database can already tell you how it will execute a query (and usually give you a performance readout from an actual execution of the query). There are plenty of GUI tools for displaying the resulting query plan output graphically. PgAdmin-II can do it, for example.
A simple SQL parser can have no idea about what indexes are configured, the distribution of the data, how much working memory the database has available for sorts and joins, etc. The database knows these things - and can already tell you how it will, or did, execute a query - so why not let it do its job?
Interesting. I almost always find the opposite - disks that are practically dead (horrible scraping noises, continuously developing bad sectors, can't even read sector 0, etc) that the SMART health check reports as just fine.
SMART is wonderful for its self testing capabilities . It's also very handy for access to the raw vendor attributes like uncorrectable sector counts and reallocated sectors so you can make your own judgement about disk health. The simple health check seems to be a pointless waste of time, though.
Note also that some of those "false positives" will have actually been faulty. A disk may develop bad sectors that're "corrected" (reallocated) through repeated overwrite attempts. However, those disks usually just keep on developing more bad sectors, and while each one can be "fixed" until the disk runs out of reserve sectors to reallocate to, each bad sector still results in corrupt files and data loss. At any given time the disk might pass a self test, but it's nonetheless quite defective.
The SMART health check, which is all most BIOSes ever access, is almost completely useless anyway. I've seen it pass on disks so dead that they made horrible scraping noises on spin-up and couldn't even read sector zero.
SMART is wonderful for its self testing capabilities . It's also very handy for access to the raw vendor attributes like uncorrectable sector counts and reallocated sectors so you can make your own judgement about disk health.
The simple health check, though, is a pointless waste of time. One might be forgiven for suspecting that vendors set the thresholds very high to minimize RMA rates...
For a VPN to be useful you need somewhere to terminate it. Public free or subscription VPN termination endpoints don't seem to be too common (unsurprisingly, given the potential for abuse and relatively limited utility).
There are places that offer virtual servers or simple shell accounts where you can set up your own VPN termination, though. Even an SSH tunnel is quite sufficient, really.
However, you also need to be sure that your termination point doesn't have a deal with the Chinese gov't to permit monitoring of traffic of Chinese origin. I would want to be pretty careful about this myself after Yahoo, Google, etc's recent dealings. I wouldn't be too worried about virtual server providers or shell services, but a dedicated VPN termination service might need to be checked out carefully.
Finally, blocking and content filtering are only half the problem. It's still easy to tell that you are using a VPN. This might not be a very good thing, especially if you somehow otherwise draw official attention.
I know that pain well - I'm in Western Australia myself.
Three offers laptop-based HSDPA broadband for AU$15/month, which seems to be the cheapest 'net access around given that there's no need for a phone line, line rental, etc. However, if you want to do anything that needs real bandwidth (and download cap) ADSL still seems to be the only option.
At least with the unbundled local loop services Telstra can be forced to let their competitors have direct access to the subscriber's copper tail - hence all those "Naked" DSL services. The situation has improved a lot since telstra was forced to set sensible pricing for ULL by the ACCC.
Telstra, by the way, charges $10/Mb (as opposed to Three's $0.10/Mb) for mobile data accessed from a phone without a data plan configured. That's how they ship their phones by default - and at advertised HSDPA data rates, that works out to $32,000/hour. They ship phones with no data plan and data unbarred. Nice guys.
Speaking of Telstra, the whole FTTN thing looks like a gigantic fiasco in the making - even now. Telstra's amazing demands that they be able to use the new - largely taxpayer funded - network in an anticompetitive and exclusive manner is just incredible. At least so far their arguments aren't being taken too seriously.
Well, that's covering the salaries of the team of people who'll be assigned to monitor and hand-filter the connection, including your email, web browsing, and IP phone calls;-)
More likely it's an attempt to extract money from rich media companies - who'll just knock it off their taxable income anyway - but the censor army isn't as far fetched as I'd like to think.
It's a little scary that satellite or UMTS/HSDPA 'net access might actually be cheaper than local ADSL circuits, though.
Yep. I have no idea why they're being permitted to get away with it. It's *way* more than "reasonable network management". Reasonable network management would include dropping packets, using ICMP destination host/port unreachable messages to ask the remote peer to terminate the connection, and many others things that are not forged RST packets.
I don't even understand why they chose this method. ICMP destination-port-unreachable would do the job just as well, and with way less legal ambiguity.
I agree that what Comcast is doing is unacceptable. It's deeply dodgy to go spoofing packets so traffic appears to come from someone else. It volates the trust the user has in the routers between them and the endpoint - the assumption that they'll carry the data and not mess with it.
As for user-flagged low priority traffic: Implementation is trivial. IP headers already contain ToS flags that're intended for exactly this sort of routing priority control. The "throughput" flag is ideally suited to bulk, lower priority traffic that's not sensitive to latency or occasional packet loss.
Routers can already prioritize based on ToS flags and intelligently queue traffic. The Linux router that runs my multihomed ADSL at work does it - it's really not difficult.
Incentives: If you live somewhere where your use of your connection is limited already and you have traffic allowances, then incentives aren't hard to come by. Australia, say. Or most of the rest of the world. An increased allowance for some classes of data in exchange for identifying that data as low priority (and letting the ISP drop it to preserve services for other traffic when load spikes) is an obvious first step.
I'd agree that your expectations are fair, in that the way US ISPs do their traffic management is deeply flawed. They sell you an "unlimited" service with small print that lets them disconnect heavy users, throttle traffic, etc. At least here (Australia) they're honest and specifically identify the limits on your service - and let you pay (lots) to increase them.
As ISPs here already meter traffic, enforcement is trivial. They already often have two meters per user - one "on peak" and one "off peak" - so all they need to do is class ToS=throughput traffic into the off peak class no matter when it occurs. If the user goes over their limit the usual action is to throttle all their traffic down to something miserly like 128kbps or 64kbps.
It's not my idea by the way. It's been around for quite some time, and looks like the way that Internet-wide QoS will probably land up working in the long run.
You identify what's important. You do so not by identifying what is particularly high priority, but by identifying what is *lower* priority than anything else. You get incentives, such as lower traffic costs, for doing so.
If you live somewhere where ISPs still don't limit your traffic then this won't mean much to you. I know they're starting to in the US though, so you'll probably soon be seeing monthly or daily traffic caps if you're not already.
If setting your bulk traffic to low priority means that you can use other services more, and can run your bulk downloading tools for longer (for more overall data transferred even if at a lower rate) then that's got to be a win, right?
Many bulk transfer apps already set TCP/IP ToS flags or let you configure them. I'm actually surprised that ISPs aren't already offering to let users count traffic with the throughput ToS flag set against their off-peak traffic allowances.
Certainly what comcast is doing isn't QoS. It's network management, sure, but very much sledgehammer style. It's extremely dodgy.
QoS, however, does *not* imply that all traffic gets delivered eventually. In fact, the primary mechanism for QoS with TCP/IP is to drop packets when traffic flows are too fast. This causes the IP stack on the sender to throttle back the send rate when it notices that ACKs aren't coming back for those packets from the other end, so they're presumably getting lost along the way.
My QoS configuration on my work's network uses exactly this mechanism. It also drops packets to ensure that our ADSL links are never more than 90% utilized, which makes sure that I can keep the (deep) packet queue at my ISPs' routers empty and handle packet queuing with appropriate scheduling and priority on my router.
Er, that blocks *all* TCP packets with the RST flag set that're destined to your BitTorrent port. That'll cause some interesting problems, though probably nothing worse than your tretcherous ISP is doing to you already.
In particular, you might have to increase kernel limits on open TCP/IP connections, decrease connection timeouts, etc.
Blocking RST packets with iptables is trivial, but an ugly hack at best. It also won't stop more thorough blocking methods like corrupting BT traffic (so your machine eventually blacklists the sender), injecting fake data packets, or simply dropping traffic.
That sounds nice, but it relies on ISPs not overselling capacity.
You can get service with ISPs that don't oversell, and actually have enough upstream bandwidth to service all their customers downloading and uploading at max speed all the time. It costs 20-30 times as much, but it's available. After all, most ISPs operate at a contention ratio of between 10:1 and 30:1, where they have enough bandwidth for 1 fully utilized connection for every 10-30 signed customers.
What might be a more reasonable compromise is for ISPs to reserve a fixed 64kbps or so per user. Even that, though, will quickly get expensive. They really need to be allowed to use QoS to provide acceptable performance for latency-sensitive applications while continuing to service bulk traffic - and doing it all cheaply.
Hmm. I'm not convinced. What about VoIP? I *like* my low-latency reliable VoIP, and I like the fact that my ISP is able to prioritize it over bulk traffic like BT. Ditto small HTTP traffic bursts, DNS requests, etc.
Rather than force all traffic to be treated equally, the more sensible approach would seem to be to provide incentives to flag bulk traffic as such.
Here in Australia, for example, we have small download quotas - often 5GB or less, but up to 40GB or so for "premium" connections. ISPs also generally offer extra download allowances during off-peak times to encourage file-sharers etc to mostly hammer the network when nobody else cares. Why not treat all IP traffic with the IP TOS throughput flag set as low-priority traffic to be sent only if nothing else of a higher priority is waiting, and charge it to the off-peak allowance at all times?
The only issue I really see with that is that ISPs might not feel the need to expand capacity when they're "only" dropping low priority traffic. However, that's when commercial incentives come into play - if they don't have the bandwidth, find a better one that does.
Legislation will be counterproductive in the long run and will impair services like VoIP - and even basics like ensuring that DNS responses are fast. If legislation tries to include exceptions then they'll always be 5 years out of date and will be inconsistent around the world, so they won't really be much good.
Making it in the end users' best interests to flag their bulk traffic as such just seems to make so much more sense. That's the direction where Internet QoS is headed already.
The FreeS/WAN guys were working on transparent IPSec negotiation for just this reason. It prevents many types of traffic analysis, spoofing, packet injection, etc just as you want.
Such technology works even with encrypted BitTorrent. It doesn't need to know what's *in* the data streams, only that a given IP endpoint is communicating in patterns that match BitTorrent traffic. If such traffic is detected, spoofed RST packets can be sent to cause the host to treat the connection as half-open and respond with its own RST,ACK to close it completely.
Perhaps the particular implementation ComCast uses is easily tricked by encrypted payloads. Don't worry - even if that's so, it won't last.
Now, IP-level security like IPSec would do the trick, because you could identify fake RST packets by their lack of, or invalid, signatures. There is, however, no standard way to negotiate IPSec with a remote peer, despite the best efforts of the FreeS/WAN project.
Thus, in a world where the routers along the way are fundamentally trusted to do their job and route packets, you're not going to have much luck protecting yourself against this sort of attack by your provider.
I hadn't actually considered round robin DNS with very short TTLs, which is an... unusual... oversight.
Round robin DNS is also appealing in that it'd help hide the user visible aspects of the multiple links when using SSL/TLS services. As my network provides almost all services via SSL/TLS + client certificates that's appealing.
Presumably your DNS server should ideally be on a machine outside the links to be load balanced.
Having an NS record pointing at each link would work, but might result in annoying DNS timeouts and delays when one link is down. (This might be acceptable if the links are almost always up, though, as they are in my case).
The bigger issue is that it appears that ISPs often ignore very short TTLs, clipping everything to a minimum of (say) three hours. I've had issues with this before when making DNS changes where I've dramatically shortened the TTL several thee or four days before making the change, but found that users on some ISPs don't see the updated details for hours or days anyway.
I guess you might say "too bad for them, they use a bad ISP" - but when they're your roaming users and you have to support them this doesn't go down well. I have enough trouble already with dodgy satellite ISPs that use symmetric NAT and aggressive port blocking.
As for outgoing: I already mentioned "sticky" connection-level load balancing. I'm already using the multiple-table approach shown in the LARTC to ensure that outgoing replies are routed correctly according to source IP. Adding multipath routing won't gain me much because of the traffic patterns on my site (because of the route cache it equates pretty neatly to sticky connection-level load balancing). This might change if the site's use of VoIP continues to increase, though.
Nonetheless, thanks for the suggestion. I think I'll have to do some testing with short-TTL round robin DNS to see if there are issues with any of the users' commonly used ISPs.
I have quite a bit of experience with this, as I use two consumer ADSL circuits to provide very reliable 'net services at my office.
To an extent you either get to use two different services (for reliability) or combine them into one service for improved performance. Not both.
If you're going for reliability, you'll be using two different providers. That eliminates the use of multilink PPPoE to bond the two services into a single logical service with a single public IP address. It also eliminates ATM channel bonding, which is the other way to achieve the same end. This isn't such a great loss as you might think since I've *NEVER* found a provider (at least here in Australia) that knows what either is, let alone supports even one of them.
So, you're stuck with two ADSL circuits, each with separate PPPoE connections (or direct IP over ATM links; either way) and separate public IP addresses.
This sucks. You can't even load balance across them properly without the cooperation of a router/proxy on the other side of your ADSL links.
Load balancing your transmissions on a per-packet basis is obviously hopeless because any sane ISP has egress filtering based on source IP address, and even if they don't you'll still get replies back on the official source IP (so you won't gain much). SNAT won't help because if you SNAT some packets in a connection the recipient will have no idea they're part of the same connection as the unmodified packets leaving on the other connection. The only way that packet-level load balancing across multiple links with different IPs will work is if you're only talking to an endpoint (probably a VPN termination point) that is aware that you're using multiple connections and can combine them. You can use tricks like multilinked PPTP for this, or iptables trickery on each end. In any case, you're going to need access to a server with enough bandwidth to service both connections that's willing to route traffic for you. You probably don't have this.
So, packet-level load balancing is out. What's left? Connection-level, and per-protocol.
Connection level load balancing works well for some services. Outgoing SMTP, for instance, is well suited to being randomly allocated between multiple ADSL links (if you're unfortunate enough to have users who think that 100MB attachments are a good idea). Unfortunately most home user services like HTTP web browsing are not. You'll find that websites like to store session data with your IP address, so if you do connection load balancing with HTTP you'll find that websites keep on forgetting your login. To work around this you need to use "sticky" load balancing that remembers which connection was used to talk to a given host - but that, of course, reduces the benefits of the load balancing.
In the end, all you can really do is a bit of sticky connection-level load balancing when establishing new outgoing connections for some protocol types. If you want more than that, you need to do ugly things like say "all FTP connections go out ADSL1, and all SIP and other VoIP connections go out ADSL2" etc.
Personally, I don't bother even with that. I have both ADSL services listed as MXes for the company's DNS, so if one is down we still get mail. The A record points at a colocated server elsewhere on the Internet, so that's not a worry, but if it didn't I'd have to use some sort of ISP-level or colo load balancing to reroute traffic down whichever link was currently available.
Outgoing connections just all use the primary link when it's up, and fail back to the secondary link if/when the fast one is down. The secondary link is the primary MX, so when both links are up mail will tend to come in one link and everything else in the other.
If I wanted more than this, I'd probably have to route everything through another server colocated at an ISP or peering point. Unless I could get free traffic between it and both my ADSL circuits this would get expensive fast - and it'd also reduce the benefits of the redundant ADSL links
The problem *I* have with it is that people usually offer a pittance. This is particularly frustrating when it's someone using your work commercially.
Offering someone $50 for a feature that'll take them weeks or months to implement isn't a bribe, it's an insult. It says "I think your time is worthless".
I'm sure people don't intend it that way, but it still exposes their attitudes and values.
I like to be charitable and assume that either (a) $50 is a lot of money to them or (b) they simply have no idea how hard what they're asking is and how long it'll take. I respect an offer of $50 from someone in Brazil rather more than $50 from someone in France, because from the person in Brazil it says "I'm offering what I to me is a significant sum that, if it was me doing the work, I might accept as payment for the job". From the person in France it says "I consider days of your work worth no more than what I get paid in an hour or two." Similarly, I respect an offer of $50 from an individual almost infinitely more than $50 from a company using my work in a product.
It's not a matter of expecting people to somehow suffer or make a sacrifice; rather it's the expectation that they be willing to offer something that, if the positions were reversed, they'd be willing to accept. It's saying "I consider your time around about as valuable as my own".
The WITH clause was apparently introduced in SQL:1999. It's still not particularly widely supported, though.
Yes - exactly my point. Please read my post and its parent.
Another comment here revealed part of why someone might think a tool like this was useful:
In MySQL, EXPLAIN apparently works more like PostgreSQL's EXPLAIN ANALYZE (and related features in other RDBMSs). MySQL's EXPLAIN actually executes the query rather than just running it through the query planner. The documentation even warns that data modification is possible with EXPLAIN in some circumstances.
If your database gives you no way to ask the query planner what it will do without actually executing the query, something like this begins to look faintly useful. Personally, though, I can't imagine voluntarily using such a database.
If it was, such a query analysis tool would be provably incapable of handling all queries because of the halting problem.
Thankfully most SQL dialects are limited to expressing queries that can be executed in finite time with a defined end point.
Most PostgreSQL users don't seem to use the existing, and superior, tools like EXPLAIN, EXPLAIN ANALYZE, PgAdmin-III's graphical explain, etc. I'm sure the same is true for users of many other databases.
It's not like these tools are particularly difficult to use or understand. No training is required, though being willing to think and read a little documentation helps if you want to get the most out of them. Understanding at least vaguely how databases execute queries is handy for any database user anyway. The same understanding is required to get anything useful out of this just-posted tool.
Anyway, as I've noted elsewhere the exiting tools for this do a much better job due to integration with the RDBMS and superior knowledge of how the DB will execute the query.
I don't see what this has over EXPLAIN and an appropriate graphical display tool like PgAdmin-III. There are large numbers of tools that display graphical query plans - and unlike this simple SQL parser, they know how the database will actually execute the query once the query optimiser is done with it.
Furthermore, a simple SQL parser has no idea about what indexes are present, available working memory for sorts and joins, etc. It can't know how the DB will really execute the query, without which it's hard to tell what performance issues may or may not arise.
See comment 24461217 for a more detailed explanation of why this whole idea makes very little sense.
Execution of SQL statements can require the RDBMS to perform nested loops over parts of the query execution.
This can be an issue if the DBMS is forced to do something like perform a sequential scan of one table for each record matched in another table. That gets expensive *fast*.
There are many other possible performance issues, of course.
However, I don't see how SQL parsing can tell you much about the performance characteristics of the query. The database's query optimiser makes choices about how to execute the query, and is free to change its mind depending on configuration parameters, available resources, system load, disk bandwidth, present indexes, statistics gathered about data in the table, etc. PostgreSQL's planner for example does make heavy use of table statistics, so query plans may change depending on the quantity and distribution of data in a table.
Any decent database can already tell you how it will execute a query (and usually give you a performance readout from an actual execution of the query). There are plenty of GUI tools for displaying the resulting query plan output graphically. PgAdmin-II can do it, for example.
A simple SQL parser can have no idea about what indexes are configured, the distribution of the data, how much working memory the database has available for sorts and joins, etc. The database knows these things - and can already tell you how it will, or did, execute a query - so why not let it do its job?
The whole project doesn't make much sense.
Interesting. I almost always find the opposite - disks that are practically dead (horrible scraping noises, continuously developing bad sectors, can't even read sector 0, etc) that the SMART health check reports as just fine.
SMART is wonderful for its self testing capabilities . It's also very handy for access to the raw vendor attributes like uncorrectable sector counts and reallocated sectors so you can make your own judgement about disk health. The simple health check seems to be a pointless waste of time, though.
Note also that some of those "false positives" will have actually been faulty. A disk may develop bad sectors that're "corrected" (reallocated) through repeated overwrite attempts. However, those disks usually just keep on developing more bad sectors, and while each one can be "fixed" until the disk runs out of reserve sectors to reallocate to, each bad sector still results in corrupt files and data loss. At any given time the disk might pass a self test, but it's nonetheless quite defective.
The SMART health check, which is all most BIOSes ever access, is almost completely useless anyway. I've seen it pass on disks so dead that they made horrible scraping noises on spin-up and couldn't even read sector zero.
SMART is wonderful for its self testing capabilities . It's also very handy for access to the raw vendor attributes like uncorrectable sector counts and reallocated sectors so you can make your own judgement about disk health.
The simple health check, though, is a pointless waste of time. One might be forgiven for suspecting that vendors set the thresholds very high to minimize RMA rates...
For a VPN to be useful you need somewhere to terminate it. Public free or subscription VPN termination endpoints don't seem to be too common (unsurprisingly, given the potential for abuse and relatively limited utility).
There are places that offer virtual servers or simple shell accounts where you can set up your own VPN termination, though. Even an SSH tunnel is quite sufficient, really.
However, you also need to be sure that your termination point doesn't have a deal with the Chinese gov't to permit monitoring of traffic of Chinese origin. I would want to be pretty careful about this myself after Yahoo, Google, etc's recent dealings. I wouldn't be too worried about virtual server providers or shell services, but a dedicated VPN termination service might need to be checked out carefully.
Finally, blocking and content filtering are only half the problem. It's still easy to tell that you are using a VPN. This might not be a very good thing, especially if you somehow otherwise draw official attention.
I know that pain well - I'm in Western Australia myself.
Three offers laptop-based HSDPA broadband for AU$15/month, which seems to be the cheapest 'net access around given that there's no need for a phone line, line rental, etc. However, if you want to do anything that needs real bandwidth (and download cap) ADSL still seems to be the only option.
At least with the unbundled local loop services Telstra can be forced to let their competitors have direct access to the subscriber's copper tail - hence all those "Naked" DSL services. The situation has improved a lot since telstra was forced to set sensible pricing for ULL by the ACCC.
Telstra, by the way, charges $10/Mb (as opposed to Three's $0.10/Mb) for mobile data accessed from a phone without a data plan configured. That's how they ship their phones by default - and at advertised HSDPA data rates, that works out to $32,000/hour. They ship phones with no data plan and data unbarred. Nice guys.
Speaking of Telstra, the whole FTTN thing looks like a gigantic fiasco in the making - even now. Telstra's amazing demands that they be able to use the new - largely taxpayer funded - network in an anticompetitive and exclusive manner is just incredible. At least so far their arguments aren't being taken too seriously.
Well, that's covering the salaries of the team of people who'll be assigned to monitor and hand-filter the connection, including your email, web browsing, and IP phone calls ;-)
More likely it's an attempt to extract money from rich media companies - who'll just knock it off their taxable income anyway - but the censor army isn't as far fetched as I'd like to think.
It's a little scary that satellite or UMTS/HSDPA 'net access might actually be cheaper than local ADSL circuits, though.
Yep. I have no idea why they're being permitted to get away with it. It's *way* more than "reasonable network management". Reasonable network management would include dropping packets, using ICMP destination host/port unreachable messages to ask the remote peer to terminate the connection, and many others things that are not forged RST packets.
I don't even understand why they chose this method. ICMP destination-port-unreachable would do the job just as well, and with way less legal ambiguity.
I agree that what Comcast is doing is unacceptable. It's deeply dodgy to go spoofing packets so traffic appears to come from someone else. It volates the trust the user has in the routers between them and the endpoint - the assumption that they'll carry the data and not mess with it.
As for user-flagged low priority traffic: Implementation is trivial. IP headers already contain ToS flags that're intended for exactly this sort of routing priority control. The "throughput" flag is ideally suited to bulk, lower priority traffic that's not sensitive to latency or occasional packet loss.
Routers can already prioritize based on ToS flags and intelligently queue traffic. The Linux router that runs my multihomed ADSL at work does it - it's really not difficult.
Incentives: If you live somewhere where your use of your connection is limited already and you have traffic allowances, then incentives aren't hard to come by. Australia, say. Or most of the rest of the world. An increased allowance for some classes of data in exchange for identifying that data as low priority (and letting the ISP drop it to preserve services for other traffic when load spikes) is an obvious first step.
I'd agree that your expectations are fair, in that the way US ISPs do their traffic management is deeply flawed. They sell you an "unlimited" service with small print that lets them disconnect heavy users, throttle traffic, etc. At least here (Australia) they're honest and specifically identify the limits on your service - and let you pay (lots) to increase them.
As ISPs here already meter traffic, enforcement is trivial. They already often have two meters per user - one "on peak" and one "off peak" - so all they need to do is class ToS=throughput traffic into the off peak class no matter when it occurs. If the user goes over their limit the usual action is to throttle all their traffic down to something miserly like 128kbps or 64kbps.
It's not my idea by the way. It's been around for quite some time, and looks like the way that Internet-wide QoS will probably land up working in the long run.
You identify what's important. You do so not by identifying what is particularly high priority, but by identifying what is *lower* priority than anything else. You get incentives, such as lower traffic costs, for doing so.
If you live somewhere where ISPs still don't limit your traffic then this won't mean much to you. I know they're starting to in the US though, so you'll probably soon be seeing monthly or daily traffic caps if you're not already.
If setting your bulk traffic to low priority means that you can use other services more, and can run your bulk downloading tools for longer (for more overall data transferred even if at a lower rate) then that's got to be a win, right?
Many bulk transfer apps already set TCP/IP ToS flags or let you configure them. I'm actually surprised that ISPs aren't already offering to let users count traffic with the throughput ToS flag set against their off-peak traffic allowances.
Certainly what comcast is doing isn't QoS. It's network management, sure, but very much sledgehammer style. It's extremely dodgy.
QoS, however, does *not* imply that all traffic gets delivered eventually. In fact, the primary mechanism for QoS with TCP/IP is to drop packets when traffic flows are too fast. This causes the IP stack on the sender to throttle back the send rate when it notices that ACKs aren't coming back for those packets from the other end, so they're presumably getting lost along the way.
My QoS configuration on my work's network uses exactly this mechanism. It also drops packets to ensure that our ADSL links are never more than 90% utilized, which makes sure that I can keep the (deep) packet queue at my ISPs' routers empty and handle packet queuing with appropriate scheduling and priority on my router.
Er, that blocks *all* TCP packets with the RST flag set that're destined to your BitTorrent port. That'll cause some interesting problems, though probably nothing worse than your tretcherous ISP is doing to you already.
In particular, you might have to increase kernel limits on open TCP/IP connections, decrease connection timeouts, etc.
Blocking RST packets with iptables is trivial, but an ugly hack at best. It also won't stop more thorough blocking methods like corrupting BT traffic (so your machine eventually blacklists the sender), injecting fake data packets, or simply dropping traffic.
That sounds nice, but it relies on ISPs not overselling capacity.
You can get service with ISPs that don't oversell, and actually have enough upstream bandwidth to service all their customers downloading and uploading at max speed all the time. It costs 20-30 times as much, but it's available. After all, most ISPs operate at a contention ratio of between 10:1 and 30:1, where they have enough bandwidth for 1 fully utilized connection for every 10-30 signed customers.
What might be a more reasonable compromise is for ISPs to reserve a fixed 64kbps or so per user. Even that, though, will quickly get expensive. They really need to be allowed to use QoS to provide acceptable performance for latency-sensitive applications while continuing to service bulk traffic - and doing it all cheaply.
Hmm. I'm not convinced. What about VoIP? I *like* my low-latency reliable VoIP, and I like the fact that my ISP is able to prioritize it over bulk traffic like BT. Ditto small HTTP traffic bursts, DNS requests, etc.
Rather than force all traffic to be treated equally, the more sensible approach would seem to be to provide incentives to flag bulk traffic as such.
Here in Australia, for example, we have small download quotas - often 5GB or less, but up to 40GB or so for "premium" connections. ISPs also generally offer extra download allowances during off-peak times to encourage file-sharers etc to mostly hammer the network when nobody else cares. Why not treat all IP traffic with the IP TOS throughput flag set as low-priority traffic to be sent only if nothing else of a higher priority is waiting, and charge it to the off-peak allowance at all times?
The only issue I really see with that is that ISPs might not feel the need to expand capacity when they're "only" dropping low priority traffic. However, that's when commercial incentives come into play - if they don't have the bandwidth, find a better one that does.
Legislation will be counterproductive in the long run and will impair services like VoIP - and even basics like ensuring that DNS responses are fast. If legislation tries to include exceptions then they'll always be 5 years out of date and will be inconsistent around the world, so they won't really be much good.
Making it in the end users' best interests to flag their bulk traffic as such just seems to make so much more sense. That's the direction where Internet QoS is headed already.
The FreeS/WAN guys were working on transparent IPSec negotiation for just this reason. It prevents many types of traffic analysis, spoofing, packet injection, etc just as you want.
They've given up because nobody cared :S
Such technology works even with encrypted BitTorrent. It doesn't need to know what's *in* the data streams, only that a given IP endpoint is communicating in patterns that match BitTorrent traffic. If such traffic is detected, spoofed RST packets can be sent to cause the host to treat the connection as half-open and respond with its own RST,ACK to close it completely.
Perhaps the particular implementation ComCast uses is easily tricked by encrypted payloads. Don't worry - even if that's so, it won't last.
Now, IP-level security like IPSec would do the trick, because you could identify fake RST packets by their lack of, or invalid, signatures. There is, however, no standard way to negotiate IPSec with a remote peer, despite the best efforts of the FreeS/WAN project.
Thus, in a world where the routers along the way are fundamentally trusted to do their job and route packets, you're not going to have much luck protecting yourself against this sort of attack by your provider.
I hadn't actually considered round robin DNS with very short TTLs, which is an ... unusual ... oversight.
Round robin DNS is also appealing in that it'd help hide the user visible aspects of the multiple links when using SSL/TLS services. As my network provides almost all services via SSL/TLS + client certificates that's appealing.
Presumably your DNS server should ideally be on a machine outside the links to be load balanced.
Having an NS record pointing at each link would work, but might result in annoying DNS timeouts and delays when one link is down. (This might be acceptable if the links are almost always up, though, as they are in my case).
The bigger issue is that it appears that ISPs often ignore very short TTLs, clipping everything to a minimum of (say) three hours. I've had issues with this before when making DNS changes where I've dramatically shortened the TTL several thee or four days before making the change, but found that users on some ISPs don't see the updated details for hours or days anyway.
I guess you might say "too bad for them, they use a bad ISP" - but when they're your roaming users and you have to support them this doesn't go down well. I have enough trouble already with dodgy satellite ISPs that use symmetric NAT and aggressive port blocking.
As for outgoing: I already mentioned "sticky" connection-level load balancing. I'm already using the multiple-table approach shown in the LARTC to ensure that outgoing replies are routed correctly according to source IP. Adding multipath routing won't gain me much because of the traffic patterns on my site (because of the route cache it equates pretty neatly to sticky connection-level load balancing). This might change if the site's use of VoIP continues to increase, though.
Nonetheless, thanks for the suggestion. I think I'll have to do some testing with short-TTL round robin DNS to see if there are issues with any of the users' commonly used ISPs.
I have quite a bit of experience with this, as I use two consumer ADSL circuits to provide very reliable 'net services at my office.
To an extent you either get to use two different services (for reliability) or combine them into one service for improved performance. Not both.
If you're going for reliability, you'll be using two different providers. That eliminates the use of multilink PPPoE to bond the two services into a single logical service with a single public IP address. It also eliminates ATM channel bonding, which is the other way to achieve the same end. This isn't such a great loss as you might think since I've *NEVER* found a provider (at least here in Australia) that knows what either is, let alone supports even one of them.
So, you're stuck with two ADSL circuits, each with separate PPPoE connections (or direct IP over ATM links; either way) and separate public IP addresses.
This sucks. You can't even load balance across them properly without the cooperation of a router/proxy on the other side of your ADSL links.
Load balancing your transmissions on a per-packet basis is obviously hopeless because any sane ISP has egress filtering based on source IP address, and even if they don't you'll still get replies back on the official source IP (so you won't gain much). SNAT won't help because if you SNAT some packets in a connection the recipient will have no idea they're part of the same connection as the unmodified packets leaving on the other connection. The only way that packet-level load balancing across multiple links with different IPs will work is if you're only talking to an endpoint (probably a VPN termination point) that is aware that you're using multiple connections and can combine them. You can use tricks like multilinked PPTP for this, or iptables trickery on each end. In any case, you're going to need access to a server with enough bandwidth to service both connections that's willing to route traffic for you. You probably don't have this.
So, packet-level load balancing is out. What's left? Connection-level, and per-protocol.
Connection level load balancing works well for some services. Outgoing SMTP, for instance, is well suited to being randomly allocated between multiple ADSL links (if you're unfortunate enough to have users who think that 100MB attachments are a good idea). Unfortunately most home user services like HTTP web browsing are not. You'll find that websites like to store session data with your IP address, so if you do connection load balancing with HTTP you'll find that websites keep on forgetting your login. To work around this you need to use "sticky" load balancing that remembers which connection was used to talk to a given host - but that, of course, reduces the benefits of the load balancing.
In the end, all you can really do is a bit of sticky connection-level load balancing when establishing new outgoing connections for some protocol types. If you want more than that, you need to do ugly things like say "all FTP connections go out ADSL1, and all SIP and other VoIP connections go out ADSL2" etc.
Personally, I don't bother even with that. I have both ADSL services listed as MXes for the company's DNS, so if one is down we still get mail. The A record points at a colocated server elsewhere on the Internet, so that's not a worry, but if it didn't I'd have to use some sort of ISP-level or colo load balancing to reroute traffic down whichever link was currently available.
Outgoing connections just all use the primary link when it's up, and fail back to the secondary link if/when the fast one is down. The secondary link is the primary MX, so when both links are up mail will tend to come in one link and everything else in the other.
If I wanted more than this, I'd probably have to route everything through another server colocated at an ISP or peering point. Unless I could get free traffic between it and both my ADSL circuits this would get expensive fast - and it'd also reduce the benefits of the redundant ADSL links
The problem *I* have with it is that people usually offer a pittance. This is particularly frustrating when it's someone using your work commercially.
Offering someone $50 for a feature that'll take them weeks or months to implement isn't a bribe, it's an insult. It says "I think your time is worthless".
I'm sure people don't intend it that way, but it still exposes their attitudes and values.
I like to be charitable and assume that either (a) $50 is a lot of money to them or (b) they simply have no idea how hard what they're asking is and how long it'll take. I respect an offer of $50 from someone in Brazil rather more than $50 from someone in France, because from the person in Brazil it says "I'm offering what I to me is a significant sum that, if it was me doing the work, I might accept as payment for the job". From the person in France it says "I consider days of your work worth no more than what I get paid in an hour or two." Similarly, I respect an offer of $50 from an individual almost infinitely more than $50 from a company using my work in a product.
It's not a matter of expecting people to somehow suffer or make a sacrifice; rather it's the expectation that they be willing to offer something that, if the positions were reversed, they'd be willing to accept. It's saying "I consider your time around about as valuable as my own".