CloudFlare Wants Tor To Change Or Risk CAPTCHA Blockades (thestack.com)
An anonymous reader writes: CloudFlare's co-founder Matthew Prince has publicly appealed to work with the Tor Project on implementing a solution that will stop the high incidence of Tor users being challenged by CAPTCHAs whilst browsing. Prince proposes the implementation of a Tor plugin that would communicate with CloudFlare servers to provide temporary, anonymous identification to bypass the CAPTCHAs, and has presented the code on GitHub. Other possibilities mooted include the adoption of higher-level encryption, which would be likely to adversely influence a network which already has native (and inevitable) latency issues. CloudFlare's public post on the matter comes after five turbulent weeks of comments-section debate between CloudFlare and Tor, and seems to be an appeal for public arbitration on the matter.Prince further noted that 94% of the traffic CloudFlair sees is "per se malicious." From his blog post: That doesn't mean they are visiting controversial content, but instead that they are automated requests designed to harm our customers. A large percentage of the comment spam, vulnerability scanning, ad click fraud, content scraping, and login scanning comes via the Tor network. To give you some sense, based on data from Project Honey Pot, 18% of global email spam, or approximately 6.5 trillion unwanted messages per year, begin with an automated bot harvesting email addresses via the Tor network.
Cloudflair's captcha thingy is ostensibly in aid of DDoS protection, Tor can't muster anything like the bandwidth needed for a DoS attack in one place at one time therefor Cloudflair should just white-list suspected exit nodes.
No new code (on Tors part anyway) no dodgy pseudo-anonymous ID's to be exploited, everything works transparently, and if they hadn't told anybody they'd done it, in all likelihood nobody would have ever noticed.
Wonderful 'can do' attitude except that even without tor, their 'solutions' are offensively dysfunctional and their feedback is at least as bad. How about not requiring javascript just to view a website, eh? And obviously, sod off with your plugins. This is just another poetteringesque asspull: You broke it, someone else gets to fix it.
I DO NOT AGREE.
Really? People still use tor? WHY?!?
There's a reason Tor exit nodes are blacklisted from just about any service.
Cloudflare hides piracy sites and child porn sites. Tor users are the one who visit those sites. Put two and two together. Likewise Tor users visit the anonymous image boards (eg 4chan, 8chan) and piracy sites that are all protected by Cloudflare.
Cloudflare is in the wrong to begin with and wouldn't have the problem if they would just pull DNS records down for the illegal sites in the first place.
Prince proposes the implementation of a Tor plugin that would communicate with CloudFlare servers to provide temporary, anonymous identification to bypass the CAPTCHAs, and has presented the code on GitHub.
Brilliant!!!!!
A Tor use is clearly hiding something illegal. Don't pander to their deviant ways.
If you want to do something beneficial to society, just forward all their traffic to the FBI for analysis.
There are two simple technical solutions:
The motivation between choosing between these solutions is based on whether Tor users, which use server resources, are returning value (product sales, other calls to action) to the people that provide those resources.
Therefore the solution is simply to inform each client of Cloudflare client and let them individually decide the correct course.
-- I was raised on the command line, bitch
Hey web master, there's a man in the middle preventing me from seeing your site. Thought you might want to know.
If the alternative is some temporary identity token which might be abused by 'bots, I'm OK with CAPTCHAs.
Have gnu, will travel.
Only a copyright owner can lawfully order DNS records to be pulled down because only a copyright owner knows whether a particular use is licensed. Have you tried reporting the results of your investigation of piracy sites to the legitimate copyright owners of the affected works so that they can act?
From the proposal:
As somebody who has to deal with these regularly, this is a strange comment.
* Until about a year ago, non-Javascript users would see the well-known, miserable two-word OCR puzzles, which were difficult but by no means impossible to solve. I don't remember what Javascript users were seeing at this point.
* From about a year ago to about 2 months ago, non-Javascript users would see a puzzle that was literally impossible to solve, i.e., the server-side code was broken and rejected you regardless of what you typed. It wasn't a matter of "humans frequently cannot solve them", it was "you cannot pass". Javascript users during this period would see a variety of puzzles, the most common of which was to find house numbers in blurry photos.
* For the last 2 months, non-Javascript users are seeing puzzles consisting of 9 images and asked to select the ones containing X (street signs, flowers, grass, etc.) This is certainly annoying but much less difficult than any of the puzzles that were used in the past, including the previous generation of Javascript-based puzzles.
So honestly it seems really strange to me that people are now complaining so vociferously about Javascript - it hasn't "degraded", it's gotten better recently. And while the entire concept of requiring a CAPTCHA in order to view a site is an abomination, the current generation of puzzles are probably the easiest non-Javascript puzzles I've ever seen.
on the whole of Internet, and was created on that premise by recommendations from the DoJ. Don't for a second think they are being entirely sincere on this matter.
I've noticed this trend as well and I am in 100 agreeance with you. The new pick the pics are much easier to solve.
This doesn't make sense. sure, it's a great idea, but it entirely breaks the purpose of CAPTCHA's in the first place. Why? This is why:
Token stockpiling
An attacker who wishes to bypass many CAPTCHAs in the future could
intentionally trigger CAPTCHAs (e.g. by first running attacks through a
particular IP) and save the resulting tokens for later.
[TODO: We don't have a great answer for this. Halp!]
From the github readme
In other words, sure this is a gaping nullification of any security pretense, opening the system to anyone who wants to add a few extra lines to their tor robots.
Then again, I'm not sure it is THAT much of a problem. Existing tor-bots can simply funnel the captchas to the user anyway with enough ingenuity, just not ahead of time as this new technique would allow.
No code, just another brainstorming "project", yay!
This just in: Buggy Whip Maker pleads with Automobile Industry to expose throttle plates on the hood so their whips can still accelerate the carriage.
Just Die Cloudflare. Your days have been numbered since we realized that decentralization is the future. HTTP over bittorrent (or other protocol like Disruption Tolerant Networking or Named Data Networking where we add caches to all the nodes) is being worked on right now by everyone from NASA to Google. Go search those terms up and see for yourself. Basically: Nearly all the routers should also have caches, then you refer to data by hash rather than by file name or URL (instant deduplication, even if you "rename" it), and everyone gets free "colocation" because my request might be served by my neighbor's browser cache. The hash ensures the data's not tampered, and means that an encrypted page can actually play nice with caches since it can pull in the resource with a hash.
Some people are highly sceptical, but they shouldn't be because you can already see the pieces coming together bit by bit. Resource URIs can already have hashes in HTML5 to authenticate linked data. The existence of cloudflare and other colocation providers are another example of the shift in the ultimate direction of reingineering. They remind me of BBS operators who provided Internet gateways before we had our own Internet connections in our homes... BBSs are dead. Such "interim services" make money only while our network capabilities lag behind the next upgrade.
Current decentralized solutions suck because we only have our caches at the endpoints, not in the interim. So the packets must hop across so many intermediary switches and routers. If the generic caches were distributed across the middle of the network, then you solve the problem. Storage is cheap. New improved network architecture will have switches with caches built in, because that's just faster. We'll have to do it eventually because of the lightspeed barrier, so might as well get out ahead of the issue now.
Once we have DTN and/or NDN then Tor and Cloudflare will be irrelevant. Your request won't have to go funnel into a pinch point "server" (or name server) to be return the data. This move is inevitable because our bandwidth is limited by the speed of light. Eventually we'll have to move all the data closer to the endpoints and DTN / NDN are attempts to solve this issue, DTN is used for NASA's space Internet already, because light speed is already an issue for them. The content producers don't need to know what content you requested from the (mesh) network. Everyone is getting hooked on tracking users, but they forget how radio, TV, and newspapers got by without tracking end users for centuries.
The "automated requests designed to harm our customers" are just crawlers, building indexes, searching for vulnerabilities, or feeding SEO content farms. The customers don't care about vulerability databases, non-monopoly search engines, or scammy SEO. Their cares are defined entirely by their role within the ads ecosystem because that's what signs their checks and permits their existence. They are publishers, and they want to promise advertisers that a real human saw the ad.
The solution is to let cloudflare run the ad ecosystem, or send a signal in to it. When Google shows ads, there is a middle category of maybe-scammy stuff that gets to see the web page, and in retrospect they decide it probably wasn't a human and refund the advertiser days or months later. You can see it on your statement. Cloudflare knows what traffic falls in this grey area. They could refund the advertiser if they ran the network themselves like Google, or send a signal to the publisher.
For the subset of the publisher's ads that go through Google, the problem is already solved, but many publishers have armies of salespeople with nothing to do but make silly deals with one another, so they banter and bicker and get each other drunk trying to beat the algorithmic auction with crudely targeted contracts. In this "legacy" ad system, that they position as "premium" because it's so high touch and complimentary to them, they show all the ads themselves and don't have access to Google's despamming, so they're trying to get it by proxy from Cloudflare even though they're a ddos company not an adspam company.
The problem is most people aren't as big-picture as Google, and Google tricks others into being big-picture by applying cognitive load on the ecosystem cleverly so that things are basically working, but if you start asking questions nobody is certain of the answers. Cloudflare sounds like they're either more honest or less clever about what they make complicated, so these greedy small-businessmen types are like, "whauy am I payin' for them rack servers just to feed content farms. ain't you s'psed to block that thar shit?"
phew.
HTH.
Fuck yeah! It beats the Google distorted words captcha tests. I breeze through the street sign tests.
And then they can run HTTP trackers over Bittorent, oh wait!
You do realize that Cloud Flare does a lot of this, right?
I suspect that Cloud Flare will one of the leaders in such re-engineering actually..
You've not seen a modern BGP setup of an ISP, have you?
Try finding decent VPS providers with large amounts of storage for cheap. Hell, try finding online storage solutions that go into petabytes that are cheap... Because that's what the demand is.
Your assuming that Cloudflare will look the same as it does today when that happens.
The problem here is that these sort of things break down where end2end encryption is involved and we are seeing a massive shift towards that with HTTPS becoming a lot more prominent. The days of when providers were happy to leave even small things unencrypted is no longer a thing.
While it's entirely possible static content maybe requested from cached resources, there is no reason why dynamic requests won't go through first before requesting static content. You're not really thinking any of this through, are you?
Change is certain; progress is not obligatory.
A Get with a sql injection attack perhaps?
If there are more attacks launched via Tor than there is legitimate traffic, then perhaps we need more people to use Tor.