81% of Tor Users Can Be De-anonymized By Analysing Router Information
An anonymous reader writes A former researcher at Columbia University's Network Security Lab has conducted research since 2008 indicating that traffic flow software included in network routers, notably Cisco's 'Netflow' package, can be exploited to deanonymize 81.4% of Tor clients. Professor Sambuddho Chakravarty, currently researching Network Anonymity and Privacy at the Indraprastha Institute of Information Technology, uses a technique which injects a repeating traffic pattern into the TCP connection associated with an exit node, and then compares subsequent aberrations in network timing with the traffic flow records generated by Netflow (or equivalent packages from other router manufacturers) to individuate the 'victim' client. In laboratory conditions the success rate of this traffic analysis attack is 100%, with network noise and variations reducing efficiency to 81% in a live Tor environment. Chakravarty says: 'it is not even essential to be a global adversary to launch such traffic analysis attacks. A powerful, yet non- global adversary could use traffic analysis methods [] to determine the various relays participating in a Tor circuit and directly monitor the traffic entering the entry node of the victim connection.'
is to maximize bandwidth utilization with junk traffic between all connected nodes, substituting junk data for legitimate data as needed.
Viable Slashdot alternatives: https://pipedot.org/ and http://soylentnews.org/
By "can be" De-anonymized, we mean "have been".
Sincerely,
The NSA
I've been repeatedly told I was paranoid regarding TOR traffic analysis by the the /. hive mind. So this can't be true.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
While I haven't read the paper, the article seems to have a reasonably big "correlation for non-victim" bar. If this means false positives, it makes this technique at least a lot less useful than the "81%" deanonymization rate that they claim. It might make it useless for anything really.
Honestly, this all seems like more headline and less news. But I do still have to read the paper.
The whole point of tor for those who are morally and ethically sane, is that it makes monitoring the populus orders of magnitude more expensive!
Forcing NSA and their ilk to actually target people individually, instead of just passivly collecting plain text data on everyone is exactly what needs to happen!
Use Tor as much as possible, it is the only thing stopping complete internet surveillance.
"Ideation" is my own pet "are you serious?" word.
Basically what they are saying is that you should not use Tor at home or at work, but in other places, where you don't do your normal browsing. Make normal and Tor browsing mutually network exlusive!
You can add a fingerprint without changing the data. One way is by timing. A 10 Mbps cable modem, for example, can send at maybe 50 Mbps for 100 milliseconds, then it stops for a 400ms to average 10 Mbps, the speed you paid for. If I want to mark a traffic flow I'm relaying, I can send the packets out in burts of 120KB, 60KB, 120KB, 60KB. Assuming a sufficiently uncongested network, that pattern will be visible several routers further down the line.
I've relayed precisely the data I was sent, I just modulated the rate at which I sent it.
It's clear that there are significant limitations to the tested identification methods. Firstly, it requires that the server endpoint be under the control of the entity attempting identification. Secondly, the TOR *entry* node being used must be identified (if you have the resources, I guess you could monitor traffic flows from *all* entry nodes) in order for the Netflow data to be compared between the Server-->Exit Node and the Entry Node-->potential target client. Thirdly, in order to generate enough traffic to have enough collected data for correlation, large (the authors' term, they do not identify the size of the file/data required, only that downloads must last ~seven minutes to collect enough data) amounts of data must be downloaded from the server.
It's an interesting piece of work, but pulling off an identification like this requires the anonymized client to both connect to a server specifically configured to generate traffic flows that can be identified, and once connected, the client must be induced to download a "large" file/dataset. What is more, those attempting the identification must also be able to gather Netflow records from the interface(s) associated with the specific (and likely unknown) TOR entry node as well, or monitor flows from *all* TOR entry nodes.
It seems to me, that while the above scenario is certainly feasible, if you can get a potential target to visit a server that's under your control and download a large file, you can probably infect the client with malware from that server, and have said malware phone home without TOR, producing a specific identification without false positives or negatives. Which would be much less resource intensive and more useful, IMHO.
No, no, you're not thinking; you're just being logical. --Niels Bohr
We do that for years with just a requirement for all ISPs to keep netflow data for 3 years.
Best regards,
The FSB.
In other words, you're only "anonymous" if you don't matter.
I do not fail; I succeed at finding out what does not work.
Security researcher proves that knowing your plaintext password greatly increases the speed of cracking it's hashed value.
So if you can spy on the traffic from the user to the tor entry node, and can spy on the traffic leaving the tor exit node at the same time... then you can tell that the traffic you saw going to the entry node is linked to the traffic leaving the exit node?
NO FREAKING DUH!?
Good luck being able to sniff traffic on *both* ends.
You're misunderstanding the methodology. The trick isn't to sniff the actual data being transferred and can be used even with encrypted traffic.
The identification uses traffic analysis (using data generated from Netflow and similar management tools), not packet sniffing.
The way it works is that you get the target client to initiate a file transfer from a server specifically set up for this, then you modulate the data rate (2 seconds at 1Mb/sec, 5 seconds at 3Mb/sec, 5 seconds at 750kb/sec, etc., etc. in a specific pattern) at which the data is being transmitted. You then you compare the data flows from the server to the Tor exit node and the data flows from the Tor entry node to the potential targets.
If you can correlate the server-->exit node flow to a specific entry node-->client flow, you've just identified the client outside of Tor.
No, no, you're not thinking; you're just being logical. --Niels Bohr
Better then 'performant'.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Just my thought when I read about Facebook's dark site. A way to massively try new tools to deanonymize people, and such. Can't trust them
Except that society seems to function perfectly fine, even if not necessarily ideally, without everyone following the golden rule everywhere... which is what any kind of ubiquitous expectation of privacy actually generalizes to.
File under 'M' for 'Manic ranting'
Uhh, from the Constitution:
The right of the people to be secure in their persons, houses, papers, and effects,[a] against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized.
Where do people get the idea that privacy is some sort of inalienable right? I'll agree that it's a civic courtesy, and certainly it's impolite to disregard another person's privacy, but to that end, I see it as more of a social contract than any sort of actual right. I would suggest that any appearance of privacy we might seem to have is actually just an illusion offered by the fact that other people are either making a deliberate choice to be polite in that regard, or else they are simply not interested enough in what we think is private for others to be bothered with it. Either way, it's not something that you can actually control... its largely determined by what other people do or want.
I don't know. I'm a private person, but not a secretive one. I don't mind sharing personal information with the folks I want to share with. I feel it's incumbent on me to keep things to myself. That may include encryption or access controls or just keeping my mouth shut.
Yes, there are those out there who want to know all about everyone, for their purposes. That doesn't mean I have to roll over and give it all up to anyone who wants it. If I take steps to protect data, ideas, information or anything else for that matter, It's in poor taste to attempt to circumvent those steps. Those who just throw it all out there without any concern, will be (in my estimation) victims of their own carelessness. That's their choice.
That said, in a truly free society, those with the monopoly on organized violence (i.e., government) should be restricted from encroaching on the personal spheres of that society's members, unless there is a compelling reason to violate that sphere (i.e., Probable Cause in the US). Unfortunately, we don't live in such a society, and neither the government (seeking control and power -- whether for noble or ignoble reasons) and a variety of corporate entities (for profit) are unable or unwilling to limit themselves. As such, if we want those folks "off our lawns" we need to take affirmative steps to make that happen.
No, no, you're not thinking; you're just being logical. --Niels Bohr
Every thing can be hacked and/or de-anonymized sooner or later. What is the point in using anti-virus and firewalls, tor and the likes. Seems every thing is flawed by design.
Exactly. those windows on your house are vulnerable to rocks, so there's no point in locking your door. Safes can be cracked or blown open, so why keep your valuables in one? It's just going to get broken into, so why bother?
A peeping Tom can look in your window or drill a hole in your wall to watch you, so put up cameras everywhere in your house and broadcast the output on the Internet and a large screen TV outside your house. They're going to see it anyway, so why risk damage to the house?
No, no, you're not thinking; you're just being logical. --Niels Bohr
Which, in turn, is still better than "compute" (noun)
I agree completely, and further I think the law should require everyone keep their windows and curtains open day and night, and the door to the shitter open. At least until the telescreen is invented.
See also Griswold v. Connecticut, 381 U.S. 479 (1965).
I'm pretty sure reddit probably through google analytics may have started doing this around eighteen months ago. I tested trolling them with sock puppets and they could identify my house through tor but could not differentiate between individual computers in the house. So pretty much anybody that uses google analytics probably has this capability.
Just because I don't think people should really have any expectation of privacy at any time doesn't mean I think people should not have any right to do whatever is in within their own personal power and ability to directly control to preserve whatever privacy they feel they might be able to secure for themselves, to the extent that such efforts do not infringe on anyone else's freedoms or rights.
File under 'M' for 'Manic ranting'
I'm not surprised. I wrote a paper back in 2003, Techniques for Cyber Attack Attribution, that listed a LONG list of ways to do attribution. This sounds a like a variant combining "modify transmitted messages" and "matching streams" via timing (see the paper).
Real anonymity is HARD. If someone wants to attribute you, it's hard to prevent.
- David A. Wheeler (see my Secure Programming HOWTO)
No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation. Everyone has the right to the protection of the law against such interference or attacks.
It's one of our fundamental human rights, right up there with other inconvenient courtesies such as right to life, freedom from slavery, freedom from arbitrary detention, freedom from torture, right to asylum, and freedom of thought and religion. Everyone should know their rights. If you don't know your rights, you won't know what you risk losing.
The United States voted in favor of the declaration at the time. How times have changed...
Human Rights, Article 12: Freedom from Interference with Privacy, Family, Home and Correspondence
Even then.
I'm not suggesting if you haven't done anything wrong you have nothing to hide, because that's actually a completely misleading argument that can be easily shown to be a false notion anyways.
Privacy, as I said, is created by two things, neither of which one is really in direct control of. The first thing is how polite other people are making a deliberate choice to be... invading someone else's privacy, for any reason, almost invariably amounts to rude behavior. Privacy is a courtesy that as civilized human beings, we should always extend to those around us. The world, however, has more than its share of rude people, nor can you really legislate that people not be rude to other people, so the measure of confidence you can have in privacy in this factor is entirely out of your control.
The other thing that creates privacy is something that you may have a small amount of indirect level of control over, which is how disinterested other people are liable to be in whatever it is you are doing. but the only way you really can influence this is by taking efforts to try and secure some measure of privacy for yourself, to the extent that you do not harm other people or infringe on their rights, and to a degree that the efforts that must be taken by others to overcome the efforts you have put in to secure some privacy are likely to outweigh how interested other parties might be in knowing about whatever it is that you are keeping private. Such measures might give you a greater feeling of confidence or security, but since you actually do not have any real control over what other people might want or how badly they might want it, I would still suggest that any appearance of privacy you may seem to achieve for yourself is still going to largely be illusionary. Certainly, if the efforts required to overcome whatever barriers you try to put in place to give yourself some privacy amount to needing to break the law, then you can probably have a high degree of confidence in how much privacy you have, as long as whoever might be interested in your private affairs has not been offered any legal immunity... and you certainly deserve to have legal recourse when someone infringes on your privacy in that regard... not because they infringed on your privacy, per se, but because of whatever law it was that they actually broke.
File under 'M' for 'Manic ranting'
It seems to me that you just said the same thing as the parent post.
It seems to me that you just said the same thing as the parent post.
It seems to me, that you don't know the difference between packet sniffing and traffic analysis using Netflow and similar tools.
The links are there for your edification. You're welcome.
No, no, you're not thinking; you're just being logical. --Niels Bohr
There is no need to be rude or presumptive about my level of education. I shall explain what I meant in more depth to clear up any misunderstandings.
OP said: "So if you can spy on the traffic from the user to the tor entry node, and can spy on the traffic leaving the tor exit node at the same time... then you can tell that the traffic you saw going to the entry node is linked to the traffic leaving the exit node"
You said: "If you can correlate the server-->exit node flow to a specific entry node-->client flow, you've just identified the client outside of Tor."
Distinction Without a Difference - The assertion that a position is different from another position based on the language when, in fact, both positions are exactly the same -- at least in practice or practical terms.
Your provided links show that "packet sniffing" and "traffic flow analysis" are not different concepts in practice. The difference is in how the collected data is analyzed or for what purpose. For the purposes of this discussion where analysis of collected packets is for identical purposes, this is also a distinction without a difference. "A packet analyzer...is a computer program or a piece of computer hardware that can intercept and log traffic passing over a digital network or part of a network." "NetFlow is a feature that was introduced on Cisco routers that provides the ability to collect IP network traffic as it enters or exits an interface."
If you feel I have misinterpreted your statements, I would appreciate additional feedback.
There is no need to be rude or presumptive about my level of education. I shall explain what I meant in more depth to clear up any misunderstandings. OP said: "So if you can spy on the traffic from the user to the tor entry node, and can spy on the traffic leaving the tor exit node at the same time... then you can tell that the traffic you saw going to the entry node is linked to the traffic leaving the exit node" You said: "If you can correlate the server-->exit node flow to a specific entry node-->client flow, you've just identified the client outside of Tor." Distinction Without a Difference - The assertion that a position is different from another position based on the language when, in fact, both positions are exactly the same -- at least in practice or practical terms. Your provided links show that "packet sniffing" and "traffic flow analysis" are not different concepts in practice. The difference is in how the collected data is analyzed or for what purpose. For the purposes of this discussion where analysis of collected packets is for identical purposes, this is also a distinction without a difference. "A packet analyzer...is a computer program or a piece of computer hardware that can intercept and log traffic passing over a digital network or part of a network." "NetFlow is a feature that was introduced on Cisco routers that provides the ability to collect IP network traffic as it enters or exits an interface." If you feel I have misinterpreted your statements, I would appreciate additional feedback.
My points were literal, rather than pejorative. Sniffing packets is gathering the *actual* packets. Netflow collects statistics about packets being transmitted/received. Do you see the difference?
GP stated "Good luck being able to sniff traffic on *both* ends." Firstly, traffic isn't being "sniffed." Secondly, With Netflow, it's not necessary to have packet sniffers on the specific links used in order to gather packet statistics.
What is more, since context is everything, GP was responding to my assessment of the paper (you know, the point of the article) and misunderstood the methodology used by the researchers. I explained.
If I (here and in my original post) have been unable to explain to you both the difference between packet sniffing and Netflow analysis and/or why GP misunderstood the methodology employed by the researchers, I suggest you read the paper yourself.
TL;DR : Packet sniffing != Netflow. Methodologies have impact on results and should be understood.
Should you want to criticize me, my reasoning or my (or at least your interpretation of it) tone for any other reasons? By all means, go right ahead.
No, no, you're not thinking; you're just being logical. --Niels Bohr
I understand where you were/are coming from now. Thanks.
Your provided links show that "packet sniffing" and "traffic flow analysis" are not different concepts in practice. The difference is in how the collected data is analyzed or for what purpose.
This is an incorrect conclusion. Packet sniffing and Netflow analysis are significantly different in both theory and practice, both from the standpoint of data collected, as well as the method(s) of collection. Granted, if you are sniffing packets, you can perform a similar analysis, but that's both completely impractical (and in the context of the research) self-defeating. Attempting to sniff all packets off an IX Node requires mirroring all packets. Which would almost certainly cause serious congestion problems and be detected almost immediately. Collecting Netflow data from same wouldn't have a noticeable effect on the IX Node's network links.
Just to clarify that point. Collecting Netflow (or similar management protocol) data is significantly and demonstrably different (in the attack mechanisms posited by and the methodology employed by the researchers) in both theory and practice.
Yes, in a scenario with network links that carry much less data and both endpoints are known, packet sniffing and Netflow data collection *can* provide similar analytical results (I've done both myself), identifying data flows across large portions of the Internet (i.e., encompassing all or at least a significant fraction of Tor entry nodes -- in that the goal is identification of a device at an unknown location anywhere in the world) is a completely different animal.
I could go on, but those are the high points. The above should be obvious to anyone who has a reasonable amount of experience with IP networking. Perhaps I should have been more explicit, but given that this is a tech site and the article concerns a scholarly paper about networking, I assumed a certain level of working knowledge. My mistake.
No, no, you're not thinking; you're just being logical. --Niels Bohr
Distinction Without a Difference - The assertion that a position is different from another position based on the language when, in fact, both positions are exactly the same -- at least in practice or practical terms.
To clarify once again. The distinctions drawn are not based on nomenclature. There are specific and important technical differences which have real impact on the discussion.
As I read your post again, I'm sorely tempted to respond in kind. However, I understand that you thought I was assigning ignorance of this particular area of knowledge to you as an insult (although you did do so in your original reply -- note that I simply repeated what you said first), rather than as a simple statement of fact. In your position, I would likely have responded similarly.
No, no, you're not thinking; you're just being logical. --Niels Bohr
Distinction Without a Difference - The assertion that a position is different from another position based on the language when, in fact, both positions are exactly the same -- at least in practice or practical terms.
To clarify once again. The distinctions drawn are not based on nomenclature. There are specific and important technical differences which have real impact on the discussion.
As I read your post again, I'm sorely tempted to respond in kind. However, I understand that you thought I was assigning ignorance of this particular area of knowledge to you as an insult (although you did do so in your original reply -- note that I simply repeated what you said first), rather than as a simple statement of fact. In your position, I would likely have responded similarly.
My apologies. I mis-stated both what you and I posted. The above paragraph should read:
No, no, you're not thinking; you're just being logical. --Niels Bohr