Slashdot Mirror


81% of Tor Users Can Be De-anonymized By Analysing Router Information

An anonymous reader writes A former researcher at Columbia University's Network Security Lab has conducted research since 2008 indicating that traffic flow software included in network routers, notably Cisco's 'Netflow' package, can be exploited to deanonymize 81.4% of Tor clients. Professor Sambuddho Chakravarty, currently researching Network Anonymity and Privacy at the Indraprastha Institute of Information Technology, uses a technique which injects a repeating traffic pattern into the TCP connection associated with an exit node, and then compares subsequent aberrations in network timing with the traffic flow records generated by Netflow (or equivalent packages from other router manufacturers) to individuate the 'victim' client. In laboratory conditions the success rate of this traffic analysis attack is 100%, with network noise and variations reducing efficiency to 81% in a live Tor environment. Chakravarty says: 'it is not even essential to be a global adversary to launch such traffic analysis attacks. A powerful, yet non- global adversary could use traffic analysis methods [] to determine the various relays participating in a Tor circuit and directly monitor the traffic entering the entry node of the victim connection.'

3 of 136 comments (clear)

  1. After Reading The Paper by NotSanguine · · Score: 5, Informative

    It's clear that there are significant limitations to the tested identification methods. Firstly, it requires that the server endpoint be under the control of the entity attempting identification. Secondly, the TOR *entry* node being used must be identified (if you have the resources, I guess you could monitor traffic flows from *all* entry nodes) in order for the Netflow data to be compared between the Server-->Exit Node and the Entry Node-->potential target client. Thirdly, in order to generate enough traffic to have enough collected data for correlation, large (the authors' term, they do not identify the size of the file/data required, only that downloads must last ~seven minutes to collect enough data) amounts of data must be downloaded from the server.

    It's an interesting piece of work, but pulling off an identification like this requires the anonymized client to both connect to a server specifically configured to generate traffic flows that can be identified, and once connected, the client must be induced to download a "large" file/dataset. What is more, those attempting the identification must also be able to gather Netflow records from the interface(s) associated with the specific (and likely unknown) TOR entry node as well, or monitor flows from *all* TOR entry nodes.

    It seems to me, that while the above scenario is certainly feasible, if you can get a potential target to visit a server that's under your control and download a large file, you can probably infect the client with malware from that server, and have said malware phone home without TOR, producing a specific identification without false positives or negatives. Which would be much less resource intensive and more useful, IMHO.

    --
    No, no, you're not thinking; you're just being logical. --Niels Bohr
  2. Re:Where does the right to privacy come from? by Maltheus · · Score: 3, Informative

    Uhh, from the Constitution:

    The right of the people to be secure in their persons, houses, papers, and effects,[a] against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized.

  3. Re:The only solution I can think of by Carnildo · · Score: 3, Informative

    Not really. Random jitter can be dealt with statistically: collect more data, compute the mean, and use the mean where you would have used the exact timing.

    In order to defeat timing analysis through noise injection, you need to introduce a large amount of variation compared to the number of packets being sent; for any realistically-sized data transfer, this requires jitter on the order of minutes to hours.

    --
    "They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.