Anyone Can Play Big Brother With BitTorrent
An anonymous reader writes "I was at the 3rd USENIX Workshop on Large-Scale Exploits and Emergent Threats yesterday, and there were people from the French Institute for Computer Science who have continuously spied on most BitTorrent users on the Internet for 100 days, from a single machine. They've also identified 70% of all content providers; yes, those guys that insert the new contents into BitTorrent. As a BitTorrent user, I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent."
Looks like a good way to earn a paycheck from the RIAA.
If copyright law was more sane we wouldn't have to argue so much about privacy.
Shh.
As a BitTorrent user, I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent."
Really? All you have to do is be on the torrent and connect to them.
Did you know when reading you really only look at the first and last letter? Your mind fills in the rest. So that comment just shows where your mind is.
It is an important reminder of just how ignorant most technology users are of the very tools they're using.
As a BitTorrent user, I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent."
Seriously? BitTorrent is a completely open, unsecured protocol. Yes. Anybody can be listening in. The only difficulty is finding the trackers, and it's not like that is THAT hard...
Whether or not the list created is ACCURATE, however, is another matter. It's also incredibly easy to 'poison' those lists with fake addresses, as in the case of the music-sharing printer...
[This post removed under the first rule of USENET.]
https://www.eff.org/https-everywhere
As a BitTorrent user, I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent.
This is good news. It means BitTorrent is no longer relegated to those who are even remotely user savvy. This means more sharing!
Hint: BitTorrent is a protocol that relies on users talking to each other about what they're downloading. This, strangely enough, provides users with information on what everyone is downloading on BitTorrent.
You mean to tell me when I connect to a large pool of people, there is a large pool of people there?
from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
This must mean my IP address is being BROADCAST TO THE WORLD! And I thought I had punched the monkey to prevent this.
First day on the internet? Welcome.
As a BitTorrent user, I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent."
Why? Have you been downloading really compromising porn?
WTF? It's peer to peer. All they need to do is have a copy and other people download stuff from you... so you know what they're downloading...
Deleted
How could they possibly spy on me if I'm using a private tracker with DHT disabled?
MABASPLOOM!
As a BitTorrent user, I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent.
Really? I guess you never looked at the protocol. I can't find a reference, but I remember a news article from a few years ago in which Bram Cohen responding to a reporter who asked if he felt responsible for the piracy enabled by BitTorrent. Bram pointed out that BitTorent is a terrible protocol to use for piracy, because anyone can see who is doing the pirating.
It's P2P, you can't hide your IP from someone when they ask for a bit of movie file and your computer cheerfully sends it! It's the equivilant of the police walking down your street shouting "Are their any thieves here ?", and you sticking your head out the window to shout back "Yes Me me me! I'm a thief!!" ;-)
The best you can do is not respond to requests from IPs on a block list ... or steal Wifi from a poorly secured neighbour.
What about if I select my bittorrent client to connect only via encrypted connection? Is it possible to tell what torrent I am downloading without getting all torrent files that are tracked by the tracker (which is obviously easy to identify)?
or steal Wifi from a poorly secured neighbour.
That's not theft, it is only theft if you take a physical object... ;)
[sarcasm]
Or is that completely wrong and sooooo 2009?
The Invisible Hand of the Free Market is what punches workers in the nuts.
"I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent."
Really, you were shocked? What were you doing at a tech conference then, you did not belong there. Back in 01 when the bittorrent protocol was released the #1 thing about it was that 'you could see everyone and everyone could see you'. That's pretty much the definition of bittorrent, that's what the tracker does, connects you to other people. How can someone be so ignorant and apathetic to not even have a basic understanding of how a technology that they use works.
Awesome. Meet any chicks?
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Well thank God TOR is transparent.
- JackSpratts
Founder, Society for the Elimination of Opacity. ;)
That you can view peers on a BT network is not shocking. What deserves more attention is the fact that they were able to identify IP's of even those users who used Tor. Of course, BT and Tor should never be mixed (to protect the network of those who need privacy for something other than piracy). This just proves it.
Saying you "can spy on what everyone is downloading on BitTorrent" and TFA stating "major privacy threat" are over-the-top and fear-mongering exaggerations.
A more accurate way to state this is: Using BitTorrent will make our IP address public regarding what content is downloaded and shared online from that IP address. When someone monitors the same content, then they can log your IP address. This is obvious from how the protocol works to anyone who looks into privacy questions seriously. Yes, there is less privacy with what you download with BitTorrent compared to a direct download, as other people also sharing the same content can see your IP address.
But remember, with every download method online someone else knows you have downloaded it, with direct downloads and with all the different peer-to-peer distribution options. If you go to Adobe and download the latest Photoshop demo, they know, they log your IP, and usually even ask for even more information about you.
The only a real privacy problem (a "major threat") is for people using BitTorrent for illegal redistribution of content; it is not a major problem for distribution of open licensed or public domain content, businesses or organizations using BitTorrent for distribution to lower costs, or to distribute free content for viral or marketing purposes.
(Disclaimer: our company, ClearBits, does exactly this, offers distribution as a service to others, and we use BitTorrent extensively)
Free month of netflix + dvd decrypt > bit torrent. Sure netflix knows where you are, but the French never will!
Joey: No, I had sex a couple of days ago.
Rachel: No no, U-N-I-Sex...
Joey: Well...I can't say no to that...
As a BitTorrent user, I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent.
Your geek card, hand it in!
Only on /. would this be modded as Insightful.
If you can read this, it means that I bothered to log in.
The scary part is that it doesn't have any of the torrents I've actually downloaded in the past few days, like BT4 which was a huge swarm and came in at like 3 MB/s.
When you're afraid to download music illegally in your own home, then the terrorists have won!
No, it means their conclusion is highly flawed. Or, at least, 2obvious4u's interpretation of their conclusion.
When you're afraid to download music illegally in your own home, then the terrorists have won!
1. Host TOR exit node
...
2. Eavesdrop on traffic
3. Post results
4. Profit!
I'm sure the traffic coming out of TOR is far more interesting than BitTorrent traffic (unless you're a media company).
Seriously, this is news? Come on, 2 minutes looking at how the protocol works is enough to know this wouldn't be hard to do.
There's a reason they chose to use the word "most" to describe their success: in order for their snooping to be successful they have to connect to the offending machine(s). Deny them that connection, and they have nothing. Deny them that connection, and anonymizing proxies are irrelevant.
Yeah, that has been disproven.
There exist pairs of words which are anagrams of each other while still having the same first and last letter. Thus you would not be able to distinguish them if the intervening letters were scrambled. Two examples are protuberantial/perturbational and, even more on point, undefinability/unidentifiably.
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
Y'ruoe rghit, it's cpetlelmoy libelge. Fantacising!
https://www.eff.org/https-everywhere
Loosely yes, though I don't think it's quite so simple. First, the paragraph they gave has a lot of context and flow going for it, and it's mostly comprised of short, easy words. Also, the other letters matter quite a bit. With even a few random letters not in the original word jumbled in to each word, I suspect it'd be substantially harder to read. And as for the power of context, my mind originally turned "slelinpg" into "sleeping" rather than "spelling" and "aulaclty" into "audacity" rather than "actually". I still figured them out quickly enough on a second pass when it was clear they made no contextual sense, but, the first and last letters aren't the only important part of the pattern recognition.
For most situations the first and last letter plus the context are enough. Occasionally it doesn't work. Give it a shot, it's pretty cool.
You do have to be familiar with the words you're reading (i.e. a strong vocabulary), and beyond sub-vocalizing, so if you suck at reading it won't work. As long as you can sight read though, it works just fine. It's also easier the faster you read.
Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
The router is physical, and the GP presumably doesn't have the owner's permission to interact with it and thereby influence its behavior. The radio signals are also physical, albeit transient. Apart from that, you're quite right (disregarding the sarcasm): if there were nothing physical involved then no theft could have taken place.
Copyright infringement, on the other hand, concerns patterns, which may describe an arrangement of physical matter (fixed or transient) but are not themselves physical. It's the different between the shape of a wave and a mass of water (or sand or photons or whatever) of that shape. The former is abstract, whereas the latter is physical (concrete).
"The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
That doesn't disprove anything, because you'll pick the correct word based on context, even if the incorrect word was stated.
The only time it won't work are the rare occasions where two such anagrams can also be applied in the same context. For example, it would be hard to come up with a scenario where protuberantial and perturbational can be used interchangeably in a sentence and still be correct. Same with undefinability and unidentifiably.
Such occasions are extremely rare, and you're just as likely to infer the incorrect word when it is spelled correctly as you are when it is jumbled, for exactly the reason this technique works.
Just look at how many times someone on slashdot says "At first I thought it said X", almost always with odd acronyms that are very similar to other acronyms. The exact same thing is going on there, they aren't looking at the whole word, just cues, and the cues were the same for both words, so at first glance they picked the one they are most familiar with. When that seemed odd they re-read it more carefully and realized the mistake.
The fact that this is a problem at all is strong evidence that the brain really does work this way.
Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
For example, it would be hard to come up with a scenario where protuberantial and perturbational can be used interchangeably in a sentence and still be correct. Same with undefinability and unidentifiably.
Sure you can: you just did it twice.
And you can have a lot more fun coming up with words that aren't in the English language, especially in sentences that attempt to teach you the correct spelling of words are rendered farcical when you scramble them. Much of science fiction would make little sense when you scramble words like joojooflop, swut, and turlingdrome, let alone unusual proper names.
It is reasoned that it is a subset of scrambling, i.e. the exchange of pairs of adjacent letters, that is readily discernable, and not the full-fledged scrambling of the word. In that it is fairly provable that the distribution of letters have enough rules regarding what adjacent letters and phonemes are common to be provable.
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
It is entirely dependent on your familiarity with the words and your ability to put them into context. Basically, if you subvocalize at all it won't work - so unfamiliar words will throw you, as will an unfamiliar context.
First, the paragraph they gave has a lot of context and flow going for it, and it's mostly comprised of short, easy words.
Context is vital to reading in general, if there is no context you can follow it doesn't matter if the words are spelled correctly, it will simply be a confusing mess. Large words are no problem as long as you recognize them on sight. If you don't recognize it such that you need to sound it out, then it isn't going to work, obviously. It relies on you already knowing the words you're reading.
With even a few random letters not in the original word jumbled in to each word, I suspect it'd be substantially harder to read.
You'd be wrong, the first time I saw this the letters were substituted randomly instead of jumbled, and it worked fine. Jumbling is actually harder than substituting for a single letter - for example replacing the middle of the words with all x's makes what is going on abundantly clear, and actually easier to read. Your brain is not un-jumbling it in your head, it's comparing cues in the word with what you already know - for example "audacity": The first and last letters are easiest to recognize, a and y, and you see it's 8 letters long. There are maybe 15-20 common 8 letter words that start with "a" and end with "y", and given the context of a sentence using "audacity", there generally isn't any other option to confuse it with. Your brain is very good at this kind of estimation, so this process lets you read twice as fast or more as someone who sub-vocalizes.
If you had to read every single letter, you'd never be able to read faster than 300 words a minute or so, which is pretty slow. That's the whole idea behind breaking out of the sub-vocalizing stage. The next step up is to read whole phrases at a time, instead of just looking at words. After that are blocks of paragraphs, which you read in quick succession and put together the sentences in your head - you essentially read whole paragraphs at a time. That's extremely fast reading there, and not many people can do it. Reading phrases is not rare, and reading individual words by sight is common.
Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
http://www.google.com/search?hl=en&defl=en&q=define:theft&ei=c8bYS_30C4KsM7uVrI4K&sa=X&oi=glossary_definition&ct=title&ved=0CAwQkAE ...
In criminal law, theft is the illegal taking of another person's property without that person's freely-given consent.
en.wikipedia.org/wiki/Theft
Anonymous comments are as pathetic as the anonymous "sources" that contaminate gutless journalism from the New York Time
"As a BitTorrent user, I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent."
Shocked? Why? I use BitTorrent myself and it is pretty obvious that anyone can do exactly what you state. The peers and associated seeders are clearly listed in the "peers" tab by IP address. BitTorrent even logs them for you. By necessity, that data (IP addresses) must be available to the clients and is thus available to anyone with the client.
It is rather naive to think that someone wouldn't put this data to use on a large scale. Creating an app to track new torrents, and the users of that data, doesn't seem too difficult to me(although I don't know how and could be wrong about the ease of doing so).
There is BIG money in data-mining. It should be of no surprise that data-mining would be applied to such an open network.
As with any online activity, use equates to a certain amount of exposure. Accept it, or don't use the resource.
Let me tell you a true story very much like the theoretical example you posted. When I was a kid there was a Rolling Stones song I loved, but I had no money to buy the album and my parents hated rock music. Our neighbors had that album, and I used to run to the backyard to listen when they played it. Was I stealing?
Most people don't act like you claim.
In most cases, if someone really wants to watch back seasons of Lost and they can't get it off the Pirate Bay, they'll spring the few bucks to rent it from the local Blockbuster or from Netflix.
Personally, I just don't see why the media corporations just don't release their own torrents. I think most people here would be willing to live with watching the same amount of advertisements you would get on TV in return for a high quality torrent of their favorite shows that was seeded the second the show ended on primetime TV.
Remember folks, slashdot doesn't have a -1 "disagree" moderation!
Well, how about a shorter example: form/from. There's no altering the scrambling rules that can exclude that pair form ambiguity. And there's nothing wrong with the previous sentence either.
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
You didn't consider that your brain can see the first and last letter, then if your brain didn't understand it, go back and look at some more letters. I mean, that's what a time-saving accuracy-needing brain would do...
From the PDF it says the scanner downloaded pieces of data from all of the 1.2 Million torrents it listened in on. Shame Shame!
As a BitTorrent user, I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent."
That's nothing! Imagine how shocked were content providers, when they discovered that anyone with a box connected to the Internet can insert the new contents into BitTorrent!
Thor is a God, you insensitive clod! And he is not transparent!
Incidentally, the CLI interface is fragile, and it can break out into a standard apache directory listing. It also occasionally redirects to an RFC document for some reason. Anyway, there's a log of all tried passwords there. But more interestingly, there's a lot of other stuff elsewhere in the tree, an 18MB text file with a Twitter social connection graph (just a list of name pairs), and a monitor/ directory with what looks like GSM/email/p2p monitoring stuff. Can't access most of it except an auto-refreshing IRC monitoring page though.
Somebody is using it for something it seems.
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
You'd be wrong, the first time I saw this the letters were substituted randomly instead of jumbled, and it worked fine. Jumbling is actually harder than substituting for a single letter - for example replacing the middle of the words with all x's makes what is going on abundantly clear, and actually easier to read.
Not for my girlfriend, who I just tried this on, and who gave up after a minute on "Yxx'd be wxxxg, txe fxxxt txxe I sxw txxs txe lxxxxxs wxxe sxxxxxxxxxd rxxxxxxy ixxxxxd of jxxxxxd, axd it wxxxxd fxxe." without working it out. For the record, her relaxed, typical reading rate is awfully fast, such that she routinely finishes 300+ page novels in an evening (normally, I wouldn't consider this pertinent, but I figured I'd anticipate a potential response). Furthermore, when I see something like jxxxxxd, I want to think "juxtaposed" because of the x, but if I were to see jlemubd, instead, I'd read the word in context without slowing.
My point was not that you read individual letters but that you recognize patterns of letters. That's why I pointed out that I parsed aulaclty as audacity rather than actually. Contextually, audacity made no sense in the sentence, but it had the right number of letters and it started and ended with the right letters. If that was *all* that mattered, I'd have chosen the word that made more contextual sense. But it isn't all that matters. It's just the most important thing.
Incidentally, I don't see what this has to do with speed reading or subvocalization. I didn't mistake aulaclty for audacity because it *sounds* similar but rather because it *looks* similar. If I were stumbling around on trying to sound out all of the jumbled words in that paragraph, I'd have read it very slowly indeed, and I misread aulaclty in an instant.
I was also kind of horrified to see a paper talking about trying to use BitTorrent over TOR.
It's widely considered to be abuse of the network.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
"As a BitTorrent user, I was shocked that anyone with a box connected to the Internet can spy on what everyone is downloading on BitTorrent."
All of my bittorrent clients show the IP address of people I'm downloading from, and those that I'm uploading to. Just click the tab labeled "Peers". Yeah, some of them are spoofed - no big deal, I can't identify EVERYONE. But, I'll just bet that more than 80% are real addresses, and I could send a "cease and desist" on any letterhead I chose to each of the ISP's.
It would be no big deal to maintain a log of the data in the peers tab. When a new torrent appears, one IP address alone will have a complete file. Logging that will indeed give you the majority of people who actually provide content. Again, some of THOSE are probably spoofed - but you can identify most of them.
It didn't take an astrophysicist to figure this out.
"Windows is like the faint smell of piss in a subway: it's there, and there's nothing you can do about it." - Charlie Br
you forgot the real part.
You then have to download the entire thing to find out if those blocks are part of IronMan2.avi are actually part of ironman2 movie or some dumb students project on feeding excessive iron to a man.
----
If you have the first part, the consecutive parts after that are automatically watchable -- immediately. And about 50% of the time you only have to download 50% of the torrent to get the 1st piece...
But even if you don't, I don't see why it wouldn't be possible for a computer algorithm to be
developed to 'synchronize' with an active stream in progress. Surely, the only place to sync with a video stream is not the beginning, -- how do you seek?
If you can synchronize with the video at any point, you can see the video from that point on.
you can prioritize those chunks after a synchronization point to watch a bit of the video from that
point on...
I know you don't have to download an entire BT stream to watch any part of it -- it's a feature of
BT, that it can be divided into files -- and you can specified which parts of the stream you want to prioritize or download at all -- you don't have to download the whole thing to get partials.
I take it you've never actually used BT?
Just use a linux box as home router....should be able to handle sufficient clients to saturate most links.
The article goes into a lot of detail about how they identify those users who are on VPN, Proxy, tor, etc. They've also identified over 10,000 IPs that "monitor" only, from a few data centers in the United States. If you're using BT, you should definitely read this article..
A friend of mine (I'm a debian sysadmin) has developped a tool to find who is downloading a given file in realtime. It is used by police to track chlidren porn downloader in several european contries. It's quite effective and has a long strory of success. I wanted to sell that product to the police of my country (european) to track those guys but they wanted to use it to track music downloaders (our license restrict the use of the tool to track chlidren porn downloaders). We did not agree here but in other places we did. So I know for sure that those kind of tools is around for years...
I don't know why, but your post made me think about this XKCD cartoon.
Don't know, just sayin'...
I have no problem with your religion until you decide it's reason to deprive others of the truth.
If you'd like to remain anonymous (in the most effective way), use VPN services like ipredator.se.
Internet is not anonymous... wow they are smart! :P
What they don't say: IPs can be poison and computers behind some IPs could be compromised. Then they may identify some IPs which share some content. But do they check the content? And above all, they cannot link what happend with an IP to its owner. They cannot prove it with inexpensive systems.
Sounds like they might be!
Please read my Canon EOS tech blog at http://www.everyothershot.com
I'm surprised a nasty worm hasn't propagated via torrent client exploits. Get a list of IPs from a tracker AND the client/version they are using. Not only that: all the users would've opened the port on their router..
I just sent the paperwork in to copyright my IP address. I hope the RIAA will use it, thus exposing themselves to liability for the unlicensed use of my work.
Hey, if I copyright all my torrent packets, I'd get them on multiple violations!
1 - copyright IP address
2 - copyright all my IP packets
3 - wait for RIAA action
4 - ?
5 - PROFIT!
Place nail here >+
It's been known for a long time that the bittorrent protocol is not anonymous. Just use freenet.
Did you mount a military-grade, variable-focus MASER on an unlicensed artificial intelligence?
No, that's domesticated turkeys.
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
That is untrue! Just ask the newly formed Farmers Union and the Framers of the Constitution! ;)
But while the examples scramble every word, the thesis is that not every word need be scrambled and that any scrambled word can be derived from context.
The second sentence in my (the GP) post contains no misspellings. I wrote "pair form" and I meant "pair form". If you misread it as "pair from" that's an error on your part, not a valid correction.
I don't discount the phenomenon, only that it is stated too simplistically as a general case in a way that is easily disproven. If it were that easy, even a machine should be able to correct anything so scrambled without any special logic to even understand what is being written (and just as rapidly as the human brain provided the right relational database).
I've submitted my observations to snopes.com on this topic, including a link to this subthread, as an update to their article on the topic.
The "paper" is cited as saying correct spelling is unimportant. However, it doesn't say that the letters in the middle are unimportant, only their order. They still contain the correct letters. (Otherwise, no one w3d h2e any p6m u11g t2s p4e and we'd have discovered a new efficient standard for compressing English text.)
It boils down to saying that having only the correct first and last letter makes solving a scrambled word easy. Apart from the three example anagram pairs I've cited, that is correct in English. But those three examples disprove it as a rule with no exceptions as it is generally taken.
And it makes light of the plight of dyslexics. Not that I have a horse in that race.
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
Umm... This isn't news. I've known this since the protocol first got big back in 2003-2004-ish.
The MPAA and RIAA are both well aware of it and take advantage of the more stupid users regularly. A friend of mine in college even got a couple DMCA letters and the IT group punished him by shutting off his network access for a couple weeks (only in his dorm room, he could still use lab computers).
My question is how did this get a green light and put on the front page as "news".
"You can't "steal" the expectation of income."
Let's say I find myself a man to play the guitar at dinnertime each night. It's now the end of the week, and he has the "expectation" of income. He was deprived of the use of his time, and I enjoyed the fruits of his labour. If I choose to not pay him, have I not stolen from him?
Instead, let's say he recorded the dinnertime performances for me in advance. I have not agreed in advance to buy them, but he expects that if I do wish to listen to them during dinner I will pay him. Let's say I copy them, listen to them, and I do not pay. He has the "expectation" of income, he was deprived of the use of his time, and I enjoyed the fruits of his labour.
If I'm not stealing in the second case, I'm not stealing in the first.
To the fucking asshole who moded this troll. Get a brain. Words have meaning, moron.