P2P Through Firewalls

Text of interview by Anonymous Coward · 2004-11-23 06:39 · Score: 5, Informative

p2pnet.net News:- Freenet author Ian Clarke is developing Dijjer, a new open source p2p content distribution tool, and he's looking for people to test drive it before it goes online in beta.

"Dijjer is a peer-to-peer HTTP cache, designed to allow the distribution of large files from Web servers while virtually eliminating the bandwidth cost to the file's publisher," he told p2pnet.

"Dijjer is designed to be simple, elegant, and to cleanly integrate with existing applications where possible. Dijjer uses "UDP hole punching" to allow it to operate from behind firewalls without any need for manual reconfiguration.

"Dijjer's distributed and scalable content distribution algorithm is inspired by Freenet."

Below is a brief Q&A.

p2pnet: When did you start working on this?

Clarke: Several months ago. It's hard to pinpoint a specific time because it's a combination of a variety of ideas that have been at the back of my mind for quite some time.

p2pnet: What prompted you?

Clarke: Dissatisfaction with apps like BitTorrent, and a desire to demonstrate that the ideas behind Freenet could be applied to solve other problems.

p2pnet: When do you expect (hope) it'll be completed?

Clarke: Well, I'm sure that development will continue for quite some time, but I hope to release a beta version in four to eight weeks that will be suitable for large-scale adoption.

p2pnet: Who do you see as the principle users?

Clarke: Anyone who needs to distribute large files to large numbers of people but who can't afford to pay for the bandwidth that this would normally require.

The download site says features include:

"No Firewall configuration
With many P2P applications you must reconfigure your firewall to get the most out of them. Not so with Dijjer, we use state-of-the-art "NAT2NAT" techniques to get the most out of your internet connection without any reconfiguration.

"Sequential downloads
If you tried to download a video through Dijjer you may have noticed that you could start watching the video before the download completed. This is because Dijjer behaves like a web server, pieces of a file are download in-order and fed to your web browser when they arrive, allowing your browser to start displaying content before it has completely downloaded.

"No "Tracker" necessary, works with virtually any URL
This is a big one, Dijjer will work with almost any direct URL, the content publisher doesn't need to lift a finger - they may not even realise that people are using Dijjer to save their bandwidth costs!

"Cross platform and native compilable
Dijjer is implemented in Java, meaning that it will run on Windows, Linux, and Macs. Those who don't wish to install the Java Runtime Environment (JRE) will be pleased to note that Dijjer can be compiled with the GNU Compiler for Java (JCJ) to native code thus eliminating the need for a JRE. Native compiled versions of Dijjer will be available from this site in due course.

"Free as in Speech
Dijjer will be released under the GNU Public License.

"No cumbersome clients
Dijjer downloads through your web browser or preffered HTTP download application. You don't need to learn to use yet another P2P client user interface.

"Advanced scalable distributed caching algorithm
Dijjer uses a highly scalable distributed caching algorithm inspired by Freenet. This will allow it to deliver faster download speeds while placing less burden on the web server, and will be better able to handle sudden increases in demand for content."

"Now all I need are some people to help me test it,"says Clarke.

Re:Reliable... udp... transfers? by Anonymous Coward · 2004-11-23 06:44 · Score: 5, Informative

TCP has slow start / back-off retransmit problems that for long transfers over links with some packet loss can cause it to not fully use the pipe.

Most modern UDP transfer systems use NACKing, where the receiver just tells the sender if it didn't get a packet (the packets are numbered sequentially) and that it should put it in the retransmit queue. The sender just goes about it's business spewing out packets until it's informed the receiver didn't get one.

bittorrent behind a firewall by ArbitraryConstant · 2004-11-23 06:44 · Score: 5, Informative

I have bittorrent behind my firewall. Rather than statically allowing ports, I set up a "torrent" user, and told the firewall to let it listen for connections. This also has two beneficial side effects. First, if there's an arbitrary code vulnerability, an attacker can be somewhat contained. Second, bittorrent doesn't always use the common range of ports, so prioritizing by port is problematic. Having a seperate user lets me throttle the bittorrent connections so that interactive traffic has priority.

While I imagine this is possible with Linux, I have no specific knowledge of how to do it. I did it with PF on OpenBSD.

--
I rarely criticize things I don't care about.

Re:bittorrent behind a firewall by Moloch666 · 2004-11-23 07:14 · Score: 2, Informative

Let me make sure I understand this. You can take:
pass in on $ext_if inet proto tcp from any to $ext_if \
port $btorrent flags S/SA keep state queue (p2p_bit, low_ack)

Change to:
pass in on $ext_if inet proto tcp from any to $ext_if \
user torrent flags S/SA keep state queue (p2p_bit, low_ack)

Not only will it assign the apropriate queue, but automatically open the ports without specifically defining them?

--
Understanding is a three-edged sword. -- Kosh Naranek

Cross between Coral and BitTorrent? by Bert690 · 2004-11-23 06:46 · Score: 3, Informative

This looks like an interesting hybrid of Coral and BitTorrent. Coral is nice in that you don't need to install any client-side software to take advantage of it. This one it appears you do need to install a client-side proxy, which is a little scary.

This system seems to utilize a client that takes on roles of both the BitTorrent tracker and the Coral caching nodes. I wonder how the client caches cooordinate? Any centralized server involved here?

Another firewall-busting HTTP serving system is YouServ (coral link), though geared more at sharing personal content instead of content requiring "super distribution".

Re:Cross between Coral and BitTorrent? by Sanity · 2004-11-23 08:03 · Score: 2, Informative

Any centralized server involved here?
Only for the initial introduction of a peer to the global network (as with all P2P apps) - then its entirely decentralised, just like Freenet.

Re:I don't know about you by sameb · 2004-11-23 06:48 · Score: 5, Informative

LimeWire doesn't have any spyware, at all. In fact, it has absolutely zero bundled software. Even with the free version.

limewire = spyware free by Anonymous Coward · 2004-11-23 06:50 · Score: 2, Informative

Wrong.

Besides the official site stating categorically no adware, spyware, or bundled software, have an actual read of the page you linked to. It's written by a desperately technologically impaired writer who probably just got these from another source.

Re:How? by GregBildson · 2004-11-23 06:51 · Score: 5, Informative

Having written most of the udpconnect code in LimeWire, the basic idea works fairly simply. The downloader starts pinging the desired uploader with UDP SYN messages. At the same time it uses what we call a PushProxy in Gnutella to tell the uploader to start doing the same thing. So then both computers are sending UDP SYNs. This makes the NAT/firewall open up to this traffic and the LimeWire hosts on both end respond to the UDP SYNs with UDP ACKs in order to identify their connection ID.

After receiving the ACKs, the connections can send UDP data messages in both direction safely. The only trick is you need to ensure that a message is sent every so often so the NAT/firewall doesn't close. If nothing else is sent, a special KEEPALIVE message is sent. Beyond this, the communication is somewhat similar to TCP with a FIN message shutting things down at the end.

Re:VPN-mesh? by Cyfun · 2004-11-23 06:57 · Score: 2, Informative

Check out Virtual Native Network. "VNN is a platform which provides the peer to peer's transparence. The peer that is behide either NAT devices or a SOCKS server can communicate with another peer transparently. Also the applications run on the peer can ignore the NAT devices' existence. Enter the world of VNN. Get over the lack of IPv4 address. Construct our own convenient and easy-using VPN."

--
In Soviet Russia, dot slashes YOU!

Re:Reliable... udp... transfers? by crow · 2004-11-23 07:04 · Score: 3, Informative

UDP has advantages and disadvantages.

UDP is connectionless--you just send a packet to a given IP/port and it goes there. This means that you can forge the from address to make it impossible to tell who is sending the file (provided your ISP doesn't filter those as bogus packets). Of course, you still need some way to get the request from the recipient to the sender (along with re-requests for lost packets).

UDP has no flow control--the sender sends as fast has he likes without any knowledge as to what the maximum bandwidth on the connection is. If the sender's direct upstream connection is the bottleneck, then that should be fine, but otherwise there may be huge packet loss. Also, because of the lack of flow control, it tends to hog the bandwidth instead of share the bandwidth.

Re:security? by Hobbex · 2004-11-23 07:07 · Score: 5, Informative

"UDP hole punching" is a simple technique, already used by many games, to allow two computers behind NAT firewalls to talk directly to one another.

Basically it works because UDP doesn't work very well with NATs, and so the NAT has to have a very general policy on what it forwards. UDP is a packet (datagram) based protocol. Each UDP packet is actually just an IP packet with two extra headers added - the source port and the destination port, and then just the data. So how can a NAT know which host on the local network it should send a UDP packet to? It can't really, so it is forced to guess, and the classical way to do this simply to forward incoming UDP packets with a given source port to a host that recently sent an outgoing UDP packet from that source port.

This allows hosts behind the NAT to open something like a server port, by simply sending packets from a certain source port out to the Internet regularly, thus making sure that packets sent to that destination port from the Internet will be sent to them. Note though that this also reveals the scalability problem with UDP and NATs: if you have many machines sending UDP packets from the same ports you get a problem.

On modern, stateful, firewalls, the NATs are slightly smarter, and will only forward the UDP packet to a node in the internal network if that recently sent a packet from the destination port of the incoming packet, and to the host that the incoming packet was sent from. This makes it impossible to act as a general "server", but UDP hole punching is still possible if you have an intermediary who can tell two NATed hosts to start sending UDP packets to each other with certain port values. This means that a non-NATed host is still needed, but it doesn't need to forward all the traffic between the two others, like it would with a proxy solution.

Blah, I meant this to be short, but instead I wasted my time writing a long slashdot post, and now there is probably already a +5 with a shorter description. Everybody mock me...

UDP Hole Punching explained by Otto · 2004-11-23 07:23 · Score: 3, Informative

UDP Hole Punching won't work on an actual "firewall". Instead it's meant to get through these home-type NAT boxes that people are calling firewalls but which really are not.

The problem with getting stuff across a NAT gateway is that communication must go through the NAT, and the NAT is generally configured to block incoming traffic unless it's expecting it.

See, NAT works by pretending to be you. When you go get a web page, you send a packet to a webserver. The NAT box, being your gateway, gets this packet, then sends out a reformatted packet of it's own to that webserver. It opens a return port to get the data from that webserver and this gets forwarded along to the receiving system. Basically you're changing the addresses used in both ways, so as to munge the thing between the private and public IP address space.

UDP works in a similar way, it's just modifying addresses going through the gateway. However, with UDP, usually the port number doesn't change. Meaning that when I send a packet out, I don't get to specify what port the responding host sends a return packet back to. I'm expected to know that it'll be coming back on the same port. So NAT deals with UDP pretty simply. The outgoing port and incoming port are the same. This is open to possible abuse, so most NAT boxes only forward packets from the original host back to the private network.

That's potentially confusing, so I'll use an example:

Computer A is behind a NAT. He sends a UDP packet to computer B on the public internet, on port 30000. The NAT munges the outgoing address and forwards it to computer B. Computer B sends back a UDP packet on port 30000. The NAT verifies that he is only allowing B to respond on that port, and sends the packet back to computer A. If computer C were to send something to the NAT on port 30000, it would be discarded by the NAT (not all NAT's do this, some allow anything in for a short time instead).

In the case where only one system is behind a NAT, this is easy to solve. The computer behind the NAT must initiate the connection. That's all there is to it. Computer A initiating the connection makes it possible for the NAT to send stuff back to computer A, and so all is good.

In the case where both computers A and B are behind their own NAT, suddenly they have no way to talk to each other. Anything A sends to B gets dropped by B's NAT, and anything B sends to A gets dropped by A's NAT. The only fix for this has been port forwarding, which manually punches a hole in one or both NAT devices.

UDP Hole Punching exploits the UDP behavior of NAT devices to allow A and B to communicate directly without any port forwarding being needed. It works like this:

Computer A sends a UDP packet to computer B on port 30000. This act opens the hole in the NAT for B to talk to A on port 30000. At the same time, A sends a packet to Server S on the P2P network. This packet basically asks computer B to send something to computer A on port 30000. Server S routes the packet to computer B over the already setup P2P network. Computer B then sends something to computer A on the given port, and they can now talk directly and setup other ports if they likee by this single channel of communication that they have gotten open.

And that's all it is, really. Just a way of using an intermediary that can talk to both A and B (via the already established P2P routing) to allow them to talk directly. Nothing particularly tricky.

Why UDP? Because UDP doesn't get the port changed by the NAT. TCP connections over a NAT usually get ports munged by the NAT without informing the computer behind the NAT. That's part of the "transparency" portion of NAT. The less tricky behavior of UDP on a NAT device makes this possible.

--
- Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.

Not likely by Wesley+Felter · 2004-11-23 07:28 · Score: 2, Informative

Lots of ISPs implement egress filtering these days to reduce forged IP source addresses.

Re:How to Initiate Connection? by sameb · 2004-11-23 07:35 · Score: 2, Informative

It's still peer to peer. Or more appropriately, peer to peer using another peer (P2PUAP). The flow is like this:

- Someone sends a request that gets routed through the network.
- Someone sends a reply that gets routed back through the network. The reply contains the address of a few [directly connectable] people the replier is connected to.
- The requestor sends a message to the directly connectable folks telling them to tell the replier to start sending UDP packets.
- Requestor & Replier both start sending UDP packets to each other. The NATs open, the transfer begins.

robots.txt is for search engines by Sanity · 2004-11-23 07:38 · Score: 2, Informative

Dijjer respects the various no-cache HTTP headers, a robots.txt file is intended for search engines, not caches.

Re:How to Initiate Connection? by Hobbex · 2004-11-23 07:40 · Score: 2, Informative

Yes, it is impossible to for two hosts behind stateful NAT firewalls to communicate if they do not have some third party "matchmaker" to tell them: "start sending packets from this port to that port at that host". But the point is that this matchmaker still has a very low load, and can exit once the connection is established, so that is not that bad compared to what would happen if he served as a proxy for all the data instead.

This definitely an ugly hack, but all NAT is really just an ugly hack, so it isn't that surprising.

Trivially solved by Sanity · 2004-11-23 07:41 · Score: 3, Informative

Just make your web server reject or redirect links that do not report Dijjer as their HTTP client. Easy.

Wondered when.... by GoRK · 2004-11-23 07:44 · Score: 2, Informative

I have been really wondering when someone was going to do this for P2P apps. Compared to how much other software actually uses the same techniques, it's long overdue. There seems to be some misconceptions on how it works though in the comments here, so I'll try to do a simple explination:

UDP is stateless. There is no connection setup like there is with TCP so there's really no way for a firewall or gateway to statefully track where to send UDP packets, so the typical implementation for NAT'ing UDP is something of a 'best guess' scenario, redirecting certain packets based on port numbers and IP's. These new applications take advantage of this synchronous behavior of NAT devices to permit direct connection between client computers where both are behind NAT firewalls.

NAT of UDP is generally implemented like this: If you begin sending UDP from source port 2000 on your computer to a remote host on port 5000, then the router doing NAT will automatically open up a 'hole' that allows any UDP packet from the remote host from source port 5000 to destination port 2000 on your machine to pass through to you. This is sort of how it works with TCP too; however the firewall only opens up the 'holes' when connections are first set up and only allows packets with correct sequence numbers to pass back through.

Essentially how it works is that two clients decide to "connect" and agree on port numbers, etc through some third host that both can reach via tcp. They then begin broadcasting UDP data to each other. Once a packet goes out from both hosts, the two 'holes' in the firewall will open up. Probably at least one packet will not actually arrive at its indended destination; however, the software can implement its own robust transfer protocol over UDP.

Games have been doing this forever. QuakeWorld (the Quake 1 client tailored to internet play) was one of the first to implement it. Most implentations of SIP support this type of connection.

Problem with Dijjler by brunes69 · 2004-11-23 08:03 · Score: 4, Informative

Just tried it out, so this is speaking from actual experience. Digger doesn't limit itself to sharing files you have already downloaded - it will *actively* download files other people are requesting, so that it can share them.

This is simmilar to freenet, and indeed will maximize everyone's bandwidth. But it has grave issues when not combined with Freenet's huge anonymimity factors like encryption and hiding IPs , and will open you up to all sorts of legal problems.

I don't want the FBI knocking down my door because my Dijjer client has been downloading kiddie porn for someone else without my knowledge. Sure, I *may* be able to argue in court that it was not me, and hey, I may even be able to prove it. But is that potential trouble worth my saving on some bandwidth? I think not.

Re:please dont by Jugalator · 2004-11-23 08:36 · Score: 2, Informative

The page has now been updated:

Welcome Slashdotters

Ok, I guess the "please don't submit to high traffic websites" in red wasn't enough, perhaps I should have used <blink> tags ;-) Since you are here, please heed the warning that this is at an early stage of development, if you are interested please sign up to our announcement mailing list so that we can let you know when its ready for primetime. Otherwise, we do need testers, so feel free to poke around.

--
Beware: In C++, your friends can see your privates!

DMCA explicitly makes caching legal by Sanity · 2004-11-23 08:49 · Score: 2, Informative

will open you up to all sorts of legal problems.

Care to be more specific? It seems to me that Dijjer is pretty-much exactly what the system caching exemption of the DMCA was intended for.

Dijjer does not create any more liability for its users than a HTTP cache creates for an ISP, and note that virtually all ISPs run HTTP caches, so far as I know, without encountering legal problems.

22 of 220 comments (clear)