Slashdot Mirror


FTP: Better Than HTTP, Or Obsolete?

An anonymous reader asks "Looking to serve files for downloading (typically 1MB-6MB), I'm confused about whether I should provide an FTP server instead of / as well as HTTP. According to a rapid Google search, the experts say 1) HTTP is slower and less reliable than FTP and 2) HTTP is amateur and will make you look a wimp. But a) FTP is full of security holes. and b) FTP is a crumbling legacy protocol and will make you look a dinosaur. Surely some contradiction... Should I make the effort to implement FTP or take desperate steps to avoid it?"

36 of 870 comments (clear)

  1. do both... by jeffy124 · · Score: 4, Informative

    But in my experiences, HTTP for whatever reason goes faster (not entirely sure why), and FTP doesnt work for some because of firewalls.

    Try both - see which gets used more.

    --
    The One Rule Of Chess You'll Ever Need: Don't play someone who carries a kit in their bookbag.
  2. how about rsync? by SurfTheWorld · · Score: 5, Informative

    rsync is a great protocol, fairly robust, can be wrappered in ssh (or not), supports resuming transmission, and operates over one socket.

    seems like the best of both worlds to me.

    the real question is - do you control the clients that are going to access you? or is it something like a browser (which doesn't support rsync).

    --
    Do it for da shorties
    1. Re:how about rsync? by Dr.+Awktagon · · Score: 5, Informative

      Agreed.. I've had enough headaches with FTP and firewalls/NAT, let's just let it die. For robust downloading of large files rsync is the protocol to use.

      For those not familiar: rsync can copy or synchronize files or directories of files. it divides the files into blocks and only transfers the parts of the file that are different or missing. It's awesome for mirrored backups, among other things. There is even a Mac OS X version that tranfers the Mac-specific metadata of each file.

      Just today I had to transfer a ~400MB file to a machine over a fairly slow connection. The only way in was SSH and the only way out was HTTP.

      First I tried HTTP and the connection dropped. No problem, I thought, I'll just use "wget -c" and it will continue fine. Well, it continued, but the archive was corrupt.

      I remembered that rsync can run over SSH and I rsync'd the file over the damaged one. It took a few moments for it to find the blocks with the errors, and it downloaded just thost blocks.

      Rsync should be built into every program that downloads large files, including web browsers. Apple or someone should pick up this technology, give it some good marketing ("auto-repair download" or something) and life will be good.

      Rsync also has a daemon mode that allows you to run a dedicated rsync server. This is good for public distribution of files.

      Rsync is the way to go! I guess this really doesn't 100% answer the poster's question, but people really should be thinking about rsync more.

  3. Http/Ftp which is slower? by emf · · Score: 3, Informative

    "HTTP is slower and less reliable than FTP"

    I would think FTP is slower since with FTP you have to login and build the data connection before the transfer begins. With HTTP it's a simple GET request.

    As far as the actual data being sent, I believe that the file is sent the same way with both protocols. (just send the data via a TCP connection). I could be wrong though.

    1. Re:Http/Ftp which is slower? by treat · · Score: 4, Informative
      an FTP session has two connections, the control which is TCP/IP and data which is UDP.

      This is not true. FTP does not use UDP fpr any purpose.

    2. Re:Http/Ftp which is slower? by DaveBarr · · Score: 5, Informative

      The data connection is most assuredly NOT UDP. It is a TCP connection just like the control connection. But yes, the latency required to initiate a transfer (due to more handshakes) generally makes FTP slower in general.

    3. Re:Http/Ftp which is slower? by Edgewize · · Score: 5, Informative

      FTP supports a single connection (Passive, or PASV in the actual protocol), which is what most web browsers use by default.

      No, no, no. Jesus. Everyone always gets this wrong. FTP in any mode uses two TCP connections. Passive or not, there is a channel for data and a separate channel for commands.

      The difference is that passive-mode means that the client initiates the data connection. The default FTP behavior is for the client to connect to port 21 on the server, and then the server initiates a data connection to the client.

      Non-passive FTP clients are very hard for firewalls to keep track of, especially when NAT is involved. Passive is a little better because both connections are outgoing.

      But at the same time, passive mode makes the server firewall's job tougher, because it requires an large range of incoming ports for the data connections.

      No matter what the mode, FTP is not very firewall-friendly.

  4. for what its worlth by dunedan · · Score: 3, Informative

    Those of your customers who don't have fast access to the internet may appreciate even a slightly faster standard.

  5. HTTP is fine by ahknight · · Score: 4, Informative

    HTTP does not have firewall issues, does not need authentication, does not (by default) allow directory listings, and is the same speed as FTP. It's a good deal for general file distrubution.

    FTP is quickly becoming a special-needs protocol. If you need authentication, uploads, directory listings, accessability with interactive tools, etc. then this is for you. Mainly useful for web designers these days, IMO, since the site software packages can use the extra file metadata for synchronization. Other than that, it's a lot of connection overhead for a simple file.

    FTP does have one nice advantage that HTTP lacks: it can limit concurrent connections based on access privleges (500 anonymous and 100 real, etc.). Doesn't sound like you need that.

    Go with HTTP. Simple, quick, anonymous, generally foolproof.

    1. Re:HTTP is fine by Voytek · · Score: 5, Informative

      [SNIP]
      does not (by default) allow directory listings
      [SNIP]

      That is a dangerous and very incorrect assumption which has nothing to do with http and everything to do with your http server.

    2. Re:HTTP is fine by kasperd · · Score: 4, Informative

      The HTTP protocol may or may not recommend DIR listings by default

      No, the HTTP protocol does not even specify the concept of a directory listning. Some servers can generate an HTML file from the directory listning, but that is all up to the server, it can generate that file as it likes or even just serve an error.

      --

      Do you care about the security of your wireless mouse?
  6. What do you want to do? by fwankypoo · · Score: 5, Informative

    The question is, "what do you want to do?" I run an FTP server (incidentally affiliated with etree.org, lossless live music!) and I need what it can give me. Namely I need multiple classes of login, each with a different

    1) number of available slots
    2) speed limit
    3) premission set

    Some people can only read files at 60KB/s, some can read and write (to the upload dir) at the same speed, come can only browse, etc. etc. For this kind of a setup, FTP is great _IF_ you keep your software up to date; subscribe to bugtraq or your distro's security bulletin or both.

    On the other hand, HTTP is great when you want to give lots of people unlimited ANONYMOUS access to something. I'm sure there is a way to throttle bandwidth, but can you do it on a class by class basis? In proftpd it's a simple "RateReadBPS xxx" and I'm set.

    As always, choose the tool that fits _your_ purpose, not the one that everyone says is "best"; they both have good and bad qualities. And http can be just as secure/insecure as any other protocol.

    --
    The time of day is 29:33.
  7. Re:Forget them both.... by Karamchand · · Score: 4, Informative

    I guess that's not what s/he wants. It sounds like anonymous downloading of publicy available files - whatfor do we need any encryption then? There are no passwords to secure, no sensitive data to secure. You'd get only hassles from MSIE users who never heard about sftp..

  8. Re:well, what're you trying to do? by Fastolfe · · Score: 5, Informative

    Furthermore, FTP allows for features such as resume, etc...

    So does HTTP. With the 'Range' header, you can retrieve only a portion of a resource.

    I agree that it really depends on the application, but for most practical "view directory, download file" purposes, there's no significant difference.

    If you wanted to interact with a directory structure, change ownerships, create directories, remove files, etc., it's generally easier to do this with FTP.

  9. Re:hmm by cbv · · Score: 3, Informative
    If it starts loading it usually finishes, and I haven't run into any corruption problems.

    You may (just may) run into a routing or timeout problem, in which case the download will stop and you are forced to do the entire download again. Using the right client, eg. ncftp, you can continue downloading partially downloaded files. An option, HTTP doesn't offer.

    With respect to the original question, I would set-up a box offering both, HTTP and FTP access.

  10. Re:hmm by toast0 · · Score: 5, Informative

    using the right client, ie wget, you can resume from http streams provided the server supports it (and i think most modern ones do)

  11. Re:hmm by tom.allender · · Score: 5, Informative
    you can continue downloading partially downloaded files. An option, HTTP doesn't offer.

    Plain wrong. RFC2068 section 10.2.7.

  12. FTP _MUCH_ faster than HTTP by trandles · · Score: 3, Informative

    It is possible to get approximately 80% of the theoretical maximum throughput of your pipe using a single FTP connection, whereas HTTP can hope for around 60% max for a single connection. The only thing faster than an FTP-based protocol (tftp, pftp) is a raw socket, and they rarely get better then 90%. Most schemes like pftp (parallel ftp, see this paper) are implemented to get as close to theoretical maximum throughput by having multiple data connections transfer the file. Of course you'll see the difference in performance more for large file transfers. The previous comment about HTTP being OK for small files is right on the mark...you will hardly notice a 20% gain when the transfers are only taking a few seconds.

  13. OR, How about... by Anenga · · Score: 5, Informative

    P2P?

    I've written a tutorial on how you can use P2P on your website to save bandwidth, space etc. An obvious way to do this would be to run a P2P client and share the file on a simple PC & Cable Modem. This works, but it is a bit generic and un-professional. A better way to do this may be to run a P2P client such as Shareaza on a webserver. You could then control the client using some type of remote service (Terminal Services, for example).

    P2P has it's advantages. Such as:
    - Users who download the file also share it. This is especially useful if the client/network supports Partial File Sharing.
    - When you release the file using the P2P client, you only need to upload to only a few users. Those users can then share the file using Partial File Sharing etc.
    - Unlike FTP and HTTP, they aren't connecting to your webserver. Thus, it saves bandwidth for you and allows people to browse your website for actual content, not media. (Though, media is content). In addition, there is ussually "Max # of Connections" allowed to a server or FTP. Not so on P2P.
    - P2P Clients have good queuing tools. At least, Shareaza does. It has a "Small Queue" and a "Large Queue". This basically allows you to have, say, 4 Upload slots for Large Files (Files that are above 10MB, for example) and one for Small Files (Under 10MB). Users who are waiting to download from you can wait in "Queue", instead of "Max users connected" on FTP.

    Though, at it's core, all of the P2P I know of uses HTTP to send files etc. But the network layer helps file distribution tremendously.

  14. Depends on the situation. by SWPadnos · · Score: 3, Informative

    As many people have said, it depends.

    FTP has a great advantage in that you can request multiple files at the same time: mget instead of get. Additionally, you can use wildcards in the names, so you can select categories / directories of files with very short commands. (mget *.mp3 *.m3u ...)

    Modern browsers allow you to transfer multiple files simultaneously, but they don't queue files for you - FTP will. This may be important if connections might get dropped - the FTP transfer will complete the first file, then move on to the next. In the event of an interruption, you will have some complete files, and one partial (which you can likely resume). For multiple simultaneous transfers - from an http browser - you may have some smaller files finished, but it's likely that all larger files will be partials, and will need to be retransmitted in their entirety, since http doesn't quite support resuming a previous download.

    So, if you're going to have a web page with many individual links, and you think that most people will download one or two files, http will probably suffice. If you expect people to want multiple files, or that they will want to be able to select groups of files with wildcards (tortuous with pointy-clicky things), then you should have FTP.

    It's not that hard to set up both, and that's probably the best solution.

    --
    - The Sigless Wonder
  15. Re:Forget them both.... by ZoneGray · · Score: 4, Informative

    This is slightly off-topic and sftp isn't what he should be using, but you can change the user's shell to /usr/bin/sftp and add it to /etc/shells. I've only tried it with OpenSSH under Linux, so YMMV. I got the idea from an OpenBSD list, though, so it should work most anywhere.

    To answer the original question, when given a choice, I always download by http. It usually takes less time to set up the connection, probably becasue of those ident lookups that most ftpd's still run by default.

  16. Re:Forget them both.... by Daytona955i · · Score: 4, Informative

    sftp is not the way to go if you want public access of files. sftp would be the way to go if you were required an account to download/upload files.

    If the files you are serving are large then use ftp. If the files are smaller (less than 10MB) use http.

    http is great, I sometimes throw up a file on there if I need to give it to someone and it is too big to e-mail. (Happened recently with a batch of photos from the car show)

    Since I already have a web page it was easy to just throw the file in the http directory and provide the link in an e-mail.

    I like http for the most part. I doubt anyone will call you lame for using it, unless the files are huge.
    -Chris

  17. HTTP, hands down by Percy_Blakeney · · Score: 5, Informative
    As I understand it, your requirements are:

    1. Download only
    2. 1-6 MB files

      I also assume the following:

    3. You don't need intricate access controls
    4. Non-technical to Somewhat-technical users

    I would say that you should go with HTTP for sure. Of course, you can provide both, but there are some key reasons for using HTTP.

    Easier Configuration Perhaps I'm just not that swift, but I've found that web servers (including Apache) are easier to configure. This is especially true if you have any previous web server experience. Of course, the FTP server is more complex due to its additional features that HTTP doesn't have, but assuming that (c) is true, then you won't need to mess with group access control rights and file uploads.

    Speed This whole "FTP is faster" stuff is not true. HTTP does not have a lot more overhead than FTP; it may even have less overhead than FTP in certain cases. Even when it does have more overhead, it is in the order of 100-200 bytes, which is too small to care about. HTTP always uses binary transfers and just spits out the whole file on the same connection as the request. FTP needs to build a data connection for every single data transfer, which can slow things down and even occasionally introduce problems.

    Easier for Users Given assumption (d), your users will be much more familiar with HTTP URLs than FTP addresses. You could just use FTP URLs and let their web browsers download the files, but then you lose the benefit of resuming partial downloads.

    Simple Access Controls Though some people need to have complex user access rules, you may very well just need simple access controls. HTTP provides this (look at Apache's .htaccess file), and you can even integrate Apache's authentication routines into PAM, if you are really hard core.

    There are a few main areas where FTP currently holds sway:

    Partial Downloads Web browsers typically don't support partial downloads, but the fact of the matter is that the HTTP protocol does support it (see the Range header.) The next generation of web browsers may very well include this feature.

    User Controls Addressed above.

    File Uploads Again, HTTP does support this feature but most browsers don't support it well. Look to WebDAV in the future to provide better support.

    In summary, just use HTTP unless you need complex access rules, resumption of partial download, or file uploading. It will be easier both on you and your users.

  18. My experiences with FTP and HTTP downloads by argonaut · · Score: 5, Informative

    Being in IT for a large Fortune 500 company that sells an operating system among other things (no, not Microsoft), I can share some of my expereinces with you. So take it for what it is worth.

    Our FTP servers run both HTTP and FTP providing the same content in the same directory structure. There are five servers that transfer an average of 1-2 TB (terabyte) per month each, so they are fairly busy. On a busy month each server can go as high as 7 TB of data transferred. File sizes range from 1 KB to to whole CD-ROM and DVD-ROM images. I think the single largest file is 3 GB.

    The logs show a trend of HTTP becoming more popular for the last several years and not stopping. It is currently at 70% of all downloads from the "FTP" servers via HTTP. While the remaining 30% is via FTP. Six years ago (I lost the logs from before this time, they are on a backup tape but I am way too lazy to get that data), it was completely reversed. 75% of downloads were via FTP and 25% were via HTTP. 90% of all transfers are done with a web browser as opposed to an FTP client or wget or something.

    One thing we learned was that many system administrators will download via FTP from the command line directly from the FTP server, especially during a crisis they are trying to resolve. They do this from the system itself and not a workstation. The reasons for this are a bit of a mystery. Feedback has shown that we should never get rid of this or we might be assassinated by our customers. We thought about it once and put out feelers.

    I would say if you don't need to deal with incoming files and you file size is not too large then stick with HTTP. Anything over about 10 MB should go to the FTP server. An FTP server can be more complicated. It seems like the vulnerabilities in FTP daemons has died down in the past year or so. Also, fronting an FTP server with a Layer 4 switch was a lot more tricky because of all the ports involved. If you want people to mirror you then go with FTP or rsync for private mirroring. In reading the feedback, most power users seem to prefer FTP, perhaps because that is what they are used to. Also, depending on the amount of traffic you might need to consider gigabit ethernet.

    The core dumps being uploaded are getting to be huge. Some of those systems have a lot of memory!

  19. Re:My opinion by snol · · Score: 3, Informative

    It'd be nice if Phoenix and Mozilla would acquire that ability. For some reason the developers' stated position is that it won't happen anytime soon, but one can always vote for the bug anyway.

  20. HTTP simultaneous connections are expensive. by androse · · Score: 3, Informative

    The problem with using HTTP for large file downloads is that, in most cases, it's cheaper ressource-wise to span multiple FTP simultaneous connections than HTTP connections. Of course, this only becomes a real problem if you have more than a few hundred virtual hosts on a single box. So save your httpd processes, and use FTP for large files.

  21. The reason is simple: congestion! by ZorinLynx · · Score: 5, Informative

    Starting multiple TCP connections for a single file download can be advantageous, because of congested network paths.

    If there are 500 TCP downloads ocurring, each download will theoretically get 1/500th the bandwidth.

    Therefore, by opening multiple TCP connections, you will increase the amount of bandwidth for your transfer, at a cost to everyone else using the connection. This is because you've effectively doubled the size of your receive window (one for each connection), causing the host you are downloading from to stuff that many more packets down the pipe.

    The problem is, when everyone does it, it completely negates any advantage to using this method. It also leads to packet loss, since you have that many more TCP connections (each with its own receive window) fighting for pieces of the pie.

  22. Re:hmm by mvdw · · Score: 3, Informative

    Especially since http is faster to connect to than ftp.

    I disagree. Sure, it's easy to browse via http and get one or two files, but when you're trying to suck down the entire directory, http blows (excuse the pun).

    What's faster for getting a whole directory than:

    wget -t 0 -c ftp://ftp.server.name/path/to/dir/*

    Doesn't work with http, because the directory listing doesn't work with wget, at least the version I have.

  23. Re:Forget them both.... by mr.+methane · · Score: 5, Informative

    I provide a mirror for a couple of largish open-source sites, and several of them specifically request that sites provide FTP service as preferred over HTTP. A couple of reasons:

    1. Scripts which need to get a list of files before choosing which ones to download - automated installers and the like - are easier to implement with FTP.

    2. FTP generally seems to chew up less CPU on the host. I can serve 12mb/s of traffic all day long on a P-II 450 box with only 256mb of memory.

    3. "download recovery" (after losing connection, etc.) seems to work better in FTP than HTTP.

  24. FTP The Easy Way by l0gic_f0x · · Score: 3, Informative

    I run a ftp for similar file-sizes (1-6 meg) using a Windows 2000 Pro box (yeah i know i should stick to my preachings about the wonders of linux but im not 100% with my abilities to lock down linux yet) and im using Bulletproof FTP server which is hella cheap but has every feature you can need and is very secure. I highly recommend it. It handles beautifully.

    --


    "Self-destruction might be the answer" --Tyler Durden
  25. Re:Different, not better or wose by sir99 · · Score: 5, Informative
    lynx, wget, and fetch, all work over http.
    Wget (don't know fetch, but assuming it's like Wget) doesn't let you browse to a file; you have to know the full path in advance, or use recursive downloading, or guess with pattern matching.

    Lynx lets you browse, but you can't do globbing, so you see lots of irrelevant crap, and you have to select files to download one at a time.

    For getting (possibly multiple) files whose location you don't know in advance, FTP is more flexible and efficient.

    --
    The ocean parts and the meteors come down
    Laid out in amber, baby.
  26. Apply these three questions... by almaw · · Score: 5, Informative
    You should use FTP if you answer yes to any of the following questions:
    1. Do you have bandwidth issues? If you are serving files to many people, FTP servers allow maximum concurrent users, which can be useful. I know you can do this with HTTP, but it's difficult to segment the downloading >1Mb files traffic from the normal site traffic. A separate service also allows you to use all the Quality of Service stuff in the 2.4 kernel nicely.
    2. Do you have a large array of files that the user might want to download, such that using an FTP client to ctrl+select multiple files is the right answer compared to having your users click on twenty links and have to cope with twenty dialog boxes?
    3. Do your users need to be able to upload files to you? This can be done with HTTP, but you'll need some PHP processing or similar on the server, it doesn't support resuming, and it won't work through many company firewalls, and therefore isn't a good option. HTTP uploading it particularly hopeless for large files, as it provides no user-feedback.
    However, you should NOT use FTP if you answer no to either of these:
    1. Are you running some flavour of unix? There just aren't any robust Windows FTP servers. Yes, I'm prepared for the flame war about this. :)
    2. Can you be bothered to keep your FTPd patched? ProFTPd and WU-FTPd are both frequent appearers on bugtraq. You need to stay on top of the patches, or you will be 0wn3d.
    Simple, see? :)
  27. FTP is slower due to TCP Window Size by Anonymous Coward · · Score: 5, Informative

    FTP implementations frequently use a fixed, small window size. HTTP on the other hand will honor the system limit, almost always larger even without tuning.

    Dramatically simplified, it means that the connection can send a lot more packets without hearing back from the far end, enabling the connection to reach higher speeds (imagine a phone call where you had to say 'okay' after every word the other person said. Now imagine only having to say it after every sentence. Much faster.)

    The tiny window size of (most crappy legacy implementations of) FTP starts to affect download speed at just 25ms latency, and has a huge effect over 50ms.

    A properly tuned system with HTTP can make a single high-latency transfer hundreds or even thousands of times faster than FTP.

    Relevant links:
    http://www.psc.edu/networking/perf_tune.ht ml
    http://www.nlanr.net/NLANRPackets/v1.3/windows _tcp tune.html
    http://dast.nlanr.net/Projects/Autobuf/ faq.html

  28. Re:Different, not better or wose by CmdrWass · · Score: 3, Informative

    I tend to agree with this, but for different reasons.

    If you are downloading a file off of a remote server, then there are one of two possibilites:

    1) you know the exact address to the file you are looking for... in this case ftp provides no superior advantage over using lynx or wget since in either case you could have been given the direct URL... either provided as an http url or an ftp url. Basically my point here is that an ftp url is no more or less useful or easy to remember than an http url.

    2) you don't know the address of the file you are looking fore... therefore you are pretty much required to browse via http, to find the site (or page) you want to download from... so since you are already forced to browse for the site, then you might as well use the browser to download. For most people that use graphical browsers, this is great... for those of us (myself included) that use shell browsers (ie lynx and links), this poses little problem as well (unless javascript is required to download a file... I friggen hate javascript... people who use javascript in their websites and have a choice should be fired [note, I use javascript in my works' website... but they make me.. I don't have a choice]).

  29. Re:hmm by grolim13 · · Score: 4, Informative

    wget -r -l1 http://http.server.name/path/to/dir/ will suck down all the files in that directory; wget -r -np http://http.server.name/path/to/dir/ will pull it down recursively.

  30. Resource usage by MattBurke · · Score: 3, Informative

    I used to run a server which distributed ~3TB/month. Initially I served these files via proftpd, but it soon became apparent that ftp daemons are far too bulky for high-volume serving.

    Enter apache. On the same hardware which keeled under around 30-50 ftp sessions, I could handle over 400 concurrent http sessions, with plenty of ram left over for the vital cacheing :)