Slashdot Mirror


Good POP3 Server for Huge Mailboxes?

brainchill asks: "I've got about 10,000 users split between a couple of quad 550 xeon machines. The machines have 2GB of ram. The problem is that the UW POP3 server takes a huge hit in both cpu and memory utilization when a 40+MB mail spool is requested via POP3. Sometimes it's bad enough to drag the monster boxes to their knees. What other POP3 daemons do you guys have experience with and how do they perform with large mailboxes"

10 of 57 comments (clear)

  1. MS by Anonymous Coward · · Score: 5, Funny

    MS Exchange.

    You know you want to.

  2. Stop using mbox and switch to Maildir by Electrum · · Score: 5, Insightful

    You won't get good performance with mbox, period. You need to switch to Maildir. qmail-pop3d works great with Maildir. Maildir scales far better than mbox since it doesn't have to parse out the individual messages. It also doesn't have to use locking. This also makes Maildir inherently more reliable than mbox. There are many tools available to convert between mbox and Maildir.

    1. Re:Stop using mbox and switch to Maildir by phaze3000 · · Score: 5, Informative
      Actually, maildir *can* be just as bad, if not worse.

      I work for an ISP where we have ~ 50 000 email users. Maildir's great when you have a few messages, and if one of these messages happens to be big then it doesn't matter. However, if a user has tens of thousands of emails of whatever size in their mailbox (happens far, far more often than you might think) then just getting a list of files in the directory can take an age. In the scenario where a user has masses of small messages (sub 2k) then mbox would probably be faster.
      Whilst I'd certainly recommend using Maildir over mbox, it's certainly not going to solve all the problems.

      --
      Blaming GW Bush for the Iraq war is like blaming Ronald McDonald for the poor quality of food.
    2. Re:Stop using mbox and switch to Maildir by Electrum · · Score: 5, Informative

      However, if a user has tens of thousands of emails of whatever size in their mailbox (happens far, far more often than you might think) then just getting a list of files in the directory can take an age.

      This is a filesystem problem. Use a better one. On FreeBSD, enable dirhash. On Linux, use ReiserFS or ext3 with htree.

  3. mbx by zsmooth · · Score: 5, Informative

    I use mbx personally (NOT mbox) and it scales wonderfully. The mailbox is fully indexed to speed up searching and can be accessed simultaneously by various processes. See here for more information.

  4. Yes, qmail by Scarabaeus · · Score: 5, Informative

    I can only second that. qmail runs like a charm and scales.

    Check out cr.yp.to/qmail.html and www.qmail.org

  5. Cyrus? by Pathwalker · · Score: 5, Informative

    Have you looked at Cyrus? It is probably best known as an IMAP server, but it has very nice pop3 support as well.

    Cyrus stores messages in a variation of the maildir format - it maintain a database of the flags, headers, etc for the messages in a folder to speed up access.

    Notable features include shared mail folders (with independent views), quotas, multiple mail partitions (with the ability to move users across partitions on the fly), duplicate email checking, and a server side filtering language (sieve).

    Most of this would probably be most useful if you were using IMAP, but it should scale quite well as a POP server.

    1. Re:Cyrus? by Matthew+Weigel · · Score: 5, Informative

      Mostly right, in a very broad non-technical way.

      Cyrus's mailstore system is actually quite different from Maildir, in particular because it doesn't need to play games with user processes (the way read/unread messages are handled in Maildir is handled that way so multiple processes can manipulate messages at the same time, for instance).

      Also, most of the abilities you list are simply unavailable via POP; Cyrus is massive overkill for a POP server, and would require even more resources (particularly disk: the users that have 40MB spool files now could probably find themselves with 2GB of mail if you let them... and even the non-abusive users would require more storage for IMAP than for POP).

      Incidentally, we use qpopper to handle POP - and quite a few users go over 40MB without killing our (not particularly beefy, and not dedicated mail) servers. I suspect the real problem is that the guy is using uw-imap's POP server - the author of which is notoriously unconcerned with the performance (or lack thereof) of spoolfiles being served over POP. Which is perfectly reasonable - he writes an IMAP server, he should be concerned with IMAP performance, and if he writes a better mailbox format (he has) then he should also concern himself with that and not a 20+ year old format.

      Actually, if one were so inclined, IMAP makes a better POP than POP3 - just disable the ability to create new folders, and use a better mailbox format (mbx, Maildir, ...).

      --
      --Matthew
  6. Re:Ever think of FTP? by 0x0d0a · · Score: 5, Insightful

    I have never understood the hostility of mail admins to large emails. Simply because a number of mail servers have piss-poor performance with large emails is not a reason to go crazy over them. Fix the mail server.

    Oh, and the fact that some people still use POP3, and their life is made miserable when they're working with large files over a modem. People should use IMAP.

    There are quite legitimate uses for file transfer via email. Most people (i.e. not UNIX geeks) do not want to maintain a file server and keep their system up 24/7. The other person may not be at the computer...this puts it in their "queue of things to deal with".

    If you mean "why don't people use ftp to transfer files to a third, intermediary system that acts as a drop box"...well, that's doing exactly what you're doing with SMTP. Why *not* do it with SMTP?

    Finally, from a user perspective, mail is much more convenient to use than dedicated file transfer protocols. Most people constantly use a mail program and know how to use it reasonably well. Everyone has an email address (a more useful mapping to users than an IP address that FTP would require), and there are no worries about different companies having different places to drop files. Email lets users sort and date emails, and tag files as being from some user. It makes it accessable from anywhere they can get at their email.

    Another thing that mail admins should live with is large mailboxes -- not just a single mail, but people leaving mail on the server, or keeping old mail around on the server. This is one of the *best* things to happen to IT. It's been the holy grail of NC designers for years. Centralize data storage to reduce costs, allow reuse of hardware, and facilitate backup.

    Frankly, if anything, mail should be extended to have *better* support for this (like resumable transfers, etc). The FTP model -- where you have machines that are always up 24/7, users that associate well with "computers" rather than "other people", users that are familiar with a larger number of programs, and a network that has no firewall or other restrictions -- simply doesn't fit the reality of what's going on at businesses today. It's fantastic for techies who want to work with their own systems, but less good for your average end users.

  7. Mailbox format by GoRK · · Score: 5, Informative

    Good christ, you'd think that by the time you outgrew a QUAD XEON mailserver with only 5000 users, you'd have been reevaluating performance before plunking down what must have been close to 10-15 grand or more at the time on a second one!

    Your mailbox format is all wrong. Storing all messages in a single file is pretty much the worst way to do anything useful. You want to explore some alternative storage format such as mbx or maildir. I personally use maildir on ReiserFS on Linux and have good luck. (The filesystem is VERY important for maildirs. ReiserFS's block tail support and directory indexing give it major disk space and speed advantages for a maildir mailserver application, while running something like maildirs on XFS would instantly kill your server. I hear mbx is pretty good too, if you're stuck on some sort of standard filesystem since it uses indexing and fewer files than maildir. The downside is that it's not as immediately parseable as maildir or mbox... Ie you couldnt write a script to say... delete extremely high scoring spam messages from any user who hasn't checked their mail in over 3 months, or other things ISP's might routinely do to maintain their servers.

    Finally, if you plan to scale way up there (60,000+), you need to start looking at better cluster systems than just a couple machines. Specialize the tasks of several machines to do mail storage or talk POP3. Look at something like POPular for specialized POP3 server clustering software.

    ~GoRK