Good POP3 Server for Huge Mailboxes?
brainchill asks: "I've got about 10,000 users split between a couple of quad 550 xeon machines. The machines have 2GB of ram. The problem is that the UW POP3 server takes a huge hit in both cpu and memory utilization when a 40+MB mail spool is requested via POP3. Sometimes it's bad enough to drag the monster boxes to their knees. What other POP3 daemons do you guys have experience with and how do they perform with large mailboxes"
MS Exchange.
You know you want to.
You won't get good performance with mbox, period. You need to switch to Maildir. qmail-pop3d works great with Maildir. Maildir scales far better than mbox since it doesn't have to parse out the individual messages. It also doesn't have to use locking. This also makes Maildir inherently more reliable than mbox. There are many tools available to convert between mbox and Maildir.
I use mbx personally (NOT mbox) and it scales wonderfully. The mailbox is fully indexed to speed up searching and can be accessed simultaneously by various processes. See here for more information.
I can only second that. qmail runs like a charm and scales.
Check out cr.yp.to/qmail.html and www.qmail.org
Have you looked at Cyrus? It is probably best known as an IMAP server, but it has very nice pop3 support as well.
Cyrus stores messages in a variation of the maildir format - it maintain a database of the flags, headers, etc for the messages in a folder to speed up access.
Notable features include shared mail folders (with independent views), quotas, multiple mail partitions (with the ability to move users across partitions on the fly), duplicate email checking, and a server side filtering language (sieve).
Most of this would probably be most useful if you were using IMAP, but it should scale quite well as a POP server.
I have never understood the hostility of mail admins to large emails. Simply because a number of mail servers have piss-poor performance with large emails is not a reason to go crazy over them. Fix the mail server.
Oh, and the fact that some people still use POP3, and their life is made miserable when they're working with large files over a modem. People should use IMAP.
There are quite legitimate uses for file transfer via email. Most people (i.e. not UNIX geeks) do not want to maintain a file server and keep their system up 24/7. The other person may not be at the computer...this puts it in their "queue of things to deal with".
If you mean "why don't people use ftp to transfer files to a third, intermediary system that acts as a drop box"...well, that's doing exactly what you're doing with SMTP. Why *not* do it with SMTP?
Finally, from a user perspective, mail is much more convenient to use than dedicated file transfer protocols. Most people constantly use a mail program and know how to use it reasonably well. Everyone has an email address (a more useful mapping to users than an IP address that FTP would require), and there are no worries about different companies having different places to drop files. Email lets users sort and date emails, and tag files as being from some user. It makes it accessable from anywhere they can get at their email.
Another thing that mail admins should live with is large mailboxes -- not just a single mail, but people leaving mail on the server, or keeping old mail around on the server. This is one of the *best* things to happen to IT. It's been the holy grail of NC designers for years. Centralize data storage to reduce costs, allow reuse of hardware, and facilitate backup.
Frankly, if anything, mail should be extended to have *better* support for this (like resumable transfers, etc). The FTP model -- where you have machines that are always up 24/7, users that associate well with "computers" rather than "other people", users that are familiar with a larger number of programs, and a network that has no firewall or other restrictions -- simply doesn't fit the reality of what's going on at businesses today. It's fantastic for techies who want to work with their own systems, but less good for your average end users.
May we never see th
Good christ, you'd think that by the time you outgrew a QUAD XEON mailserver with only 5000 users, you'd have been reevaluating performance before plunking down what must have been close to 10-15 grand or more at the time on a second one!
Your mailbox format is all wrong. Storing all messages in a single file is pretty much the worst way to do anything useful. You want to explore some alternative storage format such as mbx or maildir. I personally use maildir on ReiserFS on Linux and have good luck. (The filesystem is VERY important for maildirs. ReiserFS's block tail support and directory indexing give it major disk space and speed advantages for a maildir mailserver application, while running something like maildirs on XFS would instantly kill your server. I hear mbx is pretty good too, if you're stuck on some sort of standard filesystem since it uses indexing and fewer files than maildir. The downside is that it's not as immediately parseable as maildir or mbox... Ie you couldnt write a script to say... delete extremely high scoring spam messages from any user who hasn't checked their mail in over 3 months, or other things ISP's might routinely do to maintain their servers.
Finally, if you plan to scale way up there (60,000+), you need to start looking at better cluster systems than just a couple machines. Specialize the tasks of several machines to do mail storage or talk POP3. Look at something like POPular for specialized POP3 server clustering software.
~GoRK