Good POP3 Server for Huge Mailboxes?
brainchill asks: "I've got about 10,000 users split between a couple of quad 550 xeon machines. The machines have 2GB of ram. The problem is that the UW POP3 server takes a huge hit in both cpu and memory utilization when a 40+MB mail spool is requested via POP3. Sometimes it's bad enough to drag the monster boxes to their knees. What other POP3 daemons do you guys have experience with and how do they perform with large mailboxes"
I use mbx personally (NOT mbox) and it scales wonderfully. The mailbox is fully indexed to speed up searching and can be accessed simultaneously by various processes. See here for more information.
I can only second that. qmail runs like a charm and scales.
Check out cr.yp.to/qmail.html and www.qmail.org
I work for an ISP where we have ~ 50 000 email users. Maildir's great when you have a few messages, and if one of these messages happens to be big then it doesn't matter. However, if a user has tens of thousands of emails of whatever size in their mailbox (happens far, far more often than you might think) then just getting a list of files in the directory can take an age. In the scenario where a user has masses of small messages (sub 2k) then mbox would probably be faster.
Whilst I'd certainly recommend using Maildir over mbox, it's certainly not going to solve all the problems.
Blaming GW Bush for the Iraq war is like blaming Ronald McDonald for the poor quality of food.
Have you looked at Cyrus? It is probably best known as an IMAP server, but it has very nice pop3 support as well.
Cyrus stores messages in a variation of the maildir format - it maintain a database of the flags, headers, etc for the messages in a folder to speed up access.
Notable features include shared mail folders (with independent views), quotas, multiple mail partitions (with the ability to move users across partitions on the fly), duplicate email checking, and a server side filtering language (sieve).
Most of this would probably be most useful if you were using IMAP, but it should scale quite well as a POP server.
However, if a user has tens of thousands of emails of whatever size in their mailbox (happens far, far more often than you might think) then just getting a list of files in the directory can take an age.
This is a filesystem problem. Use a better one. On FreeBSD, enable dirhash. On Linux, use ReiserFS or ext3 with htree.
Good christ, you'd think that by the time you outgrew a QUAD XEON mailserver with only 5000 users, you'd have been reevaluating performance before plunking down what must have been close to 10-15 grand or more at the time on a second one!
Your mailbox format is all wrong. Storing all messages in a single file is pretty much the worst way to do anything useful. You want to explore some alternative storage format such as mbx or maildir. I personally use maildir on ReiserFS on Linux and have good luck. (The filesystem is VERY important for maildirs. ReiserFS's block tail support and directory indexing give it major disk space and speed advantages for a maildir mailserver application, while running something like maildirs on XFS would instantly kill your server. I hear mbx is pretty good too, if you're stuck on some sort of standard filesystem since it uses indexing and fewer files than maildir. The downside is that it's not as immediately parseable as maildir or mbox... Ie you couldnt write a script to say... delete extremely high scoring spam messages from any user who hasn't checked their mail in over 3 months, or other things ISP's might routinely do to maintain their servers.
Finally, if you plan to scale way up there (60,000+), you need to start looking at better cluster systems than just a couple machines. Specialize the tasks of several machines to do mail storage or talk POP3. Look at something like POPular for specialized POP3 server clustering software.
~GoRK