Ask Slashdot: Building a Large Email Service
Rewd asks:
"I'm looking at implementing a large scale email server (cluster) to handle POP3 and IMAP4 for about 25000 people, including a lot of attachments. I'd like to go for an Open Source solution, but a lot of people around here want to go for Microsoft Exchange on NT.
Has anyone here successfully built anything like this? Can you recommend any combinations and components which are particularly
efficent, capable, secure and reliable?"
I say with relative authority: Puh-leez!
MS would like people to believe that Exchange is an enterprise-level communications tool, when it fact it is a buchered and bloated decendant of a mediocre 1992 X.400 email system from Data Connection Limited (check out http://www.datcon.co.uk/press/messserv.h tm) Don't believe the version number; Exchange is in its second major release (4.x really is 1.x, 5.x = 2.x, etc) and still has significant stability problems.
In my experience, Exchange can support 300 users per server happily on commonly acceptable x86 corporate server hardware (say, a 2 processor PII with 512mb ram). It seems that (in my limited experience, lest MS lawyers take this to be a declaration of fact, which it is not) once you've reached this level, doubling the ram and adding more cpu's has only a minimal effect, which means that you really have to add more servers to add capacity.
Let's do the math. 25,000 users at 500 users per server (to be quite generous) means that you're going to need a Windows NT server farm of about 50 systems just to do email. Again, being generous bargain hunters, let's say you can buy one of these servers for $10kUS. That means you're out $500,000 just for hardware. In my experience, you can support 500 POP users easily on a SPARC 2 or IPX, which can be had these days for about $500 decked out (including a 17" monitor). You could support the same (probably many more) on a $500 x86 box running any of the free *nixes. Assume you blow $500 on disk storage for these boxen just to level the starting line, bringing the total cost to $1000 per. That's still only $50,000.
One less zero usually gets the accountants' attention on an expenditure like this.
But let's talk about administrative support. IMHO you're going to need 1:1 admin per NT server at that usage level, given that remote admin of NT is difficult, and 500 users per server is going to prompt more than the occasional pretty blue interface. (Nevermind the security team you're going to need for a major NT installation.) Say a cheap NT admin costs $50kUS including benefits & overhead. You're looking at an HR budget of $2,500,000us. On the other hand, say you splurge and spend $150kUS per *nix admin. If they couldn't handle 10 little boxen apiece, I'll eat the electrons this was posted with. That's an HR budget of $750,000us.
That's 1/10th the hardware expense and 1/3 the maintenance expense of using Exchange. And that's (a) making some wild assumptions that benefit the Exchange argument, and (b) assumes that you're running *nix on shit hardware. Spend 5 times as much on hardware for new, supported stuff (say $250,000us, which would buy you a couple of well-outfitted Sparc 4500s, or 10 really gorgeous systems from VA Research). Your downtime will become next to nothing, you'll still have spent only half of what you would have for NT and Exchange, and your ongoing yearly administrative cost will be 1/3 of the other option. The *nix administrative savings alone will pay for the *nix hardware in a few months.
Oh yeah. I forgot the expense of 50 copies of Windows NT, 50 copies of Exchange Server, and 25,000 client licenses... (*erk*!!)
I think not...(*poof*)
I'm short on time, but I wish to submit what may be the ultimate Exchange story:
A sysadmin at, ahem, a "large jeans manufacturer" was put in charge of Exchange on hundreds of NT servers. He dutifully logged and reported dozens of bugs, system outages, etc., to MS support, as the thing crashed and burned like the Hindenburg II. After a few months of this, Microsoft decided to act on the problems. The solution was simple: they sent a letter to his boss saying he was a troublemaker.
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
Listen to this advice, it's obviously born on the hard back of experience, just as much as me reiterating this same line: do not use exchange.
For example:
This is only a start, but I'm sure other people have many of their own reasons as well...
I remember our migration of a mere 750 (users) with extreme horror. We had to manually create each user.
You can create mailboxes in exchange via a config file with the mailbox import tool, although I figured it out by looking at files it created and not via any documentation. With exchange 5.5 I'm pretty sure you can create mailboxes with ldap (although this is far from documented last I looked).
As to solutions, I haven't used any open source email solutions with more than ~5000 users, for which sendmail and the UW pop3d and imapd worked well for the users that I had (many were very light on email). I'd be really neat to integrate an MTA and an IMAP server with ldap to support IMAP referrals and smart mail redirection. I know some of this is done as sendmail has LDAP patches and example rules for this, but I'm not so sure about IMAP side.
When you have that many users you have to have a nice structure for the usernames, which isn't the /etc/passwd file. And, you need a mailbox format that isn't linear, like the normal mbox. The rest of the problems can usually be solved with hardware (think about using a raid).
I know of three potential semi-free solutions.
Carnegie Mellon Cyrus (go to the FTP site and download the latest version. Don't rely on the way out of date web page to link to it.) IMAP server.
University of Washington's imapd. This seems to be under more active development, and supports a nice range of features, mailbox formats, and security mechanisms. However, it uses the passwd file (although you might be able to get around this using PAM) and it doesn't natively support quotas. (although you can do this at the OS level.
Darthmouth's Blitzmail Server: This has been ported to linux, and is *wonderfully* scalable across multiple machines. It inlcudes its own directory services too. The only problem is that it doesn't support Imap (although some work has started on that front), and the only database it supports as a backend is oracle. I would love it if someone hacked it to use mysql of postgresql with IMAP support, but that's a tall order. The client is also under-featured.
All of these have their drawbacks though. You might wish to go with a commercial IMAP/POP server on linux. There are a few good ones that exist. You definitely don't want to go with exchange. A lot of people go that route because they are forced to. My experience with exchange 5.5 was so bad that I would not recommend it to anyone.
-OT
Sendmail's the answer for us. The only thing that hasn't scaled well is plaintext aliases files: we've got some 20K mail lists, and it's beginning to get somewhat messy, so we're having to go to the non-plaintext solution. But for all the rest, it's stock sendmail with various GUI backends for end-user ease-of-use (and security). Note that we don't have 25K users, but 17K isn't that far off, and we do a *lot* of e-mail.