Infrastructure for One Million Email Accounts?
cfsmp3 asks: "I have been asked to define the infrastructure for the email system for a huge company, which fed up of Exchange, wants to replace their entire system with something non-Microsoft. I have done this before, but not for anything of this scale. Suppose you are given a chance to build from scratch an email system that has to support around one million accounts. Some corporate, some personal, some free. POP, IMAP, webmail, etc are requirements. The system must scale perfectly, 99.9% uptime is expected... where would you start?"
I'd start by submitting a question to Ask Slashdot.
gmail.google.com
Support the First Amendment. Read at -1
At IBM we use Lotus Notes which has saved us LOTS of virus hassles. Every employee has an account and we're something like 320,000 worldwide. The mail "databases" are spread among Domino servers but I don't know what platform these run on, or what hardware specs they have. I imagine it's either Windows or Linux... but who knows, maybe we're using some of our PowerPC-based iSeries servers. These are the boxen formerly known as AS/400.
I'd ask for six bullets. Why would you want to risk getting the empty chamber?
Here's Slidey's post. (Disclaimer: Copyright blahblahblah appropriate people yadda yadda fair use etc etc don't sue me, thank you)
---
ok i work for a large uk isp in the messaging (email) operations dept. we currently have 2.5-3 million active accounts (and a load of suspended), and manage anywhere upto 12-16million mails per day
our setup is like this (this is simplistic though):
front line - anti abuse mta's - these do dnsbl type lookups (spamcop, spamhaus and sorbs). we have 9 incoming
next we have mta's. they farm mail off to brightmail servers, which do similar to spamassassin. we have 6 incoming mtas, and 8 brightmail servers (not enough - high load)
after that they farm off to vscans (6)
after that any mail that gets through is delivered to mail stores (8 + 2 hot spares)
what you want to be doing is similar to this above - chaining hte mail from one level to the next. the first level should be the rbl's - these are less processor intensive, and can remove a fair whack of your mails in one swoop. spamassassin is going to be more cpu intensive, since it has to open each mail and read the first x many bytes
id have separate machine(s) holding your master directory, and if you can get directory caches then do that too (to take the load off the master directory) - ours run oracle
i dont know what your budget is, but split up hte different tasks as much as possible. that way if you need to add more to any pool (rbl lookups, spamassassin etc) you just add another machine..
one last thing - we also have a separate box just for postmaster mail (with exim + spamassassin funnily enough) - it tends to get busy
Last edited by Slidey on 09-08-2005 at 11:19 PM
--
(end of quote)
Gmail is beta.
:)
Gmail does not have guaranteed uptime.
You do not pin your companies communications system on something you cannot sign a SLA agreement with.
need I go on?
1) It'll run on anything - Win32, Linux, BSD, Solaris, x86, XServers, Alphas, Power5
2) It'll scale as big as you can dream - over 5 million accounts with clustering
3) MAPI support
I'd ask for six bullets. Why would you want to risk getting the empty chamber? I see that you are familiar with the subtle nuances of Polish Roulette.
My job is building systems like this. Current mailserver system I designed and built is hosting 80,000 email accounts, and will scale out to a million quite cheaply by just adding more machines.
/maildirs/domain.com/user/Maildir - split the domains up with a 2 level deep hashing algorithm (if you're virtual hosting domains, which is what it sounds like to me), so make it something like /maildirs/xx/xx/domain.com/user/Maildir, where xx/xx might be something like 3f/6b (depending on the hash). Use MD4 for the hash because its more balanced than MD5.
/var/spool/exim the internal mirrored disks. DHCP them, then all you do is plug a machine in and set it to PXE boot. Pretty trivial to do.
:)
OpenLDAP
You need a central configuration repository to store the email accounts, their passwords, etc. OpenLDAP is perfect for this, and you can replicate it out for scalability. Be prepared to learn about LDAP schemas.
Exim
Use Exim because it has a simple process model (a single binary that does all the work, like sendmail) but has a human readable configuration file and has to be the most flexible MTA out there. You will have customers with weird requirements sometimes, and Exim will be able to meet those. Plus, it has Exiscan-ACL built-in these days, which allows you to do virus scanning and spam scanning at the DATA stage, before the mail is actually accepted by the MTA. It means you can make the sending MTA deal with the bounces if the mail is a virus or is obvious spam.
Courier-IMAP for POP3 and IMAP access.
Yeah its written by a sociopath, but nothing else works as good in the field. It works out of the box with sensible LDAP schemas and is fast, reliable and secure. Handles SSL, all the different authentication methods, what have you. Maildir compatible.
Maildir message store.
Store the mail in maildirs. Don't put them in
NFS mount the maildirs from a fast NFS device like a Netapp. Netapps are recommended because you can plug them in, and they just work, plus they are easy to scale by adding more trays.
Linux NFS servers set up with heartbeat and shared disk also make a nice HA NFS, and would be cost effective, but you'll have to buy an array anyway (probably fiber channel) so it might be better just get something thats completely integrated like the Netapp.
Spamassassin.
Can be configured to scan make at DATA time in the SMTP conversation. A LOT of configuration work here to make it play nice on a massively scaled platform, but it can be done. Mostly it needs to have things like the auto whitelisting and bayseasn filtering turned off, as the extra DB file work is a bit excessive.
Actually, I'm sure there is a way to make it work with a less resource intensive repository, but using the standard SA rules seems to work well for my environment. *shrug*
ClamAV.
Free antivirus, it works, and integrates well with Exiscan-ACL. Set it up to scan via the daemon, and configure it to update every couple of hours from cron, and bob's your uncle.
Scaling out
Make every box the same. Make every box an MTA, a POP3/IMAP server, etc. Use something like Kickstart to automate builds so that you can build a machine in 10 minutes, and all you have to do is configure the IP address and plug it in. If you want to be REALLY sexy, you could make the machines boot off the network, and mount / from a shared NFS area, and make
Load balancing
Hardware load balancers are pretty much a necessity. Don't touch cisco stuff. Its not very good. Go with Foundry Networks ServerIrons. The XLs can handle 1 billion requests/day if you configure them in Direct Server Return mode (also known as DSR/Foundry switchback). Use it. It makes all the return traffic go directly out to the net, meaning your ServerIrons have to switch less traffic and track less sessions. I would recommend however for a million users a pair of the ServerIron 450GTs, or bigger. Maybe one per VIP/Service.
Now, if this is all looking pretty daunting, you could always hire me to build it for you
... Is anyone wondering what's going on at Microsoft right now?
It starts with a slashdot geek working in the email department spitting up his coffee, followed by a few rumors which make it up to a guy in accounting and customer service, followed by frantic management emails, including some inappropriate language, from Steve and Bill. Then a few good geeks start tracing who this cfsmp3 guy is and try to trace him to a company while the salesreps begin coldcalling any customers running around 1 million customers.
And Microsoft will botch it because they have no experience in cowtowing and bootlicking, which are important skills for any company who wants to humbly keep its customers.
"All great wisdom is contained in .signature files"
Resign. You're obviously in way over your head if you have to resort to asking Slashdot readers for advice like this.
Hi Cliff;
:)
;)
Sounds like a fantastic design opportunity here. The 5% of the project that is Enterprise architecture is what I enjoy the most as well. I'm assuming money probably isn't an object in terms of how much gear and bandwidth you may have to feed to this.
I'm happy to let my fingers type away below, I'd love to keep in touch and see how you end up shaping this system. my email is allowmx at hotm...
Before I ask, are there actually a million accounts? Or is that just a ceiling that you have to show proof of concept with?
I've only implemented up until about 250,000 accounts of any kind, as I'm sure you're probably aware, the base transactional resource costing is essentially the same..
For me, I would look at this for sure from at least these two angles:
1) knowing your transactional costs (how much of your hard resources, bandwidth, cpu and disk space) will each type of transaction in your system take?) I mostly use this approach to get not an exact number, but an idea of magnitude, and detail where it happens on it's own to make sure the proper attention is applied to them.
2) Failsafe intelligence & capacity in the infrastructure, as well as the failsafe intelligence & capacity in at the application layer. You have to know that your hardware, software, os, business logic and applications are all monitorable internally, externally for availabilty and actual "can I use it". Transactional logs, etc, of having information available when the inevitable problems come up.
Also, having a capacity for as many of these layers to be self-healing, and fungible to the point that your service delivery is homogenous in as many ways possible. If your network finds something doesnt work or route, with mail, you can find another way to route it. Having a transactional manager of some kind, direct or not, could be useful in this case depending on what the client wants.
99.9% uptime equates to about 526 minutes, or 87.6 hours you _could_ be down each year. Thats about 7.3 hours a month, or one day a month.
Based on that, having flexible, redundant tools setup in a high-availabily arrangement at their respective operating capacities is key. I'm not sure if your current exchange problems are being aided by not enough equipment, bandwidth, or other stability issues, so I'll just assume that it's all of them
I apologize if anyone else has already mentioned some of this, but here's some of what I've found to help me where email has become as crucial to a business as their cell phone.
On the hardware level:
- STORAGE: Everything goes on a SAN, if not more than one. Don't waste your time with anything less.
- SERVERS: All servers have redundant hot swappable parts in the very least, power and hard drives. I'd even suggest making the servers Iscsi bootable so they can boot off the backbone. Beyond this, I like to buy my servers in piles of identical ones. Have 1-2 spare serevrs of each kind sitting there, ready to throw hot swap drives into from a failed server. That way if a server dies, you can address the power supplies, or get the HD's in that machine into another identical server and get it up and running while you diagnose the hardware problem independantly. My approach to any kind of problem is FIX, DETECT and REPAIR. Get it up and running, find out what was wrong, make sure it's fixed for good. Too many of us stop at the first too
The idea I have in mind is a smaller scale of a google beige box army. linux/bsd offer so much more transcations for each piece of hardware, so that works very much in your favor. Obviously something enterprise grade to satisfy the client such as the Compaq/HP Proliants, etc. I feel these Servers ahve the best overall support, manageability and information tools, and their openlinux drivers interface wonderfully with open source operating systems)
Networking/Communication level:
- Entire mail processing architechture communi
A single server? For one million users?
Insert "imagine a beowolf of those" joke here, except it isn't a joke.
I think you might be underestimating the requirements for this large a project that "must scale perfectly". The "99.9% uptime is expected" requirement alone requires multiple internet connections, a large cluster of front end servers, and redundent database servers, preferably located in different states. (ie: "What do you mean our only server is in New Orleans?")
I don't think the average Dell dual Xeon box is up to the task for this large a project...
Tequila: It's not just for breakfast anymore!
Ah! Sendmail!
"I think it would be a good idea!"
Gandhi, about Internet Security
Definately agree on point 9. I maintain a mail server of over 2,000 users. Currently running Qmail with the following patches:
chkuser-2.0.8b-release.tar.gz
doublebounce-trim.patch
netqmail-1.05-tls-20050329.patch
outgoingip.patch
qmail-smtpd-auth-0.31.tar.gz
qmail-smtpd-auth-close3.patch
qmail-smtpd_gmfcheck.patch
qmail-spf-rc5.patch
Most of these patches require hand editing the sources and Makefiles to successfuly merge them all into the stock qmail or netqmail base. Lots of manually reading through *.rej files to make it all work.
In order to simplify new installations I've created my own personal CVS repository for my Qmail sources. I commit changes to the tree whenever a new patch comes out with functionality I need. Hence on a new install I simply check out my custom tree and compile.
The initial work was a royal pain in the ass, however, once it is all up and running the stability and performance has been excellent.
AKO (www.us.army.mil) is the Army's official intranet portal. We provide email for over 1.72M users, and we move almost 3 million messages a day. We do it all with Sun Messaging Server ver5.2 (soon to be Jes3) and we have exactly 2 (count 'em) two mail administrators. Sun mail is rock solid and scales great. We offer POP, SMTP, enterprise SPAM and Virus filtering as well as personal address books besides. We don't get the rich Outlook fat client, but then we want to be all web-based anyway. Can't say enough about Sun mail. If we had to do this with Exchange, I'd have to hire prolly 50 admins and deploy order of magnitude more machines.
No. 0.1% != 0.1
365 days * 24 hrs/day = 8760 hours per year
0.1% downtime = 0.001 downtime
8760 * 0.001 = 8.76 hrs
You're off by two orders of magnitude.
8.76 hrs / 12 months = 0.73 hrs/month = 43.8 minutes/month
One 45 minute scheduled downtime (assuming its scheduled) per month isnt terrible. It's not great, but costs really start to go up as you add nines beyond those 3.