Intel and BlueArc Set New Mail Server Record
louismg writes "With e-mail traffic continuing to explode, Intel and BlueArc announced this morning that the two companies have set a new SPECmail benchmark record in cooperation with CommuniGate Pro, offering a solution that can serve 30 million messages per day - 67% ahead of the previous record, owned by Sun Microsystems. Rather than clustering a lot of smaller servers together, large ISPs can now use fewer systems to handle massive traffic load."
Constructing templated emails and blasting them out to multiple servers doesn't require a full mail server.
They are better using customised software which doesn't care about inbound mail.
liqbase
Interestingly enough, they set their record using a message store mounted on NFS. It had 140 FC/AL attached disks, and 14Gb of RAM.
Virtually every file handle an MTA writes to is opened "O_SYNC". One of the quickest ways to make Sendmail or other common MTA's go fast is to mount their delivery queues on a solid state disk. I'm betting this disk array is turning around the queues without ever committing the data to the platters. (Not that there's anything wrong with this...) I am left wondering if there isn't some bit if NFS trickery not reported in the config.
But looking at the Sun entry, the old record was set using 2 year old software, and a much smaller disk configuration. Sun will probably update their entry in the near future, just to reclaim the crown. Email is much more an I/O problem than a CPU problem. Sun used to push their mail server on much larger HW, but most ISP's don't want to buy big boxes these days. The small to medium sized boxes, connected to a SAN are more cost effective, permit redundancy and easier maintenance.
With SPECmail, the issue isn't how many messages per
second can your kick ass SMTP server handle; it's how many (simulated) concurrent & slow modem dial up POP connections (many), IMAP connections (some), and
other dross can the overall system handle. Again,
concurrently. And SPECmail does lean very heavily
towards simulating a world with the overwhelming
majority of users looking like POPers with dialup
modem connections.
So, while being able to handle lots of inbound messages per second and delivering them to your
message repository is important, there are other
factors too and some of them more heavily
weighted.
I did a little "reliability analysis" for some telecom equipment a while ago, so I am semi-informed (not a guru).
There are two terms that often get interchanged, when they shouldn't. Reliability is the ability of the system to run without repairs. Availability is the ability of the system to do its job.
So the large monolithic system can be built out of very good (and expensive) components that do not fail as much as commodity hardware. This will lead to fewer failures and better reliability.
The commodity hardware can be arranged so that redundancy ensures that if on component fails, then another will take its load. Since the damaged component needs to be serviced, the reliability is lower, although the availiability of the system is the same.
Reliability is used by planners to determine the labor costs in keeping a system running. Availability is used by planners to make uptime predictions and to take measures to provide a certian level of service.
Two similiar numbers that are used for different purposes.
Remember, You are unique...just like everyone else.
So that includes users connecting, picking up email, deleting from their data store etc etc etc.
Disclaimer: I have two friends who work for Bluearc but have no other connection to the company
Just thought I'd add a few details and address some of the questions here. My name is Thom O'Connor and work for CommuniGate Systems (CGS), and was the one who put together and ran these tests - you can (mostly) verify this by looking at the comments in the source on the results page.
First off, on the SPECmail test itself. SPECmail is a standardized test (the only one I'm aware of for email) that attempts to closely regulate a level playing field for measuring email performance. It is critical to understand that this is not just measuring SMTP. The 30 million message a day text is a little vague, but it is important that this includes a distribution of delivery, relayed, and retrieved email. Sure, anyone can just relay many millions of messages an hour.
SPECmail does POP and SMTP, so the test measures not just MTA behaviour but also local delivery and then retrieval of the messages. The SPECmail test also uses Quality of Service (QOS) measurements such that a message injected via SMTP to the system MTAs (the CommuniGate Pro Frontend servers in this diagram) must then be delivered locally into the users' account, then be retrieved within 60 seconds. Satisfying the QOS criteria during the benchmark is often the most difficult part.
So, SPECmail itself just does POP and SMTP, which is a little 1990s I agree, but SPEC is coming out with a SPECimap test in the near future, and CGS is also very interested in seeing a SPEC VoIP/SIP test for measuring CommuniGate Pro's Real-Time capabilities.
A few others questions I've seen raised here:
1. The CommuniGate Pro Dynamic Cluster described in this test is fully and completely appropriate for production use in all aspects. In fact, if you're running a 2+ million user ISP on a CommuniGate Pro Dynamic Cluster, we'd recommend you to use these results as a guide for your architecture (although load balancers should be added to the gateway point for all inbound connections). In fact, CGS has ISP customers running architectures which match the layout of the described system almost exactly. All systems in the Cluster service all accounts - you could lose 4 Frontend Servers and 3 Backend Servers, and all users could still access their email (albeit with decreased capacity).
2. HyperThreading was disabled in the BIOS because the downloadable Solaris 10 x86 operating system would not (yet?) support the Intel x86_64 Potomoc chipset properly. That said, on top of the recent security vulnerabilities on the topic, we have also discovered miscellaneous threading and even NFS issues related to having HyperThreading enabled on Linux 2.6, FreeBSD 5.4, and Solaris 10 x86 systems.
3. On NFS...NFS is used safely and securely in this test. The integrity of data storage is one of the major criteria that the SPEC organization closely evaluates when reviewing a SPECmail submittal. Obviously, there are many ways to cheat and/or cut corners using Solid State Disks, unsafe RAM for message queueing, and other techniques that you would never want to use on your production message system. However, the test described here was performed using a standard (albeit excellent) BlueArc Titan Storage System with write caching only in NRRAM and using proper mount options and layout for security, redundancy, and data integrity.
Hope this clears up any misconceptions. Obviously, I'm clearly biased about the work here, but assembling and then passing a SPECmail test of this size is a gigantic effort. If anyone thinks