Slashdot Mirror


Infrastructure for One Million Email Accounts?

cfsmp3 asks: "I have been asked to define the infrastructure for the email system for a huge company, which fed up of Exchange, wants to replace their entire system with something non-Microsoft. I have done this before, but not for anything of this scale. Suppose you are given a chance to build from scratch an email system that has to support around one million accounts. Some corporate, some personal, some free. POP, IMAP, webmail, etc are requirements. The system must scale perfectly, 99.9% uptime is expected... where would you start?"

1,216 comments

  1. Obviously by SpiffyMarc · · Score: 5, Funny

    I'd start by submitting a question to Ask Slashdot.

    1. Re:Obviously by CDMA_Demo · · Score: 5, Funny

      I'd start by submitting a question to Ask Slashdot.

      Upon which the global "wankfest" will commence, leading to solutions ranging from Novell to qmail based solutions, upon which the OP will look for someone else for advice, upon which the OP will end up paying an IBM consultant to set up his company's email.

    2. Re:Obviously by WarPresident · · Score: 5, Funny

      I'd start by submitting a question to Ask Slashdot.

      Ah, a proof by contradiction, eh?

      --
      Here come da fudge!
    3. Re:Obviously by dzelenka · · Score: 2, Funny

      Or complain loudly enough to be an embarrasement to Microsoft and they will supply equipment and support to get Exchange running smoothly!

      --
      Bah!
    4. Re:Obviously by nofx_3 · · Score: 0, Offtopic

      First posts are overrated. After reading Slashdot daily since 1999, I finally got my first chance at a first post a couple weeks ago, but in the time I needed to post in I didn't have time to read the article and therefore my post was less than insightful. Better to have an insightful post at post 39 than a crappy post at post 1.

      --
      Visualize Whirled Peas
    5. Re:Obviously by CDMA_Demo · · Score: 1

      the best one liner on Slashdot ever...sad the moderators don't understand. What has the world come to!

    6. Re:Obviously by Cruxus · · Score: 1

      Nonsense, the beauty of first posts is that they are witty at best and may not even seem relevant at all after reading the article. The lords of Slash wouldn't have it any other way.

      --
      On vit, on code et puis on meurt.
    7. Re:Obviously by interiot · · Score: 1

      Though there are three or four posts in this article that SOUND like they're not completely making shit up, so who knows. Keep your windshield wipers ready on the commute home in case you run into any flying pigs.

    8. Re:Obviously by kryonD · · Score: 5, Interesting

      Or maybe this is a legitimate cry for help from EDS who duped the US Navy into thinking they could actually outsource IT on the exact scale that the poster is talking about. Mind you, no one has ever provided ubiquitous support for an organization as large as the Department of the Navy, but they somehow convinced congress that they could do it for $6B dollars.

      Just so you know. Most of us out in South East Asia refer to NMCI (Navy-Marine Corps Intranet) as the Not Mission Capable Intranet.

      --
      I've dirtied my hands writing poetry, for the sake of seduction; that is, for the sake of a useful cause. --Dostoevsky
    9. Re:Obviously by whackco · · Score: 5, Insightful

      Actually, I was going to use "Obviously" as my subject line... so I'll just respond to yours.

      I work with Exchange, and think that the chances are better that they just had shitty architecture to begin with. Exchange is a great platform and scales well, so if the original people wouldn't do it, well then f*ck em.

      Stilll convinced to migrate? Well, something with multiple datacenters, large scale, compressed SAN backend, and alot of clustering will do it. Shit, you could do the entire thing with MySQL if you REALLY wanted to. Moving the existing data over will be a huge pain no matter what you migrate to though.

      My suggestion? Don't just jump off Exchange, do a proper requirements analysis and you might find it is alot cheaper to just redesign the existing architecture.

    10. Re:Obviously by EnderWiggnz · · Score: 4, Interesting

      WalMart runs the worlds biggest Exchange install. They and msft are quite proud of it, actually...

      The Navy maywant to take a page out of walmarts book, if they're having that much trouble.

      --
      ... hi bingo ...
    11. Re:Obviously by AKAImBatman · · Score: 5, Funny

      upon which the OP will end up paying an IBM consultant to set up his company's email.

      At which point the highly paid consultant will post a question to Ask Slashdot...

    12. Re:Obviously by 88NoSoup4U88 · · Score: 5, Funny

      The obvious answer is of course : Send all those thousand employees an Gmail invite !

    13. Re:Obviously by Moofie · · Score: 1

      Angling for a new job, eh? Clever. Evil, but clever.

      --
      Why yes, I AM a rocket scientist!
    14. Re:Obviously by HotNeedleOfInquiry · · Score: 1

      A true geek must set aside logic and reason and revert to 3rd-grade humor and logic when faced with the opportunity of a first post.

      It's a sacrifice that many of us are willing to make.

      --
      "Eve of Destruction", it's not just for old hippies anymore...
    15. Re:Obviously by Stephan+Schulz · · Score: 5, Funny
      Or complain loudly enough to be an embarrasement to Microsoft and they will supply equipment and support to get Exchange running smoothly!
      Yes, but who can affort the space, electricity and cooling for 500000 servers (generously assuming that Exchange can handle 2 users per server)?
      --

      Stephan

    16. Re:Obviously by Karl+Cocknozzle · · Score: 5, Informative
      I work with Exchange, and think that the chances are better that they just had shitty architecture to begin with. Exchange is a great platform and scales well, so if the original people wouldn't do it, well then f*ck em.

      Your point about putting more effort up-front into design is well taken, but thhat advice applies to any platform...

      WIth that said, and without turning this thread into an Exchange bitchfest...

      Why in the hell can't you restore a mailbox from backup using only the tools you already have if the user is no longer present in Active Directory? You can't even export the mailbox with EXMERGE... Your choices are 1) 3rd party recovery tool (like Quest Recovery for Exchange) or 2) Build an ENTIRE OTHER SERVER and do a normal, full restore of the entire mail store so you can extract one measly mailbox.

      OBviously, the "Recovery Storage Group" feature is a VAST improvement over the old Exchange 5.5 way of bringing back just one mailbox (that being setup another server) but this is a MAJOR duh situation on Microsoft's part. They seem to think that since their "best practice" is to never ever erase any user account ever ever ever, that its okay to leave this gaping flaw in their enterprise groupware product. Sorry, but I think that sucks. We paid out the ass for "Enterprise" edition (to avoid the arbitrary 16gb limit on the mail store) and goddammit, I should be able to bring back a mailbox without its corresponding AD account without wasting a whole day setting up another server... I've only had to do it once (today) but the whole time I Was thinking how much esaier a mailbox restore on my OS X Server at home would be... Just restore the frickin' files and move on with your life.
      --
      Who did what now?
    17. Re:Obviously by HalWasRight · · Score: 2, Insightful

      Obviously school just started.

      --
      "This mission is too important to allow you to jeopardize it." -- HAL
    18. Re:Obviously by CProgrammer98 · · Score: 1, Funny

      DUDE!!!! You CANNOT have the word "Microsoft" and the phrase "running smoothly" in the same sentence.

      Please go and beat yourself up severely...

      --
      And the people shall be oppressed, every one by another, and every one by his neighbour Isaiah 3:5
    19. Re:Obviously by Anonymous Coward · · Score: 0

      And what kind of server doesn't let you set auto responders on peoples accounts. You have to log in as them to do it on exchange, either in outlook or OWA), or buy Sybex Out of Office manager (very useful.. if you're stuck with exchange, you should have it). It should be right click, Out Of Office. Enable/Disable, Enter Message: and that's it.

      It's a horrible horrible pos that should never have been.

    20. Re:Obviously by xenocide2 · · Score: 1

      Interestingly though, I hear Google does sell its search services to companies. I wonder if Google thinks it can provide gmail to companies for a fee. It's no wonder that Microsoft hates and fears everybody; each piece of software that a company becomes reliant on that isn't reliant on MS loosens the chains that keep Microsoft safe.

      --
      I Browse at +4 Flamebait

      Open Source Sysadmin

    21. Re:Obviously by jrockway · · Score: 4, Interesting

      > you could do the entire thing with MySQL if you REALLY wanted to

      I am so tired of people shoving everything into relational databases. What queries are you going to run against your database, anyway? SELECT * FROM messages WHERE read=0? Try "ls new" in your maildir. The reason things never scale right is because people design things to be "new" and "cool" like putting their e-mail into a relational database. No. Just use the filesystem. It, and its supporting tools, have been around for 30 years! It Just Works! It doesn't use any userspace memory! There are no permissions issues, because the kernel controls the permissions. It's the optimal solution.

      The filesystem is really really efficient (for e-mail) and really really reliable.

      Please, don't use a database!

      --
      My other car is first.
    22. Re:Obviously by superpulpsicle · · Score: 4, Interesting

      The Walmart exchange site was not properly backed up for "years". Mostly because Exchange was not 3rd party software friendly at all, and M$ didn't have much of their own backup software to offer. Veritas and Legato couldn't bend over enough for a million users.

      Walmart invited countless consulting firms and data backup experts. They deployed Exchange strictly because M$ was willing to "support" them. To say they were vulnerable to a major IT disaster was an understatement. The Navy want nothing to do with Walmart's IT.

    23. Re:Obviously by mollymoo · · Score: 5, Funny

      I didn't think there was anything more tragic to do on /. than boast about a first post. But the idea of boasting about a first post you didn't even make had never occurred to me. Kudos.

      --
      Chernobyl 'not a wildlife haven' - BBC News
    24. Re:Obviously by pjbgravely · · Score: 5, Funny

      WalMart runs the worlds biggest Exchange install. They and msft are quite proud of it, actually...


      Thanks, another reason to never shop there.

      --
      Star Trek, there maybe hope.
    25. Re:Obviously by Anonymous Coward · · Score: 0

      God damn you are lame. Serious and interesting topic, and you are worried because you missed your chance to become a first post luser because you were wanking off?

      Die, please. It's dumbasses like you that keep raising the signal to noise ratio here.

    26. Re:Obviously by calmdude · · Score: 1

      You should run a brick-level backup of mailboxes that are going to be orphaned (active directory account being deleted). A brick-level backup can be done with all sorts of backup software (Arcserve, Backup Exec, Networker) and you can restore just a single mail item if you wish, or an entire mailbox. Of course in large environments (even smaller ones), I wouldn't recommend running a brick-level backup on all the mailboxes (slow), but it's definitely a handy feature.

    27. Re:Obviously by MarkGriz · · Score: 4, Funny

      Whoa there cowboy...

      He said "up".... beat yourself *up*

      --
      Beauty is in the eye of the beerholder.
    28. Re:Obviously by AKAImBatman · · Score: 1, Interesting

      And a database file system would give you the best of both worlds. One message per file, yet the ability to quickly query for messages, and organize with a label system similar to GMail.

    29. Re:Obviously by pigwin32 · · Score: 1

      Check out http://aftermail.com/. These guys know how to manage Exchange data. I've seen the presentation and I'm sold and I don't even look after my own mail well let alone anyone else's. Seriously this looks like a solid product for exactly the nasty Exchange issues you're talking about (except for the nasty Exchange itself).

    30. Re:Obviously by Alascom · · Score: 1

      I would send an email asking Google to create a Google Appliance for email using the Gmail interface...

    31. Re:Obviously by cecil_turtle · · Score: 3, Interesting

      I don't know if you actually have experience running a mail server or not or if you just wanted to go off on your relational db rant, but mail data tends to be created and deleted A LOT with varying size files, and file-based structures on a mail server create serious fragmentation problems. If you do decide to go this way, allow plenty of free drive space - well above normal recommendations - like 80% free or more.

      Also many people have their mail clients set with ridiculousy frequent mail check times (like every minute), and on a file based system each check requires a trip to the drive and back. Even with the data on a RAID array with a decent read/write cache, you're still going through the disk subsystem, whereas with a database it would all be in memory.

      What's wrong with SELECT * FROM messages WHERE userid=xyz and read=0? That is a cakewalk for a properly indexed dbms. On a medium sized server (say, quad processor w/ 8-16GB RAM) there is more userspace memory than os memory space.

    32. Re:Obviously by killjoe · · Score: 1, Insightful

      grep searches the files really fast.

      --
      evil is as evil does
    33. Re:Obviously by Not+The+Real+Me · · Score: 3, Insightful

      What does Hotmail run these days?

      I am under the impression that if Hotmail were running clusters of Exchange servers Microsoft would be quite vocal in the enterprise scalability of Exchange.

    34. Re:Obviously by jerkychew · · Score: 4, Insightful

      Since you've taken things off topic, I'll grab the wheel and pull it right off a cliff.

      The reason Exchange uses a database can be summed up in three words: Single Instance Store.

      Say you send one 1MB Word document to 100 of your colleagues. In a relational database-based, Single Instance Store-driven mail server, that document takes up exactly 1MB on the server. If somebody in the organization forwards the Word doc to the remaining 900 people in your organization, how much space does it take on the server? 1MB.

      Send a 1MB document to 1000 users on a flat, mbox-style mail server, and how much space is taken up on the server? 1000MB.

      I see your point about some things, sure. Being able to jump in and restore a mailbox from tape by just dumping a folder somewhere is nice, but it just doesn't scale in terms of storage the way a db-driven mail system does.

      Don't flame me as an MS advocate. There are times when an SIS-based email system is good, and there are times when a flat email system is good. I've run Exchange environments for 500+ people, and I've run Linux-based mail systems for 1000+ people. I'm just saying that your particular argument is one-sided and flawed.

    35. Re:Obviously by cc.Scotty · · Score: 3, Funny

      Get your company signed up as an early adopter on the next beta version of Exchange. It will surely solve all your problems!

    36. Re:Obviously by neckdeepinspecialsau · · Score: 1

      You write a simple bot that sends out gmail invites to a mail server it has access to (the only thing this email server is used for is to collect invites). It keeps track how many invites it has to give out and dispenses invites to itself sets up the email accounts for users then harvests all invites in that account. For every user you set up you get 50 more invites. If invites expire you factor that into the bots behavior.

      PS: Dear Google staff I would never consider doing this please don't lock down my gmail account. I do love it so.

    37. Re:Obviously by AnyoneEB · · Score: 4, Insightful

      Or you could just use a filesystem that supports hard-linking files (see: man ln), so you do not have to worry about that even when using a filesystem for this purpose. Since such a file is read-only, it could just be linked to all of those people's mail boxes. If you do not know what a hard link is, it is basically the same thing you are describing, except done in the filesystem and handled transparently by the kernel. Basically, every "file" you see in an Ext 2/3 filesystem is really just a pointer to where the file is stored, and any actual file can have as many as these links as you want. When there are no remaining links to a file, it is allowed to be deleted.

      --
      Centralization breaks the internet.
    38. Re:Obviously by doshell · · Score: 3, Insightful

      Say you send one 1MB Word document to 100 of your colleagues. In a relational database-based, Single Instance Store-driven mail server, that document takes up exactly 1MB on the server. If somebody in the organization forwards the Word doc to the remaining 900 people in your organization, how much space does it take on the server? 1MB. Send a 1MB document to 1000 users on a flat, mbox-style mail server, and how much space is taken up on the server? 1000MB.

      Speaking of which, is there any filesystem around that "automagically" detects redundancy and avoids storing the same data twice (i.e. two files with the same content end up being stored only once)? (I don't mean hardlinks. Suppose I download some file for the second time without knowing the first instance exists). I suspect this would add a lot of overhead to the filesystem driver, but it'd certainly be a cool feature.

      --
      Score: i, Imaginary
    39. Re:Obviously by Majestix · · Score: 1

      ROFL....let me guess you are an exchange admin like myself. BROTHER!!!!

      Try adding a document management add-in for outlook and see how well your exchange performance takes a hit just because its not well behaved.

      I miss my recovery server. Though i think i'll be building a new one soon.

      --
      --- I was far from home, and the spell of the Eastern sea was upon me. -Lovecraft-
    40. Re:Obviously by tolkienfan · · Score: 1
      Someone pointed out one way your comment was flawed.

      Here's another:

      DBMSs were designed for high volume transactional systems, and are far more scalable than most filesystems (XFS an exception). They can be a very good fit for email serving. They are also good for handling and querying by metadata.

      Of course, screw Exchange, screw MS.

    41. Re:Obviously by colenski · · Score: 1

      OK dude, what the FUCK. He was talking about Exchange 5.5 + ADC and he's absolutely right, you have to restore to a seperate server, exmerge the data out, recreate the adc entry with an associated mailbox, and then remerge the data from the .pst. I just did it last week.

      ADC is fun stuff with 5.5, smoke a user in Active Directory and their mailbox is gone, sayanora.

    42. Re:Obviously by the+real+darkskye · · Score: 3, Interesting

      The mods are on crack, the meta-mods are on pot

      --
      Music is everybody's possession.
      It's only publishers who think that people own it.
      Fuck Beta
      ~John Lenno
    43. Re:Obviously by LurkerXXX · · Score: 1

      That's why I miss using BeOS for my email. :(

    44. Re:Obviously by milhous3030 · · Score: 1

      Why not ask Google!

    45. Re:Obviously by CDMA_Demo · · Score: 0, Redundant

      HawtMaule ruins on FreeBSHD

    46. Re:Obviously by Anonymous Coward · · Score: 0
      Please point to a mail system that actually uses the file system to hard link 1000 emails like the grandparent proposed.

      Yeah, that's what I thought. I didn't think you could.

    47. Re:Obviously by Anonymous Coward · · Score: 0

      Where have you been? Databases, and their supporting tools have been around for nearly the same period of time as your beloved filesystem.

      I don't believe that datbases are the best solution for everything, but I'm tired of people thinking that NFS mounted crap is simpler or somehow less expensive than a database simply because you get that familiar file-handle feel from a network mounted drive.

      You still need to open a socket to the remote machine and NFS caching can create concurrency nightmares if more than one client is attempting to access the same data. Databases have well-dfined mechanisms for dealing with these sorts of issues without having to play games like using directory creation as a substitute for true atomic operations.

      Get over it. Use a database, but realize it's not going to solve all your problems.

    48. Re:Obviously by Aceto3for5 · · Score: 2, Insightful

      Amen to that. I support a base that is one of the last holdouts against NMCI. (IBM was involved in the biddg process originally, and once they saw the scale of the project laughed and walked away.)As it is, we pay millions a year towards NMCI for the limited email-only version, which no one uses because it never works. Now that its going come full bloom, the talk of the town here is that we will end up with two networks, two jacks at each desk, one NMCI and one functional. Talk about wasting tax money!

      The biggest infrastructure problem plauging EDS right now is constructing a building large enough to hold all the money they are bilking out of us.

    49. Re:Obviously by AKAImBatman · · Score: 2, Informative

      Mo it doesn't. Grep searches horribly slow. If you're sorting through 2 gigabytes of email (a fairly common amount per user in corporations), you're going to be heavily limited by the disk speed and processor time. i.e. Searches could take on the order of minutes. Not good when you want to show a list of emails and the user attempts to sort by something, or search for that email from three years ago.

    50. Re:Obviously by Anonymous Coward · · Score: 0

      Sure, WinFS. But you're probably one of those mindless Slashdot reading linux zealots who thinks that Microsoft can never have anything good....

    51. Re:Obviously by Anonymous Coward · · Score: 0

      My company works with Lotus Notes. It's nothing more than a bundle of databases. The great thing about it is that your ability to customize the user's experience is only limited by your ability to write their code and your imagination.

    52. Re:Obviously by Anonymous Coward · · Score: 1, Insightful

      NMCI Blows goats. I could take $6B to give the military shitty service too. I have ~200 users and we get as much space as we need. Up to 20GB. The navy already had an experienced cadre of admins, but took all of the power to fix things from the people in an organization, and gave it to people half a world away. If it is not in the SLA, they won't fix it.

    53. Re:Obviously by Anonymous Coward · · Score: 0

      Ahh, reminds me of my summer internship as a Microserf. The summer of Code Red, the Windows XP RTM, and of course my email being inaccessible for 36 hours mid-week. Pretty lame.

    54. Re:Obviously by Anonymous Coward · · Score: 1, Funny

      Automatically deleting the 5.5 object when you delete the AD object (and vice-versa,) is an optional setting for each individual connector. Not only that, but I'm pretty sure that it's disabled by default.

      Don't you have a help desk to get back to? My grandmother's AOL account isn't working.

    55. Re:Obviously by brakk · · Score: 1

      Why would leave orphaned mailboxes sitting on the server after you delete the user? If you have exchange standard they would still be eating into your 16gig limit.

    56. Re:Obviously by Karl+Cocknozzle · · Score: 3, Informative
      You run the cleanup agent which shows you the tombstoned mailbox, you can then right click that and reconnect it to any Active Directory user.
      ...right up until the 30-day default and then your "tombstoned" mailboxes are gone, never to return--without the achingly painful "restore server" scenario. Hope you weren't counting on being able to bring them back until the end of time... Because unless you changed the default setting from 30-days, that is all the time you get. Sorry I didn't mention the 30+ days timeframe earlier, but I was on my way to the pub and didn't realize some exchange fanboy would be mortally offended by my least favorite feature of an otherwise decent product.
      --
      Who did what now?
    57. Re:Obviously by eh2o · · Score: 1

      Yup. Its called a compressed filesystem. Just like those stupid RAM doublers..... unless you have cycles to burn or a *LOT* of redundant data, its a waste of time.

    58. Re:Obviously by afidel · · Score: 2, Informative

      GE runs Exchange. I don't know of any company that has more employees likely to use email then GE (Walmart has more employees but a LOT of them are minimum wage drones who are unlikely to need email access). If they can make it work, and work well, I don't think anyone can deny that it's enterprise ready =)

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    59. Re:Obviously by Queer+Boy · · Score: 1
      Speaking of which, is there any filesystem around that "automagically" detects redundancy and avoids storing the same data twice (i.e. two files with the same content end up being stored only once)?

      I'm pretty sure The HFS+ Volume Format has that ability but it's just not used in the file manager.

      HFS+ is a very cool volume format but it's pretty much abstracted from file management which, incidentally, makes it a perfect format for something like Mac OS X because of the way OS X, Classic, and the Unix side work.

      --
      Not since Marie-Antoinette played milkmaid has looking simple and honest been so fake and complicated.
    60. Re:Obviously by zaphod123 · · Score: 1


      Hmmm... I just configured mysql to listen on port 25 in my.cnf. It doesn't appear to accept mail very well...

      telnet localhost 25
      helo thesysadmins.com
      Bad handshakeConnection to localhost closed by foreign host.

      Ahh... it must be the my clever spam hack, mysql-assassin. Just a little more tweaking and I will have this...

      --
      :q!
    61. Re:Obviously by CharlieHedlin · · Score: 1

      You have a flawed assumption in that the file is read only. Exchange/Outlook will let you modify the attachment in place and keep it in your mailbox.

      Now, it wouldn't be hard for the email server to check the link count and recreate the file as exchange does, I just wanted to point at out the flaw.

    62. Re:Obviously by Jubal+Kessler · · Score: 1

      > The reason Exchange uses a database can be summed up in three words: Single Instance Store.

      Or, on Unix, you could just use hard links for a given message to multiple recipients -- as long as their mailboxes reside on a single partition. (Hard links don't span filesystems.)

    63. Re:Obviously by Anonymous Coward · · Score: 0
      "Recover Exchange Mailbox [msexchange.org]"
      Wtf? You need a sex change to run Microsoft's email crap?
    64. Re:Obviously by theflemsta · · Score: 1

      I wanted to give you props for the sweet signature.

    65. Re:Obviously by Timshel · · Score: 1

      Except that searching the entire mail system for duplicate msgids to hard link to every time a message is received is not exactly a scalable solution!

      Waiting for someone to design a relational, multi-index filesystem that! :-)

      --
      killall -HUP .sig
    66. Re:Obviously by Anonymous Coward · · Score: 0

      There's an appliance that does this (targetted for back-up storage). http://www.datadomain.com/

    67. Re:Obviously by Anonymous Coward · · Score: 0

      Forget WinFS. Single Instance Store can actually be done in NTFS using the groveler since 2000.

      Install RIS on your server if you don't believe me. VERITAS even has an option for dealing with SIS NTFS volumes.

    68. Re:Obviously by joib · · Score: 4, Informative
    69. Re:Obviously by Fulcrum+of+Evil · · Score: 2, Interesting

      Exchange/Outlook will let you modify the attachment in place and keep it in your mailbox.

      Are you saying that I can send a file to 100 people, then edit it after I send it and leave the 100 people with no audit trail? That's horrible!

      --
      "We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
    70. Re:Obviously by AvitarX · · Score: 1

      I would gladly pay for Gmail accounts.

      2+GB, webmail as fast as a local app(on broadband), good searching, POP/SMTP acess.

      worth at least a few dollors a month per an account if there are no adds in the POP messages (I don't know if there are anyway).

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    71. Re:Obviously by Soko · · Score: 1

      Moving the existing data over will be a huge pain no matter what you migrate to though.

      This is the entire reason I'm killing Exchange where I work. Moving to anything else from Exchange is way too painfull to even consider again.

      Soko

      --
      "Depression is merely anger without enthusiasm." - Anonymous
    72. Re:Obviously by GoRK · · Score: 1

      While your point is well taken and I generally agree with you, I personally run into big problems using Maildirs and the filesystem as my email storage medium.

      Of course I have nearly 1,000,000 e-mails in my IMAP store and over 100,000 in a couple of mailboxes. While the filesystem does generally provide a good means of storing these, I have had to muck a lot with the filesystem itself to get low ovehead and decent performance. It's currently running on Reiser v4, which ironically almost as close to a relational database as you can get in a free linux filesystem right now. Trying to index or search these messages is still a huge pain in the ass though and downloading headers/checking for new mail, etc. takes unnecessarily long due to the way the imap daemon works with the files (and can't just look it up in an index or something)

    73. Re:Obviously by Electrum · · Score: 1

      When there are no remaining links to a file, it is allowed to be deleted.

      A file is deleted when the link count is 0 and no process has it open. You can delete (unlink) a file in UNIX and not affect any processes that are currently using it.

    74. Re:Obviously by Electrum · · Score: 2, Informative

      Please point to a mail system that actually uses the file system to hard link 1000 emails like the grandparent proposed.

      http://asg.web.cmu.edu/cyrus/download/imapd/overvi ew.html#singleinstance
      http://doc.powerdns.com/powermail/indepth.html#AEN 824

    75. Re:Obviously by Anonymous Coward · · Score: 0

      one big question is whether the company/organisation (and this is one big-ass organisation, considering the number of mailboxes needed) is going to be happy with their email/comms being handled by an external company. all the guys taking the parent post as something other than a (funny - i laughed) joke obviously don't see it this way. (not to mention ads)

      it'd be interesting if google could ship out some kind of "product" that "did" email. it could be tricky to do if the gmail system is very distributed over their gigantic google farm...

    76. Re:Obviously by surprise_audit · · Score: 1

      According to this page, the Navy is getting Outlook and Outlook Web Access, so I'm betting there's an Exchange server behind the scenes. Maybe that's part of the problem??

    77. Re:Obviously by vginders · · Score: 1

      FYI, Exchange 2003 Service Pack 2 will allow for a 75GB store.

      --

      Serge
    78. Re:Obviously by mpe · · Score: 1

      You have a flawed assumption in that the file is read only. Exchange/Outlook will let you modify the attachment in place and keep it in your mailbox.

      Which is going to lead to all sorts of "fun" if several users try to alter the file. Not just in terms of file handling, but also in terms of people not realising that something they have read is now changed.

    79. Re:Obviously by SnowZero · · Score: 1

      Except that searching the entire mail system for duplicate msgids to hard link to every time a message is received is not exactly a scalable solution!

      You can do this efficiently by naming objects by their content hash, as in git.

    80. Re:Obviously by chrisd · · Score: 2, Funny
      There are worse ideas.

      /me runs...

      --
      Co-Editor, Open Sources
      Open Source Program Manager, Google, Inc.
    81. Re:Obviously by raynet · · Score: 4, Interesting

      Plan 9 OS has filesystem that does just this. I think it was called Venti. Basicly it hashes the datablocks on the filesystem and only stores each unique block once. There was (is?) project where the filesystem was being ported to Linux.

      --
      - Raynet --> .
    82. Re:Obviously by minion · · Score: 1

      Say you send one 1MB Word document to 100 of your colleagues. In a relational database-based, Single Instance Store-driven mail server, that document takes up exactly 1MB on the server. If somebody in the organization forwards the Word doc to the remaining 900 people in your organization, how much space does it take on the server? 1MB
       
      Sounds like Cyrus IMAP to me.

      --

      -- If we don't stand up for our rights, now, there will be no right to stand up for them later.
    83. Re:Obviously by misio413 · · Score: 1

      Ever tried opening a 2GB mailbox with pine? When you have a 2GB mailbox, with messages up to 100MB each, it really is much better to have them parsed, broken into little pieces, tagged, and put away in a database.

    84. Re:Obviously by sum1 · · Score: 0, Redundant

      dude... you rule...

      batman beyond was definately the best one.

    85. Re:Obviously by anandp · · Score: 1

      Sometimes relational databases are overused in email systems (very silly to put your JPG attachments inside the DB). But databases are faster than doing readdir(3) and read(2) all the time, irrespective of how much the buffer cache in the kernel helps. Specially when you are designing to scale and trying to cram as many busy users as you can in a box. Even cyrus-imap maintains an index file, as does Gnus/nnml - both of which can be file per message. These systems (not just the two examples) can not afford to walk directories and open a lot of files, when you want to list your inbox. Some folks (including us) choose to put this index in a relational database - for various reasons - they're doing groupware and a database is so much easier/required, don't reinvent sorting and indexing, formalize your data structures and don't have to write new tools if schema has to change for some reason.

      Leaving the message on the filesystem, but keeping message metadata in the database (of some kind, relational or otherwise) is the way to go. "Email don't need no databases" is not a feasible option when writing mail servers.

    86. Re:Obviously by Anonymous Coward · · Score: 0

      There's enough comments on the rest, so I'll take a stab at the last bit. If all you want from the table is to find out if a userid has any unread messages, instead of select *'ing the whole fscking table, just do a select userid,read from messages where userid=someone and read=something. It'll make it more readable to someone who has to deal with the damned thing later, and it'll execute abit faster. Which matters if you start to have tons of those things going at once.

    87. Re:Obviously by Anonymous Coward · · Score: 0

      what about the sentence "Microsoft software is never running smoothly"?

    88. Re:Obviously by tacocat · · Score: 1

      Yup! Pretty cute.

      I think this is a pretty goofy place to start working on a project like that. But having a requirement of 1 million users isn't the issue. I can do that with one sub optimum pc. How many email messages are you running every day?

      Considering that email is a non-realtime process you could conceivably do this one one machine with a secondary MX server if you were willing to wait for a period of time, but how long is to long? Every minute will cost you a lot of money, especially as you approach zero.

      It's easy to say the hardware should be SCSI Raid and use multi-processor CPU's. Clustering sounds like the better solution but I personally don't know anything about it.

      99.9% uptime is more interesting: That's 8.76 hours per year of unavailable system or 43 minutes a month. That would allow you one systemic upgrade/reboot per month. I would consider this to be a bigger issue if you were also expected to manage security patches and other upgrades in a timely manner and yet only have 43 minutes a week to apply them. Given you have a million accounts it might be difficult to have the mail system restart itself in a short enough timeframe to allow you a patch a week.

      And Scale Perfectly? What are you? Marketing or Management? Is there such a thing as "To Scale Perfectly"?

    89. Re:Obviously by caluml · · Score: 1

      Hmmm. 1 million gmail accounts, and a corporate PGP keyserver so that all corporate email is encrypted....
      Me strokes chin...

    90. Re:Obviously by ramblin+billy · · Score: 1


      SOLUTION:
      Scalability.....virtually unlimited
      Ease of Use.....simple
      Support.....free practically 24/7/365
      Platform.....any
      Cost.....zero

      IMPLEMENTATION:
      First, everybody open a Yahoo account...

      billy - Groups....anyone?

    91. Re:Obviously by Anonymous Coward · · Score: 0

      Really smart, because with ONE MILLION people using their email which probably means in the order of BILLION individual mails (insert movie reference here) hash collissions are unheard of...

      That's probably why the poster even asked, because it starts to take different design approaches to scale to millions of users. Certain things scale very easily, such as outgoing queues, but other things such as storage do not.

    92. Re:Obviously by Anonymous Coward · · Score: 0

      So I suppose you'd use POP3 clients to access gmail with PGP conveniently.
      Now the question is, how would you track outgoing email?

      ps: I think google will provide Gmail as service for companies - but only running of Google internet servers.

    93. Re:Obviously by ataddei · · Score: 1

      There are 3 types of message store backends - Those with a database - Those with a file system - Those with a mix A store with a database backend is NOT a condition to have single copy messages store. Look for example at Cyrus daemon, Mirapoint, Critical Path and Sun ... Still they 'of course' do support single copy message store. In addition Sun's message store supports single APPEND store which is of a KEY help when performing migrations from MS Exchange for example!

    94. Re:Obviously by sco08y · · Score: 2, Interesting

      I am so tired of people shoving everything into relational databases.

      What relational DBMSs? All I've heard discussed are SQL products.

      The filesystem is really really efficient (for e-mail) and really really reliable.

      I'm tired of everyone shoveling everything into a filesystem.

      How are you going to run queries against your contacts? Or your appointments?

      How does a filesystem guarantee referential integrity? Can a filesystem guarantee an appointment doesn't exist for a bogus contact?

      *Any* kind of integrity? Can a filesystem guarantee that a message is well formed?

    95. Re:Obviously by superiority · · Score: 1

      I'd start by doing the exact opposite of what any /.er suggests.


      Unless it is the reply to that suggestion.

    96. Re:Obviously by Klanglor · · Score: 1

      or wait until they come up with a gmail blade.
      so you can plug it right below your google blade.

    97. Re:Obviously by OAB_X · · Score: 1

      No wonder Outlooks' searches are so slow!

    98. Re:Obviously by asciiRider · · Score: 1

      The 400 Wintel servers we run at 3 hospitals beg to differ.

    99. Re:Obviously by QuietLagoon · · Score: 2, Insightful
      Moving the existing data over will be a huge pain no matter what you migrate to though.

      Yup, that's a big problem with Microsoft Exchange's proprietary datastore.

      Like the roach motel, data goes in, but you can't get it out.

    100. Re:Obviously by Wdomburg · · Score: 1

      It's less taxing in terms of memory, but having the metadata for each message in individual files means you have to open each file to extract. For some common operations, like bringing up a message index, this means a ton of disk I/O.

      For a small scale server this doesn't really show up, but as you add users the disk contention will kill your performance. You can offset this by adding more memory to the machine, so frequently accessed mailboxes are likely to be pulled into memory, but then "doesn't use any userspace memory" seems kind of an empty benefit.

      Pure file-based message formats are great for smaller installations, but the larger one scales the more benefit one can reap from some form of intelligent indexing. Relationial databases are overkill for this sort of thing, but they have the benefit of being well understood, widely deployed, allow for easy manipulation with existing tools, and have established intelligent backup methods.

    101. Re:Obviously by Procrastin8er · · Score: 0

      Just because you "just did it last week" doesn't mean there aren't alternative ways of doing it.

      --
      Slashdot - Where the slash is most definitely to the left.
    102. Re:Obviously by Anonymous Coward · · Score: 0

      Am I missing something here? Why is this funny?

    103. Re:Obviously by Anonymous Coward · · Score: 0

      The mods are on crack, the meta-mods are on pot

      Re:Obviously (Score:2, Insightful)

      Insightful

      adj : exhibiting insight or clear and deep perception; "an insightful parent"; "the chapter is insightful and suggestive of new perspectives"-R.C.Angell

      Mod on crack, someone metamod as fair so they can be on pot, but why is dictionary.com saying "exhibiting insight" is the definition of insightful? They must be on acid (and I mean the car battery kind)

    104. Re:Obviously by sammy+baby · · Score: 1
      Say you send one 1MB Word document to 100 of your colleagues. In a relational database-based, Single Instance Store-driven mail server, that document takes up exactly 1MB on the server. If somebody in the organization forwards the Word doc to the remaining 900 people in your organization, how much space does it take on the server? 1MB.

      Well... technically, there's going to be a little bit of space taken up by the mailbox-to-message relationship. But otherwise, I'm with you 100%.

      I wonder, though, if there isn't a good way of hybridizing the two systems. If you used a maldir setup, but only stored a reference to a message stored elsewhere on the system in the actual maildir, you could get the single instance store, but get to keep it in the filesystem in a format only marginally more complicated to manage... ...maybe. At the very least, it'd be a good way to burn up some spare inodes you have lying around.
    105. Re:Obviously by sammy+baby · · Score: 1

      And now I see downthread that this is basically how Cyrus works. Doh.

    106. Re:Obviously by Anonymous Coward · · Score: 0

      Obviously it's a masturbation joke.

    107. Re:Obviously by Anonymous Coward · · Score: 0

      grep --mmap

      maybe that will speed things up a bit :)

    108. Re:Obviously by CkB_Cowboy · · Score: 0

      You totally can, as long as you negate one of the phrases or lay the irony on super-thick. For example:

      Our linux/apache web server, you know - the one that isn't running a Microsoft OS - it's running smoothly.

      or

      This Microsoft XP workstation is not running smoothly. (very common)

      or

      Yeah, SURE.. Microsoft Office 2004 is running SO smoothly on my Mac. It's SO smooth, you know - the way Entourage crashes EVERY time I do a search? That's DELIGHTFUL! Don't even get me started on how MUCH I LOVE that it polls the network without my permission to see if other people are running other copies. It's SO GREAT!

      --
      what, what?
    109. Re:Obviously by Anonymous Coward · · Score: 0

      Siemens is probably larger than Wal*Mart -- they do cover six continents.

      -- Catonic

    110. Re:Obviously by smbarbour · · Score: 0

      Sure you can. Just to prove the point:

      The server that used to run Microsoft Exchange, now has liquid coolant running smoothly out of the reservoir.

      See "Microsoft" and "running smoothly" in the same sentence.

    111. Re:Obviously by SBrickWork · · Score: 1

      actually, NTFS can handle hard links internally... you just don't ever have an interface to do it (learn to love the console!).

      Anyway that said, when installing RIS (a PXE boot server for windows installations), it installs a service called "Groveler" which does exactly what you're asking... it searches through all contents and automagically converts duplicate files into SIS and NTFS-style links.

      Unfortunately, i've yet to find any registry settings/etc in which i can redirect the service to ANY other location.

      -Scott

      food for thought: Why did MS make "shortcuts" when NTFS has built-in support for true links (aside from FAT16/32 compatibility)

    112. Re:Obviously by Anonymous Coward · · Score: 0

      Interesting +1
      Informative +1
      Not Linux -10,000,000,000

    113. Re:Obviously by StillNeedMoreCoffee · · Score: 1

      But then if it really is the Armed services, I would think would want several other features not typical to commercial email systems, like ability to monitor mail for terroist and treason monitoring, to have secure mail for classified communications with mappings of the 20 some clearance levels to email classification levels, and possibly the ability to purge information from the system, but that last requirement would have been added since they found that Corporate Shreading was defendable if there was a policy in place for it.

      If it is government there are so many other requirements we usually don't think about for a system like this.

    114. Re:Obviously by cenobyte40k · · Score: 1

      Thank you. This is exactly why I don't understand the use on IBM Notes. (Which we switched to from exchange last year). Now we only get 1 meg attachments (Compared to some size beyond what I would send when we used Exchange) Databases have there place and in mail. I am sure you could do all this with file system stuff as well but why would you given that you would need to put together a DB to keep track of what is where and who would need to get to it in the file system.

    115. Re:Obviously by Tuna_Shooter · · Score: 4, Insightful

      One BIG issue between what people are running now and what they will HAVE to run soon is the little item of SOX compliancy. Be VERY careful that your little million user mail system is compliant or the implementation costs will double. Believe me i do this for a living and just saw one of our financial clients get stung big time.

      --
      *--- Sometimes a majority only means that all the fools are on the same side. ---*
    116. Re:Obviously by lgw · · Score: 2, Insightful

      Veritas and Legato couldn't bend over enough for a million users.

      I'm pretty sure that both Veritas and Legato can scale to a million exchange mailboxes, but as it happens Wallmart used Tivoli (which should scale that large as well, given its mainframe background). It's strange that they didn't have Exchange backups with a high-end backup product in place corporately - but I know next to nothing about Tivoli. Was Wallmart just being cheap?

      --
      Socialism: a lie told by totalitarians and believed by fools.
    117. Re:Obviously by MadAhab · · Score: 1
      So... you need a relational database system because Pine sucks at handling huge mailboxes? Wow, what a great reason.

      You know, there are mail clients that index mailboxes. You know, parse, break out key data into little pieces, put it away in a little database file.

      And anyone designing a mail system for 1,000,000 users who actually lets 100MB messages through needs to be put down like a foamy dog.

      But go ahead, design a system for 1,000,000 users of Pine. That makes sense.

      --
      Expanding a vast wasteland since 1996.
    118. Re:Obviously by MadAhab · · Score: 1
      I'm tired of everyone shoveling everything into a filesystem.

      Well, when you've made the filesystem obsolete, you just let us know.

      --
      Expanding a vast wasteland since 1996.
    119. Re:Obviously by EvilNebby · · Score: 1

      NMCI Is considered total crap for those of us outside of Asia as well. At least EDS is operating at a loss as a result of it.

      --
      --- Nebulous
    120. Re:Obviously by dimator · · Score: 1

      Do you have a link for this? It sounds interesting.

      --
      python -c "x='python -c %sx=%s; print x%%(chr(34),repr(x),chr(34))%s'; print x%(chr(34),repr(x),chr(34))"
    121. Re:Obviously by Anonymous Coward · · Score: 0

      Yes, I have. This is exactly why I started using email clients than aren't total piles of shit. Pine is so slow because it read()s in the entire mailbox and parses it every time. Any decent mail client will keep an index of the mailbox file so it can quickly see what messages are where, and then mmap() the mailbox file.

    122. Re:Obviously by AK+Marc · · Score: 1

      Build an ENTIRE OTHER SERVER and do a normal, full restore of the entire mail store so you can extract one measly mailbox.

      If you have a million accounts and uptime guarantees, then you will have more than on server sitting around that you can do this on. With proper regular backups with a good program, it may take a day to get the info, but it will only take you a few minutes of time for the button pushing to start the restore from backup. In a proper environment, the only constraint is if the person demanding them doesn't want to wait a few hours.

      But the real question is, why aren't you using a 3rd party tool anyway? Use BackupExec and all your problems are solved. Exchange servers should have no file shares, even if it annoys the admins (no need for file virus scans which would impact performance, or put one on and exclude every exchange folder from scans), they should have a 3rd party email virus scanner (even if there is one on the gateway and every client PC), and they should have a 3rd party backup utility. All other non-necessary services should be stopped. Unless there is some server monitoring software necessary for the environment or some really good reason, nothing else should be installed on the server. Yes, it takes 3rd party tools to properly use an Exchange server, but you just have to factor that into the cost of Exchange.

    123. Re:Obviously by Anonymous Coward · · Score: 0

      Does WalMart actually have close to 1 million email accounts established though? I can imagine them having a large employee base, but are they really providing email accounts to every employee or just the ones who work in management level positions?

      My employer also runs MS Exchange and we have over 100,000 employees (with email). I don't know if they have a standard configuration from Microsoft or if they've had to rig anything up though.

    124. Re:Obviously by bluGill · · Score: 3, Insightful

      No they cannot. Microsoft does not want you backing up mailboxes. You backup mailstores, which are several (hundred - however many will fit on a single disk partition) mailboxes. This works great for disaster recovery, you restore the failed disk.

      It is worthless for a single user who just deleted some important message. You end up building a new exchange server, and then restoring the entire mailstore, than going into that box and grabbing the one message. Veritas (I presume Legato as well) has an option to go in an grab each message from the mailbox one at a time. However this is slow - 1/5th the speed of a normal backup.

      I work for, a company that competes with Veritas and Legato (though we try for much smaller accounts, big enterprizes need things we don't provide). We do Exchange backup, and are pretty sure that Veritas is doing it exactly like us. I strongly doubt anyone can scale mailbox level backup to millions of users.

    125. Re:Obviously by cloudmaster · · Score: 1

      A filesystem guarantees file integrity. A mail system guarantees mail integrity. Do one thing and do it well. Or write a plugin for Reiser4. :)

    126. Re:Obviously by CProgrammer98 · · Score: 1

      That humour bypass surgery you had... Was it painful?

      --
      And the people shall be oppressed, every one by another, and every one by his neighbour Isaiah 3:5
    127. Re:Obviously by bhsx · · Score: 1

      And we'll all scream DUPE!

      --
      put the what in the where?
    128. Re:Obviously by Anonymous Coward · · Score: 0

      The Armed Forces already have a system for this and it is known as the Defense Messaging System (DMS). It is available as an add-on to Exchange, Notes and others. Basically we can add additional information on classification and if your hardware token does not have the appropriate information you cannot receive the message.

    129. Re:Obviously by lgw · · Score: 1
      • Bad: mailbox level backups.
      • Worse: restoring the whole store to a dummy server to move over one mailbox or email.
      • Good: restoring individual mailboxes from a normal mailstore backup.
      • Better: restoring individual mailboxes from an Exchange 2003 VSS Snapshot, or backup thereof (no backup window).
      Slow backups and building new exchange servers is so 1999. :)
      --
      Socialism: a lie told by totalitarians and believed by fools.
    130. Re:Obviously by dow · · Score: 1
      Say you send one 1MB Word document to 100 of your colleagues. In a relational database-based, Single Instance Store-driven mail server, that document takes up exactly 1MB on the server. If somebody in the organization forwards the Word doc to the remaining 900 people in your organization


      If somebody forwards a 1Mb word doc to 900 people they are a fucking idiot. Lart.
    131. Re:Obviously by bluGill · · Score: 1

      We have plenty (too many...) customers still using exchange 5.x. 2003 requires active directory (IIRC 2000 does as well?), which many customers see no reason to use. (Remember our customers are not big business where active directory is useful)

    132. Re:Obviously by superflyguy · · Score: 1

      Each server, running an instance of Exchange handling 1 user, should also be able to run another app handling at least 1 more user, so you're guaranteed the ability to handle 2 users per server. And microsoft still supplies tech lack-of-support for half the capacity. Or you can assume that you only need microsoft to set up 1000 servers, because immediately upon instalation, you wipe them, install linux, and ignore the capacity of Exchange

    133. Re:Obviously by jrockway · · Score: 1

      > *Any* kind of integrity? Can a filesystem guarantee that a message is well formed?

      This is the whole point of a maildir. Each message is in its own file, so a message like:

      -- cut --
      From: jon
      To: you
      Subject: hi

      Hi, this message is going to ruin your mbox:

      From: your boss
      To: you
      Subject: you're fired

      Please clean out your desk immediately.

      Regards,
      H.G. Pennypacker
      Wealthy Philanthropist
      -- cut --

      When you view this in your mbox-reading program, you'll see two messages, even though only one message actually arrived. Oops.

      The MTAs have various ways of averting this problem, like adding ">" in front of lines that look like headers. But, some won't escape already-escaped lines (like "> From: jon"), and screw up the content when the mbox is viewed. Also, try using mbox over NFS. Bad.

      Maildir guarantees that each message that is received is in its very own file. This means that everything that the sender sent is in the e-mail verbatim, AND that anything else isn't.

      This is probably not what you asked, but yes, the filesystem does do a good job ensuring mail integrity. A database might be able to do that, but not as easily as "cat message > file" does.

      --
      My other car is first.
    134. Re:Obviously by Yonzie · · Score: 1

      WalMart runs the worlds biggest Exchange install. They and msft are quite proud of it, actually...

      Dude, this is obviously a guy from walmart asking for help to get rid of that exact system... sheesh...

    135. Re:Obviously by doshell · · Score: 1

      Perhaps they don't want to concede all of a sudden that they're copying Unix, so they're trying to do it very slowly? ;) (Vista is supposed to have Unix-like permission bits too, do we have a trend here? :))

      On the serious side, I think compatibility would be a strong enough reason. It seems to drive a lot of Microsoft's decisions (and unfortunately, a lot of the problems with their software too).

      --
      Score: i, Imaginary
    136. Re:Obviously by loyukfai · · Score: 1

      I wonder if it really helps that much, because when people just add a few words and then forward their emails to others, it makes having SIS irrelevant.

      Maybe having a central place (SharePoint or Public Folder for a MS shopjargon?) that do the files sharing and those stuff, and leave just important messages send through email will reduce the need of >2GB mailboxes?

    137. Re:Obviously by hlygrail · · Score: 1
      It is worthless for a single user who just deleted some important message. You end up building a new exchange server, and then restoring the entire mailstore, than going into that box and grabbing the one message.

      Wrong.

      Many products exist to restore single Exchange mailboxes (Ontrack's Power Controls being one of them), or even specific content within a mailbox, such as an individual e-mail) without having to build a new Exch server. In fact. Power Controls only needs an Outlook client and its MAPI client installed on a separate workstation, and can restore to the same mailbox, a different or new mailbox, or export to something else, like a .PST file.

      Couple that with high-end back-end storage like a NetApp SAN/iSAN or EMC equivalent that supports snapshots, and you can do this completely behind the scenes. People do it every day...

      [Full disclosure: I work for Network Appliance, and we are a rebranded reseller of Ontrack Power Controls...]

    138. Re:Obviously by crucini · · Score: 1

      Neither a filesystem nor a RDBMs will solve the real problems in building a scalable mail system. However, let me point out a weakness in the filesystem approach. It is a tree-shaped structure, aka a network database. That makes it easy to divide between read and unread. What if I want all messages from billybob@aol.com? Will the fs-based system be reduced to a linear search? Remember, the RDBMs can index on multiple columns.

      Of course, with enough cleverness, you can simulate this in the filesystem with directories full of symlinks. For each sender, create directory "from_$SENDER" with symlinks to messages from that sender. How about showing the index page? Would we open every message to read in the headers, or maintain a separate index file?

      Of course, until recently, with ReiserFS, systems that create many small files were frowned upon.

    139. Re:Obviously by markwalling · · Score: 1

      ge's exchange is on the department level. each department has its own subdomain (Power systems for ex is ps.ge.com) with its own exchange server. if someone in power systems was to send an email to someone in capital it goes out of the intranet and back in at the capital head office.

      --
      ...For the beast had been reborn with its strength renewed, and the followers of Mammon cowered in horror.
    140. Re:Obviously by Anonymous Coward · · Score: 0

      It's not a flaw, anymore than copy-on-write (look it up) is a flaw in modern programming languages. If Microsoft implemented Exchange on a database just to get the feature mentioned here, then they are cretins one-and-all. I suspect they did it that way because y'know... relational database and all that... it'll like... make stuff simpler. Instead it's led to years worth of backup headaches and corrupted mail stores that can't be accessed directly -- Exchange in a nutshell.

    141. Re:Obviously by Anonymous Coward · · Score: 0

      Actually it's not the biggest exhange implementation but it's one of the largest PRIVATE customer Exchange installs.

      As for backup. .. YES microsoft does ship Exchange backup FREE integrated with windows so if you can't backup Exchange your an idiot. Given it's not the most feature rich if your talking about mailbox level recovery but for that and a company that big you should probably be using snapshots of some sort.

    142. Re:Obviously by rthille · · Score: 1

      Single-instance store is good for small to middling groups that use email to exchange large documents among large percentages of the group. I'd say that's a pathological use of email (but unfortunately common :-). But in larger groups (I'd say that 1M users probably qualifies) the single-instance store doesn't buy you much. This is from analysing real data from real-world email installs. I used to work at Openwave (nee Software.com) where our bottom-end product scaled to 250,000 users on a single server and our top end software was used by like likes of Verizon, AT&T and others in the "carrier class" realm. In that sort of arena, the only mails which are likely to go to more than 50 users is spam, and you want to recognize them and kill them.

      --
      Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
    143. Re:Obviously by Anonymous Coward · · Score: 0

      The USAF uses Exchange as well. PACAF's system runs stunningly well under Exchange 2003 (migration recently completed). We had the same issue w/ individual mail box retrieval, but then that was one of those things where you just have to tell some stupid Colonel that, sorry sir, but you shouldn't have erased that email completely if you really really wanted it. It's not like the people on top have quotas or anything. As for the Navy's system... well, there's a reason the AF does a decent chunk of the DoD's deployed comm in the sandbox now. ;)

    144. Re:Obviously by SnowZero · · Score: 1

      Collisions resulting from false positives will be exceedingly rare when using a large cryptographic hash. However if you are really worried about that, just compare the object contents after a collision is found. 10^6 is a lot smaller than 2^256 ~= 10^77, so in other words it won't happen in our lifetimes.

      Hashing is really not that advanced an approach, and is commonly applied all over the place in scalable systems. How do you think P2P systems scale the way they do? They are indexing a *lot* of content... millions of users and billions of files isn't that far off for the bigger ones.

      Indexed object storage scales just fine if you're willing to pay for it. Want to scale out to a farm of 256 servers? Use the first 8 bits of the hash to index the server, and the remaining bits stored on each server.

  2. qmail ? by Anonymous Coward · · Score: 0

    n/t
    Oh FP.

  3. Easy. by Chess_the_cat · · Score: 5, Funny

    gmail.google.com

    --
    Support the First Amendment. Read at -1
    1. Re:Easy. by Anonymous Coward · · Score: 2, Insightful

      Assuming you dont mind google scanning your internal email achives looking for interesting business information!

    2. Re:Easy. by double-oh+three · · Score: 1

      Though meant as a joke, I wonder what google uses, and can it be bought/hacked together?

      --
      "For years, I struggled with reality... but I'm happy to say I finally won out over it." -- Elwood P. Dowd
    3. Re:Easy. by Anonymous Coward · · Score: 0

      You are missing the point! Hotmail is finally moving to an open source solution!

    4. Re:Easy. by nherm · · Score: 2, Funny

      Obviously this cfsmp3 guy is one of these phd that google hired for creative solutions... so google asked him how to expand its array of mail servers in one million accounts, and guess what is the cheapest solution that this brilliant cs phd discovered?

      Ask slashdot, of course!

      Nice try, google, you evil overlord...

      /tinfoilhat

    5. Re:Easy. by jp_fielding · · Score: 1

      i've got the million invites to give you, email at....

    6. Re:Easy. by campers · · Score: 1

      I dont think you'd meet your 99.9% uptime requirement with gmail. Ive often found the service is unavailable.

    7. Re:Easy. by killjoe · · Score: 1

      Why not use google? They provide pop access and web access. Maybe you could build a local LDAP addressbook that would hand out emails to gmail addresses.

      Oh and gmail also provides a SMTP auth server for you too.

      --
      evil is as evil does
    8. Re:Easy. by ComputerSherpa · · Score: 2, Informative

      I've had GMail go down once or twice in the year I've had my account. Problem might be on your end. Holy frick, I've had my GMail account for a year. And three days. O.O

      --
      Information wants to be anthropomorphized!
    9. Re:Easy. by Anonymous Coward · · Score: 0
      Assuming you dont mind google scanning your internal email achives looking for interesting business information!

      Whomever modded the parent of this post as "flamebait" is a flocking moron. The post clearly has intellegent reasoning and probably should be modded as 'insightful".

      This post, on the otherhand, is flamebait but I don't care.

      So bite me

      signed the raving AC

    10. Re:Easy. by nherm · · Score: 1, Funny

      ... this brilliant cs phd...

      So here I was, reading again the same thread I've just submitted a reply about one hour ago... and reading my _own_ comment, I've wondered about the existance of a post-graduate level in Counter-Strike.

      No more coffee for nherm! /me goes to sleep

    11. Re:Easy. by Anonymous Coward · · Score: 0
      Ah now we have had an intellegent modder - the post has been modded up as 'insightful'.

      signed the raving AC

    12. Re:Easy. by campers · · Score: 1

      I can pretty much always get to the gmail site, but numerous times I get a 'this service is temporarily unavailable' message

  4. OS? by dakirw · · Score: 1

    On the OS front, I'm assuming that you'd be allowed to use the OS of your choice as part of the design. Is that correct?

  5. go to gmail by Argonne · · Score: 1

    Why not gmail?

    1. Re:go to gmail by Guspaz · · Score: 1

      Or contact Google about licencing GMail. They've done this with their main product (search) with the Google Search Appliance and Google Mini. Google might be willing to allow his company to use GMail to handle their email domain, for the right price.

      If Google themselves were providing the service from their own infrastructure, stability and scaling are already solved, plus it supports webmail and POP3. IMAP is missing, but if one were to secure a deal with Google that'd be a minor concern that could likely be solved.

    2. Re:go to gmail by Anonymous Coward · · Score: 0

      99.99% uptime is why. Have you paid attention how much gmail has been down this month?

    3. Re:go to gmail by Chmarr · · Score: 5, Insightful

      Gmail is beta.

      Gmail does not have guaranteed uptime.

      You do not pin your companies communications system on something you cannot sign a SLA agreement with.

      need I go on? :)

    4. Re:go to gmail by Bun · · Score: 1

      Gmail is beta.

      Gmail does not have guaranteed uptime.


      No kidding. I've had TONS of problems with Gmail, ranging from "Oops, try again later", to emails not being delivered for 17 hours +. No way should a company rely on something so flakey.

      --
      "Anyone that has ever gotten an idea based on any of my work and done something better with it-good for you."--J.Carmack
    5. Re:go to gmail by AJWM · · Score: 1

      You do not pin your companies communications system on something you cannot sign a SLA agreement with.

      What, you mean like Exchange?

      --
      -- Alastair
    6. Re:go to gmail by einhverfr · · Score: 1

      You do not pin your companies communications system on something you cannot sign a SLA agreement with.

      What do you bet that Google has not SLA's regarding their Linux servers.

      Seriously, high availability comes first from good design. SLA's are entirely secondary. Yes, PHB's like them and they have some merit, but I will choose Linux/Postfix or Linux/Qmail without an SLA long before I will choose Microsoft Exchange with an SLA.

      My company is starting to build a highly available infrastructure for a very substantial expansion. It will be based entirely on FOSS, and we will not have any SLA's with anyone.

      --

      LedgerSMB: Open source Accounting/ERP
    7. Re:go to gmail by peragrin · · Score: 1

      I think you need to update yourself.

      Gmail went public on the same day they released Google Talk.

      in order to use Google talk you need a gmail account.

      Of course your last statement is true. i wouldn't trust it for corporate data. heck it gives me headaches sending large files on a regular basis. i did say sending, it doesn't always like uploading them.

      --
      i thought once I was found, but it was only a dream.
    8. Re:go to gmail by chill · · Score: 1

      Gmail is beta. Gmail does not have guaranteed uptime.

      Oh wise prophet, how did you forsee this?

      Server Error

      Gmail is temporarily unavailable. Cross your fingers and try again in a few minutes. We're sorry for the inconvenience.
      :-)

        -Charles

      --
      Learning HOW to think is more important than learning WHAT to think.
    9. Re:go to gmail by mmkkbb · · Score: 1

      huh? google went public months ago.

      --
      -mkb
    10. Re:go to gmail by cecil_turtle · · Score: 1

      It went "public" in that you no longer need to be invited, but that doesn't mean it's out of beta. In fact I have GMail open in another browser tab right now and it still says beta.

      Also, read their terms of use:
      Personal Use. The Service is made available to you for your personal use only.

      So it would be illegal to use it for your company's email anyway. Illegal I don't know but they could at least sue you / stop you. IANAL.

    11. Re:go to gmail by plazman30 · · Score: 2, Informative

      Not to be a smart ass, but it's not SLA agreement. It's an SLA. SLA stands for service level agreement. SLA agreement would be service level agreement agreement.

    12. Re:go to gmail by plazman30 · · Score: 1

      Just because you can get an SLA does not mean it will ne honored. Having worked with Exchange in the past, I would have to say any uptime guarantees are subject to two major factors:

      1. Will you need to pull the plug on the server due to an e-mail worm?
      2. Do you need to deploy a whole Active Directory infrastructure and trust the AD infrastrucutr not to take a nosedive and hose your e-mail environment?

      Now something you need to remember here is that Exchange ltself is not vulnerable to mass e-mail worms. It's the Outlook client that is the issue, so if you eliminate the Outlook client, then Exchange may be an option.

      If all you need is POP3, IMAP, webmail and LDAP address lookups, then I would buy some SuSE Linux boxes and run Novell Netmail, which is actually a very, very nice program that scales quite well. If Novell was still giving out accounts at myrealbox.com, you could see the product in action.

      If you need calendaring, give Novell GroupWise a look. Very slick package with a ton of features you wish Exchange had...

      The other nice thing about Novell is they can provide you with an on-site resource to help support this stuff. We have one at my work. They're called DSEs (Dedicated Suppor Engineers) and they can bypass a boatload a call queues and actually get you on the horn with a developer if need be.

    13. Re:go to gmail by Plaid+Phantom · · Score: 1

      Oh, sure. Next thing you're gonna tell me is that "ATM machine" is wrong!

      --
      All comments are properties and trademarks of the voices in my head. Not like I'm gonna claim them.
    14. Re:go to gmail by Anonymous Coward · · Score: 0
      Just because you can get an SLA does not mean it will ne honored

      True, but it's nice to be able to "thow a bone" to management types when things go wrong.

      If your system got screwed up for a period of time and it doesn't cost the company any money, nobody will really care. They'll blow smoke but it's not likely that firings will occur. If your system throws up and it costs money, you better be able to recover that money somehow. If an SLA guarantees you a certain amount of uptime that your system has not realized, you possibly have a legal ground to recover money that was lost due to outages from the software vendor.

      ie. If your system realizes only 99.8% uptime and your SLA guarantees 99.99% uptime, you could possibly recover money to cover monies lost during the .19% that you were guaranteed to be up but weren't. Bosses like that sort of thing. If the system goes down unexpectedly and the bosses come to you to explain the lost money and your excuse is, "nope, no recourse, but I used open source!", you better start dusting off your resume.

    15. Re:go to gmail by James_Aguilar · · Score: 1

      You should go on to the part where you get the joke then you can stop there. =)

    16. Re:go to gmail by 3D+Lover · · Score: 1

      But my server is "Powered by NT Technology". Are you telling me thats wrong as well????

    17. Re:go to gmail by chris_mahan · · Score: 1

      Gmail is written java on the back end.

      It may seem trollish, but, let me tell you, java vm does some funny things under load. And gmail is under load 24x7.

      --

      "Piter, too, is dead."

    18. Re:go to gmail by Anonymous Coward · · Score: 0

      how about PIN number? ...it may be incorrect, but it's certainly part of the common vernacular these days, and probably also appears in dictionaries (but i'm too lazy to check).

    19. Re:go to gmail by Marthisdil · · Score: 0

      Nor do you pin your company's communications to some tech in over his head and coming and asking Slashdot for help....

      Please. If he has a clue, he wouldn't be here to begin with.

    20. Re:go to gmail by Loether · · Score: 1

      How about PIN Number???

      --
      TODO create witty sig.
    21. Re:go to gmail by Chmarr · · Score: 1

      My attention span isn't that long :)

    22. Re:go to gmail by plazman30 · · Score: 1

      Since NT doesn't stand for "New Technology" like everyone thinks it does, the only thing wrong with your box is that it runs something based on an NT kernel. Hope you're not the one carrying the pager... :-)

  6. Umm.. by DraKKon · · Score: 0, Redundant

    With an Ask Slashdot Question?

    --
    "It's not like your minds are as open as the source you love..." - Me to the majority of Slashdot.
    1. Re:Umm.. by uofitorn · · Score: 1

      Certainly Mr. Openminded, it can't hurt?

      --
      "What kind of music do pirates listen to?" -Paul Maud'dib
      "Yeeeaaarrrrr n' Bee!!" -Stilgar, Leader of Sietch Tabr
  7. Kerio by Anonymous Coward · · Score: 2, Informative

    I would start by talking to Kerio , their mailserver is very scaleable. www.kerio.com

    1. Re:Kerio by Anonymous Coward · · Score: 0

      Are you serious? I have never had a worse email experience than when my company switched to Kerio. Exchange is the holy grail compared to Kerio, on the same hardware.

    2. Re:Kerio by epiphani · · Score: 3, Informative

      Or outsource the whole damn thing. There are dozens of providers out there that could drop a rack worth of gear into your datacenter and maintain the whole thing with plenty more experience in handling mail systems of that size. And at that level, I'm sure you'd have no problem getting it branded however you like.

      disclaimer: I work for one of those companies.

      --
      .
    3. Re:Kerio by Anonymous Coward · · Score: 0

      Run, don't walk, from Kerio. Their server software has more bugs than a well-used brothel, their technical support is spotty, and their technical design is questionable at best. (For example, under Kerio's current release, you can only create full backups of your mail store up to three times a week. Backups, as we all know, are highly overrated.)

      Two tin cans and a piece of string would be a better enterprise-class communication system.

    4. Re:Kerio by ocbwilg · · Score: 1

      Or outsource the whole damn thing. There are dozens of providers out there that could drop a rack worth of gear into your datacenter and maintain the whole thing with plenty more experience in handling mail systems of that size. And at that level, I'm sure you'd have no problem getting it branded however you like.

      Nonsense. How many companies do you think are actually out there that have experience with email systems that support over a million users? I used to work for a Fortune 50 company and we only had about 100,000 users to support. The number of mailsystems in the world that currently support 1 million or more is probably quite small (outside of the AOL/Hotmail/Yahoomail group).

      Secondly, do you really think that they're going to design a system and drop a rack into their datacenter to support all of these users? It's not like all 1 million users will be working on the same campus (parking alone would be a nightmare...also the fact that that would make it as large as many cities). Those remote site users are gonna have a headache over slow WAN links.

      Any system of that scale is going to be large. There will be probably a hundred or more servers distributed at company sites around the world. Those would require multiple WAN links between sites. There will probably be multiple SMTP gateways/DMZs for inbound and outbound email. There will have to be lots of redundancy. And putting it all together will require coordinating the efforts of an email team, a sysadmin team, and a networking team.

      If the OP wasn't just yanking someone's chain by asking the question, then I truly feel sorry for his company. He seems to have no idea of the scope of the project that he's tasked with, and posting to Ask Slashdot just underlines the point. He should do himeself a favor and do three things:

      1. Contact the Microsoft TAM that is assigned to your current installation. You do have one, right? Because if I had an Exchange installation of that size I would want a TAM that I could go to for assistance. At any rate, explain to the TAM what it is you want to accomplish. Make sure that they understand where Exchange is failing you. See if there are ways that they could address the issues. If they think that you system or architecture is hosed/could be improved, have them put together a project proposal to fix it. Microsoft has consultants for just such an occasion.

      2. Start contacting other consultants with experience in this area. Explain to them what you have, explain the problems, what you want, etc. Have them put together a project proposal to fix it (including migrating you to another mail system).

      3. Sit down, compare notes, and try to decide which solution is the best (considering cost, features, time to implement, etc). It may be that your Exchange installation can be remediated to meet your needs with a few architectural changes (you didn't say why your company was sick of it). It may require an upgrade. It may have to be pitched and rebuilt as something else. But either way you will at least have good options.

      I would be extremely leery of anyone proposing that you pitch the entire Exchange installation without trying to fix it first. If you have 1 million Exchange users, you have already made an investment in the range of $30,000,000 for software licensing. Anyone that cavalierly tells you to throw that away before trying to salvage the situation may not have your best interests in mind.

  8. I'd start by by technoextreme · · Score: 4, Funny

    bashing my head up against a desk.

    --
    Ooo man the floppy drive is broken. No wait. The computer is just upside down.
    1. Re:I'd start by by moranar · · Score: 5, Funny

      Ah! Sendmail!

      --
      "I think it would be a good idea!"
      Gandhi, about Internet Security
    2. Re:I'd start by by Viper233 · · Score: 2

      Ah! Sendmail!
      I think the bashing of the head into my desktop would result in fair less pain (and brain damage) then trying to run sendmail....
      Yes, I'm bashing sendmail

    3. Re:I'd start by by jqh1 · · Score: 1

      yeah, yeah, sendmail.

      Spamgourmet ( http://www.spamgourmet.com/ ) uses sendmail for the mail server and is handling 1,741,490 disposable email addresses as I write, on tinker toy hardware.

      It's a pain in the ass to configure, etc., etc., but it works very well and probably has more flight time than any other choice.

      --
      who's moderating the meta-moderators?
    4. Re:I'd start by by McSpace · · Score: 1

      ...looking for another profession.

    5. Re:I'd start by by jdunn14 · · Score: 2, Funny

      Come on, just move the keyboard under your head and bash away.... it'll make a valid sendmail config....

    6. Re:I'd start by by Anonymous Coward · · Score: 0

      I don't see how .NET comes into this discussion??

    7. Re:I'd start by by mrmeval · · Score: 1

      There are 20 shot revolvers. They were made over 100 years ago.

      --
      I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
    8. Re:I'd start by by cloudmaster · · Score: 1

      It's impossible for a human to make a valid sendmail config - but the head bashing will certainly make something that a human can't distinguish from a conf that's valid.

  9. Um... by Stevyn · · Score: 3, Informative

    I'd start by contacting people who know how to do it and can actually help you. A few responses on slashdot aren't going to help you along the entire process. Maybe even bring in a consultant.

    1. Re:Um... by ugo · · Score: 3, Funny

      I think he is the consultant.

    2. Re:Um... by croddy · · Score: 1

      maybe, and maybe not, but they will help the READERS who -- believe it or not -- are often interested in the same answers as the original poster!

    3. Re:Um... by supabeast! · · Score: 1

      I was about to suggest calling IBM. If anybody would know how to do it, it would be them.

    4. Re:Um... by Infernal+Device · · Score: 1

      It would help the readers more if the question was of a nature that the readers could actually implement.

      This question reads like someone just hasn't had inclination to do their own research and wants a cheap way out of it by asking around.

      --
      "My God...it's full of trolls!"
  10. cyrus by Anonymous Coward · · Score: 1, Interesting

    i believe that cyrus imap was designed specifically for large scalable systems. it can scale to multiple servers and uses a database for hashing the email... (afaik)

  11. qmail by tadauphoenix · · Score: 2, Insightful

    I've always favored it, and with some scripting/automation, I wouldn't see why you couldn't scale that large with inexpensive hardware.

  12. Highly paid consultants or....Ask Slashdot by Phoenixhunter · · Score: 1, Insightful

    I have a feeling you're not going to find the answer you're looking for, as the scale you're talking about is indeed beyond the scope of work that most of us work in.

    1. Re:Highly paid consultants or....Ask Slashdot by thunderbee · · Score: 0, Troll

      LOL. Actually, it's my job :)

      I designed a system, based on FOSS, that could handle this kind of load, scales nicely, is standard and buzzword compliant, elegant, flexible, and is tested (although not on this scale).

      Guess what - I'm not posting the howto here :)

      --
      In my opinion, Scientology is a cult you should avoid.
    2. Re:Highly paid consultants or....Ask Slashdot by Anonymous Coward · · Score: 0

      LOL thats so funny.

      thanks for the USELESS inane post you FUCKING troll cunt.

      god, if you werent a jew..... hail hitler!

    3. Re:Highly paid consultants or....Ask Slashdot by timmarhy · · Score: 1

      rubbish, you speak for yourself there mate. the first thing to hone in on is the fact that 1 million email account can generate a hell of a lot of traffic, so your either going to need to be on a lan or have a beefy internet connection (home dsl won't cut it etc.) secondly if it's 99.9% sla then you'll need REDUNDANT connections. next is hardware, you'll need caching and secondary dns servers. naturally you will have to lash out on high end raid cards and decent disks (size determined by how much quota you will give them) email doesn't really require a massive amount of cpu speed, i reckon you could get away with a single xeon to be honest, although it might be wise to allow for furture growth and just buy dual capable boards. next step it to buy 2 of the same setups, so that when one fails your backup mx record will go to the 2nd server. software - bind,sendmail,freebsd (the later is just my personal choice) don't kid yourself, sendmail IS the most flexable and powerful mta out there.people are going to want all kinds of weird and wonderful setups and sendmail can give it to you. antivirus, install a good mta based antivirus product.

      --
      If you mod me down, I will become more powerful than you can imagine....
    4. Re:Highly paid consultants or....Ask Slashdot by Anonymous Coward · · Score: 0

      I have done that. With qmail + ldap + cheap storage + cheap frontend.

      I have more than a million accounts, with more than 99.99% availability for over a year.

      If you want, drop me a line at hsg@aconectarse.com and I'll send you all the information.

    5. Re:Highly paid consultants or....Ask Slashdot by Atzanteol · · Score: 1

      Email alone may not take too much CPU, but how about spam and virus filtering? Spamassassin can really chew up the CPU.

      --
      "Ignorance more frequently begets confidence than does knowledge"

      - Charles Darwin
    6. Re:Highly paid consultants or....Ask Slashdot by Anonymous Coward · · Score: 0

      You, sir, have never built a mail system that big, and it shows through quite a bit. Either that, or you're trying to be funny... because, to those of us who have - its funny.

    7. Re:Highly paid consultants or....Ask Slashdot by PlasticMetal · · Score: 1

      Try http://www.zmailer.org/ for smtp, it will scale by it's design (two processes communicating with each other). It is deployed on few known to me large free mail sites.

      --
      Plastic & Metal. Is this sh*t worth livin' 4?
      Is diz sh*t worth dyin' 4?
    8. Re:Highly paid consultants or....Ask Slashdot by Kadin2048 · · Score: 1

      Are you joking?

      I can't tell if this is just sarcastic understatement, or if you actually think that "a single xeon" server (or two of them, with one as a redundant backup) can handle a million corporate email accounts. That's ridiculous.

      I'd wager that the whole thing would fail the first morning you had a tenth of the employees in the corporation all attempt to log in and check their email simultaneously. When he means 1M accounts, in a corporate situation, it's not like Hotmail where 90% of the accounts are just idle and never see any activity. The usage is going to be extreme at some points and nearly zero at others, but users are going to expect the same (excellent) performance constantly.

      The requirements plainly dictate a distributed, highly redundant system with multiple levels of soft fallbacks. It's a major project, not something you're going to run out of your basement on a dual T1.

      --
      "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
  13. For starters... by cached · · Score: 3, Interesting

    For starters, uptime should usually be higher than 99.9% for this large a site. 99.9% uptime means 40-45 minutes of downtime a month. Try going for 99.99% at least, though this usually increases the cost by about 250% according to what I have seen a few years back.

    --
    +1 funny, -2 overrated. Life isn't fair.
    1. Re:For starters... by Anonymous Coward · · Score: 0

      Hell,

      The last time my corporate email server went down was about 2 years ago and only for 23 minutes!

      In fact, in 7 years with this $#*%$ company, I have never allowed for significant work-hour downtime.

      I must be doing something right!

      I certainly wish I got paid well for this level of success!

    2. Re:For starters... by Anonymous Coward · · Score: 0

      Who needs 4 nines?

      Around here, we've delivering 12 eights. That's like, 3 times better.

  14. for spam... by file+cabinet · · Score: 2, Informative

    take a look here: http://www.webhostingtalk.com/showthread.php?threa did=441925 .. the post by slidey is possibly the most useful.

  15. Easy by nyquil+superstar · · Score: 0, Troll

    Exchange!

    1. Re:Easy by Foofoobar · · Score: 1

      Exchange? What a great idea!

      Sir, I'd like to Exchange this crappy mail server for something decent.

      --
      This is my sig. There are many like it but this one is mine.
    2. Re:Easy by Anonymous Coward · · Score: 2, Insightful

      Why is this marked as funny? It should be marked as informative.

      Unless the person wanted to start an Exchange flame war with his post, he clearly has no idea how to design an enterprise email infrastucture.

      All the technology in the world can't help you if you don't understand what you are doing and based on his broad sweeping question, it would be easy to assume that he doesn't.

      If he is the amateur email administrator that he has made himself out to be, no amount or advice or technology can help him.

      If he can't design the email infrastructure he definitely won't be able to properly implement and manage it either.

      Better leave this kind of work to the professionals.

    3. Re:Easy by superdude72 · · Score: 1

      Resign. You're obviously in way over your head if you have to resort to asking Slashdot readers for advice like this

      Is the Bush administration hiring Heritage Foundation interns to run FEMA's IT Dept., or something?

    4. Re:Easy by Anonymous Coward · · Score: 0

      You know, sometimes when I know EXACTLY how to do something, I play dumb and ask other people. Just to see what they'll say.
      Maybe I will learn something.
      ;]

    5. Re:Easy by JimBobJoe · · Score: 1

      Resign.

      It's great advice. So good, I'm sure you can find a lot of people who have done the same in a similar situation.

      Of course not.

      So many projects in the world have gone bad because people either think they know what they are doing, and don't have the humility to ask questions when they start seeing things go wrong, or they already don't know what they are doing, but are too meek to admit to it.

      At least the questioner has the humility to ask the question. With any luck, he can put together a good system, and become a fountain of advice for the next sys admin who needs to build an email system to deal with a million accounts.

    6. Re:Easy by Anonymous Coward · · Score: 0

      It doesnt work that way in egoland. If you cant blow your own horn loud enough you are not fit for any job.

    7. Re:Easy by Anonymous Coward · · Score: 0

      Wow.. What a dick. He asks around, and you ASSume his experience. Maybe you are way over your head for being mature.

    8. Re:Easy by gunnarstahl · · Score: 0

      This is one of the strangest answers I've read in a long long time.

      If he did _not_ ask while not knowing how to do his job might justify to tell him to quit.

      But it is not justified to tell this to a person who seeks knowledge. He asks a community of people where at least some of them have a decent knowledge of computers. And I guess /. is not the only addressee for his questions.

      I wished more people would ask instead of hiding their incompetence behind a wall of arrogance and ignorance.

  16. NO GMAIL by Anonymous Coward · · Score: 2, Informative

    I would have to say use Qmail on a freeBSD/Linux system. If you look at yahoo they have millions of email accounts and use qmail wich is very stable and very portable.

    1. Re:NO GMAIL by bani · · Score: 1

      qmail was designed primarily to be secure. it's not designed to be fast.

      yahoo doesn't use qmail anymore. they replaced it with an inhouse-built mta.

    2. Re:NO GMAIL by Alan+Hicks · · Score: 4, Interesting
      I would have to say use Qmail

      My God no! Friends don't let friends use qmail. Want reasons why?

      1) It's a bitch to install. Won't even compile on modern Linux distributions. You have to patch it to compile it and the patch isn't even hosted on qmail's site.
      2) It's a bitch to configure. Rather than parsing a single configuration file, qmail relies heavily on the presence of individual files in a directory.
      3) Not not not not scalable! That's a myth. Doesn't properly batch jobs together. Hell! qmail was originally designed to be run from inetd!
      4) Heavy reliance on other daemontools.
      5) Breaks well-known and understood UNIX standards.
      6) Security through lack-of-functionality.
      7) Not really secure despite the claims.
      8) No longer maintained.
      9) No features. Adding them requires patching, and patching, and more patching.

      Serious sysadmins don't use qmail and for damn good reason. I don't give a damn if Yahoo did manage to string it together and make it work well. In short, qmail isn't particularly suited for deployment in any capacity.

      --
      Slackware, what else when it must be secure, stable, and easy?
    3. Re:NO GMAIL by ¡ · · Score: 0, Flamebait

      Be careful. Because of your post, Dan Bernstein is sure to post a long, detailed rant about how you're obviously retarded because won't listen to him, you hate babies, like to kill kittens, etc. Ugh. Man. Qmail _is_ a pile of shite.

    4. Re:NO GMAIL by Denis+Lemire · · Score: 5, Interesting

      Definately agree on point 9. I maintain a mail server of over 2,000 users. Currently running Qmail with the following patches:

      chkuser-2.0.8b-release.tar.gz
      doublebounce-trim.patch
      netqmail-1.05-tls-20050329.patch
      outgoingip.patch
      qmail-smtpd-auth-0.31.tar.gz
      qmail-smtpd-auth-close3.patch
      qmail-smtpd_gmfcheck.patch
      qmail-spf-rc5.patch

      Most of these patches require hand editing the sources and Makefiles to successfuly merge them all into the stock qmail or netqmail base. Lots of manually reading through *.rej files to make it all work.

      In order to simplify new installations I've created my own personal CVS repository for my Qmail sources. I commit changes to the tree whenever a new patch comes out with functionality I need. Hence on a new install I simply check out my custom tree and compile.

      The initial work was a royal pain in the ass, however, once it is all up and running the stability and performance has been excellent.

    5. Re:NO GMAIL by Russ+Nelson · · Score: 1

      So, do you think smtpd-auth should be in netqmail? So far we've only fixed real bugs and compatibility problems. We could start adding features.
      -russ

      --
      Don't piss off The Angry Economist
    6. Re:NO GMAIL by mrjackson2000 · · Score: 1

      John Simpson has a combined patch set for qmail. He has a few of his own patches in the mix as well.
      i've been running a qmails erver using his patch for almost a year and i love it.

    7. Re:NO GMAIL by Anonymous Coward · · Score: 0

      I call bullshit on this. qmail is a very capable server and scales really well. Hotmail use(d) it! And it's the most secure design I've ever seen in a software, no one has managed to make an attack that has any practical significance, ever. The part about patching, patching, patching is true however. If you are new to it and just want it to work you are probably better off with Postfix.

    8. Re:NO GMAIL by Anonymous Coward · · Score: 0

      1. emerge qmail 2. text files; oh yea, that is so tricky 3. As scalable as your system; 4. Simple and functional 5. Implements good ideas in Linux well 6. Security implemented by operating system and system administrator 7. Crap 8. Crap 9. Crap

    9. Re:NO GMAIL by Denis+Lemire · · Score: 1

      I'm not sure, it would be convenient for me. I don't know if every qmail user, however, or every qmail installation would require SMTP auth.

      What would be really nice is if someone could clean up the code base in such a way that you can enable or disable all these features at compile time.

      That however would be a significant effort.

    10. Re:NO GMAIL by Russ+Nelson · · Score: 1

      enable or disable all these features at compile time.

      No, that's the wrong solution. First, because out of respect for djb, we shouldn't be adding features to his code. Second, you really, really, really, oh so painfully, oh the pain, make it stop, NO, NO, not at compile time, NOOOOOOOOOOOOOOOO!!!!!!!!

      It is ineffably painful to have a program which has to be recompiled to change the way it operates. We left that behind, oh, about the time 1GB hard drives appeared, I'd say. Instead, the proper way to add features to qmail, or indeed any program, is to put them into separate programs. If you want smtp-auth, you run qmail-smtpd-auth.

      And if we add smtp-auth to netqmail, it will be in qmail-smtpd-auth. qmail-smtpd will remain as djb wrote it. Ought else will happen over my twitching corpse.
      -russ

      --
      Don't piss off The Angry Economist
    11. Re:NO GMAIL by Denis+Lemire · · Score: 1

      Whoops, looks like a struck a nerve with that last concept! No harm or stress intended. ;)

      Having a seperate binary certainly works to an extent, though it would break down at some point ie) having qmail-smtpd-auth-spf-goodmailfrom-tls would be a little overboard me-thinks.

      I do need to humbly disagree with the concept of adding features to someone elses code being disrespectful or otherwise offensive. While you and/or djb are entitled to believe in this concept (it is his original code afterall) I think this attitude has offended some OSS purists. In contrast the community behind most projects have the rational that many contributers enhancing the feature set, when done correctly, is what allows projects to evolve and become much more valuable.

      The qmail community seems to be less open to direct outside contribution. I think this has held qmail back to an extent.

      All that being said, I'm an experienced admin. I am capable of merging patch-soup into the code base and getting the system running the way I need it to run. There are however many more admins who either do not want to, or are unable to go through such hassle to install an MTA.

      One of the primary reasons I use Open Source software is the ability it gives me to strip out features I don't need and add features I require. I don't care if I have 1 terrabyte of storage and 10 gigabytes of ram. If I can strip 2 MB from my kernel or my qmail binaries I will gladly do so.

  17. 1 Million Users! by joesucks · · Score: 2, Informative

    Wow, That is pretty huge scale but if Google, MSN and Yahoo have supported that many, and many more users all along open the back doors to see what they are doing? If it were me Linux obviously, Hi-Availability Clusters, some kind of solid indexing. Its still email :)

  18. POP? by lseltzer · · Score: 4, Funny

    A million users and they want POP3? Add a gun and a single bullet to your administration requirements.

    1. Re:POP? by JoshWurzel · · Score: 5, Funny

      I'd ask for six bullets. Why would you want to risk getting the empty chamber?

    2. Re:POP? by tktk · · Score: 4, Funny

      I'd ask for enough bullets to handle the department thats making you to do this.

    3. Re:POP? by Anonymous Coward · · Score: 2, Funny

      I'd ask for 900,000 bullets

    4. Re:POP? by mre5565 · · Score: 5, Interesting
      A million users and they want POP3? Add a gun and a single bullet to your administration requirements.
      No doubt a well deseved +5 for humor, but for those of us less in the know (and a chance at another +5 for informative), what is so bad about POP3? Thx.
    5. Re:POP? by euxneks · · Score: 3, Funny

      I'd ask for six bullets. Why would you want to risk getting the empty chamber?

      Exactly!

      Remember, redundancy is good! ;P

      --
      in girum imus nocte et consumimur igni
    6. Re:POP? by Mr.+Underbridge · · Score: 5, Funny

      I'd ask for six bullets. Why would you want to risk getting the empty chamber? I see that you are familiar with the subtle nuances of Polish Roulette.

    7. Re:POP? by Antique+Geekmeister · · Score: 1

      Use an automatic.

    8. Re:POP? by QuasiEvil · · Score: 2, Insightful

      If my company would only go BACK to POP3, my life would be so much easier. First, we moved from POP3 to IMAP - no big deal, but I don't care for IMAP and the whole remote folder thing. However, it just required me to modify fetchmail to dump it in the mail spool on my linux box, same as always. Then set Windows box with Eudora to leave mail on my linux box for 2 days. Then, I can use Eudora as I want, mail is stored on my Windows box, and I can read it using pine over SSH for 48 hours. Worked great, did everything I needed for five years.

      As of six months ago, we have Exchange/Outlook, and no POP3/IMAP access to the server at all. You're stuck with Outlook or webmail based on how it's configured. After much reconfiguration, I finally got Outlook to behave mostly the way I want - including delivering mail locally rather than leaving it on some server a thousand miles away (literally, not joking here). Now if I didn't hate everything about Outlook...

      All I want, and all I've ever wanted, is to be able to grab my messages easily and put them on my machine, not stored on a server somewhere. POP3 is great for that. It does absolutely everything I want and need for mail, and it's dead simple. Even if you don't make it the standard implementation, it'd be nice if admins everywhere left those of us who know what we're doing the option of using it.

    9. Re:POP? by Anonymous Coward · · Score: 0

      or perhaps, to ask in a different way, what should he be using instead?

    10. Re:POP? by Anonymous Coward · · Score: 0

      First off, how many 1 million employees companies you know of? IBM does even come close to 100K. And if your company is even close to the 100K employees mark, and they can't run an Exchange environment, you got bigger problems on your hand.

    11. Re:POP? by elliotCarte · · Score: 1

      I'd ask for six bullets. Why would you want to risk getting the empty chamber?

      Okay, good idea, but I say 3 should do it. You can position the cylinder such that you think the hammer will strike the round in the center (the middle one). If you're off by more than one chamber then you either;
      1. don't know how a revolver works and thus shouldn't be handling one.
      or
      2. weren't really trying, in which case you shouldn't bother going and being all dramatic and stuff. You could accidentally hurt yourself or someone else.

      My $00.02, thanks for playing.

      --
      If you can't just be yourself, then be more like me, ok?
    12. Re:POP? by lukewarmfusion · · Score: 3, Insightful

      I was curious about that, too...

      Wal-mart has an estimated 1.6 million employees. (source)

      General Motors, by contrast, has approximately 360,000 employees.

      The post says "around one million accounts" which is very different from one million employees. I have over ten email accounts that I actively use for receiving mail and four to six for sending.

      An ISP could easily have millions of accounts. But since he said "huge" company, they were using Exchange, and because he's asking Slashdot my guess is that he's not at an ISP. Instead, I'd guess he's at a medium-sized company that might offer email accounts to its customers or at a large company that also contains many subsidiaries (but wants one email domain for all of those).

    13. Re:POP? by billsoxs · · Score: 1
      I'd ask for six bullets. Why would you want to risk getting the empty chamber? I see that you are familiar with the subtle nuances of Polish Roulette

      Is this in case you miss the first 5 times?

      --
      This message was brought to you by "Lack of Sleep."
    14. Re:POP? by sgt_doom · · Score: 2, Funny
      SO! China's Department of Public Security has finally gotten around to developing its own email.

      At last!

    15. Re:POP? by OAB_X · · Score: 1

      My company uses Exchange (which I hate, but live with) and I do not have a PC to call my own (and therefore, cannot properly use outlook). So, I set up Thunderbird to use POP3 (yes, its possible) and run the USB key drive version. So, if you want exchange, you do not need to forgoe POP3, it still is there, just a setting that needs to be enabled.

    16. Re:POP? by Anonymous Coward · · Score: 0

      As opposed to American roulette where you play with a semi-auto handgun.

      Yeehaw!

    17. Re:POP? by Anonymous Coward · · Score: 0

      One small particle of information...

      Pine can connect to IMAP folders. It can even cache the messages (like most IMAP clients I know of) so the only messages that need to be downloaded are the new ones (just like POP3).

      And you still have the benefit of the messages staying on the server miles away in case you need to use webmail somewhere else.

      That is the huge advantage of IMAP over POP... You can view the messages from anywhere (because the messages stay on the server).

    18. Re:POP? by jschoenberg · · Score: 1

      You should check out the RPC over HTTP feature in Exchange Server 2003. It does exactly what you want with a few other features built in.

      http://www.outlookexchange.com/articles/HenrikWa lther/RPC_over_HTTP.asp

    19. Re:POP? by Anonymous Coward · · Score: 0

      check out cached imap. It copies imap mail to your local computer and also syncs it. firebird and kmail can both do it.

    20. Re:POP? by Anonymous Coward · · Score: 0

      When did pine add caching? Last year it didn't have caching...

    21. Re:POP? by Anonymous Coward · · Score: 0

      All I want, and all I've ever wanted, is to be able to grab my messages easily and put them on my machine, not stored on a server somewhere. POP3 is great for that. It does absolutely everything I want and need for mail, and it's dead simple. Even if you don't make it the standard implementation, it'd be nice if admins everywhere left those of us who know what we're doing the option of using it.

      That's nice for you, but I'm sure the company wouldn't be happy if your email isn't on their server, as they now don't control backups, nor do they have a copy of emails in case any legal procedures require it (for good or bad I don't care, but in some places it's the law).

      You know what you're doing? The easy response to that is "when was your last backup"? I'm sure most Slashdot users know about the importance of backups, but I'm willing to bet over half of them would lose something invaluable should their system blow up.

      You've mentioned multiple machines, so you do seem to have redundancy, but a fire in your house would still negate that (it'd even kill CD backups and the like). A business (especially one that services emails for thousands/millions of people) should have better protection against fires/floods/whatever.

      The business may not have very good protection at all, but then it's their problem, not yours. You won't be fired or sued for breaching company policies.

      -Steven

    22. Re:POP? by Anonymous Coward · · Score: 0

      I almost wonder if you work for the same company I do...

    23. Re:POP? by Anonymous Coward · · Score: 4, Insightful

      what is so bad about POP3

      Having never been near a computer, I have no idea. If I had to guess, I'd suppose that with a million users, 100,000 of them will have to be constantly reminded to delete their mail off the servers. 25,000 of them won't EVER delete their mail no matter what you do, and 5,000 will bitch and whine when you cap their fucking mailboxes. One of them will be the CEO, and he'll berate you in front of his smarmy suspender-wearing jerkoff golf buddies because you're a dumb hick that can't fit a terabyte of mp3s and porn (most of it redundant for chrissakes) into only 500 gigs of disk. You will also get to deal with countless issues involving different email clients. You would give almost anything to have a massive natural disaster wipe everything out so you didn't have to go to work tomorrow, but there's the wife and kids, so y'know, there it is.

    24. Re:POP? by Cumstien · · Score: 1

      Imagine a ...

    25. Re:POP? by QuasiEvil · · Score: 1

      Actually, there's no policy that I have to leave the mail on their server. They think that's a "feature" and can't understand why I don't want it. My email needs to be where I am to do my job, and often that's in some faraway corner of BFE with lousy connectivity to corporate headquarters (read: 56k frame relay or worse...) or at my house when VPN won't fire up and we have a production emergency.

      As far as my last backup, about 45 minutes ago - runs at 2000h. Both of my machines are backed up nightly (incremental during the week, full on Saturday) to a network tape backup in the server room downstairs, and the weekly full backups are kept offsite by the corporate lan goons and retained for six months. I'm meticulous about my backups, especially since my primary machine is a laptop that goes through hell when I'm on the road. Of course that doesn't work when the laptop is not connected, but 80% of the time I leave it at work overnight. Add to that my occasional (probably twice monthly) burns of all my data to DVD-R, which I have stacked up at home, and you have a pretty robust backup system.

      I won't get fired or sued for losing my email, nor will my company, but my life will suck and I won't have the history or resources to do my job efficiently and effectively. I have the utmost interest in its longevity and functionality, and after the corporate servers have lost my mail several times, I've given up trusting them.

      As for what I do, let's just say I'm in an electrical engineer working in scanning system design for the transportation sector.

    26. Re:POP? by level_headed_midwest · · Score: 1

      At least it's not Retard Roulette. A semi-automatic with a round in the clip.

      --
      Just "gittin-r-done," day after day.
    27. Re:POP? by timbo234 · · Score: 1

      I use fetchmail to get my email from various IMAP accounts and it works fine - just like POP3, you don't have to go in for the whole 'remote folders' thing if you don't want to.

      --
      Pre-canned Evolution Links for all those Slashdot holy wars.
    28. Re:POP? by Bake · · Score: 1

      Funny, because I can't remember a single POP3 client that I have used throughout the years that did NOT delete mail off the server after retrieving it, by default.

    29. Re:POP? by Skapare · · Score: 1

      Doesn't necessarily need to be just one domain. There could be many domains being handled by one server (or a cluster of servers). All the well known mail servers handle them just fine.

      --
      now we need to go OSS in diesel cars
    30. Re:POP? by eno2001 · · Score: 1

      Hmmm... I hate POP. Users constantly storing their e-mail on their local machine with NO backup or concept of why they should backup if they want it local? No thanks. When you're dealing with users who aren't to clear on where backups happen (at the server), then you don't want their mail on their workstations. IMAP rules. I'd never look back to POP as a suitable mail protocol ever again. I even use IMAP here on my mail server at home. POP? What were you thinking?

      --
      -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
    31. Re:POP? by Anonymous Coward · · Score: 0

      ... probably sometime between last year and today, I'm guessing.

    32. Re:POP? by Anonymous Coward · · Score: 0

      I can tell you that GM does not use Exchange. At all. They use something else, with lots of other products upstream to actual deal with the real incoming stuff. It shouldn't be difficult for an enterprising geek to figure out, though.

    33. Re:POP? by shaitand · · Score: 1

      Last I checked 99.9% of all pop clients that have more than 5 users auto-delete mail from the server after a period of time or delete it immediately upon retrieval.

      IMAP on the other hand...

    34. Re:POP? by demachina · · Score: 1

      Well the obvious yin and yang here is if you are a corporation that wants to spy on all your employees email, and you want to ensure all your email is around for an attorney to twist and use against you in a lawsuit then yes I think you should leave it all on a corprate server some place and back it up really well.

      Me personally I cherish the concept of getting my email off the server at the first opportunity and hope whomever is running the server isn't spying on it or backing it up as it comes in. The ideal is a system focused on IMAP where someone left POP enabled and the admin doesn't entirely grok that fact. I also periodically move my old mail on my machine in to PGP so no a**shole snake of a lawyer or FBI agent can use it against me some day at least without some serious work.

      I guess it just depends on whether you want your email to be transient and yours or you want to have it hang around forever to come to haunt you like it did Bill Gates in the DOJ antitrust suit. If I ran a big corporation I think I would make sure all the corprate email got flushed on a regular basis.

      Bottomline is I love POP and wish it more instantaneously transfered my mail out of the grip of whomere is running the server.

      --
      @de_machina
    35. Re:POP? by Ryan+Amos · · Score: 1

      Better than German roulette. They play that with a Luger.

    36. Re:POP? by Idaho · · Score: 1

      No doubt a well deseved +5 for humor, but for those of us less in the know (and a chance at another +5 for informative), what is so bad about POP3?

      Mainly the fact that it sends your password over the line in plaintext. While this might be acceptable for your home Linux box with 5 users, it is asking for problems (hacked accounts) in such a large setup. From a security perspective it is as bad as using telnet or ftp.

      However, it's a protocol that every mail client supports, so it's not like you have much choice, probably...

      --
      Every expression is true, for a given value of 'true'
    37. Re:POP? by Anonymous Coward · · Score: 0

      Yes but you have made it quite clear from this post that you aren't the average user. The average user as has already been stated is nothing short of an walking disaster when it comes to responsible storage/recovery.

    38. Re:POP? by Anonymous Coward · · Score: 0

      For pop3, suggest take a look at http://freshmeat.net/projects/popular/ .
      It's open source, scales very well. Easily handles a million pop accounts.

      However, an email system takes more than just software. It also requires hardware and good systems people.

    39. Re:POP? by mrdogi · · Score: 1
      Having never been near a computer, I have no idea...

      Your one of those people mentioned in the old Apple II manual that could understand the program cassette tapes by listening to them, aren't you?

    40. Re:POP? by se7en11 · · Score: 1
      I wouldn't imagine every Wal-Mart employee having an email address. I would estiamate more like only 15% of those do.

      I don't think little Jonny bagging my items needs one. :)

    41. Re:POP? by ibennetch · · Score: 1
      with a million users, 100,000 of them will have to be constantly reminded to delete their mail off the servers. 25,000 of them won't EVER delete their mail no matter what you do, and 5,000 will bitch and whine when you cap their fucking mailboxes
      How does IMAP fix this(1)? Or what is your prefered solution?

      Also with POP3 you run in to the problem of backing up messages...much easier to back up a single server than each person's computer's message store.

      1 - maybe I'm missing a joke or something. Or maybe I don't 'get it'. But seems to me that shouldn't be 'insightful'.
    42. Re:POP? by Anonymous Coward · · Score: 0

      Also with POP3 you run in to the problem of backing up messages...much easier to back up a single server than each person's computer's message store.

      That depends. In a corperate enviornment you probably already back up all of the users data anyway. If they have a roaming profile, that's already stored on another server - if it's a samba server that's a peice of cake. It's also a very robust 'million cockroaches' approach, in that you don't have "the exchange server is down so no one can get their email" problems. Not only that, but each user computer comes with more HD space then they'll ever really need (usually) so that will free up the server quite a bit.

      I'm not saying it'll scale THAT far, but I think POP3 can do quite a bit with some planning.

    43. Re:POP? by eth1 · · Score: 1

      "One of them will be the CEO, and he'll berate you in front of his smarmy suspender-wearing jerkoff golf buddies because you're a dumb hick that can't fit a terabyte of mp3s and porn (most of it redundant for chrissakes) into only 500 gigs of disk."

      And as soon as I get back to my office, I'll^H^H^H^Hhe'll 'accidently' forward most of that porn to the PR dept. of our chief rivals... :)

    44. Re:POP? by ibennetch · · Score: 1

      Fair enough (and thanks for the reply). I agree that POP3 is a fine solution in many cases, although for my personal account I prefer IMAP, I can use mutt via on the server, webmail from an internet cafe, Outlook Express at home, and maintain syncronized folders everywhere, but in a corporate environment I won't disagree with you.

      I don't work in corporate IT, but I do have an account at a rather large national corporation where My Documents is mapped to a network drive and anything that's on one's computer's hard drive is not IT's responsibility. My Documents is backed up because it's on the server, but anything on the local disc isn't their problem. I'm not sure where the Outlook message store is, but the messages stay on the server until deleted (but there's a size and time limit, messages older than 2 or 4 weeks are deleted unless moved to a local folder).

      That's how I tend to look at things; keep things on the server, let the admins back up the server, and if a machine blows up, re-image it and keep on working. Makes things nice if you want to work from someone else's desk, too.

      But yeah, I see how this all can quickly become a hassle no matter what protocol is used or where the messages are stored.

    45. Re:POP? by Anonymous Coward · · Score: 0

      It takes all of 15 seconds to create a personal folder inside of Outlook and change the delivery location...

      Frustration caused by inexperience with Outlook might be a better complaint. :)

    46. Re:POP? by darkone · · Score: 1

      2 Words:

      Relaying Denied.

    47. Re:POP? by kryzx · · Score: 1

      God damn, near-suicidal work-induced bitterness is funny. You are killin me.

      Just helped our CEO migrate to a new laptop. Got Outlook configured and then it started downloading from the server. Every time it did a "send and receive" it was trying to download 1239 messages and would grind for an hour or more. Turned out he had been set up to not delete from the server, so the new installation was trying to download all 39,000 messages, but Outlook must have some limit, so it would only try to grab 1239 at a time. Oh, we had good fun that day.

      --
      "I don't know half of you half as well as I should like, and I like less than half of you half as well as you deserve."
    48. Re:POP? by eno2001 · · Score: 1

      That's why...

      1. You shouldn't use your work e-mail address for private stuff. Everything that goes through your employer's server should be considered suspect
      2. You should run your own mail server and make it accessible via secure means

      I don't run a local mail client at all. I use VPN to get to a workstation back at home and that's where all my e-mail and IM happens. I also run my own IM server so that my friends and I can be sure our communications are secure. It's a simple process really. Running servers at home (if you use open source) isn't that hard.

      --
      -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
    49. Re:POP? by elemental23 · · Score: 1

      But isn't IMAP the same? Unless you use (or require!) SSL/TLS, which you can do for POP3 just as easily.

      --
      I like my women like my coffee... pale and bitter.
    50. Re:POP? by demachina · · Score: 1

      "1. You shouldn't use your work e-mail address for private stuff. Everything that goes through your employer's server should be considered suspect"

      Excepting in the case of law suits what I was saying applies to work email, not personal email, and the FBI investigation could apply to either work or personal.

      Of course in the case of the FBI or someone who has a warrant they are going to capture it all at an telecom or ISP office so you are screwed at that point and probably should just stop using email for anything you don't want the whole world to read, or at maybe PGP encrypt though its open to debate if the NSA can't crack that and its a red flag you are hiding something.

      --
      @de_machina
    51. Re:POP? by demachina · · Score: 1

      "1. You shouldn't use your work e-mail address for private stuff. Everything that goes through your employer's server should be considered suspect"

      And I forgot to add the problem is more you can't control what someone malicious or clueless who has your work email sends you. You obviously can't insure the corprate man doesn't look at it but leaving it on a mail server and having it backed up ensures its there forever if the corporate man wants to go on a digging expedition.

      --
      @de_machina
  19. ~ 320K accounts by Anonymous Coward · · Score: 5, Informative

    At IBM we use Lotus Notes which has saved us LOTS of virus hassles. Every employee has an account and we're something like 320,000 worldwide. The mail "databases" are spread among Domino servers but I don't know what platform these run on, or what hardware specs they have. I imagine it's either Windows or Linux... but who knows, maybe we're using some of our PowerPC-based iSeries servers. These are the boxen formerly known as AS/400.

    1. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      You cannot be serious. Lotus Notes is the worst product on the face of the planet.

    2. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      "Blotus Notes"

    3. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      And you have 319,999 users who despise it (speaking as a current notes user).

    4. Re:~ 320K accounts by thatoneguyfromphoeni · · Score: 1

      We use Lotus Notes at my office and it may be the worst product ever made. We only have around 1000 users and it takes 6 people to maintain it! It's down for 5-10 minutues a few times per day; not to mention at least one multiple hour outage every 3-4 weeks. We've tried upgrading from 4 to 5 to 6.5, new servers, IBM onsite to help config and troubleshoot. It is a hassle, a pain and not worth the effort. If you need an enterprise level solution, sadly Exchange is the best solution. However, if you can do without all the bells and whistles there are numerous *nix based solutions using imap/pop3 that will suit your needs. You might also consider using something like Exchange for the employees and then using something less featured, but more robust and inexpensive for your "free" accounts.

    5. Re:~ 320K accounts by dogugotw · · Score: 1

      Domino runs on Win, Linux, and the AS stuff. It scales like a charm because you put servers where you need them. Users can access via native Notes client, browser, or POP. Server mail can be replicated to local laptops, do the mail off-line, then replicate when you get to the hotel. With the newer versions you can set execution control features which help keep the bad guys from hosing your system. Add to that that Notes is a great development platform and you've got an excellent package.

      Have fun!

    6. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      (I posted the grandparent message about IBM using Notes/Domino internally)

      Why would I make up something like that? If you don't believe me ask any IBM employee or read this article:

      http://www-306.ibm.com/software/success/cssdb.nsf/ CS/EHON-5JRLM9?OpenDocument&Site=software

      I made no comments about the quality of Lotus Notes, but having used it for almost 8 years with thousands of messages and often huge e-mail attachments, I'm pretty satisfied with it.

      Sure it has its quirks, but the fact that we were unaffected when lots of companies went to their knees (including Research in Motion, makers of the Blackberry) with viruses like Melissa made me appreciate Notes. I'll gladly use this instead of anything from Microsoft, thank you very much.

    7. Re:~ 320K accounts by Maradine · · Score: 1

      ...and I'm going to guess that a large part of that decision is because IBM owns the Lotus product line?

      --

      trustedworlds.net - gaming, security, and the gunk that lives in between

    8. Re:~ 320K accounts by GregoryD · · Score: 1

      My wife hates Lotus Notes. She dreaded when they switched from Outlook. My wife is geeky enough to use portable firefox from her network drive. She would rather use a Microsoft product then Lotus Notes.

      She also believes Lotus Notes to be the worst product ever.

    9. Re:~ 320K accounts by Anonymous Coward · · Score: 0, Troll
      As someone working for said company:

      Please, please, please! What ever you do, do NOT use Notes. Any value gained by not having as many viruses is very quickly lost by all of notes quirks:

      • Lost mail. I get severe disconnects between my local and server replicas. Server has stuff I don't and vice-versa.
      • Calendar very frequently forgets things, gets out of sync (it will say a meeting is at 10:00 when it is at 8:00).
      • Crashes several times per month.
      • Will NOT alert you of upcoming meetings when it is locked. My company requires a timeout and then it locks the screen.
      • Archiving and backup sucks. Be prepared to copy lots of .nsf files that you might not even be able to open any more or otherwise Notes will complain.


      Please, I beg you, DO NOT use Notes!
    10. Re:~ 320K accounts by Anonymous Coward · · Score: 1

      My company also uses Lotus Notes. I'm happy to hear that IBM uses it, too, so you suffer at least as much as we do.

    11. Re:~ 320K accounts by DaveCar · · Score: 5, Funny

      The mail "databases" are spread among Domino servers

      Yeah, but we all know what happens when one of these Domino servers falls over ...

    12. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      Well as the old saying goes, if no one else will eat your dogfood..

    13. Re:~ 320K accounts by jarod670 · · Score: 1

      We've been using Lotus for several years in a hospital environment. Around 4,000 accounts. Lotus is a nightmare to keep up and going all the time. It no matter how much time we spend in preparation to upgrade clients, including getting IBM in to help, the upgrades always screw us. We are going to move to Exchange in the next year or so, not sure if it is much better, but it has to be better than Lotus.

    14. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      Linux on zOS, I believe.

    15. Re:~ 320K accounts by metamatic · · Score: 1
      We only have around 1000 users and it takes 6 people to maintain it! It's down for 5-10 minutues a few times per day; not to mention at least one multiple hour outage every 3-4 weeks.

      So stop running it on Windows servers.

      --
      GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
    16. Re:~ 320K accounts by rbenech · · Score: 1

      I concur 100%... Lotus Domino scales well (can easily run in a cluster to 1Mil users), runs on linux! Security has been #1 since the beginning (thank you IBM) and has Pop3, IMAPv4, Webmail support out-of-the box... There are great templates at OpenNTF.org... Plus there are hundreads of professional consultants to help you do it.

      --
      Perspective is to Science what Interpretation is to Religion. Obama + Paul FTW
    17. Re:~ 320K accounts by Anonymous Coward · · Score: 1, Funny

      > ...and I'm going to guess that a large part of that decision is because IBM owns the Lotus product line?

      Of course! Why would we use a competitor's product which is little more than a virus/trojan processing center? Sure we have to pay MS for using Windows, Office and some other tools but if we own an e-mail/collaboration/messaging software division that can scale to the size we need, might as well eat our own dog food.

    18. Re:~ 320K accounts by Falshire · · Score: 1

      Actually, they were all running on AIX (4.3.something IIRC) when I left IBM in 2000.

      --
      "Meddle not in the affairs of Dragons...for thou art crunchy and taste good with ketchup."
    19. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      Lotus Notes is the most absurd, user-unfriendly, EXPENSIVE, flakey, unpredictable, and unreliable mail client and MTA I've ever encountered in my 16 years in IT.

      The only good part of Lotus Notes is that the maintenance costs are so inflated it is a cinch to write a business case that shows a reduction in cost by migrating to Exchange with brand new hardware. ... and I say this as an avid MS Exchange hater. IBM makes even Mickysoft mail look good.

    20. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      Nice to know slashdot keeps up its reputation for acuracy. IBM was using Notes long before they bought Lotus.

    21. Re:~ 320K accounts by Nefarious+Wheel · · Score: 2, Informative
      I've designed and administered Exchange, Notes, DEC All-In-One, a few *nix based mail systems and a few others, some of them quite large (water utilities, national postal systems among them). Notes took over the role of most egregiously unpleasant mail system to set up or administer when MS Mail died. Very admin-hostile.

      Argue for your favorite all you want, but friends don't specify Lotus Notes to friends.

      --
      Do not mock my vision of impractical footwear
    22. Re:~ 320K accounts by Fred+Nerk · · Score: 1

      When I was at IBM, Domino was running on RS/6000 unix servers. A *lot* of them.
      It supports lots of features, but they could only get about 1000 users per server and they are expensive servers.

      Don't quote me on that, it's been a few years.

      --
      Anything is possible, except skiing through revolving doors.
    23. Re:~ 320K accounts by aktzin · · Score: 1

      Nice to know slashdot keeps up its reputation for acuracy. IBM was using Notes long before they bought Lotus. IBM bought Lotus in 1995. Lotus products were already in use there, but it wasn't until around 98 or 99 that every e-mail account at IBM was transferred to Notes. Before that a lot of people did their e-mail on mainframes (VM) and I think some employees didn't even have their own addresses.

      --
      Quantum mechanics: the dreams that stuff is made of.
    24. Re:~ 320K accounts by aktzin · · Score: 1
      Yeah, but we all know what happens when one of these Domino servers falls over ...

      Absolutely. When a server "falls over" the user's e-mail fails over to a backup server. You have maybe a few seconds of inconvenience while you switch over if you were in online mode. If you were in offline mode you don't even notice when the switch happens on your next replication.

      --
      Quantum mechanics: the dreams that stuff is made of.
    25. Re:~ 320K accounts by Nefarious+Wheel · · Score: 1

      As much as I despise Lotus Domino (see earlier post) I will suggest that if you want scale, scale up rather than out. I think you can get the dog to run on a mainframe, so if you have some big iron to recycle go that way. But I still would not recommend the product to anyone -- I'd be worried about that heart-weighing moment in the afterworld.

      --
      Do not mock my vision of impractical footwear
    26. Re:~ 320K accounts by DaveCar · · Score: 1

      Dude! Sense of humour bypass?

    27. Re:~ 320K accounts by Moofie · · Score: 1

      OK, you might not have any virus problems with Notes. I think it's also worth observing that you don't have computer virus programs by using trained sled dogs to drag around clay tablets that have been carefully scribed with pointy sticks.

      I'll take the pointy sticks over Notes.

      --
      Why yes, I AM a rocket scientist!
    28. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      I believe that you read carefully you might notice that they grandfather was making a joke.

    29. Re:~ 320K accounts by MasterB(G)ates · · Score: 1

      so... what are you trying to say exactly?

      --
      In the Slashdot moderating system, humourless based offenses are considered especially heinous.
    30. Re:~ 320K accounts by gr8dalmatian · · Score: 1

      Domino has advantages and security is the best feature. But, administrating the damn thing is a nightmare. Especially the end users: Replication, ACL's, archiving, person docs, desktops and nsf's just to name a few. Multiply that by a thousand users, a couple of sales guys on the road and an pissed off manager and there goes your day. The real fun begins when you start playing with id files: name changes, accessing email on different computers and lost id's (yes, it does happen). And I won't even mention corrupt databases.

    31. Re:~ 320K accounts by vwjeff · · Score: 1

      Yeah, but we all know what happens when one of these Domino servers falls over ...

      Of we all know what happens. If the server is not up in 30 minutes or less, next year's SLA is free.

      Duh.

    32. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      I support Notes/Domino for IBM and our customers. There are thousands of companies big and small (IBM being the one of the biggest, then GM, Ford, ExxonMobil, etc) who rely on Lotus Notes as an enterprise e-mail solution.

      The problem is that it's not a nightmare to administer -- it's that most companies have untrained IT staff thrown into the sysadmin/developer role. It's not hard to manage especially if you don't have to patch or do virus/worm repairs every week. Domino 7.0 which just came out this week supports something like 25% more users while using 50% fewer resources. With the older version, Domino 6, I think the record was 65,000 users on a single iSeries (AS/400) server.

      AFAIK, IBM uses 12 or 16 way pSeries POWER4/5 boxes that each support thousands of users. They're clustered in a failover configuration so that in the event your primary server goes down, you've still got an up-to-date failover server to go to.

      Plus, for the Slashdot crowd, Domino Web Access officially supports Firefox 1.0.x!

    33. Re:~ 320K accounts by aktzin · · Score: 1

      Dude, I got the joke, just wanted to let anybody who didn't get it that we do in fact plan for these contingencies. :-)

      --
      Quantum mechanics: the dreams that stuff is made of.
    34. Re:~ 320K accounts by Loopy · · Score: 1

      Lotus....ack. I have three words to describe my utter hatred of Lotus Notes:

      1) Formatting.

      2) User Interface.

      Thank God my company configured our servers for POP3 access.

    35. Re:~ 320K accounts by tengu1sd · · Score: 1
      I used to think Bloated Goats was vile until we merged with a new corp-rat group that forced Exchange down on us. I repent for every curse and snide comment.

      Lotus Notes understands commuting and laptop users who don't have "alway on" network (VPN or otherwise) connections. Outlook regularly wigs out if the server isn't there. Because of our size, the corp-rats run mutliple Exchange servers and we can't see free time if someone's on another Exchange server. Alledgedly we're a technology company.

      Replication and backups on a MSFT product? Just kidding, hope you backed up your local file before trying to synch with server, we deleted all your mail during the last outage and if you synch your local copies get deleted too. . .

    36. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      dont want to be rude but do you have any training in domino , cause everything above is easy , piss easy in fact
      name changes , lost ID's jesus not being able to do that is equilivent of not being able to reset a password in AD.
      ive admind 2500 users by myself , domino alows me to be a chilled and relaxed admin.
      get some training , have a look at the forum;s on notes.net

    37. Re:~ 320K accounts by I_LV_MSFT · · Score: 0

      Notes server sucks big time on Windows. All I can say is: crashes, crashes, crashes. On AIX it is rock solid, however this will cost you an arm and a leg. Also it consumes a lot of resources. You will need 200K cluster to support 50k users. The Notes Client is odd at best. And Yes it runs on Linux (crashes, crashes...). Bottom line is unless you are sitting on a stockpile of cash and are willing to subscribe to IBM robbery services, stay away from notes.

    38. Re:~ 320K accounts by Alioth · · Score: 1

      I worked at IBM both when all email was on VM mainframes, and when it was Notes.

      I much preferred VM. People didn't keep sending me damned MSWord attachments!

    39. Re:~ 320K accounts by cavemanf16 · · Score: 1

      Well, if you weren't using the email server that YOUR OWN COMPANY SELLS then I'd be even more afraid of that bloated goat known as *IBM* Lotus Notes than I am already!

    40. Re:~ 320K accounts by CheeseTroll · · Score: 1
      we all know what happens when one of these Domino servers falls over ..

      The other servers turn Communist?

      --
      A post a day keeps productivity at bay.
    41. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      Then you have a configuration problem. I'm a road warrior and have no problems with being offline. We use Outlook 2003 & Exchange 2003. Also, I have no problems seeing the calendars of folks on other servers or in a different domain.

      I don't know anything about back-ups though...

    42. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      I've been a Notes / Domino Administrator for 10+ years. This is a generic reply to the thread, as opposed to the individual post above.

      Over the years, I've seen more than a few crappy environments. There are a few things you need to remember:

      -Notes/Domino is a collaboration environment - not just an email server. Calling it email is like calling a computer a video game. Sure, it does that, but it's designed to do so much more. That's why all these other commenters think the client sucks - they've only used it for email. Well duh. (Analogy: Linux is a piece of crap because I've only used it once to edit a text file with vi.)

      -It's flexible to a fault. Wanna use Outlook to read email - no prob. IMAP, POP, Web, all are options. And not just for email - any application written in Domino can be used via the Notes client or a browser. Of course, now the system architect needs to take these things into consideration.

      -Distributed administration. Domino has the ability to be managed in compartments. First, by splitting up an organization into separate domains, and then by delegating and limiting admin rights via an x.500 (Standards! Oh my!) directory.

      -Oh yeah, there's a Directory built in. A really friggin' good one in my opinion. One of my positions used it as the reference point directory for the HR system.

      -Backups and restores are cake in Domino. I did recoveries every day for one gig a few years back. Like everything else, it's all about planning ahead.

      -Not scalable?! My last position (still) has a single directory of 300,000+ uses. Lotus said it couldn't be done. We got creative and proved them wrong - and showed them how to add even more capacity. And that was on R6 - now R7 is out. I'd feel comfortable responsible for a 1,000,000 user rollout with R7. Wanna hire a consultant?

      ~~~

      I'll sum up this rant: Like any "development environment", the sysadmins, programmers, and architects need to work together and listen to the users to produce a user-friendly system. That's been truth for a long time. But Domino itself is one of the most robust and scalable products I've ever worked with.

    43. Re:~ 320K accounts by bittmann · · Score: 2, Insightful
      320,000 accounts on a single iSeries? Child's play. I doubt that IBM is only on "one box" though, given the wide-ranging network that Big Blue maintains.

      1 million total users at 99.9% uptime as per the original request? Not exactly "child's play", but honestly, not much harder

      Domino on iSeries does seem to be a reasonable option for a deployment of this size, especially given the rather generous uptime allocation that is being offered..."3 nines" being EXTREMELY generous for an iSeries shop (you'd even be able to schedule monthly downtme on purpose and still meet this uptime goal.)

      I do note that IBM has benchmarked Domino on a 16-way Power5-based iseries at a 33ms response time for 175,000 concurrent users (details here: http://www-03.ibm.com/servers/eserver/iseries/domi no/scalerecord.html)...and given the limited usage pattern of POP3 (yuck!), a properly-deployed solution should be able to meet the published needs with just one server. AND provide backup. AND enable the user to restore an individual mail store, mail box, or object on-demand. If high-availability or higher performance is necessary, 2 servers could be deployed in several different configurations (mirrors, clusters, HA failover, etc.).

      And if the moans of "Outlook-only users" get to be too much of a problem, IBM offers a "connector" that can offer MAPI access to Domino's mail store.

      Hell yes, I'm an iSeries fanboy. Those machines have proven themselves to be reliable, capable, economical systems over the long haul. Now, while (due to price) I wouldn't suggest deploying an iSeries to be a simple file, print, web, or small-database server, true...but when you need to move freight and *lots* of it, but you don't want to spend hours every week in operating and administering the system, it's hard to beat the venerable System/38 ne AS/400 ne iSeries systems.

    44. Re:~ 320K accounts by Anonymous Coward · · Score: 0

      Ouch. Down once per day ? Thats really really bad.

      And most notes systems - even the huge ones - are very stable. Unlike Exchange.

      Sounds like the admins need to be fixed first, then the notes environment.

      Lots of good folks around who can come in and make it perform like it should.

      ---* Bill

  20. Whatever myspace uses. by jZnat · · Score: 1

    I'd have to go with whatever system MySpace uses. I can't believe that any system could sustain such a heavy flow of pointless "X updated!" emails to hundreds of millions of users...

    --
    'Yes, firefox is indeed greater than women. Can women block pops up for you? No. Can Firefox show you naked women? Yes.'
    1. Re:Whatever myspace uses. by FinestLittleSpace · · Score: 1

      They don't sustain it. The email notifications are reguarly 2-3 days late, or just never arrive.

      The code behind that site is atrocious

  21. The drug-store by jd_esguerra · · Score: 1

    Specifically, the pain-killer isle.

    1. Re:The drug-store by Anonymous Coward · · Score: 0

      Yes, you will need an entire island of aspirin to deal with the headaches you will doubtless start to have.

  22. It's obvious by gulfan · · Score: 4, Informative
    Your first bet would be Ask Slashdot.

    However, I'd personally ask Google. They've done it and even their search engine has information. I found an interesting link from there detailing the deployment of a large hundred thousand user mail system, from the architecture to the software located on Linux Journal.

    1. Re:It's obvious by Jugalator · · Score: 1

      However, I'd personally ask Google.

      And maybe better yet for tricky questions requiring expert input: Google Answers.

      --
      Beware: In C++, your friends can see your privates!
  23. Who to talk to by Effugas · · Score: 2, Informative

    I've heard surprisingly good things about Communigate Pro, though I have no idea if it scales that high.

    Mirapoint is probably _the_ vendor to speak to, though.

    1. Re:Who to talk to by akac · · Score: 1

      It clusters and can do a 50,000 email clients on a fairly inexpensive server easily.

    2. Re:Who to talk to by NtroP · · Score: 1
      We use CommuniGate Pro. We don't have that many users, but I know that it scales very well, has excellent clustering capability, runs on every platform (we run ours on RHEL3) and high-uptimes. It's supposed to be an "ISP-class" email server. I'm fairly sure they already have customers who have more than a million email accounts and I'm pretty sure you can get five 9's out of it. I know we've been very happy with ours for the last 6 years or so.

      Even though we are a small customer, I've almost never failed to get one of their "top" engineers (like Vlad) on the phone for support immediately. In my experience they don't run you around to "help-desk" level guys first like so many other companies. The very few times we've actually had issues, they were solved immediately (and, were almost always something we screwed up :-)

      The best part, IMHO, is they make it a point to be standards-compliant. They point you directly to the RFC's for every item right from their web site. They also have support for Mailing lists, MAPI, ACAP, SIP, RADIUS, etc. WebMail address books are stored in vCard format. Their calendaring solution uses iCal/vCal, their users are stored in LDAP, and they have a good Perl module and CLI interface. Their plugin architecture is simple and flexible enough that I was able to sit down and write a plugin in Perl that allowed our users to be authenticated off both their internal system and a proprietary, in-house database we have (which-ever set of credentials match, works). It only took me about 20 minutes to get it all working. This sort of thing allows excellent opportunities for scripting management tasks. In fact they encourage it.

      A million email addresses won't be cheap, but I can't think of too many other "supported packages" that can handle something like that "out of the box". Check them out at http://www.stalker.com/ You can try it for free even.

      --
      "terrorism" and "pedophilia" are the root passwords to the Constitution
    3. Re:Who to talk to by afidel · · Score: 1

      Communigate has a SPEC entry with a cluster system capable of hosting roughly a million users (10K emails a minute, SPEC says 2M users, I cut that in half =) Mirapoint has a much more modest cluster capable of 5K emails a minute, so I'm sure they could make you one capable of scaling to 1M users. SPEC benchmarks for email solution are available her.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  24. Not Gmail by FatalChaos · · Score: 1, Funny

    I think they want an acutal company email. so the email reads john@company.com.

    1. Re:Not Gmail by Guspaz · · Score: 1

      Take a look at this and tell me why Google wouldn't do something similar for email?

      I'm referring of course to the Google Mini (which powers Anandtech's search above) and the Google Search Appliance. For all we know Google might be persuaded to sign a deal to provide GMail services directly from Google's own network for the company's domain.

    2. Re:Not Gmail by Alan+Hicks · · Score: 1
      I think they want an acutal company email. so the email reads john@company.com.
      root@darkstar~# echo "mail.company.com. 1D IN CNAME gmail.com." >> /var/named/company.com-hosts :^)
      --
      Slackware, what else when it must be secure, stable, and easy?
  25. openwave's email server does this but it's $$$ by Serveert · · Score: 2, Informative

    I'm sure other commercial vendors have it but I do know that large companies like ATT et al use it to handle their email. It's a shrinkwrap product that does it all and then some but it's very pricy.

    I'm sure you could hack together something to do this much like what google did. Might take some time but it's totally doable.

    --
    2 years and no mod points. Join reddit. Because openness is good.
    1. Re:openwave's email server does this but it's $$$ by Anonymous Coward · · Score: 0

      Actually, AT&T for the past several years has used an internally developed product called Maillennium for its large (.5 to over 5 million mailbox) customers. OpenWave was simply too expensive.

      Maillennium has a banner that looks like:

      220 gateway.isp.net - Maillennium ESMTP/MULTIBOX

    2. Re:openwave's email server does this but it's $$$ by sho-gun · · Score: 1

      I'll second this opinion. Both nationwide ISPs I've worked for over the past 10 years use Openwave's platform, and I know of a few other biggies (AT&T included) that use it too.

      Intermail, which was developed by another company (software.com I think) and Openwave bought them out. Openwave might call it something different these days.

      I know it scales to really large capacities, a million+ mailboxes or more I'd bet. And it has capabilities for webmail, pop/smtp SSL, spam filtering etc. And the versions I've seen run on an LDAP back-end. Not sure about IMAP access but considering the webmail has folder manipulation, I wouldnt be suprised if its under the hood somewhere. I think Solaris is the OS.

      I'm not in a server admin role so I can't really
      be more specific but you may want to look into
      this solution.

      Of course the open source advocate in me says
      use sendmail/qmail but those would probably be
      more painful to scale to the size you are talking
      about than going out and buying a platform that
      can already handle something on that big a scale.

    3. Re:openwave's email server does this but it's $$$ by zerocool^ · · Score: 1

      Of course the open source advocate in me says
      use sendmail/qmail but those would probably be
      more painful to scale to the size you are talking
      about


      I'm forced to agree with you here. Qmail is a fantastic MX; it is lightweight, secure, and easy to understand.... but...

      As much as I love qmail, and I do, I can see it scaling very well up to about 40,000 (and maybe 100k if you strech your hardware) users on your average box (say, Dual Xeon 2.0Ghz, 4GB ram). Unfortunately, I don't remember any blindingly easy way to split the duties of qmail up across multiple servers, so the only way to scale it up by a factor of 10 would probably be to get a SunFire or some comparable 8 or 16 (or more) processor / 32GB RAM system and compile it to be multithreaded. Qmail should scale to multiple threads easy (it's nothing if not a series of tiny programs doing small jobs), but throwing down that kind of hardware and the time to get it configured right for that arch is going to be cumbersome.

      And even at that, you're probably looking at a separate database server that's equally as beefy, and a fiberchannel SAN. And then, another set of those for redundancy.

      I thought about that, and how much I would love to admin a qmail system, but... high availablity and a million users, even given qmail's traditionally lightweight footprint... we're talking probably a quarter million worth of hardware ($40,000 X 3 servers X 2 installations, probably), and a lot of headaches. If you're going to spend that cash, you might as well ask IBM or Unisys to come in and do it for you, and then you can blame them if it screws up.

      Or, for a quarter million, you could probably just buy 200 $1000 1-U cheapie servers, cluster them, use arrowpoint routers, and use any MX (and by any MX, I mean any MX other than sendmail).

      ~Will

      --
      sig?
    4. Re:openwave's email server does this but it's $$$ by dougnet · · Score: 2, Interesting

      I ran an InterMail MX system for about 3 years for a national ISP. The company that sells InterMail was called Software.com at the time... and then they merged with phone.com and the combined entity was renamed Openwave. They provide many of the browsers used on cell phones... check an old phone and it probably says "phone.com" and a newer one will say "openwave". I used version 4.x of their InterMail Mx product primarily and had a little experience with version 5.0. It is a fairly complex system but is obviously very powerful. The system used an Oracle database for all user information (LDAP on the front-end, with the data stored in an Oracle DB on the back-end) and also used an Oracle database for each Message Store server. For example, if an E-Mail message was sent to 2000 users on your system, one instance of the message was saved to disk (in a hashed directory structure) and 2000 "links" were stored in the Oracle DB. Once all 2000 links were deleted (IE all users deleted the message) then a garbage collection process would remove the message file. This can obviously save a lot of space on a busy system. The server scaled by adding Message Store Servers (MSS) and front-end POP/IMAP/Web servers. The front-end servers are typically setup for load-balancing with F5 BigIPs or the like. The back end servers (directory/ldap server, MSS servers) are less redundant and require a cluster/HA solution. We had a 3 to 1 fail-over for our directory server and two MSS servers to one stand-by system. This was at least US $2M of hardware by the time you added an EMC Symmetrix for multiple TB of storage. This was a while ago and you may not need to use a tier 1 storage vendor... but when you're talking 1 million users and 99,9% uptime, you can't just throw something together and cross your fingers. OpenWave also offered an InterMail Kx solution (thousands of users rather than millions of users) that was less complicated. Below that was post.office. The price at the time was negotiable and was generally based on the number of users. Their support was generally quite good. They appear to call the product Email MX now: http://www.openwave.com/us/products/wireline/email _mx/index.htm The main reason companies choose (or stay with) MS Exchange really comes down to these two things: 1) Integration of the Windows Domain with the E-Mail account (often single sign on). 2) Integrated Calendar I'm not sure if Openwave offers something comparable now with their product, but I'd much rather run a system with that many users on a Unix platform than on a ton of Windoze systems. As other posters have mentioned, if it is properly architected... many different options are possible.

    5. Re:openwave's email server does this but it's $$$ by Anonymous Coward · · Score: 0

      Not sure about IMAP access but considering the webmail has folder manipulation, I wouldnt be suprised if its under the hood somewhere

      I actually work at Openwave on their email products, and yes, they do support IMAP (the webmail does not use this to communicate with the message store though, too heavyweight)



    6. Re:openwave's email server does this but it's $$$ by Serveert · · Score: 1

      I used it at a couple ISPs myself and was impressed.

      I really liked the distributed config.

      I'm not sure why they didn't officially port it to linux / ditch oracle and its expensive license completely and instead use something like postgresql. These things would drive costs down. In that respect they were out of touch with their customers - now even sun is adopting linux. They seem to have missed the boat re: cheap x86 hardware and inexpensive DBs. I heard they outsourced a lot of stuff to India yet still the price tag is so high.

      In any case, I also heard intermail hasn't changed much since the 90's, just more spam checking and other bells + whistles.

      Here's to hoping openwave goes bankrupt and open sources intermail. ;)

      --
      2 years and no mod points. Join reddit. Because openness is good.
    7. Re:openwave's email server does this but it's $$$ by Sandman1971 · · Score: 1

      In Intermail MX6 & MX7, you can use a sleepycat backend instead of Oracle, and I'm 99.9% they now run on Linux as well. As far as outsourcing to India, considering I deal with them on an almost daily basis, I have never heard this. All our support still comes from the US & Australia.

      --
      It's better to burn out than to fade away
    8. Re:openwave's email server does this but it's $$$ by Serveert · · Score: 1

      Interesting that they use that from Kx, paul falstad, author of zsh, wrote the sleepcat replacement. Good to see they've used that in mx.

      But a good portion of their engineering and QA is done in India, from the mta to the mss to the directory. The mobile side of things is outsourced to India and Ireland.

      The support people you deal with are of course based in the US.

      --
      2 years and no mod points. Join reddit. Because openness is good.
  26. Let the vendors do the work. by liquidzero4 · · Score: 1

    I'd call IBM, Red Had or any other large vendor and ask them. If ther are big $$$$ involved which there probably are the vendors will jump through hoops for you. You'll have your own little circus.

    1. Re:Let the vendors do the work. by joe_bruin · · Score: 4, Insightful

      Seriously. If high availability systems is not your company's core competency, call IBM, Red Hat, Sun, Oracle, Novell. Tell them you have a million users. Tell them you have a very fat checkbook and that you want them to provide you with a complete solution. Tell them that nothing but 5 nines of uptime will do.

      DO NOT implement a half-assed solution. Unless you really know what you're doing (and if you were, you wouldn't be asking this question), don't assume that a million Linux servers strewn about a million offices and data centers is the best solution, even if it is easiest to set up and administer. Maybe it is, come up with a proposal with hard numbers and see how they compare to the vendors. A million dollars spent on a Sun E10000, and Oracle Grid subscription (scales perfectly, right?), or a million IBM engineers flown into your site when an emergency happens may be worth paying for.

    2. Re:Let the vendors do the work. by SparafucileMan · · Score: 1

      Seconded!!!!

      A million people. So you have money. And this is email. Is HAS to work all the time. Plus remote folders. Plus lots of backups for legal purposes. Plus supporting web front-end. Plus Outlook. Plus *nix. Plus generic. Plus accessible from around the world 24/7.

      Outsource it, no question. Have the company sign a contract that assures they have to make it work and then forget about it. This is NOT something for a homegrown solution. SPEND THE MONEY, OUTSOURCE IT.

      lol. a million people. you must have billions of income per year. spend a couple million of it already and forget about it. email is ridiculously vital in an organization of 1 mil. i work at a company of say 150k-300k and really, if i couldn't access my email at any time NO MATTER WHAT i'd shit a brick.

      and i'm not even management. wait till your VPs/CEO can't get their email. you'll want to overdose on painkillers.

    3. Re:Let the vendors do the work. by eSims · · Score: 1

      ah... i see YOU haven't support the 10k... 5 9s ona 10k? What're you smokin'? A week at a time would be nice!

      --
      I .sig therefore I am!
    4. Re:Let the vendors do the work. by twiddlingbits · · Score: 1

      Five nines on 1 E10K box? I doubt it. If you ever had to patch Solaris there goes your downtime for the years to come. Oracle Grid while a great idea, I've not seen it implemented yet on the scale you are mentioning and it's a bit new. I think I would go for more proven technology. I notice you didn't mention an email package that can USE the Grid Database, there are very few out there that can even use plain old relational DB like Oracle or DB2 to store mail. A DB based mail system could be nice as DBs are designed for high volumes of inserts and deletes, and threads could be generated easily, forwarding of documents would just be the SQL query to grab if from the table and it would not be sent 1000 times, logging would be done automatically, backups/restores are easy. The only issue might be that you would need to setup the Data Architecture and Storage such that lots of small mailboxes don't cause performance issues, or lots of large ones don't. You need to know more about the "average user" to configure the system right.

    5. Re:Let the vendors do the work. by Anonymous Coward · · Score: 0

      1. Using Solaris LiveUpgrade to apply patches to an alternative root brings the time required to switch to another patch level to that required for a reboot (assuming you're applying patches requiring a reboot).

      2. Oracle sells Oracle Collaboration Suite, their own database-based mail/calendar/file sharing system.

    6. Re:Let the vendors do the work. by EvilTwinSkippy · · Score: 1

      I love Linux as much as the next guy, but when push comes to shove I always go with a Vendor's brand Unix on Vendor made hardware. The stuff just plain works, and when it doesn't, you have one number to call.

      --
      "Learning is not compulsory... neither is survival."
      --Dr.W.Edwards Deming
    7. Re:Let the vendors do the work. by QuietLagoon · · Score: 1
      Seriously. If high availability systems is not your company's core competency, call IBM, Red Hat, Sun, Oracle, Novell. Tell them you have a million users. Tell them you have a very fat checkbook and that you want them to provide you with a complete solution. Tell them that nothing but 5 nines of uptime will do.

      Finally, a sane recommendation. If for no other reason, but to get the requirements for this very large project documented into a usable form. Without the latter, this project is a disaster waiting to happen.

    8. Re:Let the vendors do the work. by gwayne · · Score: 1

      Ah! 5-nines:

      First three nines: $1.5 million
      Fourth nine: $5 million
      Fifth nine: pricele$$

  27. CommuniGate by Anonymous Coward · · Score: 2, Informative

    www.stalker.com

    Is able to run clusters, and clusters of clusters, and theoretically scale into the hundreds of millions of accounts. Offers all the things you want, and more. LDAP, ACAP, etc, etc, integrated webmail. Intelligent directory creation structures, etc.

    1. Re:CommuniGate by p0rkmaster · · Score: 3, Informative

      I second that recommendation. I've been running CommuniGate Pro for many years now, and I love it. There's a cellphone provider in sweden that is hosting over a million accounts on a single 8-processor server - but for your requirements I'd probably recommend looking into CommuniGate's clustering solutions.

      --
      ... I like to keep an open mind, but not so open that my brains fall out. - Judge Harry Stone, Night Court
    2. Re:CommuniGate by Anonymous Coward · · Score: 0

      Same here, great product. I replaced exchange in my corp with communigate a few years ago and it is rock solid. Runs at 20% processor utilization vs 95% for Exchange 5.5 on the same hardware and load.

      And while it was SMS accounts, how about 4.5 million accounts on a single server! http://www.stalker.com/Papers/Uboot_press_rel.pdf

    3. Re:CommuniGate by Anonymous Coward · · Score: 0

      I wholeheartedly agree. CommuniGate is by far the most stable and robust email server we have used. Beyond that, it is probably the best performing application we own. 5 stars. I imagine you'll want to check out the clustering capabilities. Free trial, too -- check it out!

  28. earthlink's setup by Triumph+The+Insult+C · · Score: 2, Interesting

    earthlink's mail server complex has come up on freebsd-isp a few times

    this guy used to work at both sendmail and earthlink and he has links to some good resources

    --
    vodka, straight up, thank you!
  29. beowulf??? by snatchitup · · Score: 1

    Definitely beowulf cluster of "dead" BSD clusters...

    No, but the answer is simple in two words....

    Geographic Cluster

  30. Please Please Please by xactuary · · Score: 2, Funny
    Let me send your peeps a million .mac invites. Then I'd be set for life! Mmmmwwwhhaaaaa!

    If that's too rich for ya, how about gmail invites? Slashdotters could come up with a million of those I bet.

    --
    Say hello to my little sig.
  31. Lotus Domino by jesseraf · · Score: 1

    Lotus Domino is a viable alternative to Exchange, although it's probably not very popular with the /. crowd.

    Whatever you do, I think the most imporant part is to think through the migration process. It's good you've already done it before, but 1 million people could mean a lot of angry phone calls.

    Good Luck.

    1. Re:lotus domino by Koda · · Score: 1

      I have to agree. It might not be popular with the slashdot crowd, but Lotus Notes/Domino is quite robust, extremely secure, and offers an immense amount of functionality outside of just e-mail and calendaring. Sometime midway through the R5 codestream, it's web interface for mail became world class, and the Domino server will run on Windows, Linux, AS400, AIX, etc.

      It's an extremely mature product, now at version 7 (http://www.lotus.com/lotus/general.nsf/wdocs/nd7c ontent). There's a huge install base, and you should be able to easily find a number of IBM business partners and consultants that can get you on the right path. Speaking of the right path... I would recommend you avoid IBM Global Services. They pay poorly and no longer attract the talent they used to. Their administrators also tend to drag their feet on upgrades. YMMV.

      Good luck!

  32. Vendors by XorNand · · Score: 4, Interesting

    I'd start with talking to vendors. Consult with some sendmail gurus, Notes guys, etc. Any of these people/companies would salvate at the thought of being a part of a project this large. First, talk to the client and hammer out the real needs with solid performance requirements, timeframes, growth expectations, (meaning real numbers) etc. Put together a well thought-out Request For Proposal and send them out to as many applicable vendors that interest you. Then just stand back and play the role of ringmaster. The vendors will give you all the ideas you need.

    Just do one thing, please: make sure that the client is honest-to-goodness serious about this. I absolutely hate getting pie-in-the-sky RFPs from people who are just kicking the tires. It's a good way to burn bridges by not looking professional.

    --
    Entrepreneur : (noun), French for "unemployed"
    1. Re:Vendors by ewg · · Score: 1
      Just do one thing, please: make sure that the client is honest-to-goodness serious about this. I absolutely hate getting pie-in-the-sky RFPs from people who are just kicking the tires. It's a good way to burn bridges by not looking professional.

      I've been on the other side of this, cluelessly wasting the time of potential vendors discussing plans that worked from an engineering standpoint, but not a business one.

      I didn't just look unprofessional, I was unprofessional. An awful experience.

      --
      org.slashdot.post.SignatureNotFoundException: ewg
    2. Re:Vendors by This+is+outrageous! · · Score: 3, Funny
      hammer out the real needs with solid performance requirements, timeframes, growth expectations, (meaning real numbers)
      Integers, kid. INTEGERS.

      Those newfangled "real numbers" are nothing but bullet-point creeping featuritis. Integers, on the other hand, have been around since at least Kernighan & Richie. They do one thing and do it well. Keep true to the Unix philosophy! Real numbers in information technology? Just say NO.

      --
      This is...

      O
      U
      T
      R
      A
      G
      E
      O
      U
      S

      !

    3. Re:Vendors by Ilgaz · · Score: 1

      Such large numbers (million users etc) always reminds me one company: Novell

      Thanks to Myrealbox they are constantly testing their stuff in real life 24/7 and now with Suse it will be much more scalable.

      http://www.novell.com/products/netmail/

      http://www.myrealbox.com/ (their testing)

    4. Re:Vendors by Dunkirk · · Score: 1

      Where's the +1 "Ate up" mod when you need it? (Sorry, southern Indiana dialect here. "That's just ate up" means it's bizarre, but in a funny way.)

      --
      Acts 17:28, "For in Him we live, and move, and have our being."
  33. Ask Slashdot? by gromitcode · · Score: 1, Redundant

    If I was your boss and found out your idea to architecting what will be a large investment, high uptime demands and a large user base was to ask slashot your arse would feel my boot followed closely by the pavement. This sounds like a pretty poorly run place, if you need to ask slashdot for this scale of thing then you are far better off not touching it.

    1. Re:Ask Slashdot? by R3D · · Score: 2, Insightful

      Well, they're currently using Exchange.

    2. Re:Ask Slashdot? by shmlco · · Score: 1

      This is probably the CEO's second cousin, who said he can do the job for a LOT less money than those nasty overpriced consultants were going to charge...

      --
      Any sect, cult, or religion will legislate its creed into law if it acquires the political power to do so.
  34. Where to start? by Anonymous Coward · · Score: 0

    dunno, but ask slashdot is probably the worst choice...

  35. Oracle E-mail Server. by Anonymous Coward · · Score: 0

    Oracle E-mail Server. Oracle can easily handle your type of data volume and up time requirements.

    This assumes that you'll have the hardware for it, of course.

  36. Where do I start? by Anonymous Coward · · Score: 0

    I'll ask my boss to hire someone with a clue instead of me because going and asking ./ is all I could think of.

  37. Split up the tasks by jgardn · · Score: 4, Informative

    There are three parts to your system: sending mail, receiving mail, and storing mail. Keep them separate.

    Your receivers will be a bank of servers running sendmail. They will do appropriate spam processing to reduce the amount of mail actually received. They feed the data into the storage servers.

    The storage system has the data partitioned out so that all the data for one user would go to one server while all the data for another will go to a different one. The storage system also has to provide POP and IMAP access. You may want a special setup where the IMAP or POP service known which server to go to. Investigate having one giant virtual filesystem so that the system isn't too complicated.

    Your webmail access will use IMAP to access the actual mail. It can be a completly different system.

    The sending system will be a chokepoint for all outgoing mail. You are going to scan it as it goes out to look for virus-sent emails or unauthorized messages. For instance, you may want marketing email to be processed differently than inter-office email and such.

    All of these systems will be running sendmail. I know sendmail has a bad rap for being insecure, but the insecurities have been found and since fixed. It is by far the most manageable system when it comes to large-scale deployments with heavy customization.

    --
    The radical sect of Islam would either see you dead or "reverted" to Islam.
    1. Re:Split up the tasks by harlows_monkeys · · Score: 1
      All of these systems will be running sendmail. I know sendmail has a bad rap for being insecure, but the insecurities have been found and since fixed

      Just the known insecurities have been found and fixed. What about the unknown ones?

      Sendmail is asking for trouble, until they completely throw out the old code and rewrite it from the ground up, with security in mind.

    2. Re:Split up the tasks by michelcultivo · · Score: 4, Insightful

      And please don't forget to use Maildir for email storage, it's very good for backup and very easy to manage.

    3. Re:Split up the tasks by njm · · Score: 1
      Sendmail is asking for trouble, until they completely throw out the old code and rewrite it from the ground up, with security in mind.
      I'm pretty sure that's already been done. And while many will argue that sendmail scales better than either, both Postfix and qmail are in used in some awfully large sites itself--I'd maintain that scalability is a non-issue. Still, sendmail admins are (paradoxically) easier to come by, and its code has been scruitinized very thoroughly, at least enough so that one could be reasonably comfortable with its security.
    4. Re:Split up the tasks by Alan · · Score: 1

      "Just the known insecurities have been found and fixed. What about the unknown ones? "

      Yea, don't you know that you're supposed to report all the unknown bugs to the developers as well as the known ones?! :)

    5. Re:Split up the tasks by UndeadDude · · Score: 2, Interesting

      Having dealt with sendmail at scale, I would definitely say no. And if you think that it is the most configurable, sounds like there are some MTAs you still need to check out. I recommend Exim.

      I agree that you want to split things up-- make farms of large numbers of servers to make horizontal scaling easy. Store your user info in LDAP (OpenLDAP works very well, with very good data replication in 2.3.x). Most common server software will support LDAP and it scales very well.

      You need "layer-4 switching" to load balance across machines, and automatically disable systems/services that are down. You need something that will cluster. I recommend Foundry ServerIron switches. F5 BigIP is another common alternative.

    6. Re:Split up the tasks by Fujisawa+Sensei · · Score: 1

      Please, they're already using Exchange. The security in Sendmail, and whatever OS its running under would be an upgrade.

      --
      If someone is passing you on the right, you are an asshole for driving in the wrong lane.
    7. Re:Split up the tasks by BlackStar · · Score: 1

      In a previous position, our ISP set up a half million mailbox system scalable upwards to multiple millions. The outline above is exactly on the mark, although I would add that we had inbound proxy directors as the mailboxes were spread across various redundant server pairs. Additionally, look into using an LDAP system for the user admin and settings, not the internal databases most email servers use. If the system is externalized, that lookup is offloaded, the admin on the users is much easier and many different systems can be configured to take the load from it without overhauling the user accounts.

    8. Re:Split up the tasks by Anonymous Coward · · Score: 1, Insightful
      Two ways to end the war: (1) Kill all terrorists. (2) Convert to Islam. Unfortunately, diplomacy is not a part of either
      In regards to your sig, you are implying which of the following:
      • Terrorists don't attack muslims.
      • All terrorists are muslims.
      • Those killing terrorists are not muslims.
      • Only religious conversion and killing ends war.
      It's hard to take advice on sendmail from someone who displays such an obvious lack of deep thought on other issues.
    9. Re:Split up the tasks by BlackStar · · Score: 1

      In a previous position, our ISP set up a half million mailbox system scalable upwards to multiple millions. The outline above is exactly on the mark, although I would add that we had inbound proxy directors as the mailboxes were spread across various redundant server pairs.

      Additionally, look into using an LDAP system for the user admin and settings, not the internal databases most email servers use. If the system is externalized, that lookup is offloaded, the admin on the users is much easier and many different systems can be configured to take the load from it without overhauling the user accounts.

    10. Re:Split up the tasks by Antique+Geekmeister · · Score: 2, Informative

      You wrote: Your receivers will be a bank of servers running sendmail. They will do appropriate spam processing to reduce the amount of mail actually received. That's 2 tasks. This requires absolutely robust, absolutely lightest weight email servers, with serious caching. Sendmail can do it: Postfix can do it, and is vastly easier to manage. The syntax of configuring sendmail configurations is just too arcane for most of us to deal with. Definitely add blacklist filtering and SPF on the front end, to reduce the load on all your other servers of handling and processing the spam, and very definitely create an SPF record for your own domain: much of your email will be to and from people inside your own domain, and being able to throw out all forgeries and inappropriately sent emails before wasting time on sophisticated virus or spam checking is a huge, huge, huge CPU win. This is in fact a big enough project that I'd contact Novell: they have support for those hardcase Outlook clients, they have good calendaring for Linux with their latest Evolution email clients and matching servers, and they've worked very hard on things this scale.

    11. Re:Split up the tasks by QuantumG · · Score: 1
      You forgot:
      • It's not a "war".

      As long as you continue to think of it as such you have to keep fighting which rules out any chance of fixing the social problems that cause the phenomona in the first place.
      --
      How we know is more important than what we know.
    12. Re:Split up the tasks by schave · · Score: 2, Informative

      Many people are now putting e-mail security devices in front of the "receivers".

      Products such as Ironport, Openwave Edge Gx, and Symantec Mail Security Security use technologies such as traffic shaping, reputation services, directory harvest attack detection, etc. to help keep spam out of your network.

    13. Re:Split up the tasks by Anonymous Coward · · Score: 0

      You forgot to hold a watch out and let it swing like a pendulum. You're getting sleepy... Sendmail is the one true MTA....

    14. Re:Split up the tasks by Anonymous Coward · · Score: 0

      I agree, but I'd use postfix - so reliable....oh and here's a howto:

      http://workaround.org/articles/ispmail-sarge/

      postfix on sarge is _very_ reliable. the only issue is if you're getting some big iron in to do the job, make sure your distro is supported....but the above howto should give you a good idea of how to setup postfix.

      Oh, and as a mail client goes - it is soooo important - thunderbird is great - absolutly no hassles and all the features a mail prog needs.

    15. Re:Split up the tasks by Kafka_Canada · · Score: 1

      Yes, free markets and the right not to wear a burqa caused millions of Muslims and non-Muslims to believe in an atavistic, fascistic, nihilistic global war on infidels.

      Makes sense to me. Death to infidels!

      (Oh, sorry.. that should be "subsidized day care and free yoga classes for terrorists! That'll stop them from wanting to kill me, won't it?")

      --
      Fuck it
    16. Re:Split up the tasks by fwc · · Score: 3, Informative
      This is right on the mark. I would differ in a few implementation details (aka I hate sendmail with a passion), but this is the way we do it at a medium-size ISP with a mail server "cluster" running in the thousands of mailboxes category.

      In short, we have mail servers accepting the mail and dropping it on a shared NFS server which stores all the mail. The incoming servers run spam and virus filtering and is responsible solely for delivering the mail to the customer's mail directory which lives on the NFS server.

      On the client side, we run IMAP and POP3 servers which access the stored mail on the NFS server to deliver it to the clients.

      The exact software used for both of these functions are somewhat irrelevant. Once you split this up this way, you can also split the selection process. I.E. which is the best server for accepting SMTP mail and dumping it in customer's mail directories. Which can be answered with a completely different answer than the question of "what is the best NFS (or SANS) server to use to store the mail", or "what IMAP server should we be using", or "what webmail front end should we be using", or so on.

      It also makes changing your mind down the road on any piece easier since you can actually run and test any one of these components in the live system as a final test before moving a replacement into the system.

      FWIW, I would *love* to consult on something this scale.

    17. Re:Split up the tasks by JChris · · Score: 2, Informative
      Your receivers will be a bank of servers running sendmail. They will do appropriate spam processing to reduce the amount of mail actually received.

      You might give serious consideration to outsourcing your spam and virus filtering.

    18. Re:Split up the tasks by dumeinst · · Score: 1

      Stupid question maybe... but don't you run out of inodes rather quickly?

    19. Re:Split up the tasks by indigoid · · Score: 1

      yep, bigips are lovely, lovely things. easily worth the small number of beans you hand over for them

      --
      P-plate adventurer
    20. Re:Split up the tasks by drsmithy · · Score: 2, Informative
      There are three parts to your system: sending mail, receiving mail, and storing mail. Keep them separate.

      I would argue that should be sending, receiving, accessing, storing. I'm not so sure sending and receiving need to be separated either.

      The storage system has the data partitioned out so that all the data for one user would go to one server while all the data for another will go to a different one.

      Uh, sounds to me like you're suggesting a storage system for every user. I'm sure that's not what you meant, but it's what you wrote :).

      The storage system also has to provide POP and IMAP access. You may want a special setup where the IMAP or POP service known which server to go to. Investigate having one giant virtual filesystem so that the system isn't too complicated.

      You should separate where the mail is stored from where the mail is accessed. Ie: your IMAP and POP servers access a mail store on a SAN or NAS. Depending on load, things like Webmail might require yet another layer of separation (ie: Webmail <-> IMAP <-> Storage).

      It's really important to separate the mail access system(s) from where the mail is actually stored, otherwise you are building a system with single points of failure and performance bottlenecks.

      All of these systems will be running sendmail.

      That would be an absolute nightmare. Postfix is just as functional and orders of magnitude easier to administer.

      Although, as I've said elsewhere, if this palce really does have a million-seat Exchange environment, they're almost certainly not going to be able to replace that with Squirrelmail, IMAP and Postfix. Exchange does a hell of a lot more than just sending emails back and forth.

    21. Re:Split up the tasks by Niten · · Score: 2, Informative

      Good point. One thing to be aware of when using Maildir, though, is that since each message is stored in its own file you'll have to make sure you configure your filesystem so that it can handle holding a massive number of files/messages. If you configure an ext3 partition with the default number of inodes, for example, then with one inode per message you might find yourself running out of inodes before you run out of disk space.

    22. Re:Split up the tasks by QuantumG · · Score: 1

      I think it's more the invading and bombing their homes that did it. If terrorists were logical they'd look at comments like yours and discover that as far as public education goes, terrorism really doesn't get their message across too effectively. But the "send a message to the world" justification for terrorism is just a pretext, it's really about revenge and retribution. Probably the most heinous part of human nature.

      --
      How we know is more important than what we know.
    23. Re:Split up the tasks by glenebob · · Score: 1

      I'd agree with the parent for the most part.

      Sendmail sucks (configuration nightmare), but it works. YMMV.

      I would be looking to store email in a proper SQL database.

      On the plus side, you get reliability, backup/restore, scalability, etc., depending on what you spend. You also don't tax your file system with too many files, or have scalability issues with huge mail boxes or huge user counts (a good SQL server laughs at a million user records). You also get easy navigation through the email stores; standard tools allow you to delete old messages, whatever you can dream up. It's easy; install the server with whatever options you need, and point your software at the server.

      The problem with this approach is lack of software availability. You will almost certainly have to modify or write new software to handle mail delivery (to storage), POP3, and IMAP. I know of one solution and it's pretty immature still.

      PostgreSQL would probably be fine for what you need, so the cost factor could be low.

      My opinion is that you WILL have to write software to make this work. The building blocks are out there, but no package that I'm aware of will scale like you want and use a database. I recently tried to revamp our mail server using a database and web management tools, and gave up. I just didn't have time to write or rewrite and support the software needed. If the server was large enough to warrant a full time admin, I probably would have done it.

    24. Re:Split up the tasks by mathrock · · Score: 1

      For the storage system consider the HP RISS (HP StorageWorks Reference Information Storage System). It scales well and handles Exchange, and other mail servers. Provides transparent access to your archived email. Allows users to have GB's of email but not have to worry about saving .pst files, etc. Basically an "unlimited" inbox.

      To scale the system you add additional "smart cells" which are the storage components of the system (2U servers, 900GB RAID, mirrored onto a second smart cell). To scale throughput, you add additional portal servers which handle the requests.

      http://h18006.www1.hp.com/products/storageworks/ri ss/index.html

    25. Re:Split up the tasks by harlows_monkeys · · Score: 1
      Yea, don't you know that you're supposed to report all the unknown bugs to the developers as well as the known ones?! :)

      I see the smiley, but this is a good place to point out that it is possible to estimate how many unknown bugs are in a program, from the distribution of bugs that have already been found.

      So, for example, if someone says they think that all bugs in TeX have been found, there is a good chance they are right. If someone says all bugs in Sendmail have been found, they are probably wrong.

    26. Re:Split up the tasks by thogard · · Score: 1

      yes it will eat up lots of inodes. And you waste huge amounts of space.
      Its better if most people are sending huge attachments but not as good if most users check messages frequently and use mostly plain text messages.

    27. Re:Split up the tasks by thogard · · Score: 2, Interesting

      All of these systems will be running sendmail.

      That would be an absolute nightmare. Postfix is just as functional and orders of magnitude easier to administer.


      If its a million seats, its not going to be easy to admin at all. It will require several people that know MTAs inside and out and sendmail has a track record in very large systems.

      Remember that in this case, the job will be 100% running an email system so the best tool for the job should be used, not the best tool for the admin.

    28. Re:Split up the tasks by yuri+benjamin · · Score: 1

      It's hard to take advice on sendmail from someone who displays such an obvious lack of deep thought on other issues.

      Some of the most technically competent people I know have totally fucked up (IMHO) views on other (unrelated) matters.

      Have you never heard the term Idiot savant?

      --
      You make the mistake of thinking you can educate the fundamental stupidity out of people. You can't.
    29. Re:Split up the tasks by thogard · · Score: 1

      One key thing to look at is how many race conditions and operating system bugs does your MTA work around? Sendmail has many, many patches to fix other broken systems. At least one other MTAs claim its "an OS problem" and don't even try to fix it.

    30. Re:Split up the tasks by ipjohnson · · Score: 1

      You also forget that its much much easier for distributed files systems to handle ;-) The maildir format has a much higher fault tolerance. If something goes wrong there is only one message not a whole mail file to f' up

    31. Re:Split up the tasks by Etyenne · · Score: 0

      And precisely what make it the "best tools for the job" except a warm fuzzy feeling about it ?

      I built a mail system for 100K accounts on Postfix. I have absolutely no doubt it would scale to 1M given appropriate hardware.

      Anecdote : to make Postfix deliver mail via LMTP, you just add a "mailbox_command = lmtp:unix:/path/to/socket" directive (or a transport table entry saying approximately the same). Easy as pie. I once watched an experienced sendmail admin (think > 10 years experience) fight for over 15 minutes compiling m4 macro to achieve the same result with sendmail. I know nothing of sendmail except the very basic, and this incident taught it was not worth it for me to waste my time on it either.

      Sendmail should be dead and buried. It's time have passed. Both Exim and Postfix are modern and well-maintained MTA that have been build with a clear idea of *not* to do, given sendmail history. And succeed pretty well at it too.

      --
      :wq
    32. Re:Split up the tasks by Etyenne · · Score: 1
      Sendmail is asking for trouble, until they completely throw out the old code and rewrite it from the ground up, with security in mind.

      It's been done already; it's called "Postfix".

      --
      :wq
    33. Re:Split up the tasks by thogard · · Score: 2, Informative

      Lots of things should be dead and buried (like :wq in your sig, where did that come from? there are very few people who were using ex in the days before :x of ZZ).
      However sendmail isn't one of them. Just because there was an issue with m4 (also something that should be gone) doesn't mean the core app is broken. M4 use started over a decade ago when even awk wasn't consistent on all unix systems.

      On one setup that has both sendmail and postfix, I know postfix loses far more mail than sendmail (which has lost none).

      I use sendmail because I can make it do everything I want it to and sometimes I have to have an MTA that does odd or unusual things. I've spent time and learned how to make it do very unusual things (at the .cf macro level) and it is very powerful since it has a full programming language built in. I've used it in very large instilaions and it works and it keeps on working. It hasn't ever let me down. I can't say that for the other MTAs.

      Also a complete rewrite of sendmail is being done right now. Too bad its taking away all the cool low level macros but I expect most people will find that an advantage.

    34. Re:Split up the tasks by RollingThunder · · Score: 1

      Amen. LDAP may be a bit of a bear to learn at first, but the way you can tie everything together really can't be beat.

    35. Re:Split up the tasks by Nefarious+Wheel · · Score: 1

      A mail system of a million or so users is likely to see them geographically distributed, too -- it might not be a great idea to have everyone pipe through the same ISP without a bit of soul-searching. Consider the network, and strategies for which bits reside where, or you'll be up for some fairly hairy replication costs. You have been warned. An institution of that size will probably have Sev-1 failover to an alternate geographic location as a post-911 business risk mitigation policy. Good high-end NAS with a high-bandwidth off-site replica will need to be part of your storage infrastructure, whatever tool you use. You'll need to consider how you'll handle single-instance storage too, to avoid attachment bloat. Internet usage is key to an email system but primary usage will generally always be person to person within the organisation, and you'll need to optimise that.

      --
      Do not mock my vision of impractical footwear
    36. Re:Split up the tasks by Kafka_Canada · · Score: 1

      You just said two seconds ago it was about social problems. Or is it about revenge and retribution for social problems? Damn you, Amerikkka, our illiterate society has theocrats running our countries into the sewer, state-controlled economies, and none of the freedoms you take for granted! We must fly planes into your buildings!

      For fuck's sake, you (like the terrorist's you're feebly attempting to defend) are completely demented.

      --
      Fuck it
    37. Re:Split up the tasks by Anonymous Coward · · Score: 0

      Ah yes. Lies, damned lies, and statistics. I have a hard time figuring out how thinking people can fall in to this trap.
       
      It's not wrong, but it's a very far cry from being right.

    38. Re:Split up the tasks by QuantumG · · Score: 1
      like the terrorist's you're feebly attempting to defend

      Typical war attitude, you're either with us or you're against us. Remember when W. said that for the very first time and the world recoiled in horror? Now it's considered the norm.

      --
      How we know is more important than what we know.
    39. Re:Split up the tasks by 4of12 · · Score: 1

      I'd heard Maildir was a higher performance storage system, too. It reminds me some of how MH would store messages in individual files.

      But, then, what motived the Evolution team to drop Maildir support for mbox?

      --
      "Provided by the management for your protection."
    40. Re:Split up the tasks by Etyenne · · Score: 1
      On one setup that has both sendmail and postfix, I know postfix loses far more mail than sendmail (which has lost none).

      That smell a lot like FUD to me. Hopefully, you have been filling bug report to Postfix about it. Losing a single email is a serious bug, and I am pretty certain it would have been acted upon in the earnest.

      Personnally, I never lost a single mail (that I know of) with Postfix. But I guess my anecdotes are no more worthy than yours.

      I've spent time and learned how to make it do very unusual things (at the .cf macro level) and it is very powerful since it has a full programming language built in.

      That's basically the only reason left to use sendmail : you have spent time learning it and you hope to get back on that investment. Some misplaced nostalagia that should be written off IMHO. In 2005, there is no reason for a fledging admin to go through the hardship of learning the sendmail mess, except if he have to support legacy installation.

      And maybe I lack imagination, but I can't think of anything that could not be done in Postfix, given a content filter in last resort. What's the point of learning a spcialized mini-langage when you could use one that you already know, and that is more generally useful anyway ? And Exim is apparently even more programmable.

      Preaching about sendmail programmability sound to me as a case of every problem looking like a nail to a hammer holder.

      --
      :wq
    41. Re:Split up the tasks by sco08y · · Score: 1

      ...you're either with us or you're against us. Remember when W. said that for the very first time...

      "You're either with us or you're against us" is a cliche that predates Bush by at least 50 years, and a concept that probably predates written history.

    42. Re:Split up the tasks by QuantumG · · Score: 1

      Yeah, but there hasn't ever been a statesman who has used it seriously in refering to a war on an emotion.

      --
      How we know is more important than what we know.
    43. Re:Split up the tasks by EjayHire · · Score: 1

      My mail implementation is quite a bit smaller, by a factor of 1000 to be specific, and I accept that there are limitations to what I'm about to suggest. With that disclaimed, here is where I would start. I agree with splitting up the tasks, and I see more than three roles. Receivers - responsible for receiving, spam/virus filtering, and writing messages to the user stores. Storage Servers - responsible for holding the users mail. NFS has been chastised in this thread, but It probably is the right tool for the job. User Servers - Database servers that hold the master user information for the email system. Username, Password, Quota, NFS Path to mail. Retrieval servers - These provide the end-user interface to the users. Including Pop/Imap/Webmail. Delivery servers - These are the outbound mail servers, with a set of requirements that could be a thread unto itself. Anti-virus at a minimum, preferably with some sort of tracking/shutdown/rate-limit capabilities to prevent owned machines from being used for nefarious purposes. To me, it starts with the User server. I store my user data in MySql, and don't know if that specific database will scale to 1 million users. It's a simple indexed query, so it should be okay, with replicas. Simple task, store the username password email address(s) and path to the mail stores. Receiver Servers. These pull a replica of the mysql database from the user server to a local instance of mysql for speed. They receive emails via postfix, and write them to the nfs storage. Redundancy and scalability are available through round-robin dns. Storage servers. Straightforward. The mail is stored as Maildirs, the NFS boxes need speed and reliability. The underlying filesystems need to optimize for large numbers of small files. Scalability: Using a SAN and blade servers would allow for quick failover from one server to another if a specific server failed. Retrieval servers. These provide the end-user interfaces. For my small implementation, I use courier-imap for pop and imap access, and it can already talk to mysql for the user and store information. Scalability: round-robin dns and layer 4 load balancers should fit the bill. A note on webmail: I'm torn between a webmail frontend that speaks IMAP, and abstracts from the mail server, or one that connects to mysql and the nfs path. Courier Webmail supports the latter, and any number of webmail packages support the former. This is left to experimentation of the reader. Personally, I don't trust a webserver with read/write access to the mail stores, and I use an Imap webmail package for abstraction. Delivery Servers: This is a whole other thread. At a minimum, you'd want to virus scan outbound mail. Scalability is through round-robin dns and layer 4 load balancers. Ejay Hire EjayHire@hotmail.com

    44. Re:Split up the tasks by Anonymous Coward · · Score: 0
      Have you never heard the term Idiot savant?

      Relating to the grandparent (he of the silly .sig), it's more like "Idiot Savant" without the "Savant" part.

  38. New Google Appliance by Anonymous Coward · · Score: 3, Interesting

    I agree. The google appliance should implement gmail and a web front end for administration. Like the Colbalt machines of yore, only better. Google-ified.

    It really is the best email.

    1. Re:New Google Appliance by smileyy · · Score: 1

      I imagine that its only a matter of time before this is the case. gmail kicks the snot out of every other email client.

      --
      pooptruck
  39. Ask t35.com by Anonymous Coward · · Score: 0

    They know everything, they are the "uber" in ubersmart.
    Or was it the "goober" in goobersmart?
    anyways....

  40. Novell? by lorien420 · · Score: 2, Informative

    www.myrealbox.com is a tech demo of NetMail and eDirectory.

    --
    "[We'll be] really getting inside your head and making it an unpleasant place to be" -- Trent Reznor
  41. If theyre using exchange by gad_zuki! · · Score: 2, Insightful

    they're probably using the groupware too. Are they also willing to ditch outlook?

    If you're looking for a groupware replacement, then you've got a big job ahead of you. Scalix is a mess, bynari is a hack, etc. When you do get them running things end users end up buying like PDAs and apps that hook into outlook are going to cause more problems.

    If its just pop/imap you really can't go wrong. A good webmail option is kinda a catch. Squirrelmail is nice, but compared to OWA its really out of its league.

    If your post told us what they were fed up with and how they used their system you'd get some real advice. Expect the usual postfix vs qmail vs sendmail vs whoever mini-flamewars.

    1. Re:If theyre using exchange by killjoe · · Score: 1

      I don't see what kind of a choice they have. They are already using exchange and are fed up with it. I can't even imagine trying to handle a million mailboxes in exchange either.

      They know for sure exchange won't work for them so it seems obvious that if they need to ditch outlook they will. It would be a bonus if outlook could connect to whatever they will end up with but it has to be ditched if push comes to shove.

      Their choice is not exchange or something else. Exchange is not an option, they tried it and it does not work. They just need to find something that will actually work and scale.

      --
      evil is as evil does
  42. bring in a consultant? by Khashishi · · Score: 1

    What if he IS the highly paid consultant?

    1. Re:bring in a consultant? by psyon1 · · Score: 1

      Then his customer is getting ripped off.

    2. Re:bring in a consultant? by airjrdn · · Score: 1

      And that would be different than every other consultant/customer relationship how?

    3. Re:bring in a consultant? by MrKahuna · · Score: 2, Funny

      Well, he seems aware that he doesn't, in fact, know everything.

    4. Re:bring in a consultant? by subterfuge · · Score: 2, Funny

      so he can't be management...

    5. Re:bring in a consultant? by airjrdn · · Score: 1

      Heh, you got me there.

    6. Re:bring in a consultant? by bladernr · · Score: 1
      And that would be different than every other consultant/customer relationship how?

      Because many consultants are hard-working, honest, and do bring in unique expertise?

      I know it's not in fashion to say, but many people become consultants because they are really good at some specific thing. I once had some serious Oracle scaling issues in an application, and hired an Oracle consultant (independent, not employed by Oracle). He fixed me right up, although my team of certified and expienced DBAs could not (and they were really sharp - this was the only major issue we had they didn't solve).

      Now, it turns out this guy was one of the lead architects of the original Oracle SMP on UNIX implementation. Before that, we worked at Bell Labs on A/SMP research. You felt smarter just from being around this guy, but he never treated anyone like they were anything other than his equal.

      And he was a highly paid, full-suit-and-tie wearing consultant.

      --
      Sarcasm and hyperbole are the final refuges for weak minds
    7. Re:bring in a consultant? by airjrdn · · Score: 1

      Obviously there are good ones out there. My comment reflected the approximate 99.999% of the ones I've dealt with who've typically had to be taught both business rules as well as technical information, turned out products that didn't even meet the initial goals, got paid about 3 times what other staff did, and were seen by upper management as the 2nd coming.

      The company I work for typically hires a consultant or third party vendor to get something "done". About 2/3 the way through the development, they have us start writing version 2 due to the bugs, lack of attention to detail, and non-existant foresight for how to make the product scalable. While one can say that a consultants job is to get it done, not get it done right, or make it scalable, that still doesn't make them seem any more attractive. Unfortunately during our creation of version 2, we're generally having to fix all the crap they designed/coded into their version.

      We generally hire them to get a product out the door (first to market) so staff can work on the real version. The staff hate it because if the company would simply offer a 5th of the money going out to the consultants, they'd be willing to do it in off hours. Upper management have heard this numerous times yet still choose not to do it.

      We've got a consultant in the door right now. Our work hours are 8 to 5. She shows up somewhere between 8:15 and 8:45 and leaves whenever she wishes. On more than one occassion she has spent over 6hrs of the day on her cell phone in our break room. She has been seen 3 times on Monster.com or an equivalent, signing up, etc., and finally had to be told to leave existing staff alone. She was spending a lot of time simply asking how to do things she should obviously know how to do. Oddly, she went through a technical interview just fine. She seemed confident in her answers, they weren't questions she could guess on, and yet when she actually had to sit down to do the work, it's like she was brain dead. Since she's a consultant, I can't have her work particular hours, all I can do is make sure she meets deadlines and delivers the product (regardless of how sadly it's designed or coded). Unfortunately, staff don't exactly love watching her do whatever she wants making more money than them.

      Consultants have their place, I just wish it wasn't where I work.

  43. exchange by Anonymous Coward · · Score: 0

    dont count out the possibility of exchange. If setup properly it can be very powerful and scaleable, and it is easier to administrate then most unix alternatives.

    1. Re:exchange by Anonymous Coward · · Score: 0

      Still sucks hard though.

  44. Unless It's A Very Old Exchange System... by zentec · · Score: 2, Insightful

    ...they need to think about this very carefully.

    I'm sure someone, somewhere within the enterprise is using features of Exchange that they won't get anywhere else. Not to sound like a Microsoft fan-boy sock puppet, but there's some features that Exchange has that people in a business environment just love.

    However, since you asked. I'd run Exim or Qmail and Cyrus IMAP.

    1. Re:Unless It's A Very Old Exchange System... by Anonymous Coward · · Score: 0

      CommuniGate Pro with MAPI/Groupware functionality. Completely emulates Exchange, sans Instant Messenger, but plus thing like integrated Voicemail/PBX support.

    2. Re:Unless It's A Very Old Exchange System... by Anonymous Coward · · Score: 0

      I'm sure someone, somewhere within the enterprise is using features of Exchange that they won't get anywhere else. ...and if that someone, somewhere is willing to cough up the annual TCO difference between Exchange and exim et al then you might have a good point. Before you sigh up to be that guy, realize that the difference could be on the order of $100's of millions of dollars.

    3. Re:Unless It's A Very Old Exchange System... by WilliamSChips · · Score: 1

      I'm sure you could add Jabber to get IM functionality...

      --
      Please, for the good of Humanity, vote Obama.
    4. Re:Unless It's A Very Old Exchange System... by Anonymous Coward · · Score: 0

      Actually, it is rarely *exchange* that corporations like--it's the functionality in Outlook that combines group calendaring and email.

      Fortunately, it is not necessary to have Exchange servers to run Outlook clients if that is what the customer wants. In fact, the only "feature" of Exchange that is occasionally difficult to replace is its ability to incorporate VBAScript to customize it. Of course, that is also the source of much of its security woes.

      Several companies have licensed the former HP Openmail and offer it with a few new wrinkles--and that would be an excellent way to go. There are other Exchange replacements, too, that would also work. As several have pointed out, Novell has mail solutions that would actually offer some advantages of their own.

      In another year or so, I would look seriously at Chandler, as it may transform the use of email for many. It simply isn't quite there yet--but keep an eye on it. http://www.osafoundation.org/

    5. Re:Unless It's A Very Old Exchange System... by killjoe · · Score: 1

      "I'm sure someone, somewhere within the enterprise is using features of Exchange that they won't get anywhere else. "

      They have exchange now. They don't like it. It doesn't work for them. They want to get rid of it.

      At this point it makes no sense to talk about exchange. They tried that and it doesn't work them.

      --
      evil is as evil does
  45. its a trade-off by jokach · · Score: 1

    My opinion for what its worth is that you'll have a hard time meeting all those requirements perfectly in one product. You usually have to do a trade-off because some systems are more scalable, but may not provide great webmail, while others may not manage free accounts as well as others but might have a great webmail interface. I think you have to get some real requirements as to what are the most important requirements for the email system, and meet them instead of looking for the perfect product, I doubt that you'll find it ...

  46. Easy as W33T by Anonymous Coward · · Score: 0

    GPL all the way!!!!

    Linux of course. 2 machines. You can use wimax for the interweb.

    Did I mention FSF only? Corrupt software sux.

    LAMP has to be used, it is sooo much better. Mysql can scale to 1Billion users so its best bet.

    Perl should be used for front end as it is fast.
    Maybe ajax rendering, but need GPL component as AJax - GPL = SUX.

  47. What is wrong with exchange? by pw700z · · Score: 1

    What are they fed up of it about? I think it would be easier to recommend if we understood the current problems. I mean, exchange is fairly awesome: - MAPI for a mail client protocol is hard to beat - The webmail client is quite good - The integration of calendars as such Isn't exchange the benchmark everyone is trying to reach? Why go backwards? I know not everyone loves microsoft, but exchange is really good stuff.

    1. Re:What is wrong with exchange? by Anonymous Coward · · Score: 0

      Fuck Microsoft - the threadstarter is making progress.

  48. For the lazy... by Spy+der+Mann · · Score: 5, Informative

    Here's Slidey's post. (Disclaimer: Copyright blahblahblah appropriate people yadda yadda fair use etc etc don't sue me, thank you)

    ---
    ok i work for a large uk isp in the messaging (email) operations dept. we currently have 2.5-3 million active accounts (and a load of suspended), and manage anywhere upto 12-16million mails per day

    our setup is like this (this is simplistic though):

    front line - anti abuse mta's - these do dnsbl type lookups (spamcop, spamhaus and sorbs). we have 9 incoming
    next we have mta's. they farm mail off to brightmail servers, which do similar to spamassassin. we have 6 incoming mtas, and 8 brightmail servers (not enough - high load)
    after that they farm off to vscans (6)
    after that any mail that gets through is delivered to mail stores (8 + 2 hot spares)

    what you want to be doing is similar to this above - chaining hte mail from one level to the next. the first level should be the rbl's - these are less processor intensive, and can remove a fair whack of your mails in one swoop. spamassassin is going to be more cpu intensive, since it has to open each mail and read the first x many bytes

    id have separate machine(s) holding your master directory, and if you can get directory caches then do that too (to take the load off the master directory) - ours run oracle

    i dont know what your budget is, but split up hte different tasks as much as possible. that way if you need to add more to any pool (rbl lookups, spamassassin etc) you just add another machine..

    one last thing - we also have a separate box just for postmaster mail (with exim + spamassassin funnily enough) - it tends to get busy

    Last edited by Slidey on 09-08-2005 at 11:19 PM
    --
    (end of quote)

    1. Re:For the lazy... by erikharrison · · Score: 1

      Of course, I'm critisizing a C and P post, but . . .

      I'm sure Slidey's thought of this, but there has to be some checks against filtering out too much mail at the early levels, and not providing human confirmation at highers ones.

      16 million messages a day. Even at a 99.99 percent success rate of tracking spam from ham, that's over 16 thousand incorrectly classified messages.

      Remember, this description is simplistic.

    2. Re:For the lazy... by LurkerXXX · · Score: 1
      1,600 incorrectly classified.

      Check your math.

    3. Re:For the lazy... by therus121 · · Score: 3, Informative
      I work with Slidey, but in the Solutions side of the team (i'm the guy who architects the infrastructure of the platform). Here's a few additions:

      1. Storage - \Disks, lots of Disks\ - we use EMC DMX3000's for the stateful machines (~180TB raw) which work very nicely.

      Your back end needs to handle lots of small random writes - this makes storage vendors cringe when mentioned, as it makes a mockery of their lovely benchmarks.

      2. Clustering - you'll need that also on your master directory and message stores's. Veritas is nice.

      3. Load balancing - For the front end boxes (pop, imap, web). Cisco CSS's are pretty good for this.

      4. OS - We run Solaris. It might not be the fastest thing around, but it works pretty much non-stop; has good vendor support and is very mature. RedHat might be on the horizon as well as Solaris for x86. Windows? don't be daft.

      5. Test environment. Have a scaled down exact copy of the production system to test things on. i can't stress how important this is.

      6. Proper automated server build procedure. One word - Jumpstart. All OS and application configs and builds in Jumpstart. So if you loose a box, it's no big deal about rebuilding it at 3am on Saturday morning when you've had a bevvy or two the night before, and all you feel like doing is chundering (i speak from experience - a SunFire 6800 does not respond well to projectile vomit)

      One correction of Slideys post, we now have 16 brightmail boxes (10 in, 6 out) and it's not enough.

      Cheers.

    4. Re:For the lazy... by Anonymous Coward · · Score: 0

      And how many of those 1,600 are spam getting in vs honest mail getting blocked?

    5. Re:For the lazy... by LurkerXXX · · Score: 1

      If their ratio is anything like I get with the filter I use at home (I eventually get around to taking a look at what's fallen into my spam bucket), about 16. ~99% of the mislabeled mail from my filter is spam that makes it though the filters. Not too bad out of 16 million.

  49. actual suggestions by Anonymous Coward · · Score: 0

    There are companies that specialize in this, haven't followed them recently, so don't know who is still in business.

    Check criticalpath.net, another possibility might be commtouch. I once dealt with these and other companies when looking to outsource the email portion of an internet service.

  50. 1 mil users, huh? by Anonymous Coward · · Score: 0

    commercial sendmail on veritas clustered front end, fiberchannel storage on SAN for spools, probably with an ldap layer providing internal routing and backend for user profile data? -jms

  51. Has anyone tried Zimbra software? by coder_96 · · Score: 1

    http://www.zimbra.com/ The flash demos look nice anyway.

    1. Re:Has anyone tried Zimbra software? by marcmac · · Score: 1

      I have. It definitely supports everything listed in the requirements (pop, imap, webmail) plus calendar, etc.

      Is open source a requirement, or just not MSFT?

  52. Been there by Gr8Apes · · Score: 1

    Done it. With Exchange, believe it or not. 2.5M seats, in a single Exchange/NT environment (not single server farm though - it was distributed...)

    You haven't defined your real requirements, nor what 99.9% uptime means, really. For such a large site, generally 99.9% uptime is defined in terms of full site responsiveness, outside of maintenance windows. Anything less is suicidal, and I'd walk away from. Maintenance windows should more than cover your backup windows, planned upgrades, etc. This doesn't mean that you'll use each available window, on, say, Sat night from 8-4am or something, but it gives you a nice window for major planned events.

    --
    The cesspool just got a check and balance.
  53. Try Hula by cplim · · Score: 1

    Novell's created an open-source mail server project called Hula that's based in part in part from their original NetMail codebase. It's aim is to provide a mail server that's easy to use and also scalable. Disclaimer: I haven't tried it, but have only heard about this.

    1. Re:Try Hula by papplegate · · Score: 1

      I run Hula, on my small domain, but it supports plenty. The post above aout www.myrealbox.com, is really talking about Hula, as it is using code from Netmail. Give Hula a try, you won't be let down.
      http://www.hula-project.com/Hula_Server

      Paul

    2. Re:Try Hula by Anonymous Coward · · Score: 0

      Yeah it may be good for this guy...except for the little fact...its not complete yet.

    3. Re:Try Hula by papplegate · · Score: 1

      It is not complete, but try it. It won't let you down. It is more stable then I have seen in a beta mail server.

  54. Homework? by jeffChuck · · Score: 0

    Are we doing your homework for you? One would have to think that a company of such size in the real world would hire somebody who doesn't have to ask slashdot how to do his job.

  55. Still Have to Engineer it by DavidDPD · · Score: 3, Interesting

    I'm not sure that there is any commerical solution that can support 1 million emails well. Hence why Yahoo and Google have built there own custom systems. Some engineering may need to be required.

    For pop3 & imap4rev1, look at:
    http://www.dbmail.org/index.php?page=overview

    Still need an MTA, I think qmail is the fastest, best, but I'd used exim, as its easier.

    Database - not sure if MySQL and PostgreSQL will scale with dbmail.

    I'd say use FreeBSD, because of the ports collection (Don't linux Flame me). However, something like Solaris 10 x86 (or Solaris+Sun Hardware) might provide a bit better scaling, and HA hardware, SAN support, support in general, etc. Though, a bit tougher on the OSS software installs (In My Experience)

    1. Re:Still Have to Engineer it by bani · · Score: 1

      qmail isnt the fastest. it may be the most secure, but it's not fastest.

      the qmail license scares a lot of people away though, the rest get scared off by djb's ego.

    2. Re:Still Have to Engineer it by QuasiEvil · · Score: 2, Insightful

      I'd strongly consider exim and maybe postfix if you're not looking to go with good ol' sendmail. That's the voice of a five year qmail user talking.

      I currently run qmail in a small production environment, handling about 20k messages a day. It's small, but enough to point out the cracks.

      qmail does many things well, but it also is a product of DJB-bizarroworld. The worst of the offenses, in my book, is that due to his security model, the smtp receiver will accept messages to any recipient, not just valid ones. Then, if it can't figure out what to do with it, it generates a bounce message - which usually bounces. This can kill a machine and a network connection during a dictionary spammer attack. Implementing SMTP-AUTH with qmail is a royal, gigantic, immense, overwhelming pain in the ass. It took me several hours to get it all patched together and working.

      Want any of the above to work? Patch. Want a blacklist of users that shouldn't get mail? Patch. Want SPF support? Patch. Want the non-POSIX use of errno to be fixed? Patch. Usually, the patches don't go together smoothly, so you wind up spending hours figuring out the rejected chunks and how to properly patch them together. And this is a modern MTA?

      While I've patched qmail to deal with a host of issues, there's no reason a modern MTA should need to be patched for most these. The rcpt authentication thing is just downright dumb, and smtp-auth is reasonably widely supported with the ESMTP standard.

      I'm testing exim right now, and I'm pretty happy with it. It's fairly light, does everything I want and need, and isn't the configuration quagmire of sendmail. As soon as I rebuild the mail server, I'm switching the production environment away from qmail.

      If you're a hard-core qmail adherent, that's great. It's fast and reasonably easy to configure in its basic form. However, I prefer something that's more standards-compliant and feature-rich right out of the tarball.

      My advice to anybody considering qmail for the first time is to try it, but consider other popular MTAs like exim and postfix as well, including the 800lb. gorilla, sendmail. It's a pain, but get the O'Reilly book and you can do positively anything (and I do mean anything) you want with it.

    3. Re:Still Have to Engineer it by pbhj · · Score: 1

      >>> "I'm not sure that there is any commerical solution that can support 1 million emails well. Hence why Yahoo and Google have built there own custom systems. Some engineering may need to be required."

      Well if Yahoo and Google have already done it, then I'd approach them for a quote on using their engineers to implement your system. They could say f***-off. They might give a price.

      They might come over all generous and hand you their email system implementation docs ... but I doubt it!

      If you can't get the corps to play ball, find out who the lead engineers were and head hunt them. I reckon a one-million strong corporation could do this without blinking.

      One other thought. Check for recent patent publications on email systems from either corp.. Long shot, but it may give you a start. This patent (http://v3.espacenet.com/textdoc?DB=EPODOC&IDX=AU2 003299904&F=0) application from Google talks of an elision module - for removing header or repeating elements from mail. Which sounds like it might be a system for storing tokens in place of spam mails.

      Or there's http://v3.espacenet.com/textdoc?DB=EPODOC&IDX=WO20 05046111&F=0 from Yahoo ... but neither of these help particularly. One thing corps seem to do is use abstractions of descriptions of real systems for the patent disclosure. It might help. I think it was the Google one that mention the three-fold sub-division suggested elsewhere on this page.

    4. Re:Still Have to Engineer it by bani · · Score: 4, Insightful

      if you need another reason not to use qmail, this is a good one.

    5. Re:Still Have to Engineer it by Anonymous Coward · · Score: 0

      Though, a bit tougher on the OSS software installs (In My Experience)

      Check out NetBSD's pkgsrc. It works fairly well with Solaris (though X11 stuff can get a bit funky, especially if you compile against /usr/openwin).

    6. Re:Still Have to Engineer it by Kadin2048 · · Score: 1

      Huh? You're not making any sense. There are too commercial solutions that support one million accounts (which means a lot more than one million emails), because there are corporations which sell these products and use them, which by themselves have more than one million accounts. Witness IBM, and AT&T.

      I would bet that Yahoo and Google built their own systems not because there wasn't a commercial product out there that would scale to the size they wanted, but wasn't a commercial product that would scale to the size they wanted, at the price they were willing to pay. They're high-tech, in particular internet, companies. So the "build it or buy it" equation looks very different for them than it would for, say, the Navy. Or a big consulting firm. Or Wal-Mart. Or frankly, for anyone who's business ISN'T servers and software.

      Just because Yahoo and Google went the DIY route doesn't mean that anyone else should; it only means that if you somehow ended up in a position identical to theirs, that it might be worth considering. But if servers and massive distributed systems aren't your company's (in managerese) "core competency," then you ought to go hire someone for whom it is.

      --
      "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
    7. Re:Still Have to Engineer it by ^chuck^ · · Score: 1

      I've played with dbmail and have to say its very kickass. It's been about a 1 year and a 1/2 since i last played with it, but what i really enjoyed was the fact that all user email was stored in mysql/postgres...

      this allowed me to hack apart a slightly broken web mail client for it, which allowed MUCH faster email delivery than the standard IMAP based webmail programs. It also made searches fly!!!

      I also hacked together a simple XUL (mozilla) based email client,where the entire layout and "application" was stored on the server (saving a huge amount of time for "updates" to the client.

      Basically, I wanted to make a better exchange. Their API is abstracted from the database calls, so I played around with porting to Oracle as well with limited success (they depend on a couple of column elements that are present in MySQL and Postgres which have no equivalent in Oracle).

      Either way, if someone's interested in grabbing the code I started with reply to this message.

      I also have a openbsd + dbmail doc available
      http://www.lemure.net/~chuck/openbsd_help/dbmail.h tml

      don't bother emailing chuck@lemure.net, dead due to spam. Send questions/comments about it here.

      Either way, dbmail is a very clever idea.

      --

      Lemure, wtf! Don't you mean Lemur?
  56. Zimbra by Anonymous Coward · · Score: 0

    http://zimbra.com/ look at Zimbra

    Try the hosted demo. These guys, and their work, are /is awesome.

    disclaimer: I do not work for them, but it would be cool if I did.

    1. Re:Zimbra by anandp · · Score: 1

      If you've been to our website and you believe you're a fit, you know where to send your resume. :)

      http://www.zimbra.com/careers.html

  57. Book: Sendmail Performance Tuning by Anonymous Coward · · Score: 0

    http://www.jetcafe.org/~npc/book/sendmail/

    A good book on sendmail performance tuning, although a lot of it covers the OS.

    Then get The Practice of System and Network Administration.

    http://www.everythingsysadmin.com/

  58. Groupwise doesn't suck by Anonymous Coward · · Score: 1, Informative

    I know it'll get blasted, and I thought it did suck originally, but I am surprised by its scalability and reliability.

    It's not free, but it's not dependent on the Linux community either. There is a concentrated and very very dedicated support and development crew. Message store size can be up to 1/3 the size of Exchange, and moving servers around is a cinch.

    I'm not a Groupwise admin or anything, but I have been and Exchange guy, and I feel your pain.

    1. Re:Groupwise doesn't suck by Anonymous Coward · · Score: 0

      Agreed. GroupWise is a very stable, mature product that is easy to administer.

  59. pyramid scheme by krelyk · · Score: 0

    at the top of the pyramid - you have your mx servers/clusters... slap on your postfix+amavisd there to filter unwanted crap or cluster some barracudas... those would pass the wanted email to your routing servers/cluster at the middle of the pyramid. Then those would pass the email off to the pop3/imap servers (maybe one for each dept) at the bottom of the pyramid - enterprise-grade, connected to fibre-channel san ... Open source should do the trick if you have the $ to buy the hardware needed but not the software - I hear communigate pro (sp?) is nice if you are looking for something commercial. google for postfix or qmail for some nice howto's on the free stuff...

  60. When all else fails... goto spec.org by pci · · Score: 2, Informative

    Using this as a reference point (and from recommendations I've heard)...
    I recommend CommuniGate.

  61. Didn't We Just Have This Question? by vigilology · · Score: 2, Informative
  62. IBM + VMware ESX, RHEL, Postfix, Horde by Semireg · · Score: 1

    As a VPC/LPIC (VMware Certified Professional, Linux Professional Certified) consider using a blade solution from IBM or DELL with VMware VIN (Virtual Infrastructure Node) installed to keep your server OS installations abstracted from hardware. Use RHEL as your guest OS, which will run your specific software applications.

    I'd be more than happy to consult a large-scale VM installation.

  63. Courier! by temojen · · Score: 1

    Not just IMAP, but the whole shebang (MTA, webmail, POP3, IMAP, mailinglists, etc), plus you'd want OpenLDAP for storing all those passwords. I'm not sure how to set it up redundant and distributed, etc, but I'd wager that someone at the courier-mta website could point you in the right direction.

  64. qmail by Anonymous Coward · · Score: 0

    qmail is secure and scales wonderfully

    maybe first post?? :)

  65. lotus domino by Anonymous Coward · · Score: 0

    lotus is great , as a poor BOFH who normaly admins Domino who is now due to the crap employment market admin an exhange set up , lets say I feel pain, lots of pain .
    Domino - good security model , easy to implement easy to keep secure , built like a truck and can take abuse , ie mail file size that would annilate exchange , domino does not break into a sweat.
    you will handle that many users without hassell
    cheaper than exchange
    less security hassles
    simple and logical to set up
    call IBM at the amout of users you have they will be selling their first borns to get your buisness.
    good luck and enjoy migrating from exchange

  66. just what is being replaced? by dAzED1 · · Score: 1

    I find it hard to believe that Exchange Server supports a million accounts in any sort of configuration that wouldn't barf on itse;f every 30 seconds.

    Just what is it this is replacing?

  67. One million accounts, but no Exchange? by ip_freely_2000 · · Score: 1


    I would have to think that you want support for a setup like this. Your options realistically probably boil down to one choice.

    You'll need a vendor with proven big time support and, unfortunately, OSS is not something you may be able to look at.*

    With proven installations as large as 400,000 users in a single organization, your only choice is.....Lotus Domino. Pricey though.


    * Wasn't Hotmail originally running on BSD? You may want to check it's history.

  68. Gmail accounts... by slashname3 · · Score: 2, Funny

    I have several gmail accounts I can give you. Once you have serveral of these you can assign gmail accounts to the rest of your users. :)

  69. MDaemon by ScrewMaster · · Score: 1

    It's a good mailserver ... a million accounts though ...

    --
    The higher the technology, the sharper that two-edged sword.
  70. Toaster by Anonymous Coward · · Score: 0

    Might want to have a chat with Matt Simerson over at http://www.tnpi.biz/

    http://www.tnpi.biz/internet/mail/toaster/intro/fe atures.shtml

    Good luck!

  71. I would use postfix by shunk · · Score: 2, Informative

    From my experience postfix scales the best for sending and receiving email. Use postfix+(mysql or ldap) + amavisd-new + clamav (or some proprietary alternernative) + spamassassin. Cyrus is probably the best for pop and imap access. Squirrelmail for webmail.

  72. CommunigatePro from Stalker.com by ejoe_mac · · Score: 5, Informative

    1) It'll run on anything - Win32, Linux, BSD, Solaris, x86, XServers, Alphas, Power5
    2) It'll scale as big as you can dream - over 5 million accounts with clustering
    3) MAPI support

    1. Re:CommunigatePro from Stalker.com by Anonymous Coward · · Score: 0

      I have to definitely second this recommendation. It's late and I'm sick, so I won't go into details, but CommunigatePro would be the first place I would look for this kind of thing. Superb support, very competitive pricing and a product that is just rock solid. There's an article from a few years ago that compared various mail servers for different size installations and Stalker's won by a landslide.

    2. Re:CommunigatePro from Stalker.com by David+McBride · · Score: 1

      You may wish to reconsider that:

      "Flamers roast Stalkers for 'timebomb' shut-down"
      -- http://www.theregister.co.uk/2005/02/04/stalkers/

      Friend of mine at an ISP got bitten hard by this.

    3. Re:CommunigatePro from Stalker.com by msblack · · Score: 2, Informative

      You want to blame the makers of CommuniGate Pro for enforcing the terms of their license? I take it that you believe customers should be entitled to infinite upgrades at no charge. CGP users are always able to use the version of CGP they purchased or their last upgrade before the license expiration for as long as they please.

      This so called time bomb applies only to FORMER customers who upgrade without a current license. Sounds fair to me.

      --
      signature pending slashdot approval
  73. I'd check out Open-Xchange by thehunger · · Score: 1
    Obviously, check out open source options. For users that require collaboration features (ie. calendar, appointment scheduling, tasks etc) I'd give Open-Xchange a serious look.

    An scalable, open-source based email server particularly well suited if you have multiple domains etc. is Limacute, developed by Linpro, a Linux experts company in Norway. It is GPL and in use by at least one large mail-centric ISP.

    There's also the Hula Project. It is based on Novell's NetMail. Novell used to claim that a single server easily could handle 100.000 users. The Hula project is adding calendar and other features.

    1. Re:I'd check out Open-Xchange by Anonymous Coward · · Score: 0

      Take a look at @Mail - http://atmail.com/ a much more polished and professional WebMail interface over Hula/Open-exchange.

  74. I worked at a company that did this... by curunir · · Score: 1

    I worked at a company that hosted mail for other companies. We had POP/IMAP/Webmail plus a bunch of other services.

    Our secret to making it work? Qmail.

    For an installation like this, Maildir format seems like a must to me. Plus, almost all of the free webmail clients support or require it.

    Like others have said, you have to separate your storage, inbound mail, outbound mail and webmail services onto different hardware. Since we had many millions of mailboxes, we also had proxies in front of all of client-facing servers to help minimize the impact of an individual server having issues. But these are all basic network design issues. The key is a secure, configurable MTA like Qmail that stores it's mail in a friendly format that other apps can understand. De-couple everything and you should be able to scale up to AOL size, if need be.

    --
    "Don't blame me, I voted for Kodos!"
    1. Re:I worked at a company that did this... by TheBracket · · Score: 2, Informative
      I work with a similar setup, only rather than plain Qmail we use Qmail-LDAP. It works wonderfully, and has some nice (but really not amazing) clustering capabilities.


      We have account data stored in an LDAP store, mirrorred to a second (read-only) store for redundancy/scaling when busy. LDAP scales wonderfully for read-heavy tasks such as this one.


      As has been mentioned separately, separating recipient (edge), storage, and outbound mail servers is really important. Our edge servers perform RBL checks, greylisting (on some domains that want it), SPF (ditto), reject various attachment types, perform a reverse-MX check to try to accept from valid addresses only, and perform a recipient address check to quickly reject incorrectly addressed messages. That cuts down 80% of incoming mail (with very few false positives). Mail is then forwarded to a second set of edge servers that run SpamAssassin (set to flag spam, not stop it) and ClamAV on attachments. Finally, it goes into the storage servers. POP3/IMAP/Webmail points at the mail directories on these servers. Our outgoing servers are quite a simple setup, with SMTP Auth (also hooked to LDAP). We also have a few listservs setup, but they are a side issue.


      Qmail is a bear to setup, and asking the author for advice is a good way to get flamed. Other than that, it works very well, we haven't had any security issues, and it's adequately fast - especially if you apply the "silly qmail todo" patch, fixing concurrency problems under high load. It's part of the Qmail-LDAP distribution (as is almost everything else I listed).


      For servers, we use FreeBSD. I'm sure other OSes would do a fine job, but FreeBSD has been rock solid for us.

      --
      Lead developer, http://wisptools.net
  75. 1m? by Jeffrey+Baker · · Score: 1

    A million used to be a lot of accounts, but now it really isn't. The real questions are: how many mails will be sent and received every day, how many during the peak minute of the day, and how much long-term storage is needed. At [name of company removed] which hosted zillions of email accounts, we found many unexpected problems with the storage, such as NetApp being unhappy with hundreds of billions of tiny files, Solaris NFS being unable to deal with filenames longer than 32-characters, and so forth. But handling the front-end tasks of SMTP, POP, and IMAP for 1m accounts wasn't so difficult.

    If I were in your position I'd just call up HP and outsource the whole thing. If it was truly necessary to keep it in-house, I'd probably throw together three separate beefy machines: one to deal with the IMAP/POP clients, one to deal with the inbound queue, and one to deal with the outbound queue. Probably qmail or any other standard mailer would work fine. For the storage, you could use a small SAN with GFS.

  76. Hire Matt Simerson, the creator of MailToaster by ChrisKnight · · Score: 3, Informative

    My number one suggestion is hire someone who has built scalable mail systems, and written tons of code to support them: Matt Simerson

    You can learn about him, and his mail projects at http://www.tnpi.biz/internet/mail/toaster.shtml

    -Chris Knight

    --
    -- This sig is only a test. If this were a real sig it would say something witty. --
  77. Openwave by Anonymous Coward · · Score: 0

    Openwave is definitely one way to go. Of course, I say this as someone who worked on the system, but nonetheless, it is designed to scale to millions of users across many hosts while still maintaining a single point of administrative interface. At the time I worked there, it was the ONLY mailserver that could scale that high OR offer a single interface for the entire cluster. That could be different now. Yes, it costs, but if you are supporting millions of users, the money is nothing compared to the costs of maintaining a cobbled together system. It is the email server used by many of the tier 1 ISP's and webmail systems.

  78. If you can't do your own job by Anonymous Coward · · Score: 0

    Please let your superiors know so that someone more qualified can be hired to take your place.

  79. I'd start with a professional... by Anonymous Coward · · Score: 0

    ...that is, someone who doesn't need to ask the motley masses of slashdat for advice.

    In other words, YOU'RE FIRED!

    however, kudos to your comp for dropping Crapland Brand Software (aka MS)

  80. 13 Million mail infrastructure by Anonymous Coward · · Score: 1, Informative

    I once installed a 13 Million account mail system on a Linux infrastructure. As far as I know, it is working nowadays (I left that company).

    The keys were:

      - qmail (but postfix will work better nowadays)
      - smtp (4 machines)
      - pop/imap (4 machines)
      - separated webmail 1 is enough (2 to high availability)
      - NDS (Netscape Directory Server) which is now owned by RedHat and opensourced.

    Hope that helped.

    1. Re:13 Million mail infrastructure by benjamindees · · Score: 1

      NDS (Netscape Directory Server) which is now owned by RedHat and opensourced.

      That's a good point. Authentication will quickly become a huge factor in the success and ease of your project. Build your system around the authentication infrastructure.

      --
      "I assumed blithely that there were no elves out there in the darkness"
    2. Re:13 Million mail infrastructure by HermanAB · · Score: 1

      Hmm, only 10 machines to handle the mail, but you forgot about the 50 machines running Spam Assassin and ClamAV...

      --
      Oh well, what the hell...
    3. Re:13 Million mail infrastructure by louissypher · · Score: 2, Insightful

      I built and admin mail for around 100k users. Their is no f'ing way that you can run 13 million accounts on 10 machines. One webmail server for 13 million people?

      --
      www.bleepyou.com
  81. Scalable e-mail systems? by shub · · Score: 3, Informative
    Try Googling for "Scalable E-mail Systems" and "Scalable IMAP services". Of course, I'm biased since most of the top hits are from the slides from the presentations that I've done at LISA 2000, LISA 2002, etc....

    My slides relevant to this discussion can be found at http://www.shub-internet.org/brad/papers/dihses/ and http://www.shub-internet.org/brad/papers/sistpni/.

    And yes, Nick Christenson has been a long-time friend and co-author of mine.

    Feel free to contact me directly if you want some referrals.

    --
    Brad Knowles
    http://daily.daemonnews.org/ -- if you're not
  82. Re:qmail by kerrle · · Score: 1

    Agreed.

    QMail should be able to handle it fine, though he should of course expect to have the load distributed to quite a few machines.

    I'd probably also set up a sizeable group of mail gateways on incoming mail, to filter the mass amounts of spam and viruses that a million email addresses are going to bring.

  83. AOL by MoogMan · · Score: 1

    Ahh, you need AOL Mail. They have over 1 million [l]users :-P

    1. Re:AOL by Jay+L · · Score: 1

      Yep, and it handles about 4,000 messages per second (not counting all the spam that gets filtered)... :) But it's not available, not standards-compliant, and had to be designed to support a lot of special AOL features that don't exist in the rest of the world, and that make a distributed mail system hellish: unsend, check status, instant validation instead of bounces, etc.

      But one thing we did learn: If you've got a high volume of messages, managing outbound queues is going to be a full time job for you. Mail will back up in your queue for downed sites, and that slows down sendmail, slowing down the rest of your outbound mail in a vicious circle. Newer versions of sendmail let you partition by outbound host; you're going to want to use that.

      Other than that, make sure you use a file system that can handle lots of inodes without slowing down logarithmically.

  84. Simple. by Anonymous Coward · · Score: 0

    Sun Microsystems JES (Messaging, LDAP, Calendar).

    Problem solved.

    No, not free. No, not open sourced. Great performance, full, robust, integrated enterprise level systems that can handle 1 million accounts like cake (I've dealt with JES/iPlanet deployments in the tens of millions of users).

  85. Dear Slashdot by hobotron · · Score: 0, Flamebait



    How do I do my job again?

    Thanks
    -Hobotron

    --
    There is truth in humor.
  86. Google services by naoursla · · Score: 2, Funny

    I bet Google would be willing to sell you a solution.

    1. Re:Google services by MP3Chuck · · Score: 1

      You're modded funny, but it'd be smart of them, no? A searchable, scalable email solution. POP3 support, filtering, etc... I dunno, maybe I'm underestimating corporate email needs but it'd be pretty slick to be able to buy a few Gmail rack boxes, put them on the network, and have them Just Work.

    2. Re:Google services by naoursla · · Score: 1

      Exactly what I was thinking. It wasn't intended to be funny. They already sell search appliances. I think a plug and go email server appliance would be a natural next step and I really would be surprised if it isn't already in the works. It should also allow you to add another box for more capacity and do redundant backups and failovers and everything automagically.

      Heck, maybe I should make this if they aren't offering it.

  87. Start with universities by dubl-u · · Score: 2, Insightful

    I'd start seeing what universities near you use. They won't be as big, but a large school should have circa 100k accounts and a lot of the same issues you'll face. They may already describe their infrastructure somewhere on the web. And offering to take two or three of the mail guys out to lunch or dinner will get you a ton of the nitty-gritty details and smart questions to ask yourself (and vendors).

    Then once you think you have a solution, budget plenty of time for extensive testing against simulated load. Make sure you simulate failures by, e.g., pulling plugs randomly. Buy the hardware and software *after* you're 100% sure it works, not before. And where possible, roll your solution out gradually, so that small problems don't turn into MCFs.

  88. IBM Z990 by 1c3mAn · · Score: 2, Informative

    Contact IBM. A mainframe running z/VM is your solution here.

    99.9% reliabilities is more then normal for those machines. It is modular enough to expand to what ever you may need in the future, and it has the dataprocessing horsepower to actually hand the 20k or so concurrent users at a time and have the harddrive space to match that many users as well.

    Run linux or unix on top of VM and you should be fine.

    Product Page for Z990:
    http://www-03.ibm.com/servers/eserver/zseries/z990 /

    1. Re:IBM Z990 by kcbrown · · Score: 1
      He needs to nail down his requirements a bit more, but for his situation, it sounds like a mainframe solution would make a LOT of sense:

      • It can handle the sheer amount of data he'll need in one place (a million users' email, at a measly 10 megabytes of storage each, comes to 10 terabytes of storage, and he should probably regard that as a minimum)
      • It can handle the data moving capability he needs (out of a million users, he should probably count on spikes of 1% or more of them hitting the system at a time -- so 10,000 users or more at a time)
      • If he wants 99.9% uptime for the service, then he'll need something like 4 times (I'd shoot for an order of magnitude myself) better uptime out of the OS and hardware, so now we're talking something like 99.99% uptime from the OS and hardware. That's 52 minutes a year of downtime for the OS and hardware. A mainframe should be able to do that without breaking a sweat -- they typically have uptimes of years, from what I've been told.
      • It gives you centralized administration and control, which includes things like doing backups.
      • Because it's a highly reliable machine in a central location, reliability now becomes primarily an issue of network reliability. With a distributed solution, you have to worry about network reliability and the total reliability of all nodes put together -- the reliability of the multinode solution isn't likely to be as good, depending on your network topology.

      The downside, of course, might be the expense. I say "might" because it may end up being the cheapest solution after you factor in everything.

      So whether or not the mainframe is a good solution depends primarily on the company's network. If the company is spread all over the globe, then a distributed solution may make more sense. Setting up a widely-distributed solution, where the mailboxes themselves are distributed, so that any given user can get at his email from any location is likely to take some custom programming (the user database would, for instance, have to be replicated everywhere and would have to store the user's "home site" location so that the connector would know where to connect the IMAP request, for instance).

      I will say this: a distributed solution is almost certainly going to require more in the way of administrative manpower to manage than a centralized solution (like a mainframe).

      This sounds like it would be a very interesting project to work on...

      --
      Use 'slashdot stuff' in the subject line in any email you send me if you want to get past the spam filter.
    2. Re:IBM Z990 by Anonymous Coward · · Score: 0

      The TPF Mail server (runs on IBM Mainframes) claims to be designed to support 250 Million mailboxes:

      http://search390.techtarget.com/tip/1,289483,sid10 _gci802989,00.html

      Typical TPF availability is above 99.99%

    3. Re:IBM Z990 by Horus1664 · · Score: 1
      I certainly agree that a mainframe solution is required for this size of job and brings with it the reliability and operability necessary to provide decent service.

      The choice of OS should be Transaction Processing Facility (TPF) however, as this is much faster than VM, or any other mainframe OS.

      IBM have already proclaimed that a TPF system was capable of supporting at least 250,000,000 email addresses (approximately one for each US citizen) when they developed their mail server for TPF.

      People that understand TPF have been saying for some time that many heavy internet sites would probably be more manageable if they used TPF, which now supports Apache as well.

      For those unaware of TPF it is the software that has powered the huge airline computer systems for years.

      In fact the biggest handicap TPF has is IBM's internal politics that have historically relegated it to a poor relation behind CICs/IMS, which it outperforms as a TP system comfortably.

      For the curious: http://www-306.ibm.com/software/htp/tpf/

      or http://www.blackbeard.com/tpf

  89. Sun's Java Messaging Server (AKA Netscape/iPlanet) by Zocalo · · Score: 1
    Don't let the word "Java" put you off and click here. It definitely scales that high with ease, does all the mail transfer protocols that you require, has webmail plus it can interface to SMS, SpamAssassin and AV tools. Not to mention the rest of the Sun Java Enterprise System of course, especially the LDAP server which makes delegated account administration much easier. Most importantly for you it scales very well indeed and supports clustering which should help with your uptime requirements.

    That said, it's a beast of a system, not the easiest thing in the world to administrate by a long shot and Sun's commitment to further development seems a little "lacking" lately. It's also not especially cheap, but you should be able to negotiate some massive discounts on a deployment of that scale (well, what did you expect from Sun?). You should definitely also be thinking about getting a few people on Sun certification courses if you do go down that route.

    --
    UNIX? They're not even circumcised! Savages!
  90. Since you are coming from exchange... by jsnipy · · Score: 1

    Since you are coming from exchange is it just the email that you are replacing? Is the customer expected calendars, pub folders, and all of the other nick-nacks in exchange?

    --
    -- if you mod me down, I will become more powerful than you can possibly imagine
    1. Re:Since you are coming from exchange... by ArtStone · · Score: 1

      The most insightful post on this thread - once a company starts using Exchange, the people using the system will get used to things like expecting "Read receipts", integration with scheduling meetings and sending out the meeting invitations, emailing Office documents from inside Office... searching years worth of archived Exchange stored emails to say "See, I told you your idea wouldn't work"...

      I vote for hiring a person with experience in managing large Exchange installations and identifying and fixing the existing problems.

      --
      Final 2006 "Proof of Global Warming" US Hurricane Count -> 0
  91. qmail-ldap is best suited to this task by Lost+Found · · Score: 2, Interesting

    qmail-ldap is best suited to this task. Reasons:

    1. You can sleep at night knowing that you're running the only MTA in widespread deployment that has never once had its security compromised; in fact, qmail's author Dan Bernstein still offers cash to the first one to be successful...

    2. You can sleep at night knowing that the core MTA, qmail, has reliably handled some of the largest e-mail operations in the history of the internet. Its design is such that on a properly configured system, you'll never lose a single e-mail. Hotmail actually used qmail for a long time, even after Microsoft bought them - Microsoft repeatedly tried to replace it with Exchange, which kept buckling under the load.

    3. Qmail is very modular, allowing you to pick and choose your components wisely.

    4. Qmail uses the Maildir format its author pioneered. Maildir is NFS safe, not proprietary/complicated (often binary formats like PST are subject to corruption), etc.

    5. LDAP makes it easy to manage massive amounts of accounts.

    In any case... qmail-ldap is already running large sites with millions of users. Info:

    http://www.qmail-ldap.org/wiki/Documentation

    I've set one of these systems up on an IT cluster at my current office, and I must say that it is not only very robust but also really easy to manage.

    1. Re:qmail-ldap is best suited to this task by Anonymous Coward · · Score: 0

      And if anyone has doubts about if qmail is scalable - yahoo.com runs on it (no idea how many gazillion mailboxes are there)

  92. I run one capable of scaling that high by Anonymous Coward · · Score: 0

    We run 10 front end mx boxes that run postfix and deliver via lmtp to 10 lmtp servers that deliver the mail to netapps. lmtpd handles virus and spam filtering. Works like a charm. ipvsadm is a godsend.

  93. Plan. Test. Spec. Deploy. by MattW · · Score: 4, Informative

    (1) Plan an server setup which can handle the load. The requirements may change, but one million users is a fair bit. How much average incoming and outgoing emails is that? Figure that out, using a network sniffer or sniffers on existing traffic if need be (although logs should work). Then use this to calculate a number of servers needed for an outgoing smtp farm, an incoming MX farm. Figure out how much storage space is to be provided per user, and then figure out how you want that storage space to be accessible. Probably your best bet is to have a round-robin DNS farm of imap/pop servers which proxy connections based on the users login to a backend farm of actual mailservers responsible for storage. Plan the ability to move users from server to server to rebalance as needed. Outgoing smtp is a lot easier since you're not really storing things long term. Plan a web farm for webmail. (And pick software) Don't forget to plan some sort of backup, and make sure your system is flexible as far as email retention; chances are the email retention policy will change at some point and your setup should be able to change with it.

    (2) Test. For each server, hammer it. Test it's load under as close to real world circumstances as you can. Then create unreal punishing loads and see how it handles it. Plan in advance for how your server farm handles something like virus-generated mass emails causing 1000% spikes in load.

    (3) Using your testing results, spec out the actual hardware. RAID, cheap hardware, redundancy, etc. If you have control over the network choice, plan a location with multiple fiber trunks coming into the building and provider redundancy. Remember backhoes in concert? Don't get hit by that. Plan for server failures, drive failures, network failures, power failures, and security compromises.

    (4) Deploy! If you did the rest right, this is the easy part. You'll have redundant network connections, HSRP, redundant switches, a proxy farm, an imap/pop farm the proxies connect to, an smtp farm for outgoing emails, and a web server farm for serving up webmail (depending on how you choose to architect the disk space, the web farm and the pop/imap farm may be one and the same; depends on how you set things up.)

    Here's a starter link to a setup which is smaller but, in principle, fairly similar:

    http://www.itd.umich.edu/umce/features/2004/cyrus. html

    Finally, if you don't want to screw it up, ask someone who has done it before. Paying someone $300/hr for a 10-30 hour review of your plan is dirt cheap compared to horking the setup. Someone who has worked in huge email environments (a la, hotmail) could show you gotchas before they bite you. (If you need help figuring out who to ask, I could even point you to some of the appropriate people)

    1. Re:Plan. Test. Spec. Deploy. by More+Trouble · · Score: 1
      Here's a starter link to a setup which is smaller but, in principle, fairly similar:

      http://www.itd.umich.edu/umce/features/2004/cyrus. html

      Finally, if you don't want to screw it up, ask someone who has done it before. Paying someone $300/hr for a 10-30 hour review of your plan is dirt cheap compared to horking the setup.

      FYI, I am the architect of the above deployment (thanks for the props). I am 100% in agreement about the 10-30 hour review idea, $/hr negotiable. :)

      :w
    2. Re:Plan. Test. Spec. Deploy. by MattW · · Score: 1

      Haha. On any given day, on Slashdot, anyone is watching, eh?

      I was actually thinking of a certain hotmail engineer, or another friend developing a commercial antispam toaster based on research one webmail provider had with a hundred-million-per-day-plus spam problem. Both have the mail chops to probably independantly benchmark the services on proposed hardware, catch gotchas, and okay a design in a 30 hr timeframe.

      But hey, how did the umich upgrade turn out?

    3. Re:Plan. Test. Spec. Deploy. by More+Trouble · · Score: 1

      But hey, how did the umich upgrade turn out?

      Peachie! Cooking like gas. Around 70,000 accounts, all in heavy use with no quotas.

      :w

  94. No Homework Assignments Allowed by Sin(O)+Cos(O) · · Score: 1

    Slashdot should not be used to solved your homework assignments!

  95. YIKES! Tossing out the groupware?! by Dark+Coder · · Score: 4, Informative

    Gee whiz... I'm surprised that the groupware is getting tossed out. If as small as 20% of the user is accustom to Outlook Calendaring, they'll represent 95% of the complaints in a new system. An advance warning to all existing account should be mailed out (both paper and email) so that nothing falls through the cracks.

    Now to the mega-infrastructure that I set up for an undisclosed company for under 50K (and also didn't want groupware).

    1. Transport Sender (sendmail). That's right! Good ol' plain sendmail scales. It does require some pretty savvy tweaking so get Sendmail.Com consultant onboard just for this. Use SleepyCat DB for speed for all sendmail setups. For one million, I had about 23,000 transaction per minutes during the day. You'll require 10 servers for this for cushion (against some idiots sending an ISO attachment).

    2. Payload receiver (sendmail). A second group of machine to handle the reception of SMTP payloads.

    3. IMAP4S/POP3S - Hey what's with the "S"? Nothing like sending your user's password in the clear. Unless you enforce VLAN in your corporate environment and limit all IMAP4/POP3 to VLAN, the "S" is a mandatory security feature, inside and outside. Guess what "S" stands for?

    4. Webmail - SquirrelMail - Yet another dedicated server (in which I had to add two more load-balanced server to handling the growing pain). Use https for login only.

    5. AntiVirus (ClamAV) - It was the best back then, now its just running in the middle of the pack. sendmail has milter that allows extensibility such as MIMEDeFang, wilter, rureal (reverse-DNS check), spamassasin, and SPF.

    6. Support - Half the effort is put into those webpages that would 'hand-hold' these newbies into reconfiguring their machine. Worth the effort if you have over 20 expert PC users that can do their boxens. Otherwise do it yourself at each PCs. These pages should cover Thunderbird, Evolution, as well as Outlook and Outlook Express.

    7. Learn to spin 11 plates, one on each pole. Keep them spinning... If they start to drop and break, bring in some more Unix dudes.

    1. Re:YIKES! Tossing out the groupware?! by whoever57 · · Score: 1
      I would add Perdition to the list of tools. Perdition can be used to distribute POP/IMAP connections across a bunch of servers, while providing a single point of entry (single machine which proxies the incoming connections).

      Using Perditon, one can send the actual POP/IMAP session to a specific machine using regex type matches on the username. The Perditon server(s) require little processing power per connection.

      --
      The real "Libtards" are the Libertarians!
    2. Re:YIKES! Tossing out the groupware?! by Anonymous Coward · · Score: 0

      As you can see, setting up MTAs is a task you definitely want to seek experienced, professional help in doing, whichever MTA you use.

      For the web-based mail solution, I think you can't go past http://www.openwebmail.org/. It's feature rich and flexible.

    3. Re:YIKES! Tossing out the groupware?! by Stinking+Pig · · Score: 1

      Yeah, throwing out the calendar is pretty unusual and makes me think that the OP's boss isn't high enough up the ladder to be making the decision (or is so high up the ladder that they have no clue what makes the company go).

      I've been looking for a free calendar that doesn't suck for a long, long time, and it ain't here yet. A mod_webdav server hosting .ics files accessed by Apple iCal and Mozilla Sunbird is about as good as it gets, but iCal's UI sucks and Sunbird occassionally gets confused and loses all the data.

      Double-check on that calendaring requirement, then look into commercial solutions when they say that they didn't mean it and they really did want groupware.

      --
      "Nothing was broken, and it's been fixed." -- Jon Carroll
    4. Re:YIKES! Tossing out the groupware?! by Craig+Ringer · · Score: 1

      What POP3/IMAP server did you use? Personally, I've been using, and have been very happy with, Cyrus IMAPd.

      Re IMAPs and POP3s, I agree that encrypted mail is crucual. You should offer SMTPs too. All the ssl-wrapped protocols are, however, somewhat obsolete. They're needed for older clients, where the client support exists the STARTTLS variants should be preferred. In particular, the SSL wrapped protocols are impractical for virtual domains because the server has to pick and send out a certificate before it has any way of knowing what hostname the client thinks it has.

      You should also also offer standard IMAP, POP, and SMTP but enforce STARTTLS, so the server won't let the user authenticate without first negotiating SSL. The advantage here is primarily virtual domain support, but you can also cover more clients if you support both variants of each protocol (IMAPs and IMAP+TLS, etc).

      If it's for corporate mail, I strongly suggest requiring the submission of a client certificate signed by your company's CA when connecting from the outside world. You can make your own CA and use it on your network without problems (just install the CA cert in the client by bundling it in the PKCS12 file with the user's client cert).

      I couldn't agree more about the importance of mail filters like amavisd-new or MimeDefang to apply ClamAV and SpamAssassin checks. It's just too bad you'll need a stack of quad opterons with a apalling quantities of RAM to handle the load :S

    5. Re:YIKES! Tossing out the groupware?! by Nailer · · Score: 1

      1. Transport Sender (sendmail). That's right! Good ol' plain sendmail scales.

      From the sound of your mail, you have more experience in large mail systems than I do. So you may know something I don't here. But I'd like to point out that Postfix scales too (primary MTA at AOL) and so does Exim (primary MTA at Google). The latter two MTAs also don't install transports for rarely used protocols waiting around for someone to exploit them, and reduce the chance of misconfiguration by shipping a standard Unix human readable config file.

    6. Re:YIKES! Tossing out the groupware?! by Dark+Coder · · Score: 1

      I can't agree with you more, but at the time a decision was made to sweep in a guru Sendmail consultant but only after a carefully detailed RFP has been sent to him.

      I knew I could handle sendmail.conf, but given the time and money, it was cheaper to contract that one out.

      Google/Yahoo wasn't in full force at the time the primary MTA decision was made, so it was extremely difficult to compare MTAs without actually deploying selected candidates (no time).

  96. Cyrus. by JadeSky · · Score: 1

    Postfix + Cyrus IMAP and Cyrus POP3. Seperate your systems out (MTA v. Final Delivery). Use Cyrus Murder (as in a murder of ravens, or a cluster for us normal people.)

    Back it up with LDAP for all the joyful goodness it bears (authentication, address books, etc.) If you want stronger authentication, add in Kerberos V.

    I'm sure others will suggest all-in-one packages, but most of the ones I have seen are really some combination of Postfix or Sendmail combined with OpenLDAP and Cyrus, anyway.

    Take your time to think about load-balancing, storage, and test, test, test!

    It'll be a couple weeks of work (assuming you already have hardware and networking and storage gear), but you'll likely end up with a bulletproof mail system.

    --
    I used to think printing on on Unix sucked. Then I figured it out. Printing on Unix *does* suck. Like a Kirby.
    1. Re:Cyrus. by Anonymous Coward · · Score: 0

      there was a post today on the cyrus mailing list about someone who was just about to implement a system with 750K accounts after extensive testing so it's definantly in the ballpark for what you are looking for.

      cyrus is extremely scalable, but it doesn't implement mirroring itself, you need to choose a storage solution for it that will do that for you

  97. Re:Obviously - use zombie machines by Anonymous Coward · · Score: 0

    use zombie machines

  98. Re:Here's my plan and it's the best one you'll get by Reality+Master+101 · · Score: 4, Insightful
    So how will people get all their mail rather than a twentieth of it? Easy, you set up a round robin DNS on mail.DOMAIN.com.

    This is the best advice he'll get? Sheesh.

    Think this through -- a lot of e-mail programs check every 20 minutes. Assuming I actually hit any without duplications, I could potentially need 400 minutes or over six hours to get all my mail. Since it's random, it could take days.

    And that's just for starters with this lame scheme. If I want to check mail, say, from the field on a dial-up once a day... hopefully you can see how badly this would suck.

    What the guy should do is buy an e-mail system that can handle 1,000,000 users and not screw around trying to chewing gum his own solution.

    --
    Sometimes it's best to just let stupid people be stupid.
  99. Only 99.9% uptime? by Radak · · Score: 2, Insightful

    If my email system designer were satisfied with almost nine hours of downtime per year, I'd find a new designer.

    1. Re:Only 99.9% uptime? by Trailer+Trash · · Score: 1

      Especially when that represents 9,000,000 man hours without email.

    2. Re:Only 99.9% uptime? by mshiltonj · · Score: 1

      If my email system designer were satisfied with almost nine hours of downtime per year, I'd find a new designer.

      If 99.9% uptime costs X, and 99.99% uptime costs X x 2.5 (according to an earlier post) and 99.999% uptime costs god knows how much more -- at what point do you reach a point of diminishing returns?

      Assuming that 100% is not achievable, how does one determing what degree of available is "good enough"? Alternatively, why are you so certain that 99.9% is *not* good enough?

    3. Re:Only 99.9% uptime? by sharkey · · Score: 1

      Well, since he is moving from Exchange, 9 hours/year of downtime may seem like a golden dream. Hell, one on-site visit from a DELL Gold Support tech will likely run you past 9 hours.

      --

      --
      "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
    4. Re:Only 99.9% uptime? by Anonymous Coward · · Score: 0

      gimmie a break, your email goes down for 9 hours christmas night, and you change providers.... riiight. 99.9% is great.

    5. Re:Only 99.9% uptime? by dtfinch · · Score: 1

      99.9% is better than a lot of ISPs I know.

  100. Its not that bad. by Anonymous Coward · · Score: 0

    You get to spend hours each day playing FreeCell while you are waiting for Lotus to open a single email. I've gotten quite good.

    1. Re:Its not that bad. by Anonymous Coward · · Score: 0

      You must have a terribly slow or over worked server. In this case you can just replicate your mail to your local hard drive in the background. You will then be able to open your email in no time... so long as your workstation is not 5 years old.

  101. Re:Here's my plan and it's the best one you'll get by Anonymous Coward · · Score: 0

    You're an idiot.

  102. contract! by gnixdep · · Score: 1

    Hire someone who has done something along that scale
    I hear poaching MS staff is all the rage these days.

  103. fusemail? by Wabbit+Wabbit · · Score: 1

    What about fusemail?

    This is exactly the kind of service they offer.

    Anyone have experience with them?

    --
    Nothing is inexplicable; only unexplained -Tom Baker, Doctor Who
  104. Re:Here's my plan and it's the best one you'll get by techno-vampire · · Score: 1
    Easy, you set up a round robin DNS on mail.DOMAIN.com. This way whenever a user checks their mail, they'll randomly end up on a different mail server, therefore collecting more of their mail.

    Earthlink uses that in part, or did when I worked there. However, instead of getting a random chunk of your email, all the servers connect to the same file servers so that you get all your mail no matter what machine you hook into. Done right, it's just as fast and you don't have to worry about missing a time-sensative email because you never logged into the one and only server it's on.

    --
    Good, inexpensive web hosting
  105. Novell Groupwise or Lotus Notes by trboyden · · Score: 2, Informative

    Chances are you're not going to be just turning off those Exchange servers, you're going to need to migrate the data. That being the case your going to want something with good migration tools that can handle that much migration in a relatively speaking short amount a time. I just completed an Exchange to Groupwise migration and there are some really great migration tools out there for it. Groupwise also meets all your requirements out of the box. Not to mention by buying Novell you're (at least indirectly) supporting open source. I'm not as sure about Lotus Notes, but regardless if your going to have that many users, you want big name vendor support.

    1. Re:Novell Groupwise or Lotus Notes by nuckin+futs · · Score: 1

      i find the groupwise client very unintuitive. it's the most confusing email/calendar/etc. client i've ever used.

    2. Re:Novell Groupwise or Lotus Notes by M3number3 · · Score: 1

      Have you seen the latest release? Not bad at all.

    3. Re:Novell Groupwise or Lotus Notes by nuckin+futs · · Score: 1

      what's the latest? we've been using 6.5 and it's awful, IMHO.

  106. Ballmer will not accept this! by Anonymous Coward · · Score: 0

    At best, you have to consider how your face will look with a chair shaped dent in it. At worst, he may bury you, then start throwing chairs at your grave. Either way, you should probably just stick with his shitty Exchange until he requires you to replace it with MMail, just like GMail, but ... better, more ... expensive.

  107. Architecture? by wulfbyte · · Score: 1

    1 million accounts doesn't really begin to explain what kind of solution you need. How many data centers do you have, what is your existing infrastructure, what kind of support will you be expected to provide to end users, to administrators, do you have special security needs to address (HIPPA, etc) et c. ...

    You really need to determine what your needs are for the immediate future, then figure in growth, before you can start thinking about which solution or set of solutions (more likely I would think) is appropriate.

    I could of course go into far greater detail, but then I would be looking for a piece of the action...

    Good luck on finding what you need.

  108. Groupwise doesn't^H^H^H suck by Anonymous Coward · · Score: 0

    You misspelled it. The correct phrase is "Groupwise does suck." Majorly. At work, the FLAIM databases are always corrupt and in need of repair; the client is slower than molasses; the server is always crashing and must continually be rebooted. Oh, I'm sure Exchange is probably worse, but don't fall for Novell's worthless claims that Groupwise is a capable mail-handling platform!

    1. Re:Groupwise doesn't^H^H^H suck by hb253 · · Score: 1

      Then you have monkeys for admins. My colleague and I administer the NY portion (about 5000 users) of our global mail system. Server software upgrades take all of 5 minutes. Servers never crash and performance is zippy. As with most products you have to know what you're doing.

      --
      Self awareness - try it!
  109. www.stalker.com by not_sleepy · · Score: 1
    I have used Communigate Pro with great success but not on your scale. They have a great forum at www.stalker.com, I would search there for how large some of the sites are and ask the question if you don't find a answer. Last time I looked, it worked on Windows, Macs, Solaris and Linux - probably more. There are a number of add-ons to handle groupware, virus protection, etc... I never had to use their support, so I can't tell you about that (but, since I didn't have to use them that should tell you something!)

    I was told by freind of a freind that this is what a lot of ISPs use so we gave it a try. We downloaded the fully functional demo, installed, ported users over and started using it in about 30 minutes. Spent another couple hours customizing the web front end (which I'm sure could have been done much faster by anyone with a little graphics talent).

    Good luck, don't forget to reply yourself on what you choose.

  110. You should be fired. by gatkinso · · Score: 1

    I am 100% serious.

    --
    I am very small, utmostly microscopic.
  111. Split up the tasks by Precision · · Score: 1

    Split everything.

          - Incoming MX's (exim)
              - Spam checking
              - sender verification
              - greylisting
              - route mail to the IMAP stores

          - IMAP stores (exim plus maildir + dovecot)
              - break out people on different servers (a = imap-1, b = imap-2, etc..) trivial to do w/ exim and dovecot, also break out the maildirs by letters /var/maildir/f/foo/)

          - LDAP/Mysql
              - some kind of directory to store username, passwords, which imap store, etc.. on.

          - outgoing MX's (postfix)
              - postfix queue handling can't be beat
              - smtp auth for users outgoing mail.

          - IMAP proxy's (perdition)
              - http://www.vergenet.net/linux/perdition/

    --
    - U
  112. POstfix + Mysql by bubulubugoth · · Score: 1

    Look at postfix + mysql
    http://www.sweeney.demon.co.uk/pfix_imap_virtual.h tml

    Mostly, U will need a cluster for everything.
    If you are seeking for a all around opensource, start with this link, later, to use LVS, the tool for makeking load balancing clusters go here:
    http://www.linuxvirtualserver.org/

    And if you really are looking for a opensource cheap software costs (not very cheap tco) also you can build your OWN san with ata over ethernet:
    http://sourceforge.net/projects/aoetools/

    And for webmail a usefull but also ligth interface:
    http://www.squirrelmail.org/

    With all the licence cost savings, you can Invest a lot of time, and have a fair amount of flexibility.

    Sendmail inc, has high availability solutions:
    www.sendmail.com

    Also, you can spend a lot of money and buy a very bit IBM machine with lots, and lots of lotus notes licenses, with that kind of money spent, you can put IBM at your knees if a lawer makes a good contract..

    Also, to complete the solution you can setup nagios and mrtg for monitoring.
    http://www.nagios.org/
    http://people.ee.ethz.ch/~oetiker/webtools/mrtg/

    I think, to setup the hole thing, U will need, like about 50 good servers, (maybe u can try IBM openpower with virtualization, it IS a risc CPU), and like.. humm.. a month of technical tests...

    The mysql backend will give you centralized administration, LVS will provide scalability and good servers will give you uptime...

    And if EVEN you like, you can make a Linux Routers using sangoma hardware:
    http://wwww.sangoma.com/

    Everything can be done with Linux by now... The cuestion is how much responsability do you want to have regarding the stability, and overall functionality of the solution.

    IBM, HP, RedHat, SuSe, and ANY Linux Consulting firm would be interested in having you as a success history.

    Good Luck, and May the Source be With You

    --
    Â_Â
  113. Only one thing... by dtdns · · Score: 1

    If it's a large company, they have money. If they want software with a corporate vendor behind it, look no further than Vircom's ModusMail software. It can authenticate against a wide variety of sources (AD, SQL, Radius, LDAP, etc.). The user mailboxes can be stored on a SAN array to deploy multiple front-end servers to increase uptime. Supports POP/IMAP/webmail, etc. EXCELLENT spam and virus filtering built in with automatic updates every 15 minutes. Admin is Windows GUI app or web-based. Cost is higher than you would pay for a FOSS solution, but uptime and ease of management make it a great option to look at. I don't work for Vircom, but I am a satisfied customer. See: http://www.vircom.com/

  114. Ironmail by feydrus · · Score: 1

    It's used by many of the largest and most succesful companies across the world. With good reason - it works!

    Now keep in mind this is a gateway solution: It screens incoming and outgoing mail for spam and viruses plus a whole lot more.

    You still need a solution for storing the mail for 1 million accounts, but Ironmail will interface with any of them.

    www.ciphertrust.com

  115. Hire Chertoff & Brown! by jpellino · · Score: 1

    They'll be looking for jobs soon enough, and they're as qualified for this as they are for their current jobs.

    (he opined with karma to burn...)

    --
    "Win treats sysadmins better than users. Mac treats users better than sysadmins. Linux treats everyone like sysadmins."
  116. postfix and dovecot by Geekboy(Wizard) · · Score: 1

    postfix and dovecot-1.0alpha (the alpha version is a misnomer, its been very stable for me for a long time before).

    make sure you have high quality hardware, and don't forget to secure the system. *make* everyone use tls/ssl and smtp-auth. have your users use port 587 (and 465 for those that need it), so you don't get blocked by all of the port 25 blocks (which exist just about everywhere).

  117. Re:Here's my plan and it's the best one you'll get by HansF · · Score: 1

    I don't quite follow the round robin solution.
    Wouldn't that mean that I'd have to press send/recieve for 20 times and there could still be a chance that some of my mail is left on a missed server ?

    --
    --> Insert Funny Sig Here
  118. Blackberry? by koa · · Score: 1

    Does anyone have any ideas on what one would do if they had users who depend on a Blackberry? I'm sure that if you have that many users it is quite possible that some of them already rely pretty heavily on them.

    AFAIK (I could be wrong) but there doesnt seem to be any effort by Reasearch in Motion to include sendmail (or equiv.) support for their Enterprise Server product.. Not to mention real-time calendaring and contacts synchronization..

    --
    ....move along....nothing to see here....
  119. Segrigate and think about the IO's by silas_moeckel · · Score: 1

    OK having done this before there are a few tips. Sepperate everything inbound MTA's, outbound MTA's, Web, IMAP, Datastores and filtering to scale you need servers with very different requirements at every stage. Glue everything together with LDAP or SQL just remember it needs to be dynamic assume that you need to seamlessly move users from one data store to another on a regular basis and expand partitions on the fly (LVM is your friend here). The ability to alter the data flow on the fly is a must to perform maitnence etc. SAN's can be your friend here as it makes moving data around easy but it can also up the cost (10k gets you a nice server with 4-5TB's of disk 10k dosent buy squat for SAN gear) if I could do it all again I would look at iSCSI rather than IMAP/POP proxy's/NFS along with with a cluster FS.

    As to software side go with what you know unless it's just incapable of doing the job. I like sendmail the next guy likes qmail (programers like it lots of easy SQL hooks) but overall having the techs be knowledgeable in it matters the most. For some cool hardware bits that can speed things up look at solid state (RAM not flash) disks for temp spools as just about everything besides MS respects the commited to disk requirement for SMTP and thats a big performance issue.

    --
    No sir I dont like it.
  120. where would you start by just-a-stone · · Score: 1

    divide et impera

    like any project, the best was to start is still thinking about dividing the original problem into small chunks. setup some rough timeline and find the correct person for each task to take over.

    one big point would be all kind of administration and distribution of settings. maybe, LDAP will be the account storage of choice, but maybe, there are some other reasons against it (e.g. the need for a possibility to bulk change user properties fast and easily).

    another big point is mail storage, affecting all kind of clustering you think of. huge and redundant NFS boxes are open for changes, but other storage solutions may be faster,....)

    or your network design? your smtp hosts will receive tons of mail every day, should they be the same as the ones doing virus/spam checks and bouncing of non-existant accounts? it's not very useful to check inbound mails you dont's have a "local" recipient for. atm, at work, i use a setup, where the first 2 smtp hosts (load balanced, but could be via dns too) just check if the account ldap if the account exists + some minor checks, to relay internally to the next step (this way, there are free ressources for larger worm and virus waves). step 2 is virus scanning and spam detection (just adding headers, bigger machines, the cpu intensive part). finally, step 3 is the correct delivery system that stores the message itself in the user's mailbox.
    at the same time, networking issues including firewall, internal interfaces and administrative access should be part of this considerations.

    the easy setup combination point: after some research, the best for the last delivery and user access i found was qmail-ldap and courier. in your case, that depends on your requirement specification. comparatively easy questions like the mailbox format of choice should also be solved by this group.

    security has to be an issue for all parts. how are you planning to secure your outer smtp? are there requirements that people from other networks (e.g. at home) use your smtp for sending mails (not using webmail, that'd be easier)?

    monitoring & backups - 99.9% is a hard task, find the correct way of monitoring parts where you expect problems.

    and in the end, the group where you will mostly spend your time: side effects of unclear specification.

  121. Deep water by Anonymous Coward · · Score: 0

    If you are in charge of setting up an e-mail system for 1 million users and you are asking Slashdot for advice then you are in way over your head.
    Tell them to use Google gmail and then start working on your resume.

  122. Infrastructure, infrastructure, infrastructure... by Anonymous Coward · · Score: 0

    Start by listing the requirements. You've got some protocols, uptime, and number of accounts, but you need more.

    Storage requirements? Simultaneous usage? Likely peak load times? Account management? Backups? Disaster Recovery? etc...

    The mail part is actually fairly trivial in comparison to building the necessary underlying infrastructure.

  123. What you need by Anonymous Coward · · Score: 0

    Computer Hacking Skills, Bow Staff skills, Bike Jumping skills.........

  124. Commercial Package Options by schave · · Score: 2, Informative

    If you are intererested in commercial packages, either Sun's Java System Messaging Server or Openwave's Mx product will easily scale to a million accounts and beyond. Many of the larger ISPs are using these packages or have their own custom mail server. Other possibilities may be Mirapoint(who offers an appliance type solution) or Sendmail.com

    If you are into benchmarks, the folks at SPEC have published results from several packages.

  125. Huh? Are most of you posters teenagers? by Anonymous Coward · · Score: 0

    Suggestions like using a public email service such as Gmail is just plain silly. Asking on Slashdot is almost as silly.

    No offense but if you have to ask hear, you are probably over your head on this job. I would seriously consider securing the assistance of someone who has done this before.

    There are many highly scalable and non-ms servers available... examples: www.tnsoft.com, netwinsite.com, www.kerio.com, etc...

    BUT the question is...
    - what features do you require?
    - why is the client fed up with Exchange? (which happens to be one of the premier and among the most ubiquitous corporate email servers)
    - what OS will this email system operate on?
    - What is your budget?

    1. Re:Huh? Are most of you posters teenagers? by Anonymous Coward · · Score: 0

      I meant here, not "hear"... so if you want to nitpick about that, bite me! :-P

  126. Scalix by sien · · Score: 1
    Check out scalix. It is basically a lower cost exchange replacement that runs on Linux. It scales, it works. There is a community edition that you can test.

    The CEO, Julia Hanna Farris has 20 years of experience working on messenging systems for Bell then for Lotus Notes and then in a few other start ups, and she is a babe as well. There is an interview with her over at It Conversations that you might want to listen to.

    With the paid for edition you get all the features of exchange without the cost and without the security risks of running Windows servers.

    1. Re:Scalix by andreyw · · Score: 1

      Scalix server-side is pretty neat. But just say no to -
      a) The scalix connector. Outlook is a piece of crap by itself, but the scalix connector is simply rancid icing on an already spoiled cake. It is slow. It is buggy. It crashes haphazardly. It consumes gobs of memory. It does not play nice with Palm conduits or HotSync. It does not play nice with out-of-office.
      b) The web interface. It looks good on paper. It is a trainwreck in practice. The web interface should be used as a case study of how *not* to design interfaces with Javascript. Slow, buggy and unresponsive.I've had less negative vibes getting my teeth pulled at the dentist.

    2. Re:Scalix by Anonymous Coward · · Score: 0

      Hvae you used the latest version of scalix? The product is much improved though thier is still a problem with using palm sync software. The outlook connector works nicely and webmail is really shaping up too. I have most of my users using webmail only and they seem to really like it so far.

  127. FUBAR by Aslan72 · · Score: 1

    It's a stupid question for two reasons:

    1) Instead of saying 'No _________'(insert evil, monopoly here), you need to ask what exchange isn't doing. What specifically is wrong with exchange other than it's parent has the initials b.g. Any management that would start a requirement by casting a curse on a particular company is just asking for trouble. It's a no win situation because in two years if sun one, sendmail, whatever bombs on you, where will you go? Part of IS's duty in an organization is to help management ask better questions.

    2) Why are they putting this on one person. I have several friends that work for a large corporation in town that runs a large installation of e-mail and no single person has responsiblity over the entire system; there's probably a staff of 30+ that run their e-mail (and that's just a guess).

    Anyone that asks one person to take on a project of that scope is asking not for an answer but a scape goat.

    Shop your resume.

    --pete

  128. Re:Here's my plan and it's the best one you'll get by kashani · · Score: 2

    Wow. You've got no idea just what it would take to do this do you? Or you're being extremely funny.

    1. users should be in a db.
    2. imap servers should be their own cluster
    3. pop servers should be their own cluster
    4. smpt servers shoudl be their own cluster
    5. spam filtering should be their own cluster
    6. round robin DNS should be ditched in favor of hardware load balancing.

    kashani

    --
    - Why is the ninja... so deadly?
  129. What's your budget and requirements? by mveloso · · Score: 1

    How much money are you willing to spend?

    Do you need to keep the existing functionality of Exchange (ie: calendaring)?

    The latter will limit your choices more than the former.

    Funny, the first answer that came to mind was "Solaris." Sun gets a bad rap here, but really, their hardware is pretty good.

    Is your organization decentralized? Be sure to think about administration and infrastructure issues associated with your particular topography. Ideally you'd want to keep your current processes, and slide another product underneath it.

    Anyhow, be sure to phase it in. Run a pilot in one section. Better yet, run multiple pilots with multiple vendors if you can mange it.

    Insist that the vendors do the pilot for free. If possible try and run within your existing infrastructure somehow, with live users. Find a department or two that's willing to volunteer, and use them as part of the test.

    Vendors won't lie to you, but their data is obviously biased. Be sure to understand exactly what you're trying to figure out from each vendor. Keep metrics now, and while you're doing the pilot, so you have hard data as part of the pilot.

    If you have requirements, be sure to share them with the vendors when they design their pilot. List -everything-, including every feature in webmail, every administrative capability that you'd like, etc. Be specific. "Supports IMAP" isn't good enough. "Supports the entire IMAP feature suite" is better. Listing every IMAP feature that you want is the best. It's not overkill, because inevitably the feature you need will be implemented in a later release and you'll be screwed.

    The idea isn't to beat one vendor with the requirements. Nobody will be able to do everything. The idea is that at least you know what each product/suite can do and what it can't, so you can adjust the expectations.

    Be sure that the people in the food chain forwards all external entities inquiring about the project back down to you. Vendors will attempt to affect the requirements by talking to your boss, your boss' boss, etc. Make sure that everyone knows that you are in charge, and that you are the point of contact.

    Hmmm. Understand what your infrastructure is now. You can't replace it if you don't know what it is. Understand as much as you can about it, so you can spew statistics out the wazoo at meetings. If you know, for example, that 8% of everyone checks their email after business hours, you'll overpower everyone else in the meeting with your obvious technical Mojo.

    Be nice, but firm. Save being a dick for vendors. Your job is to keep your internal customers happy. The Sales Engineers are not your friends, they're there to make sure you buy their product.

    You can yell at sales people. It is a negotiating tactic, and can be very effective. Don't yell at the SEs, because they don't have any real authority.

    And whatever you do, don't change everything at once!

    Lastly, keep your boss (and your boss' boss) in the loop as to your progress. When the s**t hits the fan, they have to back you up. They can't do that if they have no idea what's happening.

    Good luck!

  130. Re:Here's my plan and it's the best one you'll get by Anonymous Coward · · Score: 0

    I would like to shoot every person who modded this crap up as Informative.

    It should be Funny, or modded down to oblivion.

  131. Secretaries by Anonymous Coward · · Score: 0

    Do away with email. Everyone then gets a hot secretary. When you want to send a message to someone, she memorizes it and then runs over to that person's cube and repeats it. If the message is too long, she can write it on her clothing and just drop it off for the other employee.

    It definitely saves money on equipment, and I know I'd be a much happier person when I got an email.

  132. Simplicity is key. by chrome · · Score: 5, Informative

    My job is building systems like this. Current mailserver system I designed and built is hosting 80,000 email accounts, and will scale out to a million quite cheaply by just adding more machines.

    OpenLDAP

    You need a central configuration repository to store the email accounts, their passwords, etc. OpenLDAP is perfect for this, and you can replicate it out for scalability. Be prepared to learn about LDAP schemas.

    Exim

    Use Exim because it has a simple process model (a single binary that does all the work, like sendmail) but has a human readable configuration file and has to be the most flexible MTA out there. You will have customers with weird requirements sometimes, and Exim will be able to meet those. Plus, it has Exiscan-ACL built-in these days, which allows you to do virus scanning and spam scanning at the DATA stage, before the mail is actually accepted by the MTA. It means you can make the sending MTA deal with the bounces if the mail is a virus or is obvious spam.

    Courier-IMAP for POP3 and IMAP access.

    Yeah its written by a sociopath, but nothing else works as good in the field. It works out of the box with sensible LDAP schemas and is fast, reliable and secure. Handles SSL, all the different authentication methods, what have you. Maildir compatible.

    Maildir message store.

    Store the mail in maildirs. Don't put them in /maildirs/domain.com/user/Maildir - split the domains up with a 2 level deep hashing algorithm (if you're virtual hosting domains, which is what it sounds like to me), so make it something like /maildirs/xx/xx/domain.com/user/Maildir, where xx/xx might be something like 3f/6b (depending on the hash). Use MD4 for the hash because its more balanced than MD5.

    NFS mount the maildirs from a fast NFS device like a Netapp. Netapps are recommended because you can plug them in, and they just work, plus they are easy to scale by adding more trays.

    Linux NFS servers set up with heartbeat and shared disk also make a nice HA NFS, and would be cost effective, but you'll have to buy an array anyway (probably fiber channel) so it might be better just get something thats completely integrated like the Netapp.

    Spamassassin.

    Can be configured to scan make at DATA time in the SMTP conversation. A LOT of configuration work here to make it play nice on a massively scaled platform, but it can be done. Mostly it needs to have things like the auto whitelisting and bayseasn filtering turned off, as the extra DB file work is a bit excessive.

    Actually, I'm sure there is a way to make it work with a less resource intensive repository, but using the standard SA rules seems to work well for my environment. *shrug*

    ClamAV.

    Free antivirus, it works, and integrates well with Exiscan-ACL. Set it up to scan via the daemon, and configure it to update every couple of hours from cron, and bob's your uncle.

    Scaling out

    Make every box the same. Make every box an MTA, a POP3/IMAP server, etc. Use something like Kickstart to automate builds so that you can build a machine in 10 minutes, and all you have to do is configure the IP address and plug it in. If you want to be REALLY sexy, you could make the machines boot off the network, and mount / from a shared NFS area, and make /var/spool/exim the internal mirrored disks. DHCP them, then all you do is plug a machine in and set it to PXE boot. Pretty trivial to do.

    Load balancing

    Hardware load balancers are pretty much a necessity. Don't touch cisco stuff. Its not very good. Go with Foundry Networks ServerIrons. The XLs can handle 1 billion requests/day if you configure them in Direct Server Return mode (also known as DSR/Foundry switchback). Use it. It makes all the return traffic go directly out to the net, meaning your ServerIrons have to switch less traffic and track less sessions. I would recommend however for a million users a pair of the ServerIron 450GTs, or bigger. Maybe one per VIP/Service.

    Now, if this is all looking pretty daunting, you could always hire me to build it for you :)

    1. Re:Simplicity is key. by msimm · · Score: 1

      OpenLDAP

      Great product, nightmare to learn. SLES happens to make it a little easier, even will help you setup domain logins with Samba without losing all your hair (highly recommened).

      Spamassassin

      Ew.

      ClamAV

      ClamAV is a wonderful tool. However keep in mind it can be set up a number of ways, it can scan incoming traffic, it can scan out-bound traffic, it can scan both incoming and outgoing traffic. It can use a lot of cycles, consider your outbound scanning and disable it if you feel comfortable with that (I work at a smaller shop and we have a pretty solid (read: managed) A/V policy on the workstations).

      I didn't see anything mentioned about multi-queueing, maybe this isn't as important in your situation or isn't supported with a lot of mta's (we use sendmail). But adding queue limits and multiple queues can help speed things up. Disk speed (for the queues specifically) can also help reduce bottlenecks as I/O increases.

      .02

      --
      Quack, quack.
    2. Re:Simplicity is key. by Anonymous Coward · · Score: 1, Interesting

      Have you found courier imap scaleable? I have found when a user gets 1000+ messages in a box doing certain imap queries (select, sort) become very slow. This is because courier can't index any data and has to open each file on these queries. This was a real deal breaker for us, would be interested if you found away around this (without forcing users to divide their folders).

    3. Re:Simplicity is key. by Matt+Perry · · Score: 2, Interesting
      split the domains up with a 2 level deep hashing algorithm
      Could you please elaborate on this point and why you do it?
      --
      Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
    4. Re:Simplicity is key. by Anonymous Coward · · Score: 0

      so crackers don't get to guess where other people's mail lives.

      (assuming they penetrate that far)

      no idea on md4 vs. md5

    5. Re:Simplicity is key. by chiph · · Score: 2, Informative

      While I am not a mail guru, I expect it's because you've got a million users, and having that many directories all in one subdirectory means that the file system will be consuming a lot of cpu doing lookups. With a decent hash algorithm, you're down to approx. 15-16 directories in the 2nd directory level, which is quite managable.

      We did something similar at my last job, where we had to maintain 9 million+ smallish files. We originally had one level of indirection and NTFS choked (huge amount of time spent in the kernel, not enough time running our app). Adding another directory level made it happy again.

      There may yet need to be another level of indirection at the folder level to handle those few people who never deleted any email over their 15-year career with the company.

      Chip H.

    6. Re:Simplicity is key. by chrome · · Score: 1

      Couple of reasons,

      1) So you don't have 50,000 domains sitting in one directory. Can get a little slow.

      2) So if you need to have more than one NFS server, which you will if you have a large customer base, you can separate the base level into different NFS servers.

    7. Re:Simplicity is key. by Anonymous Coward · · Score: 2, Interesting

      (This is not a troll, all the following questions are honest.).

      > OpenLDAP

      IIRC, the replication feature was pretty buggy in some versions of OpenLDAP (2.2.x). Has it been really fixed in the latest versions ?

      > Exim

      What about qmail ? Have you ever tried it ?

      > MD4 [is] more balanced than MD5.

      Do you have evidence to back up this claim ?

      > NFS mount the maildirs from a fast NFS device like a Netapp.

      How do you provide data redundancy with such devices ? Do you replicate data on different NFS servers ? Why not use FreeBSD or Linux boxes as NFS servers ?

      > Hardware load balancers are pretty much a necessity.

      Why not use standard software load-balancing facilities provided by Linux and BSD systems ?

    8. Re:Simplicity is key. by chrome · · Score: 1

      Spamassassin is actually fine, you just have to tweak it a lot to get it running well on a busy system.

      With regards to configuration of the MTA, thats where an experienced Exim admin helps. Or at least, someone who is able to read the Exim docs. Not a big task really.

      I have 8 machines doing 500,000 messages a day, and their load is under 1 even during the busiest periods, using the above config.

      I'm pretty certain it would scale out to 1,000,000 users just by adding more hardware. I don't think it would scale linearly, either; I saw a more than double increase in performance going from 4 to 8 machines. Though, I wouldn't expect that curve to continue all the way through to 1M.

    9. Re:Simplicity is key. by adrianmonk · · Score: 1
      1) So you don't have 50,000 domains sitting in one directory. Can get a little slow.

      That's an OS-specific problem. For instance, on Solaris 8 and later, the DNLC (directory name lookup cache) has been enhanced so that it can deal with directories with thousands of entries efficiently. So, 50,000 might be slow on one operating system but perfectly fine on another.

      2) So if you need to have more than one NFS server, which you will if you have a large customer base, you can separate the base level into different NFS servers.

      That's a better reason. This problem could be solved with automount maps, but that might be more pain than just doing two-level hashing.

    10. Re:Simplicity is key. by chrome · · Score: 1

      Courier-IMAP seems fine here.

      The servers are IBM x306s with 2GB RAM, 3.2Ghz CPUs and dual 80GB SATA drives, mirrored.

      I think the 2GB ram and the fast back-end NFS is key here. Lots of cache.

      I *think*, from memory, the Courier-IMAP indexes a lot of data now. At least, it leaves a bunch of cache looking files in the Maildirs. I have mailboxes with 1000s of messages and they don't seem to cause lots of NFS IO.

      Check out the newer versions, maybe its addressed the issues you described.

      In my environment, its never been an issue.

    11. Re:Simplicity is key. by Matt+Perry · · Score: 1

      But what are you generating the hash from? The directory name? And why generate it at all rather than just use the first two characters of the domain name?

      --
      Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
    12. Re:Simplicity is key. by Zak3056 · · Score: 1

      so crackers don't get to guess where other people's mail lives.

      I'm fairly certain we're talking for performance reasons here. 20k files in a single directory==slooooow, so you break them up into reasonably well distributed chunks. As far as security through obscurity goes, it dies with the following simple sequence of commands:

      cd /maildirs
      find . -name "domain\.com" -print

      --
      What part of "shall not be infringed" is so hard to understand?
    13. Re:Simplicity is key. by Zak3056 · · Score: 1

      And why generate it at all rather than just use the first two characters of the domain name?

      Because there are a thousand times as many domains that start with the letter T than with the letter X. A hash gives you a far more even distribution than just using the first few characters of the domain.

      --
      What part of "shall not be infringed" is so hard to understand?
    14. Re:Simplicity is key. by PapaZit · · Score: 4, Insightful

      All of the paren't suggestions are decent, but there are a few alternatives that may make sense:

      -Cyrus IMAP, while a monster to build and configure, can handle a pretty heavy load, and the latest versions can handle a lot of load-balancing internally.

      -Exim's nice. I'm a Postfix man, myself. Sendmail is king, though. I'm not going to claim to like it, but it's up to the task, and there's something to be said with using a standard tool.

      -While things like MD4 are okay for hashing, they're kind of CPU-intensive. Consider something like "second and third letter of username" that takes less CPU time. The right answer here depends a lot on the relative speed of CPU versus disk. If you can get dedicated hardware to do this (rare, but it exists), use whatever hashing the hardware supports.

      -Consider some sort of cache (maybe even separate machines) between incoming SMTP and SpamAssassin/ClamAV. When the 2am spam run hits, your incoming SMTP machines can become overloaded. The downside: deciding what to do with mail that's not rejected the moment it's received.

      -Set up a "mail machine" configuration with whatever OS and tools you use, and make it possible to create a disk image quickly. You're going to need a lot of hardware, which means that you'll have enough random failures to make building machines by hand impractical. This also means "have at least one extra built machine/disk array/etc. powered-on and waiting at all times" for those 4am hardware failures.

      -You may find that things like NFS just aren't fast enough. Be ready to look at SAN or shared "direct-looking" storage. The tough part: this is hard to discover during testing. It may be overkill, but don't lock it out as a possibility.

      -I/O is king. CPU speed won't matter as much as bus speed, disk speed, and memory speed. This is why a lot of companies use banks of big proprietary unix machines for their mail, even if they use commodity PCs elsewhere.

      -I don't trust hardware load balancers. Sometimes they're necessary (and they do make life better when they work), but they're a big single point of failure. Consider other ways to split the load, or at least ways to work around the load balancer if it should fail. The Cyrus aggregator can handle some of this.

      --
      Forward, retransmit, or republish anything I say here. Just don't misquote me.
    15. Re:Simplicity is key. by Zak3056 · · Score: 2, Interesting

      OpenLDAP

      You need a central configuration repository to store the email accounts, their passwords, etc. OpenLDAP is perfect for this, and you can replicate it out for scalability. Be prepared to learn about LDAP schemas.


      I know this won't be a popular opinion, but given that he's migrating from Exchange, it's fairly likely that they're already an Active Directory shop... it doesn't make sense to abandon it for OpenLDAP, especially given that they're almost certainly windows only on the desktop and will still need AD even if they ditch Exchange.

      --
      What part of "shall not be infringed" is so hard to understand?
    16. Re:Simplicity is key. by chrome · · Score: 1

      OpenLDAP: Replication has worked fine for me since 2.0 and up.

      Qmail:

      Tried it, loathe it. I won't go into it here for fear of DJB suing me ... ;)

      MD4:

      I did tests, asked around, general opinion was to use MD4 over MD5 if you're going to use the first 4 characters as a directory hash.

      Netapp:

      I won't sell you on them. I'll leave Netapp's salesmen to do that for you. I will say that any issues about redundancy or scalability are pretty much addressed.

      WRT using FreeBSD/Linux NFS servers - I guess its the same reason why I don't buy a machine with many PCI-X slots and fill it with quad fast ethernet cards to build my own switch. Its just esaier (and more reliable) to get that component from a vendor thats done the work on it.

      Hardware load balancers:

      Same reason as above. They are cheap, they work, and they can do cool things like DSR which I'm pretty sure that the software load balancers can't do. DSR is pretty much a requirement for a high performance environment.

    17. Re:Simplicity is key. by chrome · · Score: 1

      Its for performance reasons, not security.

    18. Re:Simplicity is key. by chrome · · Score: 1

      Give the man a cigar! :)

      Or a cookie, if you don't smoke.

    19. Re:Simplicity is key. by chrome · · Score: 2, Informative

      Cyrys: No opinion on this. When I looked at it 3 years ago, it wasn't where I wanted it.

      Exim: I've tried the rest, Exim's the best. :)

      MD4: You do it once, when the account is created, and put the location of the Maildir into the LDAP directory. No CPU hit.

      Spam/ClamAV: I've found separating this stuff out makes it worse, not better. Having all the machines equal, and having lots of them, seems to work better. Don't ask me why, I'm not a professor at this stuff, I just know what works and what doesn't.

      Disk images: Don't do it. Its a dark road. I use Fedora Core 4 and Kickstart. I build RPMs of everything, including configs, and build it all with the kickstart. You could do something similar with Debian if that's your poison.

      NFS: NFS is good. Get a fast NFS server and you won7t have problems. Use gig for the interconnect. SAN based Global File Systems are not their yet. They are too buggy and unreliable.

      IO: CPU does help a lot, actually, if you're doing the spam/antivirus thing. If you don't do that, then fine.

      Hardware load balancers: Foundry kit is trustworthy. I've been using their stuff for years and never had any major problems with it. I've got ServerIrons that have been running for 3 years without a reboot and without a problem. The key is: understand how they work, and you won't have any problems.

    20. Re:Simplicity is key. by thogard · · Score: 3, Insightful

      Current mailserver system I designed and built is hosting 80,000 email accounts, and will scale out to a million quite cheaply by just adding more machines.
      80,000 is trivial. I was running a 12 node system with 87,000 users 12 years ago on hardware that was slower than a play station.

      The complexity of going from 100,000 to 1,000,000 isn't just 10 times harder, you start to get into that area where sigma 4 system works with few problems with 100k but dies horribly with 1000k users. There is a line where instead of one machine being broken is unusual, you get this situation where at least one machine is always broken and it will often be broken in a way that is hard to diagnose.

    21. Re:Simplicity is key. by chrome · · Score: 1

      Ah yes, but I stole this design off a large ISP in the UK that I used to work for, and they had it scaled up to 500,000 users :)

      So, I know that it works.

      But, I guess thats why you need to employ smart people that can adapt the design as you need to scale out.

    22. Re:Simplicity is key. by msimm · · Score: 1

      Ya, we just started beefing up our mail system, single machine. It gets hammered sometimes but generally the load is doable (from time to time we send out large quantities of mail, but not the kind you'd think).

      I'll take a look at Exim, right now Sendmail is king mainly because thats what was installed previously and change can be...we, hard to encourage.

      Spamassassin mainly annoys me because I don't think its so much the best answer as the most well known. But again, its what we use and our users are happy.

      Thanks.

      --
      Quack, quack.
    23. Re:Simplicity is key. by spir0 · · Score: 1

      Netapp:I won't sell you on them. I'll leave Netapp's salesmen to do that for you.

      Funny. We would have bought a NetApp if it wasn't for their head salesman in New Zealand. He has completely turned me off their entire company. I've never met a more brash and abusive person who was trying to sell me something.

      He abused me when I wanted to shop around to see other vendors' offerings, because NetApp is the best, goddamnit. Any idiot can see that. Why even look at anything else?

      Fucktard.

      I'm sure he would have had me at gunpoint if we were face to face.

      --
      The reason girls and Windows users don't understand UNIX is because all the documentation is in Man files.
    24. Re:Simplicity is key. by vmcto · · Score: 1

      The other thing to keep in mind is that 80,000 users 12 years ago versus 80,000 users today probably represents a 100x (conservatively) increase in the amount of raw mail traffic you will have to handle.

      In other words your 12 year old solution would die horribly today.

    25. Re:Simplicity is key. by Eric_Utah · · Score: 1

      I would agree with this design and choice of tools. I also build large scalable systems using most of the software and hardware mentioned here. My last project had more than 150,000 users on less than 20 generic Intel server platforms w/RH7.3.

      I use an OpenLDAP back end with a master read/write server and replication to a couple of slaves that all the lookups run against. User LDAP records contain fields such as their name/pw, public email addresses and back-end internal addresses.

      I use Postfix as the MTA for receiving and moving mail rather than Exim, but that's just because I'm comfortable with it and happy with its performance.

      I use Courier-IMAP with Maildirs as it runs well and plays nice with LDAP. Additionally, I use Perdition as the front end POP/IMAP proxy so that users always automatically connect to their own back end server for message retrieval.

      My setups involve 3 tiers: Front end platforms for external facing services such as SMTP/POP/IMAP (low powered single cpus/standard ide io), a second layer a servers for filtering and spam markup (moderately powered SMPs w/scsi io power), and back end servers for message storage (strong SMPs with strong scsi io power & raid).

      Inbound mail arrives at the front end MTAs. It's accepted and readdressed with its internal back end address, then is delivered to a group of spamassassin/clamav filtering/tagging servers for markup. From there, mail is transported to its appropriate back-end server running Postfix & Courier-IMAP for storage.

      To fetch mail users hit Perdition on a front end VIP; it looks up the real back-end server LDAP record and proxies their POP/IMAP requests to that machine.

      I use Foundry ServerIron XLs for hardware load balancing the world/user facing services. I just can't give those boxes enough praise.

      The simplicity with the design is that each server group runs only a couple of services, and if spam/AV filtering or a back end mail system fails, the messages spool up on the front end or filtering systems until you clear the problem. Nothing is lost and everything arrives asap.

      As a side note/gloating point; my uptimes on the Courier-IMAP servers are well over 400 days without a restart of any services. Postfix is equally stable. Everything just works.

      Of course, the tricky part will be cutting some code to make some web admin pages or utilities, configuring all the transport maps, figuring out and creating your LDAP schema, working out clamav/spamassassin, the fun with IMAP, etc. ;)

      But that's why they pay you the big bucks.

    26. Re:Simplicity is key. by Deven · · Score: 1

      I wasn't planning to weigh in on this complex topic, but I can't let this bad advice pass without comment:

      Maildir message store.

      Store the mail in maildirs. [...]


      Don't use Maildir! It does not scale well at all. Maildir's claimed greatest strength is the ability to use an NFS-mounted spool without locking. While this is true, Maildir has a fatal flaw that more than outweighs this benefit.

      Because Maildir encodes flag information into the filename, constant rescanning of the mail spool directories is inevitable, and the network traffic for these directory scans will kill the performance. Worse yet, NFS caching can cause different servers to have different views of the same mailbox at the same time.

      How do I know this? I once tried to implement a scalable mail architecture for a former employer using Linux servers and NetApp NFS servers. On paper, Maildir sounded like a great solution, since NFS locking would not be required. After implementation, we discovered how expensive all those directory scans are -- the performance was a fraction of what it might have been with a better mailbox format.

      Now, a format similar to Maildir (one file per message, no NFS locking), which NEVER renames files once created, might have a lot of potential. Of course, this would require either an index file or a database for metadata, and perhaps some direct server-to-server notification of new email, but Maildir's promised advantages might be achievable without its Achille's heel of the massive overhead of constant directory scanning.

      Another thing I learned from that project: Qmail sucks for this! It may work well on small systems with a few user accounts on a single server, but it is nearly impossible to adapt to a scalable multi-server environment where the users have no system accounts. The source code is so convoluted and obfuscated that it's incomprehensible, and it's so poorly modularized that customization is a serious nightmare.

      Against my better judgment, I agreed to use qmail for the project after heavy lobbying by a new employee who left the company soon thereafter. I spent several frustrating weeks fighting with the horrible qmail codebase, trying to integrate it with an external database system to control delivery of the email -- it was ridiculous.

      After the qmail evangelist quit, I immediately dumped qmail and started from scratch with sendmail. Integrating the database into sendmail was a snap -- I just created a new "map" type, which was a very cleanly modularized piece of code, then I made a few modifications to the config file and it Just Worked. In a day or two, I was able to accomplish the goal that wasn't even close after 2-3 weeks of dealing with the morass of qmail's source code.

      I'm now convinced that qmail's vaunted "security" is merely "security through obscurity", and I just don't trust the software. But as long as you don't want to do something different than qmail expects, it seems to work fine for many people with small, simple systems. Forget about using it for an enterprise email system.

      I don't want to try to describe the email system that I would design in this situation, but I would avoid "one big server" designs and focus more on multi-server designs where platform reliability depends on server redundancy. (Think Google!)

      Ideally, you should be able to rip any given server out of the rack without notice (simulating a crash), rebuild a replacement from scratch and put it into service -- all without affecting the users in any noticable way. Designing "stateless" servers where the service data is not local can help here. (For inspiration here, read Earthlink's white papers: A Scalable News Architecture on a Single Spool and A Highly Scalable Electronic Mail Service Using Open Systems)

      Good luck designing this system. It's not a trivial task, and what some people evangelize (because it works on a small scale) will be inappropriate and fall apart on a large scale...

      --

      Deven

      "Simple things should be simple, and complex things should be possible." - Alan Kay

    27. Re:Simplicity is key. by Nailer · · Score: 1

      You need a central configuration repository to store the email accounts, their passwords, etc. OpenLDAP is perfect for this, and you can replicate it out for scalability.

      You clearly know more about large mail systems than I do, but I couldn't resist reponding to this. OpenLDAP is actually pretty poor when it comes to replication: ACLs on directory entries aren't stored in the directory last time I checked, and aren't replicated at the same time as the data. This is, frankly, dumb.

      Red Hat Directory Server (formerly AOL Directory Server, formerly iPlanet) is OSS, stores ACLs in the directory, supports multi master replication and has existing large scale setups at both AOL and the US DOD.

      Disclaimer: I work for Red Hat.

    28. Re:Simplicity is key. by obender · · Score: 1

      What about the webmail interface? Does a Gmail like FOSS solution exist?

    29. Re:Simplicity is key. by ktwombley · · Score: 1

      Ad implements ldap v3 anyhow, so with a few minor modifications (like schema updates) or tweaks in the source of the application, anything that works with ldap can work with active directory.

      I'm a *nix guy so I'd prefer to use FOSS, but the boss says we use Active Directory. So, we do.

    30. Re:Simplicity is key. by dTb · · Score: 1

      LVS on linux can be configured for maximum throughput through the network as return traffic does not pass back through the director. You can install a pair of directors that sync state in case one fails.

    31. Re:Simplicity is key. by Anonymous Coward · · Score: 0

      ext3... only 32000 subdirectories for you... ouch. Nice that RH EL3/4 only support ext3.

    32. Re:Simplicity is key. by Anonymous Coward · · Score: 0

      Go for Dovecot imapd! It has even better protocol support than Courier, is written by a really nice person, and has an extremely security record. (It's the same dude who wrote irssi...)

    33. Re:Simplicity is key. by Zak3056 · · Score: 1

      Ad implements ldap v3 anyhow, so with a few minor modifications (like schema updates) or tweaks in the source of the application, anything that works with ldap can work with active directory.

      Right, that was my point--putting in OpenLDAP would just be reinventing the wheel with the added issue of now having to manage both it and AD.

      --
      What part of "shall not be infringed" is so hard to understand?
    34. Re:Simplicity is key. by Anonymous Coward · · Score: 0

      There should be no difference between MD4 and MD5. If you have found one you probably should write a paper about and submit it to the crypto cummunity.

      I say use a fast hash instead. You don't need crypto security for dir hashes. Waste of CPU.

      See
      http://burtleburtle.net/bob/hash/evahash.html
      for a reasonably fast hash func. Heck you could probably even use adler32.

    35. Re:Simplicity is key. by Dunkirk · · Score: 1
      -While things like MD4 are okay for hashing, they're kind of CPU-intensive. Consider something like "second and third letter of username" that takes less CPU time.

      Just from having setup my own mail server -- with all of 2 accounts! -- I'm loosely following this discussion. The main thing I'm not getting here is how to get the "frontline" SMTP servers to send mail to the "backend" POP/IMAP servers in some load balancing manner. There was the suggestion of "hashing" in the grandparent post, then yours about basing it on the users' names, but how does one go about configuring the MTA to actually do this? Since you're a "Postfix man" yourself, as am I, could you please tell us how you would go about doing that specifically with Postfix?
      --
      Acts 17:28, "For in Him we live, and move, and have our being."
    36. Re:Simplicity is key. by Anonymous Coward · · Score: 0

      Using characters from usernames is not optimal, as the distribution of characters in most userids is not random (you'll find that there are 'clumps' of commonly used 2-digit groups in names). If CPU overhead is an issue and your mail server doesn't cache hash lookups, it's better to use something that has a guaranteed uniform distribution; a good example is the last two digits of their UID.

      Alternatively, if you alter the table structure for the account details (SQL,LDAP, etc) to store the pre-computed hashes and include them in the query when you do id lookups.

    37. Re:Simplicity is key. by Matt+Perry · · Score: 1

      But what are you generating the hash from? The directory name? Are you just taking the first two (or four) characters from the hash?

      --
      Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
    38. Re:Simplicity is key. by towermac · · Score: 1

      I admit I skipped a page or two of the thread, but read at least half of it, and am suprised not to see anyone suggest an Apple solution. It uses open LDAP, can bind to existing LPAD or AD or be a master. Uses Cyrus, Postfix and squirrelmail, but would run Communigate Pro if you wanted. It clusters out of the box, has a nice plug-in raid box using fibrechannel, with a double-click install SAN solution. It's already configured to use the blacklists and anti-spam stuff. Needless to say, it's dead-easy to set up (compared to a lot of what I've read), and once it is, your granny could manage it. I can't imagine 5 xserves and an xraid not being able to handle this just fine, and that would set you back what, $25,000 -$30,000? That includes a fiberchannel switch. You'd have real support, and yet you have all your open standards and BSD. Just a thought.

    39. Re:Simplicity is key. by stickyc · · Score: 1
      Spamassassin.

      Can be configured to scan make at DATA time in the SMTP conversation. A LOT of configuration work here to make it play nice on a massively scaled platform, but it can be done. Mostly it needs to have things like the auto whitelisting and bayseasn filtering turned off, as the extra DB file work is a bit excessive.

      Actually, I'm sure there is a way to make it work with a less resource intensive repository, but using the standard SA rules seems to work well for my environment. *shrug*

      I'm no professional at this, but wouldn't you want to start looking at adding some dedicated anti-spam pre-filtering hardware when you start getting into > 100k recipients? A recent bit in Infoworld had even the lowest reviewed appliance cutting server loads in half.

  133. hasnt this problem been solved? by Anonymous Coward · · Score: 0

    cant you just buy any one of a number of
    email appliances that can do this? why does
    everyone insist on building from scratch?
    also, the number of boxes doesnt really matter
    so much as the actual volume of messages.

  134. Re:Here's my plan and it's the best one you'll get by Anonymous Coward · · Score: 0

    your solution would be ten times easier if you bought a load balancer such as a cisco css or bigip, and stored mail centrally.

  135. PTSD? by finelinebob · · Score: 1

    For 1 million accounts, Post Traumatic Stress Disorder seems more likely.

  136. Morons, morons, all around... by Mr.+Underbridge · · Score: 1
    Think this through -- a lot of e-mail programs check every 20 minutes. Assuming I actually hit any without duplications, I could potentially need 400 minutes or over six hours to get all my mail. Since it's random, it could take days. And that's just for starters with this lame scheme. If I want to check mail, say, from the field on a dial-up once a day... hopefully you can see how badly this would suck.

    That's why I'm pretty sure the aforementioned post was a rather good troll. Either that, or he has a $100/day crack habit.

    What the guy should do is buy an e-mail system that can handle 1,000,000 users and not screw around trying to chewing gum his own solution.

    Think that's retarded? Think about the idiot company that trusted the design of their 1M-account server to some clown that's never set one up before, and thinks slashdot is the best place to start looking. They're absolutely fucked, and they deserve all of it.

    1. Re:Morons, morons, all around... by Reality+Master+101 · · Score: 1
      That's why I'm pretty sure the aforementioned post was a rather good troll.

      Hmmm. When I first read it, it was just stupid enough that I figured some Slashdot Open Source fanboy might've advocated it as a "Hyuck" solution, but reading it again, I think I might've been taken by a troll. :)

      Ah well.

      --
      Sometimes it's best to just let stupid people be stupid.
    2. Re:Morons, morons, all around... by Mr.+Underbridge · · Score: 1
      Hmmm. When I first read it, it was just stupid enough that I figured some Slashdot Open Source fanboy might've advocated it as a "Hyuck" solution, but reading it again, I think I might've been taken by a troll. :)

      Don't feel bad at all, I'm not completely sure it is a troll myself. If it was a troll, it was a beauty, as it was simultaneously ridiculous and credible. The idea was dumb, but not moreso than most of the other dumb shit I hear around here.

      The tipoff (to me) was combining the multiple MX server idea with random storage to create a situation where you might never get all your email, because I think you'd have to be fairly knowledgeable about mail systems to come up with something so completely insane.

      Ya gotta love a great troll, even if you feed it.

  137. Where I'd start by Anonymous Coward · · Score: 0

    First you need to think about your storage. Storage will be the key to the uptime, scalability, etc. Email is inherently tough on storage, just like OLTP work, so whatever solution you use has to be able to handle high loads of small, random I/O. Once you get your storage needs determined (how big will each mailbox be?, will there be size restrictions to attachments?, what will likely be the read:write ratio?) you 'll have to decide on a system to run it on. I'm not familiar enough with non-Microsoft solutions to make a suggestion, but for that many users it will be interesting to see what you decide.

    My personal opinion, though, is you have to start with your storage systems. Everything relies on that in email solutions.

  138. Where to start? by Anonymous Coward · · Score: 0

    I'd start by firing your incompetent arse.

  139. helpdesk.ibm.com could help by Anonymous Coward · · Score: 0

    ummm, whats the budget on this project?

  140. While we answer this question... by hellfire · · Score: 5, Funny

    ... Is anyone wondering what's going on at Microsoft right now?

    It starts with a slashdot geek working in the email department spitting up his coffee, followed by a few rumors which make it up to a guy in accounting and customer service, followed by frantic management emails, including some inappropriate language, from Steve and Bill. Then a few good geeks start tracing who this cfsmp3 guy is and try to trace him to a company while the salesreps begin coldcalling any customers running around 1 million customers.

    And Microsoft will botch it because they have no experience in cowtowing and bootlicking, which are important skills for any company who wants to humbly keep its customers.

    --

    "All great wisdom is contained in .signature files"

    1. Re:While we answer this question... by stevesliva · · Score: 1
      Hmm. Huge company some free email. What kind of company ends up offering a mail solution for free that it must buy?

      Could be a telecom providing DSL or a cableco providing broadband. I wouldn't put it past Adelphia, Comcast, Qwest...

      --
      Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
    2. Re:While we answer this question... by Nintendork · · Score: 1
      "What kind of company ends up offering a mail solution for free that it must buy?"

      A smart company.

      Aquisition is better than organic growth on almost all fronts. Ask anyone with a business degree and I'm sure they'd agree.

      -Lucas

    3. Re:While we answer this question... by stevesliva · · Score: 1

      I don't believe that licensing software to provide free services counts as an acquisition. Buying the company that's selling you the software-- that's an acquisition. But I don't think they're ready to acquire Microsoft. Perhaps they could afford Novell.

      --
      Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
  141. Scalix by kntgtaid · · Score: 1

    I recently listened to in intersting IT Converation about Scalix a linux based e-mail solution that can handle large volumes. http://www.itconversations.com/shows/detail654.htm l/ http://www.scalix.com/products/index.html/ Bryan

  142. Worst. Email. Client. EVER! by Greyfox · · Score: 3, Funny
    I've been subjected to bloated goats every time I've contracted out to IBM and I've hated the experience every time. There are a number of projects going on inside the company to try to avoid having to use it, but no one's ever had a whole ot of success at it. IT steadfastly refuses to enable imap on the servers, ostensibly because the mail servers would not be able to handle the load of EVERY SINGLE IBM employee on the planet saying "OH THANK GOD!" at once and migrating to a mail client that doesn't SUCK DONKEY BALLS.

    Don't get me wrong. Notes isn't just a crappy E-mail client. It's also a crappy database access client that provides user-definiable forms which can be used to populate rows in the database. When you start getting a LOT of rows, the performance really goes to shit unless you replicate the database down to your local hard drive.

    Rather than the Notes based solution, I would suggest an old 386 running BSD and Sendmail. That'd save you a lot of pain in the long run, versus dealing with Notes.

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    1. Re:Worst. Email. Client. EVER! by Otter · · Score: 1

      Seconded. I know nothing about the backend aspects, but the Lotus Notes client is *the* worst piece of major software in existence. There's nothing else even close. I'd rather use a version 0.0.1 email app picked at random off Freshmeat.

    2. Re:Worst. Email. Client. EVER! by StarsAreAlsoFire · · Score: 1

      "Lotus Notes for Dummies" is surely a single page pull out with "don't" printed on it.
      Unknown

      http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.ht ml

  143. CommuniGate Pro by DarkFencer · · Score: 1

    For that amount of users I would look into CommuniGate Pro from Stalker software. I have trouble recommending it as much as I used to since their price increases but for as large as what you're doing it would be rock solid.

    It is very easy to cluster, does all the standard IMAP/POP/SMTP/Webmail along with some more uncommon features in an email server, Radius, LDAP, SIP, and MAPI (for an extra fee which can let Outlook connect to it as if it were Exchange with full Groupware functionality).

    I know I sound like a sales guy for Stalker, but I have been very happy with their mail server for a few years now.

    1. Re:CommuniGate Pro by bvdbos · · Score: 1

      Cgate-pro is software which I didn't mind buying. I've been using it for years in one company now and it's much easier then the qmail-solution I put in another company. If you calculate the hours you're busy configuring C-gate is actually cheaper with much more options (like calendering, SIP etc). A complete solution (and no, I don't work there either).

  144. Where to start (seriously) by einhverfr · · Score: 4, Informative

    First, you need to start by drafting real requirements. What do you need exactly? Antispam? Antivirus? Try to have it fill up at least a page.

    Once you have that done, you can start looking at solutions. You will have two parts to your solution:
    1) The DMZ email relays (possibly including other antispam/antivirus functions) You really need high availability here.

    2) Your email storage and retrieval systems. These may be a little more tolerant to downtime on an individual basis. But if you need to have redundancy here, there are ways to do it.

    I think Hotmail did fine with BSD and Qmail.* I am sure Postfix is equally capable.

    * Although Qmail itself has never had a security vulnerability discovered, you should be careful. TCPRules (on which qmail relies) has a vulnerability that can lead to root access for local users. This is not a problem on systems with no local users, however. I am not aware of any patch for the TCPRules vulnerability.

    --

    LedgerSMB: Open source Accounting/ERP
    1. Re:Where to start (seriously) by spotter · · Score: 1

      just wondering what that vulnerability is, none of the binaries from ucspi-tcp on my debian box are suid anything (let alone root)

    2. Re:Where to start (seriously) by einhverfr · · Score: 1

      Aparently under certain circumstances it is possible to have a buffer overrun when it checks for the presence of the RELAYCLIENT environment variable. Maybe it is ia problem in Qmail's smtpd..... Some of the mailing lists seem to indicate that this could lead to root access, but the simple answer seems to be not to have it on the same systems where users have interactive access.

      --

      LedgerSMB: Open source Accounting/ERP
    3. Re:Where to start (seriously) by Russ+Nelson · · Score: 2, Interesting

      Yer blowin' smoke, of course. Everybody loves to claim that they've found a vulnerability in djb's code, but when it comes down to details, there are none.
      -russ

      --
      Don't piss off The Angry Economist
    4. Re:Where to start (seriously) by Lost+Found · · Score: 1

      I'm a bit suspicious of how exploitable that is myself, but in any case, as shocking as it would be, it certainly wouldn't be root. qmail-qmtpd is run as an unprivileged user. Qmail's separation model rocks... if this is true, it's a chink in the armor but it's not a root exploit.

    5. Re:Where to start (seriously) by jnf · · Score: 1

      funny, what are these?
      http://osvdb.org/searchdb.php?action=search_title& vuln_title=qmail&Search=Search
      Additionally, about two years ago I saw code floating around for qmail.
      Just becuase Denial Bernstein claims there are no bugs and finds technicalities to justify his position, doesn't mean they dont exist... but then I imagine most of the people who believe that type of stuff also believe openbsd is ahead of the curve.

    6. Re:Where to start (seriously) by foksoft · · Score: 1

      Where to start? Asking what of the functionality from Exchange your users are using. Remember Exchange is NOT e-mail server. It is groupware server with much more possibilities than regular e-mail servers. Who in the company is fed with Exchange? Users, BOSS or admins? I also hate "features" of Exchange like POP3 localization or its debility with handling e-mails with large attachements when I was unable to download such attachement over IMAP or POP3 but when I prove that I am also able to contact such server using Outlook then IMAP appeared to download the attachement. But with combination it is powerfull tool that could be hardly replaced. (contacts, calendar)
      I know only one replacement for exchange. Lotus Notes/Domino.

  145. ...and it crashes a lot by HBI · · Score: 1

    That said, it does have healthy clustering support. That was the only thing that made Domino tolerable for us as a mail server.

    They really have to do something about all the panics and task shutdowns that Domino suffers, no matter what the equipment. There's something screwy in there somewhere - something with buffer handling or whatever, pointers getting mangled. It's probably bad databases ultimately but, after you run Lotus' consistency checks nightly on close to a terabyte of databases (spread over 10 servers) you'd expect valid data, right?

    The need for something like Cassetica's NotesMedic to restart your client after a crash is kind of lame too. That should have been fixed ages ago.

    I have to say that Exchange is better, sadly. I hate Exchange, but it suffers from none of these issues.

    --
    HBI's Law: Frequency of calling others Nazis is directly correlated with the likelihood of the accuser being Communist.
  146. Big Problem, Big Guns by bdaehlie · · Score: 1

    Seriously, if someone asked me to do that, I'd be on the phone with IBM and Novell immediately. I can almost guarantee you they'd be more than happy to help and your job would be safe. Don't try to do this yourself...

  147. suggestions by erase · · Score: 2, Informative

    quick 15 minute brainfart:

    in order to increase reliability, you want to adopt a clustered design - if a machine or two fail, nothing should happen to the service.

    in order for all the machines to be able to find the user preferences/passwords/etc, you'll want some sort of common storage for them. it could be on a shared filesystem, in ldap, mysql, etc. ldap is common and a good choice (it has very fast read/query performance) - make sure you use replication so an ldap server failure doesn't take you down (or better yet, a multi-master setup). if you use ldap or sql, make sure you are indexing correctly on the data you most commonly pull up.

    in order for all the machines to access the user's mail, you'll want some sort of shared message storage. a shared filesystem is easiest, you could choose from nfs, redhat gfs, veritas cluster fs, etc. if you use nfs, make sure the nfs server can failover to a backup system if the nfs master dies (netapps are great for this).

    rather than using round-robin dns, i'd invest in a load balancer. there are some free options for bsd and linux, but the commercial products are very nice and easy to use. f5 labs bigips are very nice, cisco CSSes are garbage.

    other suggestions about breaking the services into different groups are spot on. personally, i'd have 3-4 inbound smtp servers inside a loadbalanced pool that handled inbound mail and passed the messages to virus and spam scanning services before delivering them to the shared message store (your load might dictate you need more servers, but if you design right you can just add more as time goes on). i'd probably put pop3 and imap services on those hosts as well, and possibly only allow pop3s and imaps (the ssl encrypted varients).

    i'd also have a set of outbound mail servers that users would connect to to relay outbound mail. they would require smtp auth, and possibly only allow connections on smtps ports. spam/virus scanning would be performed before the message was accepted by the server, so users would get immediate feedback if their message didn't go through. the outbounds would not do any local delivery, so they would not mount the shared message store (you'll get proper bounces for all invalid mail addresses this way, instead of smtp rejections for invalid email addresses in local domains).

    i'd have another set of servers that did virus and spam scanning for both the inbound and outbound smtp servers. you'd want these machines to have faster cpus than the rest, and virus and spam scanning are usually quite cpu intensive. again, if your load increased (or was more than you had anticipated), the system is easy to grow just by adding more machines.

    another set of servers would handle the shared filesystem (if nfs, or gfs exported via gnbd), and possibly also the shared preferences store (ldap).

    the final set of servers would handle webmail.

    each set of servers should be firewalled from the others (especially the webmail servers, which are probably the most vulnerable to attack), with only the neccessary allowed traffic going through.

    qmail and postfix can easily read ldap, i'm sure sendmail can also (as can commercial solutions). anything will work for the smtp daemon.

    since you are supporting pop3 users, maildir is a better choice over mailbox for your message stores. courier or cyrus would be a good choice, and come with pop3, imap, and MDA (message delivery agent) components.

    i'd have the inbounds accept mail from remote sources immediately (assuming the user being delivered to was valid) and have them hand off the message to an MDA, which would perform spam scanning, virus checking, and any user filtering configured before delivering the message to the user's mailstore. (scanning after the message is accepted uses more resources, but grants you more flexibility - users can have their own spamassassin settings, or you can add any number of filtration steps).

    for virus scanning, check out ClamAV. for spam scanning, look at spamassassin (

  148. Step 1: by Anonymous Coward · · Score: 0

    find someone with mail experience. Large scale mail experience. They're out there.

    I'm giving that advice 'cos I've been to the 1m mark, and its not as simple as it sounds. You need clue, and you need experience, and /. does not count.

    Hire someone (with clue and relevant background) who has the knowledge before you make ANY decisions. After that its easy.

  149. Re:Here's my plan and it's the best one you'll get by Anonymous Coward · · Score: 0

    Sendmail is fine

    Sorry, lost you right about here. Sendmail is unacceptable crap. I have these high-volume servers (tens of thousands of users) and sendmail literally locked up trying to process the queue (yes I'd split the queues and optimized sendmail with what I thought were good settings). When it DID work, it left cruft in the queue directory that left me wondering "did it even deliver those messages..what the hell?"

    Postfix has been working a LOT nicer. I also use qmail a lot in "set and forget" situations. Sendmail is junk. Microsoft could've made it better.

    I can't even imagine 1,000,000 users on sendmail, even if it is spread over 20 servers. *shiver*

    Assign usernames and passwords to all users and create all the accounts on every single machine (more on this later).

    You do have a feel for how many users 1,000,000 is, don't you?

    Easy, you set up a round robin DNS on mail.DOMAIN.com. This way whenever a user checks their mail, they'll randomly end up on a different mail server, therefore collecting more of their mail.

    Are you making this shit up as you go along? Have you ever dealt with real users? A MILLION of them?

    Them: "How come my important file hasn't arrived in my inbox yet? My client sent it ten minutes ago!"

    "Just check your mail twenty times, it should show up after about 10 tries. kthxbye!"

    Yeah, that'll go over well.

    Better idea, partition the users over the 20 machines and put a gateway in front that proxies them to the correct machine based on their username or some other criteria. You can write a FAST POP proxy (probably IMAP too, never tried though) in Python using event-driven techniques.

    Set up all the accounts there and write a Perl script which logs into all the other boxes on POP3 for every account, then puts the messages into the folders on the IMAP server. Get this script to run (with crontab) every minute.

    Okay, okay, I get it. You're just fucking with this guy, right? You don't REALLY do it this way????

    Good luck!

    Yeah you'll need it...

  150. Don't make the same mistake your boss did by geekoid · · Score: 1

    hire someone who doesn't have to ask slashdot how to do there job.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  151. www.raqport.com by NAACPsupporter · · Score: 0

    Check them out, they are doing what SUN destroyed. www.raqport.com They have a nice email appliance...

  152. Whatever you do, don't forget connection control by ttul · · Score: 1

    My firm has recently consulted for an email service provider that handles mail for about ten million end user accounts. Until recently, they were running everything through a large and growing bank of content filtering servers. As traffic has increased, the load on their filtering machines has increased exponentially, as has the storage requirements for their anti spam quarantine system.

    Whatever you do, please add some kind of throttling-style connection control in front of the content filtering systems to limit the rate at which spammers can connect to your content filters. With content filtering and blacklists alone, you will only get about 95% of the spam and your infrastructure costs will know no limit. Add connection control and you can get the last five percent under control while also significantly reducing the amount of mail that ends up wasting time and space in the quarantine.

    My company sells a traffic shaping connection control system. Fancier appliance-based options such as the Symantec 8160 are also available if you have large amounts to spend and a propensity to go with the big name.

    To my knowledge something like this is not yet available in open source -- probably because it has only until recently made sense for large mail receivers such as your client.

    Our home page: http://www.mailchannels.com/

  153. Thank you by superspaz · · Score: 1

    Am I the only person in the world who's actually gotten a message saying gmail is down. This happens to me every few weeks and usually lasts a minute or two. Not a big deal, but certainly not going to meet your reliability requirements.

    Guess there is a reason they are still in beta.

    1. Re:Thank you by ipxodi · · Score: 0

      This happens to me every few weeks and usually lasts a minute or two. Not a big deal, but certainly not going to meet your reliability requirements.
      I'm not so sure. the poster said he needed 99.9% uptime. If I did my math right, that means he's willing to deal with over two 40 hour work weeks of downtime! (87.6 hours) I haven't had any where near that amount of downtime with Gmail since I got it back in July of 2004.

      --
      load "windows7" ,8,1
    2. Re:Thank you by Anonymous Coward · · Score: 0

      Jesus, are half of slashdot posters math fucktards? Check the discussion above.

  154. Cyrus IMAPd is your friend by Craig+Ringer · · Score: 1

    This article:
    www.linuxjournal.com/article/7323
    may be helpful to you. It's not on quite the same scale, but it may be helpful. I know quite a few universities and companies run enormous Cyrus clusters with LDAP and a good UNIX MTA.

    You might also want to ask on the info-cyrus mailing list.

  155. Start at step one... by nutbar · · Score: 1
    Well, I've read a load of other comments that all seem to be touting the poster's favourite email system. Good for them - I'm sure they all have their features and points. But that's way too far ahead.

    The place to start is at the beginning - step one. What are the requirements? Without those, any product being touted is pointless. Really, you need to consider things like:

    • Mail storage capacity per user
    • Geographical diversity - where are the users
    • Functions - are calendaring et al. required, or is it just email?
    • What anticipated growth is there for the system?
    • What business continuity requirements are there? Is it acceptable for a single site to be out, or is absolutely everything mission critical?
    That's a start. Clearly, the outcome of these questions will help you determine what the business requirements are for the system, and from there you can build an RFP and start talking to vendors to determine the most cost-effective option that meets your requirements. Suggesting any particular solution at this stage is academic.

    PS: To all the posters that said "look at gmail!" - gmail has an entirely different purpose than a corporate email system!

  156. I'd start by... by Anonymous Coward · · Score: 0

    Hiring someone smarter than you. "Must scale perfectly" BWAHAHAHAHAHA go away.

    1. Re:I'd start by... by sadomikeyism · · Score: 1

      ...sticking my organ in a desk drawer, then slamming it closed several times. The dopamine rush should last a few minutes, though make sure your office/cubical opening is closed/vacant to avoid UWE issues. Now, I assume your engineering dept works on PCs, the art/graphics/advertising dept works on Macs, and the manufacturing dept works on some sort of big iron, while the IT dept works on linux boxes shelling into the big iron. Your executives all get their email printed for them, so you need massively reliable printers too. Find somebody like gmail or yahoo who is willing to license their webmail interface and stay away from mail client applications.

      --
      "Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves
  157. Easy by xihr · · Score: 5, Insightful

    Resign. You're obviously in way over your head if you have to resort to asking Slashdot readers for advice like this.

  158. Lotus Notes (works with Linux) by gueido · · Score: 1

    Last year, my company migrated from Exchange to Lotus Notes/Domino. (Domino is the name of the server product, Notes is the name of the client). Migration of data from Exchange to Notes was fairly easy. The best thing was the server runs on Linux, Solaris, and all IBM mainframes. Of course, Windows 2000/2003 Server are supported, but I prefer not to purchase CALs. For 50 users, I think we spent $3000. That included webmail, pop3 server, imap server, everything. I really don't think Notes gets the credit it deserves sometimes. Did I mention that it doubles as a development environment, where you can build applications fairly easily? Good luck!

    1. Re:Lotus Notes (works with Linux) by Belial6 · · Score: 1

      Me and my tester crank out ~3 large scale applications a year. Just the two of us. I haven't seen an application development environment that can match it. As with any project, just make sure the tools fit the project.

  159. FreeBSD Case Study by NovaX · · Score: 1

    You may find this interesting and useful: Argentina.com: A Case Study. They supposedly built a large-scale, low-cost email system with high reliability at a company with less than 15 employees.

    --

    "Open Source?" - Press any key to continue
  160. Re:NO Domino by Shalda · · Score: 2, Insightful

    Well, on the subject of what not to use, avoid Lotus Domino & Notes as well. Take your favorite horror story involving Exchange and substitute Domino for Exchange and Notes for Outlook and that's what it's like. Only Outlook is a much better mail client.
     
    There are dozens of perfectly good mail servers out there. The more features they have the more likely you are to have problems. It's a pretty simple equation.
     
    And if all else fails, you can write your own. I've written one, it's not very difficult (hacked it out in C# in a weekend). It's a very simple plain text protocol. But I wouldn't run the company on something I wrote in C# in a weekend. I don't even use it myself anymore. I'm running Exchange now for my personal mail server as that's what we run at work.

  161. Intelligent Architecture by Anonymous Coward · · Score: 5, Informative

    Hi Cliff;

    Sounds like a fantastic design opportunity here. The 5% of the project that is Enterprise architecture is what I enjoy the most as well. I'm assuming money probably isn't an object in terms of how much gear and bandwidth you may have to feed to this.

    I'm happy to let my fingers type away below, I'd love to keep in touch and see how you end up shaping this system. my email is allowmx at hotm...

    Before I ask, are there actually a million accounts? Or is that just a ceiling that you have to show proof of concept with?

    I've only implemented up until about 250,000 accounts of any kind, as I'm sure you're probably aware, the base transactional resource costing is essentially the same..

    For me, I would look at this for sure from at least these two angles:

    1) knowing your transactional costs (how much of your hard resources, bandwidth, cpu and disk space) will each type of transaction in your system take?) I mostly use this approach to get not an exact number, but an idea of magnitude, and detail where it happens on it's own to make sure the proper attention is applied to them.

    2) Failsafe intelligence & capacity in the infrastructure, as well as the failsafe intelligence & capacity in at the application layer. You have to know that your hardware, software, os, business logic and applications are all monitorable internally, externally for availabilty and actual "can I use it". Transactional logs, etc, of having information available when the inevitable problems come up.

    Also, having a capacity for as many of these layers to be self-healing, and fungible to the point that your service delivery is homogenous in as many ways possible. If your network finds something doesnt work or route, with mail, you can find another way to route it. Having a transactional manager of some kind, direct or not, could be useful in this case depending on what the client wants.

    99.9% uptime equates to about 526 minutes, or 87.6 hours you _could_ be down each year. Thats about 7.3 hours a month, or one day a month.

    Based on that, having flexible, redundant tools setup in a high-availabily arrangement at their respective operating capacities is key. I'm not sure if your current exchange problems are being aided by not enough equipment, bandwidth, or other stability issues, so I'll just assume that it's all of them :)

    I apologize if anyone else has already mentioned some of this, but here's some of what I've found to help me where email has become as crucial to a business as their cell phone.

    On the hardware level:

    - STORAGE: Everything goes on a SAN, if not more than one. Don't waste your time with anything less.
    - SERVERS: All servers have redundant hot swappable parts in the very least, power and hard drives. I'd even suggest making the servers Iscsi bootable so they can boot off the backbone. Beyond this, I like to buy my servers in piles of identical ones. Have 1-2 spare serevrs of each kind sitting there, ready to throw hot swap drives into from a failed server. That way if a server dies, you can address the power supplies, or get the HD's in that machine into another identical server and get it up and running while you diagnose the hardware problem independantly. My approach to any kind of problem is FIX, DETECT and REPAIR. Get it up and running, find out what was wrong, make sure it's fixed for good. Too many of us stop at the first too ;)

    The idea I have in mind is a smaller scale of a google beige box army. linux/bsd offer so much more transcations for each piece of hardware, so that works very much in your favor. Obviously something enterprise grade to satisfy the client such as the Compaq/HP Proliants, etc. I feel these Servers ahve the best overall support, manageability and information tools, and their openlinux drivers interface wonderfully with open source operating systems)

    Networking/Communication level:

    - Entire mail processing architechture communi

    1. Re:Intelligent Architecture by Anonymous Coward · · Score: 0

      "99.9% uptime equates to about 526 minutes, or 87.6 hours you _could_ be down each year. Thats about 7.3 hours a month, or one day a month."

      526 minutes, or 87.6 hours? I sure don't want this guy designing my e-mail setup.

      C'mon, people, check your math for sanity.

    2. Re:Intelligent Architecture by Anonymous Coward · · Score: 0

      99.9% uptime equates to about 526 minutes, or 87.6 hours you _could_ be down each year. Thats about 7.3 hours a month, or one day a month.

      uh, i'm not good with figures, but off the top of my head, one day per month is around 1/30, which is close to 3%. i could be wrong, they taught the "new math" at my high school, so what do i know?

      250,000 e-mail accounts and no math skills required? is your company hiring? :)

    3. Re:Intelligent Architecture by X · · Score: 1

      - STORAGE: Everything goes on a SAN, if not more than one. Don't waste your time with anything less.

      I'm sorry, but I gotta disagree with this. I'd say don't waste your money on the SAN unless you know you need it. A cheaper approach a mail solution is to have a good replication setup.

      --
      sigs are a waste of space
    4. Re:Intelligent Architecture by Anonymous Coward · · Score: 0

      hah. an excellent cover letter. best of luck on getting the job. :*)

    5. Re:Intelligent Architecture by Anonymous Coward · · Score: 0

      Hello Darlings,

      If you noticed, i calculated my downtime/yr by minutes per year and then converted to hours.

      I put the decimal in the wrong space while typing and then, and then it carried forward.

      99.9% downtime calculation

      60 minutes/hr x 24 hrs a day x 365 day a year (take a breath and let the leap year slide)
      = 525600 min/yr .999 uptime x 525600 min/yr = 525074.4 mins of guaranteed uptime/yr

      so, that means 525600 mins/yr total - 525074.4 mins/yr uptime =

      525.6 minutes of downtime/year

      526 minutes a year = 8.76 hours and the universe is all stable again. I put the decimal one space over in my typing. Thanks for pointing out the confusion.

      Lol@ cover letter. I'm more than gainfully employed already, learning never stops though.

      adamgeek; you calculated with a 99.0% uptime instead of a 99.9% as i think was in the original post.

      re: SAN; enterprise work isn't about how much you can accomplish with how little. seeing how far you can push your 386 to save money doesn't cut it with all clients, as much as I enjoy it too. I like SANs given their huge pricing. HP's SATA san's are a superb value relative to the bloated prices of the scsi ones and would be a good fit. Personally, if it were my company, I'd probably make do with replication, enterprises demand fungibility.

      Have a nice day fellas. :)

      And no, I'm not hiring, unless you have extensive experience with the decimal police!

  162. Cyrus imapd or Communigate by thule · · Score: 1

    I ran Cyrus-imap on a production server for a .com back in the day. It had .5 million accounts on the box with around 100-200 simultaneous web users hitting the daemon constantly. This was back in cyrus-imap 1.6 days. Cyrus performed very well except for logins. This was due to a flatfile that no longer exists in the 2.x release. Cyrus is probably the fastest most scalable opensource imapd/pop server out there.

    If you don't mind a commercial solution, I can't imagine anything more scalable than Communicate from Stalker Technologies.

  163. Just forward everything... by Anonymous Coward · · Score: 0

    Create a mail system for 5,000 users. Create 200 subdomains. Run a forwarder which sends mail from the main address to the appropriate subdomain.

  164. I can see the conversation that started this.... by geekoid · · Score: 1

    Boss: Damn email server is down again
    ThisClown: Too bad we done use an open sourse solution we set up ourselves..it would always work AND we would save money.

    Boss: Really?

    ThisCLown: Sure Open source is great (blah blah boring open source diatribe)

    Boss: sounds great, make it happen

    Thisclown: [drops brick relizing his boss was taing him seriously] uhhhh sure, but I have these other priorities...

    Boss: Never mind them, make us an email system that can handle [holds pinky to mouth] 1 MILLION users..

    ThisClown: Goes to computer and asks slashdot what to do.

    I like OS, but you know how some people can go on and on about it. Espcially people who don't really understand the magitude of a task there being given.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  165. Re:Here's my plan and it's the best one you'll get by pyrrhonist · · Score: 1
    Sendmail is fine, but...

    ...liquor is quicker. ;)

    --
    Show me on the doll where his noodly appendage touched you.
  166. A long process ahead by Anonymous Coward · · Score: 0

    A req analysis for sure is the start with a point by point list of functions....we currently run 29 mtas with 16 storeage racks. qmail defntly figures in your design. other vary. Have fun!

  167. I'd start... by Anonymous Coward · · Score: 0

    with the backbone.

    load balancers, fiber channel, and shared storage

    look into a clustering file system on a NAS or SAN

    store your mailboxes there, I recommend looking at coraid.com

    otherwise just daisychain a set of load balanced networks together

    smtpd->spam/AntiVirus->mailman/postoffice delivery to SAN/NAS->POP3/IMAP/Web client->spam/antivirus->smtpd

    each a seperate network, load balancers in between, seperate your receiving smtpd daemon network from your sending smtpd daemon network, and your antispam and antivirus networks should include more than one product

  168. If you're brave. by imbroken3a · · Score: 1

    You could give Oracle Collaboration Suite a try.

  169. You need a staff of 10 to 20 to run this... by Temkin · · Score: 2, Informative


    One million email accounts is quite a lot. You getting into the big league ISP category with something like this. It's not a one person operation to put something like this together. You're going to need a substantial number of well trained people to do this. There's only a couple players in the field at this level. Sun's JES Messaging system owns a sizeable chunk of the market, followed by OpenWave and a small gaggle of fly-by-nights with unproven track records.

    Some of the larger email systems however are homegrown using open source parts. Yahoo and Google immediately come to mind, and they do work quite well. But you probably don't have the resources that they do to engineer & test something like this. Yahoo is rumored to have more than 200 people working on email alone.

    Sun has a deployment like this canned, sitting on a shelf in Santa Clara. Tell them what you need, write a check, and they'll show up with the kit. 99.999% uptime if you write a big enough check. Make them to throw in the Waveset stuff.

    1. Re:You need a staff of 10 to 20 to run this... by anopres · · Score: 1

      Strange that this is the first post I've seen that mentions Sun's system. I used to run Netscape Messaging Server and Exchange side by side for two groups in the same building. It was amazing how much more milage we got out of our Netscape setup than we could with the MS box. I'd say we must have been supporting 5 times the number of users on the ns box and spent less time doing it. Of course that was many years ago. I'd still have to give Sun a crack at something like this though. With their stock trading at about $4, you could probably just buy them if they gave you any trouble (just kidding).

      --
      Strong Mad - 2008: "I PRESIDENT!"
  170. Oh, dear God, you RECOMMEND Notes? by Anonymous Coward · · Score: 3, Informative

    Of course, everyone should note that recommendation is coming from an IBM employee.

    Sorry, but Lotus Notes sucks; it's an abomination in almost every way. It's bloated, slow, buggy and has what is arguably the worst user interface ever (The User Interface Hall Of Shame said they could have based their entire site on this one app!) Sure, it does group meeting notes and can let you check other people's calendars but it falls flat as an email system. If it can't do the basics, who cares about the "advanced" features.

    Doubt me? Okay. Let's try a little experiement.
    First, sort your inbox by subject. Oh, I forgot. YOU CAN'T. Well, let me take that back. You can if you simply follow these simple instructions...
    First, you need to have Domino Designer installed. In Designer, open Folders in left pane, then open folder $Inbox, highligh the Subject column. In the window with Columns properties in second tab you can check-in the "Click on column header to sort..." checkbox. Close $Inbox folder window. To prevent design refresh, in Folders view, right-click on $Inbox folder, choose Design properties and on third tab check-in "Prohibit design refresh or replace to change".
    [blinks eyes in disbelief]

    Un. Fucking. Believable.

    Oh, and the feature I like the best is the pop-up dialog that tells you you have new mail. So you click to make that go away, switch over to LN to read the new mail and it's not there... Oh, yes, that's right, you have to press F9 to actually download the email to your client, even after being notified by an obnoxious popup that you have new mail.

    Want to know another neat little feature related to that F9 key? According to our LN System Admin, get a few dozen people to all press and hold the F9 key for a few seconds at the same time and you can crash the Domino server backend requiring the server to reboot. Nice.

    I could go on but I think I've made my point. I have never, ever, encountered anyone who has switched from Notes and been pleased with the change.

    1. Re:Oh, dear God, you RECOMMEND Notes? by Anonymous Coward · · Score: 0

      I've been using Blotus Notes for over 2 years now. I do hate the UI, and the enormous size and slowness, but you can sort by subject in a single click at the top of the subject column (and there's probably some kludgey threaded view), and I don't get popups notifying me of new mail.

    2. Re:Oh, dear God, you RECOMMEND Notes? by the+bluebrain · · Score: 1

      mmmmhyeah. But how about you want the change the $Inbox to show emails categorised by the first letter of the sender's last name? Or the first two letters? Or first categorised by the year and month, then by the first letter of the last name? Or with a separate category for mails flagged as "important", or "encrypted"? Or any other parameter?
       
      And what about if you want to make this change for all users - or only a selection of users? Or you want your users to be able to select from a list of possibilities?
       
      It's all as hard as the switch you described, no harder. The only problem is that you probably have *too many* choices - but once you're in there ... weeee!
       
      /yah, written from deep within an IBM shop

      --
      yes, we have no bananas
    3. Re:Oh, dear God, you RECOMMEND Notes? by Anonymous Coward · · Score: 0

      can somebody confirm this or the GP? is the GP a troll?

    4. Re:Oh, dear God, you RECOMMEND Notes? by AaronLawrence · · Score: 1

      Hm. Been a while since I used it, and it probably depends on version. I do remember having to create a special view (using the designer, as the GP says) to have sorting by subject. That was Notes 4.6; they are now up to 6 now.

      Notes is pretty much a flat file database with blob support. This works well for certain things, but is disastrous for others.

      --
      For every expert, there is an equal and opposite expert. - Arthur C. Clarke
    5. Re:Oh, dear God, you RECOMMEND Notes? by pullmyfinger · · Score: 1
      First, sort your inbox by subject. Oh, I forgot. YOU CAN'T. Well, let me take that back. You can if you simply follow these simple instructions...

      What the hell are you talking about? You can sort by subject, atleast in version 6.0 and above (I'm at 6.5) by clicking hte subject title and it works. Sure if your running some archaic 4.5 LN client it probably wont work but any later version of LN works fine.

      Oh, and the feature I like the best is the pop-up dialog that tells you you have new mail.

      Its a freaking option like anything else. if you dont like it, Click File->Preferences->User Preferences->mail->General and then uncheck "show pop up"

      So you click to make that go away, switch over to LN to read the new mail and it's not there... Oh, yes, that's right, you have to press F9 to actually download the email to your client, even after being notified by an obnoxious popup that you have new mail.

      Dude, make a local freaking replica of your server copy. Why the hell are you running off the server copy anyway? If you run off the server replica and not a local replica, you get this issue. Once you or your scheduled replication runs, the server copy gets placed locally and voila, no need for F9!

      I agree that the UI is one of the most clunky/unintuitive things I've dealt with. I'm no LN fanboy but just bear in mind that LN has options that can be changed!

      PMF

  171. Re:Here's my plan and it's the best one you'll get by keepr · · Score: 1

    Oh come on !!

    Just get yourself one of those ultra fast AMD servers and put exchange on it with a 1,000,000 user license. I have been told by countless AMD fans that the new AMD processor is XX times faster than anything on the planet.

    I think 1 server will do it for you, but you might want to raid the server just in case!

    --
    Slashdot taught me how to use the preview button!
  172. You said from scratch by xodiak · · Score: 1

    If you have are expecting a million users you should expect to put some cash into this. Develop the majority of it from scratch, get up to date on RFCs and how different web mail services and servers vary from them (this is honestly the hardest part).

    First, develop an incoming server that it specific to your aims, have it read from your user database (the only thing you should use any database server for in this instance is a user database) have it dump the raw message to files in some sort of predetermined directory structure (find out the particular limitations for your OS, and test them, you will regret it if you don't). Don't do anything with the incoming mail server more than verifying the user and dumping the raw information.

    Next, you write the POP server, the web interface, and if you are going to have an internal mail client then you write that (look into IMAP for ideas on protocol). Have only these things do any sort of indexing of messages, and when they do make them write to a universally (among your software) readable index file so you don't do any work more than you have to.

    Whatever you do don't use a database to handle your mail. It may work fast initially but I can promise you with any regularly available database that you will start getting very slow with more than 200000 users.

    Use sendmail for your outgoing mail servers. Have dedicated servers compiled with only what is needed to function as a mail server. Rotate what smtp servers your mail clients use and expect to have a whole lot of mail that is going to wrong addresses and mail servers that have really slow connections, so have a few fallback servers that have much larger timeouts that the primary servers send the mail to once they are unable to send it themselves. That way the majority of the mail gets out fast.

    You can honestly write all of this from scratch in a few months. Writing it from scratch gives you the functionality required for your situation. Look at bug databases for simlar programs and make sure your software doesn't have similar exploits. Learn from the mistakes of others. Don't use windows for any servers (not trying to be biased) except maybe on your RAID arrays. Read the writeup (don't have the link) about the issues that the hotmail team had when converting hotmail to windows.

    Remember, a lot of people don't check their mail frequently, don't waste CPU time on something that is not immediately necessary.

    --
    ---------
    Swearing is the crutch of inarticulate mother fuckers.
  173. Easy... by Anonymous Coward · · Score: 0

    A couple of V880's can handle 1,000,000 email users easy.

  174. Qmail!! by mnmn · · Score: 1, Interesting

    Qmail is best. Preferably on a FreeBSD server. So hard to kill it in any way.

    Get a server with RAIDed SCSI disks preferably hot-pluggable. Install FreeBSD, Qmail and other packages you might need as you go.

    Ideally keep the emails in a Maildir format.

    I dont know where the Novell idea came from.

    --
    "Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
    1. Re:Qmail!! by Pharmboy · · Score: 5, Insightful

      A single server? For one million users?

      Insert "imagine a beowolf of those" joke here, except it isn't a joke.

      I think you might be underestimating the requirements for this large a project that "must scale perfectly". The "99.9% uptime is expected" requirement alone requires multiple internet connections, a large cluster of front end servers, and redundent database servers, preferably located in different states. (ie: "What do you mean our only server is in New Orleans?")

      I don't think the average Dell dual Xeon box is up to the task for this large a project...

      --
      Tequila: It's not just for breakfast anymore!
    2. Re:Qmail!! by Cruithne · · Score: 0

      I'm sorry but it just bugs me how many people throw around uptime figures without knowing what they mean.

      99.9% uptime allows for almost 15 minutes of downtime a day. Even for a mom n pop business, that is becoming unnacceptable.

      I'm questioning who is submitting the post and if they are really "in charge" of this project... an email system this large should be designed to have "five nine's" of reliability - 99.999% or less than 8.64 seconds of downtime average per day.

      There really isnt much of an excuse to not design a system for five nine's, as the costs of designing a system to that level of reliability are 1) In the short term, negligible, and 2) In the long term, FAR outweighed by the benefits, especially when you consider the lower labor costs due to the system being largely self-sufficient - at least to a higher degree than a system designed to be any less reliable.

    3. Re:Qmail!! by Allador · · Score: 5, Informative

      No. 0.1% != 0.1

      365 days * 24 hrs/day = 8760 hours per year

      0.1% downtime = 0.001 downtime

      8760 * 0.001 = 8.76 hrs

      You're off by two orders of magnitude.

      8.76 hrs / 12 months = 0.73 hrs/month = 43.8 minutes/month

      One 45 minute scheduled downtime (assuming its scheduled) per month isnt terrible. It's not great, but costs really start to go up as you add nines beyond those 3.

    4. Re:Qmail!! by Zarel · · Score: 3, Informative
      The "99.9% uptime is expected" suggests a fixation with Windows NT on flaky servers. 99.9% equates to 876 hours of outage a year. Quite frankly the requirement for 99.9% availability suggests the equirer does not know what they are talking about.
      That's not right... 99.9% uptime means 1/1000 of the time is outage. 1/1000 is less than 1/365, so 99.9% uptime is less than 24 hours a year, not 876. It's actually somewhere around 8. I suppose you meant 8.76 hours of outage?
      --
      Want a high quality FOSS RTS game? Try Warzone 2100!
    5. Re:Qmail!! by LurkerXXX · · Score: 2, Informative
      What???

      Check your match before telling him he doesn't know what he's talking about.

      It's 8.76 hours of outage a year.

    6. Re:Qmail!! by Kyosuke77 · · Score: 1

      Hmm...

      365(days/year) * 0.999 * 24(hours/day) = 8.76(hours/year)

      I can't pass judgement without knowing how you did your calculations, but there seem to be some issues with the orders of magnitude you're dealing with. Perhaps you should check them.

      --
      GET THEM INSIDE THE VAULT!
    7. Re:Qmail!! by Kyosuke77 · · Score: 1

      Oops, guess I should have checked mine too. That should be:

      365(days/year) * 0.001(downtime) * 24(hours/day) = 8.76(hours of downtime/year)

      --
      GET THEM INSIDE THE VAULT!
    8. Re:Qmail!! by ObjetDart · · Score: 3, Funny
      99.9% uptime allows for almost 15 minutes of downtime a day. Even for a mom n pop business, that is becoming unnacceptable.

      Yeah. Well, if 1 minute, 26 seconds is "almost" 15 minutes, anyway.

      --
      I read Usenet for the articles.
    9. Re:Qmail!! by Shambhu · · Score: 1

      Your argument would be better if you could do the math. Look at the sibling posts to get a clue.

      99.9% uptime = .999 uptime = .001 downtime

      Similarly,

      99.999% uptime = .00001 downtime

      Or, 86.4 and .864 seconds a day, respectively.

      --
      Rome wasn't bilked in a day.
    10. Re:Qmail!! by Cramer · · Score: 1

      The "99.9% uptime is expected" requirement alone requires...

      For something of this size, yes, it'll take some planning and more than one machine. But just for 3 nines, a single machine can do that. I have one mail server that's been running continuously for almost two (2) years (years)... it hasn't been rebooted since the day it was installed, and has been unavailable to it's users for a total of a a few hours (2, 3, ? -- the longest was about 30min to rebuild the mail store after a db file was damaged.) And btw, it's a dell 4600 DESKTOP PC.

    11. Re:Qmail!! by Cruithne · · Score: 1

      Correction (missed a 0 when calculating) - 99.9% uptime allows for 1.5 minutes of downtime a day. Still not acceptable for an operation of this magnitute, so we're looking at four nines, or less than 8.64 seconds, blah blah blah..

    12. Re:Qmail!! by Anonymous Coward · · Score: 1, Funny

      I think you mean "What do you mean our ***CARRIER LOST***"

    13. Re:Qmail!! by Hork_Monkey · · Score: 1

      And you are trying to compare a Go-Kart to a Freight Train.

    14. Re:Qmail!! by Anonymous Coward · · Score: 0

      ahoy!

    15. Re:Qmail!! by illumina+us · · Score: 0, Redundant

      99.9% Uptime allows for 86.4 seconds of downtime daily. So that's approximately 1.5 minutes of downtime a day or 8.76 hours of downtime a year. I doubt this would go noticed unless all or a good deal of those 8.76 hours happened in one day or one week. You are suggesting a 99.999% uptime rating or 0.864 seconds of downtime a day, or just 5.256 hours of downtime a year. Sorry, that's just not realistic. Your UPS will most likely fail before that kind of uptime is achieved. 3 nines are sufficient. Check my math if you want: 100.00% - 99.90% = 0.1% = 0.001 24 * 60 * 60 = 86400 (seconds in a day) 0.001 * 86400 = 86.4 (seconds of downtime a day) 86.4/60 = 1.44 (minutes of downtime a day) 1.44 * 365 = 525.6 (minutes of downtime a year) 525.6/60 = 8.76 (hours of downtime a year)

      --
      -illumina+us "I put on my robe and wizard hat..."
    16. Re:Qmail!! by Loconut1389 · · Score: 4, Funny

      You used to work for NASA right?

    17. Re:Qmail!! by MyEyesTheyBurn · · Score: 1, Interesting
      I have to agree with QMail, I've seen it scale nicely - but not on one machine. You would really need a large cluster of machines - Perhaps the following:

      - 4-5 core machines all running heartbeat, and DRDBD or NFS
      - Then several Machines for POP, IMAP, and Webmail (NFS the maildirs)
      - Then several SMTP servers.

      Something similar, but greatly scaled, like this: http://shupp.org/maps/ispcluster.html

    18. Re:Qmail!! by nagizli · · Score: 2, Insightful

      While debating how much time the downtime takes, which is completely worthless, I'd rather you skim through the specs of FreeBSD & Qmail if they exist. I'd also look for companies which provide installation and support of FreeBSD and consult them on subject of how much this installation could cost or something. I'd also look for successful projects with Qmail & FreeBSD.

      I'd take into consideration the fact that UNIX-based solutions are far more lightweight than ones of MicroSoft so you have no idea of what you're talking about unless you managed one yourself. Before debating on how long 0.01% downtime is, I'd rather you consider other numbers which are of much more importance to you now.

    19. Re:Qmail!! by Anonymous Coward · · Score: 0

      ...except that shit happens and because it worked for you doesn't mean it won't happen to someone else. Also saying something can do something is kind of worthless, falling out of airplane from a few thousand feet without a parachute may not kill you however its not something I'd advocate doing.

    20. Re:Qmail!! by Anonymous Coward · · Score: 0

      from experience with huge ammounts of accounts and then multiplying those situations to fit yours i'd say FreeBSD or NetBSD cluster of at least 150 machines with at least 3GHz xeons in each and 4GB of ram in each

      sendmail or qmail doesn't matter but i honestly don't know what to say about pop and imap, been using cyrus myself but i have no experience of cyrus working in those large environments

    21. Re:Qmail!! by nickco3 · · Score: 1

      One server? I've seen an email infrastructure that was designed for 1.5 million people and with scalability in mind. The authentication sub-system alone was a 6 server LDAP cluster.

      --
      -- Nick "Hallo this is Beel Gates, und I pronounce weendows as ... WEENdows"
    22. Re:Qmail!! by Sinus0idal · · Score: 1

      And in that case, it must be either due some hotfixes (NT) or a kernel upgrade (unix).

    23. Re:Qmail!! by Anonymous Coward · · Score: 0

      0.864 seconds of downtime a day, or just 5.256 hours of downtime a year

      Check my math if you want

      OK then...
      0.864 seconds * 365 = 315.36 seconds or 5.256 minutes a year.

      Sorry to be a pedant, but you asked for that.

      Oh, and if your UPS(es) can't maintain that, you've got a problem.

    24. Re:Qmail!! by -brazil- · · Score: 1

      1.5 minutes of downtime a day. Still not acceptable for an operation of this magnitute

      What does "magnitude" have to do with acceptable downtime? I don't know about you, but I and everyone I know can most certainly live a couple of minutes without email. Considering what it costs to really get 99.9% uptime, I'm quite sure most companies would decide that 99% is actually peachy keen.

      --

      The illegal we do immediately. The unconstitutional takes a little longer.
      --Henry Kissinger

    25. Re:Qmail!! by Anonymous Coward · · Score: 0

      It's not great, but costs really start to go up as you add nines beyond those 3.

      Rule of thumb being: costs double for each nine being added.

    26. Re:Qmail!! by sim82 · · Score: 1

      Well, I said this in the past and I have to say it again: let google do the math before you post any calculations on slashdot:
      (100 - 99.9)% of (1 day) = 1.44 minutes

    27. Re:Qmail!! by Wdomburg · · Score: 1

      That's something we in the industry like to call luck. Not the best reliability plan.

    28. Re:Qmail!! by Wdomburg · · Score: 1

      Erm, I've run mail for 13 million users on about 60 machines. The average configuration was dual Pentium II 450s and 256MB of RAM.

    29. Re:Qmail!! by Anarke_Incarnate · · Score: 1

      No, a dual Xeon box from Dell is surely not the answer. However, Linux or BSD running on about 6 HP DL585s should be plenty. You would want to have at least 1 spare sitting on a shelf too, so make it 7. Then you would need some sort of fibre box for it to store the data, and a tape backup system. I would suggest LTO2 or LTO3 for that. An ML6000 tape backup unit would do it for you. Choose the backup software solution of your choosing and figure on another 2 DL385s for backups alone. You would also want gigabit switches for that type of load. Like you said, however, it would be definately nice to have the same setup in at least another location. If it has to be the same state, then make sure you have a long distance between the two. You would probably also want a dedicated pipe to them. This shit does not look cheap

    30. Re:Qmail!! by Anarke_Incarnate · · Score: 1

      Btw, this was just for the back end.

    31. Re:Qmail!! by arkanes · · Score: 1

      If you've had 3 hours of downtime in less than 2 years, thats not even 99.9% reliability. 3 9s is about 52 minutes of downtime per year. Of course, since you have no plans or redundencies in place for failures, your current uptime is basically luck rather than the result of expertise and planning. Note that if you'd had even 1 other machine clustered in, you could have made 99.9%, because you wouldn't have lost service while you rebuilt the mail store.

    32. Re:Qmail!! by HuguesT · · Score: 1

      1day = 86400s * 0.001 = 86.4s i.e 1mn 24s.

      '5 nines' would be 0.9s a day or roughly 5 minutes a year. You'll find that '5 nines' systems are in fact quite expensive.

      Please check your math before attempting to lecture everybody.

    33. Re:Qmail!! by Cramer · · Score: 1

      Like I said, for millions of users, you're asking for trouble putting them all on one box; but one does not need million dollar "enterprise" hardware to have stability and reliability. Just look at google's search engine farm...

      For my little mail server, even if it did catch fire, I could build a new one in under an hour. This is no different than a cluster of machines... if one dies, you simply replace it and move on. (with a properly oversized cluster nobody will notice one machines failure.)

      I don't know what "industry" you live in, but I've not seen anyone build a mail cluster[*]. NCSU is the closest, but that was simply departmentalized servers; the only reason any department had more than one server was due to storage not scalabilty or reliability. (And that was for ~25k people.) [The largest I've worked with had just shy of 70k mailboxes across about 5k domains... on ONE server. It could've handled much more than that.]

      [*] Well, not one that actually works... RR has several that significantly delay messages passing through it. Network Solutions has a bunch of "mail appliances" that lose 30% of the messages and delay the other 70% for hours (multiple documented cases of 13-15hrs.)

    34. Re:Qmail!! by Bubba-T · · Score: 1

      There is a BIG differance between server down time and SERICE downtime. If you have enough servers you could have every server down through the day and not affect the service.

      Make it a webmail system, front end a bunch of web servers with a loadshareing device (arropoint or similar). Back end it with a bunch of mail servers with NAS attached disk.

    35. Re:Qmail!! by onepoint · · Score: 1

      Qmail is most likely the best option, since it is very scalable.
      the web site for qmail is :
      http://www.qmail.org/top.html

      you are going to have to add this patch for more than 256+ connection ( which you will need for safety's sake and scalability )
      http://www.qmail.org/big-concurrency.patch

      You are going to need to add preventive measure ...double email bouncing script http://www.30below.com/~zmerch/qmail/spambad.cfm

      there are tons of patch's and how - too's for spam reductions.

      read this http://www.lifewithqmail.org/ldap/
      to get some better understanding of qmail

      Now onto the server side .... well I use the basic thinking that each users will use 1 to 3 meg of space before downloading to there outlook account. You have some history, so check what the average file space used per user is. next don't forget to find out what the company's e-mail policy is ( do they have to save e-mail for xyz amount of time, back-up policy's ... ).

      next don't forget that no mater what, each user gets 3 pieces of e-mail per day ( that's my number that I use for configuring the server ) ... so with your needs you'll require a 2 cpu system ( of which you'll share the spam software ) and an excess of ram ( to run the dns blacklisting or other cpu/ram intensive operation ).

      File server... that's open, my thinking would be a true raid 5 system, hot swappable, build it yourself. here is a link to a do it yourself terrabyte server for under 10K way back from 2002 and posted at that time on slashdot http://home.fnal.gov/~yocum/storageServerTechnical Note.html or http://www.accs.com/p_and_p/TeraByte/index.html that should help you along the way

      best of luck and enjoy

      Onepoint

      --
      if you see me, smile and say hello.
    36. Re:Qmail!! by hensley · · Score: 2, Informative

      Just do the math:

      1 yr = 24 * 365 h = 8760 h
      99.9% reliability => 8760h * 0.999 uptime = 8751.24 hours uptime or 8.67 hours downtime similarly 99.99% leads to 0.867 hours downtime = 52.56 minutes

      you're off by one magnitude!

    37. Re:Qmail!! by Cramer · · Score: 1

      It's not NT (NT would've rebooted on it's own with the "uptime" counter rolled over.)

      At any rate, this is a common misconception. The UNIX(tm) kernel does not require upgrading/patching every 3.2 days. Obviously, the box is working just fine with the 2yo kernel. Yes, there are some kernel issues one could trip if one could actually get to the box... firewalls, idiots!

      "Don't fix it unless it's broke."

      [BTW, I have other machines running MUCH older kernels without any issues (it runs a lot of stuff)...
      [root:ttyp0]dominion:~/[9:10am]:uname -a
      Linux dominion 2.3.42-SMP #11 SMP Sun Feb 6 20:06:02 EST 2000 i686
      ]

    38. Re:Qmail!! by Anonymous Coward · · Score: 0

      That math is in German. Could someone please translate to English?

    39. Re:Qmail!! by Anarke_Incarnate · · Score: 1
      A NAS for 1mil users? I guess they can go make a pot of coffee while they try and pull up their email.

      Email is almost always (almost, almost) held in a database type format. They do NOT scale well on a NAS

    40. Re:Qmail!! by dildatron · · Score: 1

      I would get scared just thinking about putting one million users on a home-made storage system, on IDE drives. It is a potential disaster given the reliability. You can't go cheap on the storage here, as the databases would require some decent bandwidth. A mid-range array would probably fit the bill, attached to a small SAN with two redundant fabrics to meet the 3 nines uptime.

      --


      If you had nuts on your chin, would they be chin nuts?
    41. Re:Qmail!! by codeguy007 · · Score: 1

      While Qmail is probably the fastest sending email server going, How do you expect it to replace exchange? Exchange has features Qmail doesn't touch like colaboration and calendaring.

    42. Re:Qmail!! by onepoint · · Score: 1

      I would use those links as a guildline. Building a 99.9 file servers is easy, with the ability to find hard drive reviews and other system cards review a simple guildline would make it worth while. worst case it becomes a backup server

      --
      if you see me, smile and say hello.
    43. Re:Qmail!! by codeguy007 · · Score: 1

      hmm last time I checked 86.4s is not 1min 24s but 1 minute 26.4 seconds.

    44. Re:Qmail!! by pyite · · Score: 1

      For my little mail server, even if it did catch fire, I could build a new one in under an hour. This is no different than a cluster of machines... if one dies, you simply replace it and move on. (with a properly oversized cluster nobody will notice one machines failure.

      What happens when you're half a world away and your server dies? Who fixes it? It's completely different from a cluster. Clusters are designed to have redudnancy built in. Sorry, but anyone with your view of "stability" and "reliability" hasn't a clue about running services that need to stay up.

      --

      "Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman

    45. Re:Qmail!! by Anonymous Coward · · Score: 0

      you fucking homo.

      0.1% is 0.1

      0.001 = 0.001

      how the fuck are you able to survive on a daily basis?

    46. Re:Qmail!! by Cramer · · Score: 1

      You forget people are part of the equation (as is money.) It doesn't matter how many machines there are in the cluster when there's only one admin capable of managing the thing. (We in the industry refer to this as the bus problem... "... and what happens when you are hit by a bus?" (eatten by a moose, whatever.))

      We're talking about email systems here, so it's extremely likely, even in a clustered environment, users will have their spool on a single server -- it's difficult (and cumbersome) to have mirrored redundancy for email... (multiplied by 1 million users...) This isn't a web farm where it's ok for updates in one place to take several minutes to appear across the farm. [A mirrored SAN is a f'ing expensive solution to a "100 year" event.]

      (Actually, I have quite a bit of clue. However, I've been around long enough to be frugal in planning and building systems.)

    47. Re:Qmail!! by pyite · · Score: 1

      Yes, I know of the "bus problem," which is why redundancy is useful so that in the event that someone doesn't at least have a vague idea of how you have things set up (and someone should in a place that needs 1,000,000 accounts), they have some time to figure out what it is you're doing rather than watching your "nines" trickle away as they stare at a burning heap of plastic and solder.

      I guess we come from different scenarios. I'm accustomed to things like a database being stored on a SAN with its transaction log on another disk which is also mirrored miles away (yay dark fiber) and then tapes of the database are taken offsite on a regular basis.

      No doubt things are difficult. No one said managing 1,000,000 email accounts was supposed to be easy. But if it needs to be reliable, it needs to be reliable.

      --

      "Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman

    48. Re:Qmail!! by Cramer · · Score: 1

      That's 99.983% reliability, btw. (People really need to get out pencil and paper to do their math 'cause they are getting everything wrong when they aren't looking at it.)

      I said no such thing... There are indeed "plans" in place for failures. They don't involve dropping thousands of dollars on hardware and software that are not necessary. The fact that the system has been working for 2 years is proof it's not necessary.

      If the server dies, move the drive(s) to another box. There's a mirrored set of drives inside there -- so a dead drive doesn't take the files with it. Linux/FreeBSD is very forgiving of being moved to wildly different base hardware; and any incompatabilities are quickly and easily corrected. Plus, the server is backed up to tape (as are a number of other systems), so at worst, one day of "stuff" would be lost. The entire cyrus/sendmail combination can be dropped on just about any system in the building (even inside a vmware if necessary.)

      Btw, replication doesn't necessarly prevent replicating errors. The major downtime was due to a dbm error, so it would've happened on every box.

      [The only "luck" part is that it's drives are still good. Unlike all the other machines...]

    49. Re:Qmail!! by Cramer · · Score: 1

      And such things are very Not Cheap(tm). Only the bean counters can say if it's a good investment against possible downtime. Lemme ask, how often has that backup SAN and off-site tape storage been of utility?

      Of course, I've lived in a world where "uptime" and "reliability" aren't always absolutes... if one mail server is down and 1000 of those million users cannot get their mail, is that "down"? (answer: no. the 'system' is up; your mail is 'unavailable'.) And the number of nines is measured within "business hours" -- if the client uses the system between 8am-8pm, we could turn the datacenter off between 8pm-8am and still have "five nines". (maint. windows don't count as downtime.)

    50. Re:Qmail!! by Anonymous Coward · · Score: 0

      0.1% is 0.1

      wow, moron alert!

      yo, baby boy, what do you think "%" stands for? literally, "divided by 100" or "parts from 100" - that would make 0.1% = 0.1 * % = 0.1 * 1/100 = ... guess how much?

      How the fuck are you able to survive on a daily basis? My 2% of 1$ is on "poorly"

    51. Re:Qmail!! by Anonymous Coward · · Score: 0

      owned

    52. Re:Qmail!! by Anonymous Coward · · Score: 0

      step one is not listening to anyone who responds that doesn't know the difference between "there" and "they're" or suggests a single server (or building it yourself).

      -j

    53. Re:Qmail!! by Wdomburg · · Score: 1

      For my little mail server, even if it did catch fire, I could build a new one in under an hour. This is no different than a cluster of machines... if one dies, you simply replace it and move on. (with a properly oversized cluster nobody will notice one machines failure.)

      An hour of downtime? That's incredibly unacceptable.

      I don't know what "industry" you live in, but I've not seen anyone build a mail cluster[*]. NCSU is the closest, but that was simply departmentalized servers; the only reason any department had more than one server was due to storage not scalabilty or reliability.

      Among other things, large scale e-mail hosting.

      NCSU is the closest, but that was simply departmentalized servers; the only reason any department had more than one server was due to storage not scalabilty or reliability. (And that was for ~25k people.)

      That's great. Some of us have SLAs.

      [The largest I've worked with had just shy of 70k mailboxes across about 5k domains... on ONE server. It could've handled much more than that.]

      We've had mail domains as large as a million users. I know how much you can shove on a single box. That doesn't change the fact that if your single point of failure goes, you lose everything.

    54. Re:Qmail!! by Sinus0idal · · Score: 1

      Yes, its not to say that it shouldn't be done, just that security is all about as many barriers as possible. Being up to date means if someone bypasses your firewall or your firewall malfunctions, there is another line of defense, surely?

    55. Re:Qmail!! by Robert+The+Coward · · Score: 1

      Sorry but I don't think so. Granted I could use more beefed up servers but I have 3 SMTP server 2 imap/pop servers and 1 ldap server for less then 10,000 users. I would assume based on the web-mail and imap requirement that mail will be stored on the server. So users will keep there emails empty use only a few megs but some users will use every drop of space you give them. My company's servers have limits set to 1 Gig with a few users set larger. Avg users sit just under 500 Megs of mail after 1 year of use that is up from 300 Megs six months ago. 1st thing you need to do is get a handle on how much space is in use now and how much is likely needed in the next 18 Months. At one place I worked 75 Megs was fine for everyone but they had 3 Month auto purge on all users at my current place emails date back years and email is kept for long periods of time. You need to figure out how much you max is now and add some for good measure or you will find a world of hurt after the conversion.

      I have a few basic questions what is you budget? I looked around and found lots of choices out there. Some of those choices are very expensive.

      How much in house knowledge do you have. Are they going to need to pay for service contracts or hire several people in house to handle these systems?

      Outside of email and web-mail what extras are you going to need. Support for outlook calenders? etc.

      How many office involved. Bandwidth inside and outside of the office. How much mail with there current system goes inside the site ver. inside the rest of the company ver Internet as a whole. Might mike sense to put several system in remote sites and have a local admins that do more then email compared to one larger site will several people who do only emails.

      And no mater what be prepared for the complaints as some people will complain if you mailed them there checks at home and never made them come to work and/or do any work.

    56. Re:Qmail!! by Cyberdyne · · Score: 1
      I don't know what "industry" you live in, but I've not seen anyone build a mail cluster[*]. NCSU is the closest, but that was simply departmentalized servers; the only reason any department had more than one server was due to storage not scalabilty or reliability. (And that was for ~25k people.) [The largest I've worked with had just shy of 70k mailboxes across about 5k domains... on ONE server. It could've handled much more than that.]

      The guys behind Exim (the University of Cambridge Computing Service) have such a cluster, known as Hermes - a couple of years ago (when I was there) it entailed four Sun E450s (named after colours - red, yellow etc) in front of a pair of NetApp Filers (named black and white IIRC) and an authentication master (named prism). There was a bit of downtime when one of the Filers had a multiple disk-failure (or rather, wrongly believed it had), but that was it - everything else had no single point of failure available, except if prism was down you couldn't change passwords.

      Expensive, of course, but reliable and fast despite enormous load and a plethora of services (being geek-friendly, you could get your mail via anything from Pine over SSH, SCP, FTP, Telnet, IMAP, POP3 or webmail). Very nice setup - beats the ... out of the GroupWise mess my current employer (another university) uses...

  175. Outbound queues by dskoll · · Score: 2, Insightful

    You probably want a FallbackMX host (or a bank of
    them) so backed-up outbound queues don't interfere with normal outbound processing.

    The FallbackMX hosts can use a file system optimized for directories with lots of files in them (and can of course themselves be tuned as the parent poster suggested.)

  176. Netscape Directory Server? by KidSock · · Score: 1

    Perhaps you should look at hanging vanilla SMTP and IMAP servers off Netscape Directory Server (now Fedora Directory Server)? I think it supports multimaster replication and the code is OSS so if it doesn't work you can get inside and adapt it.

  177. Re:I can see the conversation that started this... by QuantumG · · Score: 1

    I'd shorten the conversation this:

    ThisClown: Hmm.. what stupid question can I post to Slashdot to get me on the front page.

    [Light bulb goes on over his head]

    ThisClown: Aha! I can't ask how to make an email system to support a large number of users! Yes!

    [Posts to Slashdot]

    Editor: Shit, I've gotta do 15 minutes of work or my fellow "editors" will know I've been slacking off all week. What's in the queue? Hmm... slightly technical question about email systems. Well, I'm not too familiar with this whole "email" thing, but what the hell, seems like it would be good for some page hits.

    [Posts to Front Page]

    --
    How we know is more important than what we know.
  178. See what others are using by henry.thorpe · · Score: 2, Insightful

    I'd start by seeing what the big ISPs are using.

    That's a matter of doing an mx lookup, telneting to one of their gateways on port 25, and seeing if you can infer from their banners what mail system that they are running (for the inbound smtp gateways, anyway-- since there's nothing to prevent them from layering different products). Look to mailing list archives for messages sent from the various domains, and see what the headers tell you about their outbound mail path.

    Example: Inbound Comcast HSI:

    $ dig comcast.net mx ;; ANSWER SECTION:
    comcast.net. 250 IN MX 5 gateway-r.comcast.net.
    comcast.net. 250 IN MX 5 gateway-r.comcast.net.

    $ nc -vv smtp.comcast.net 25
    Connection to smtp.comcast.net 25 port [tcp/smtp] succeeded!
    220 comcast.net - Maillennium ESMTP/MULTIBOX sccrmhc14 #274

    So, they use something claiming to be 'Maillennium'.

    If you do this for AOL, you'll see some weird-looking, probably custom AOL gateway. Earthlink says something like:
    'ESMTP EarthLink SMTP Server', AT&T WorldNet is also Maillennium, Verizon.net declares 'MailPass SMTP server v1.2.0', and so on.

    If you really wish to probe to see if this is opensource-ish stuff with obfuscated banners, you can try fingerprinting them using smtpscan http://www.greyhats.org/outils/smtpscan/> to find out that it's really just Postfix or Sendmail hiding behind that custom 220 banner. Actually, it's the smtpscan fingerprint file is an interesting read all by itself...

  179. 99.9% uptime?!? by Anonymous Coward · · Score: 0

    That's over EIGHT HOURS of downtime a year! Shit, Exchange should be fine for that job! I doubt any other webserver could have downtime even in the vicinity of what you're talking.

    1. Re:99.9% uptime?!? by Anonymous Coward · · Score: 0

      We get 99.999 uptime. We are allowed about five minutes a year.

  180. I question your taste in women. by Anonymous Coward · · Score: 0

    I'm not turning to stone at the sight of her, but Julie Farris also does not come anywhere close to my definition of a "babe".

    Maybe she cleans up nice, but based on everything Google Images can dig up... not a babe.

  181. More specific? by Grendel+Drago · · Score: 2, Interesting

    Could you be a bit more specific on the following items?

    5) Breaks well-known and understood UNIX standards.

    Which standards are these? Are you talking about the errno fiasco?

    6) Security through lack-of-functionality.

    What sort of functionality is provided by, say, postfix, that qmail simply won't do?

    7) Not really secure despite the claims.

    How's that? Do you have $500? If not, what's the security vulnerability that the author refuses to acknowledge?

    Which of these problems that you enumerate are not addressed by netqmail?

    --grendel drago

    --
    Laws do not persuade just because they threaten. --Seneca
    1. Re:More specific? by killjoe · · Score: 2, Interesting

      "What sort of functionality is provided by, say, postfix, that qmail simply won't do?"

      Qmail has almost no features out of the box. It can't talk to LDAP, it can't handle multiple domains, it does not reject mail for unkown users (instead it queques up a bounce message which means each spam message generates one outgoing message).

      in order to get qmail to what exim and postfix do you have to apply half a dozen patches and recompile.

      Of course unless the guy who did the compile took very careful notes you have no idea what your particular installation of qmail is capable of either.

      I inherited a qmail install one time and it was a nightmare to maintain. When somebody decided to start sending me 100 thousand emails a day to unkownuser@mydomain.com and my message que got to be hours long I only had two options.

      1) Gather all the patches used to build the original qmail (again no real way of knowing) and then add yet another patch and recompile.

      2) Install postfix.

      Guess what I did?

      --
      evil is as evil does
    2. Re:More specific? by Russ+Nelson · · Score: 1

      qmail can't handle multiple domains?? Sonny, qmail was handling multiple domains when postfix was just a gleam in its father's eye.
      -russ
      p.s. Hmph, kids these days.

      --
      Don't piss off The Angry Economist
    3. Re:More specific? by lintux · · Score: 1

      How's that? Do you have $500? If not, what's the security vulnerability that the author refuses to acknowledge?

      I remember that there was an integer overflow DJB didn't want to acknowledge some time ago. IIRC it could only be exploited on machines with more than 4GB of RAM, not sure though. I know someone who did an audit on some DJB code, and it drove him mad. There are many dirty and dangerous constructions, but they're all done in such a way, that they're just exactly safe, or at least not dangerous. Continuously walking on the edge, but never falling off.. So far...

      It's kind of annoying that DJB thinks his software is so superior that he refuses to update it for years already, even though there are enough problems with it. (Considering the amount of patches that most of the QMail users have to apply before they can use (or even compile) it...)

      BTW, Grendel, aren't you the one whose survey-system I kind-of beta tested a couple of years ago? ;-)

  182. I'd start by looking at this website .. nrg78.com by Anonymous Coward · · Score: 0

    There's been some mindblowing stuff on it, recently re: corporate memory and technology, that I really enjoyed readng it - the guy is kinda funny as well. He's really looking at things in an abstract/philosophical way. He said this morning that Google's been visiting too.

    Ever heard of TAI/MAI/NAI?

    Technology augmented intelligence, machine augment ed intelligence, network augmented intelligence?

    PC2?

    Me neither, before I found this blog a couple of days ago. This guy is talking like he invented them, and it sounds pretty interesting. I want to see what he says next.

    http://nrg78.com/
  183. Mod Parent Up by afree87 · · Score: 1

    I command thee!

  184. MOD PARENT UP! by lamp540 · · Score: 1

    hahahaha... that's some funny shit.

  185. Postfix+Dovecot+Squirrelmail=TheWin by Anonymous Coward · · Score: 0

    Postfix+Dovecot+Squirrelmail=TheWin

  186. Call Microsoft... by Anonymous Coward · · Score: 0

    ...and tell them they're in danger of losing your company's business.

    Ballmer will be on a plane to your location a couple hours later, and he'll have his negotiation hat on when he shows up at your door. You'll get a serious discount on upgrading your entire organization to Exchange 2003, and The Powers That Be at your company will take it, since migrating from Exchange would have been a painful mess anyway.

  187. Cyrus IMAP by Rheingold · · Score: 2, Informative

    Cyrus IMAP is designed for this size of installation. You can split the backends up with Murder on the front-ends to distribute load; divide mailboxes on each host between filesystems (which, you'd presumably spread over multiple disks); use a SAN and GFS or other shared-storage cluster filesystems and share the spool among servers; use the new pre-release 2.3 code with mailbox replication and use more discrete, commodity components. Lots of other features that are designed for large-scale implementations.

    For authentication, of course you have choices among LDAP, Kerberos (both of which are usable even if you're stuck with a Windows domain for authentication), PAM and other things. Very flexible; too flexible for some and it can be a bit confusing.

    I've been working on rewriting the HOWTO, although I haven't made a ton of progress, it may still be useful to you: http://nakedape.cc/info/Cyrus-IMAP-HOWTO and here's a presentation I put together for Linuxfest Northwest: http://nakedape.cc/info/Cyrus-IMAP-Intro.

    You mention a million mailboxes, but that doesn't really mean much--that is just an estimate of storage requirements. What is more important to determine is how many concurrent users you will have and how much actual traffic--storage is cheap, memory not so much.

    --
    Wil
    wiki
    1. Re:Cyrus IMAP by Xenna · · Score: 1

      Of course Cyrus uses a database to store the IMAP metadata as well as index files within the 'maildirs'.

  188. start by hiring someone with experiance and by bxbaser · · Score: 1

    expertise ,will definatly save money in the long run.

  189. Re:Here's my plan and it's the best one you'll get by Antique+Geekmeister · · Score: 1

    You missed an important one. Round-robin DNS doesn't work that way: presented with a set of IP addresses for one hostname, it's almost entirely a client software decision on which IP address to reach out to. Couple that with DNS caching on the clients or their local DNS servers, and round-robin and DNS based failover servers can easily take more than 24 hours to reach even the next IP address of a round-robin set.

  190. Re:Obviously this idea has a problem.... by quadra23 · · Score: 1

    The obvious answer is of course : Send all those thousand employees an Gmail invite !

    So, you have 1,000 invites in your Gmail?! I have a Gmail but that's really amazing! I only got 100 invites even though the number was increased it's still 10x less then you suggest. Or did you mean invite 100 people expecting they'd invite 1 of the other people in the company?...Swell idea, but how are you going to keep track of who was invited and who wasn't?! After all that I think your best doing the solution yourself :)

  191. First Class by kakkak · · Score: 1

    I hate to say this, but I would seriously consider http://www.firstclass.com/casestudies/Business/ with some sort of anti-spam http://www.barracudanetworks.com/ns/?L=en infront of it.

    The only serious problems I have with it:
    -lack of true RIM support
    -hard to find quality administrators

    But it has all the functionality you could possibly need and it Just Works.

    1. Re:First Class by dhandler · · Score: 1

      Don't hate yourself for saying it - First Class is an excellent package. I have a single dualie pizza box handling 3000 accounts with less than one FTE managing it. I use their mirroring tool to keep a warm box in sync. I am using ProofPoint out in front for Spam filtering. I know many whole school systems using it for all teachers and students. Thousands and thousands of accounts - and like you said, it just works. Great fat client and a good web client. IM features, VB like back-end scripting engine, and VM integration with your Nortel if you want. They also have an ActiveX control that essentailly puts the fat client in a web page. Started using it when the post office was running on a old PowerMac with AppleTalk and a DigiBoard to the Windows 2003 server it is on now 9 years later - with many of the orignal mailboxes still there, carried over through the years. Palm Pilot conduit too!

  192. obviously pSeries with AIX by Anonymous Coward · · Score: 0

    IBM does not run the corporate Domino servers on Windows! they use pSeries boxes with AIX

  193. Obligatory Movie Quote by kakashiryo · · Score: 1

    I want one meeeeeeeeeeeeeellllllllioen doll... err, email accounts!

  194. How about AOL? by spewey · · Score: 1
    Yeah, AOL. You've probably got half a million free AOL diskettes already, so you only need to ask a neighbor or two. Plus only AOL e-mail announces YOU'VE GOT MAIL! with every incoming message.

    It would be awesome to send an all-employees memo out and have 1 million computer speakers announce YOU'VE GOT MAIL! simultaneously.

    If you think this is a good idea, just write at the bottom of this post

    >>Me Too!!!!!!!!

  195. Get Lotus Notes... by drsmithy · · Score: 1
    Then they'll realise how good they had it with Exchange.

    Which highlights another issue - I'd struggle to believe an existing implementation of that scale was using Exchange _only_ for email, so you're not really looking for /just/ an email system, you're probably looking for a groupware solution as well.

    This is not a trivial thing to implement, and you're highly unlikely to get much worthwhile advice from Slashdot.

    That said, the place to start is a *real* requirements specification. You need to figure out what services you need to provide, to how many users and at what availability level(s) (note that difference services might have completely different userbases and requirements). Once you've done that, you have all the information you need to either research everything yourself (without using things like Ask Slashdot), or hire someone else to do it for you. But until you know exactly what it is you're trying to build, you shouldn't be asking for advice on how to build it.

  196. The very first thing I would do... by Anonymous Coward · · Score: 0

    If I were you is quit.

    I've been in over my head before. But man, you've taken "OH SHIT" to a whole new level.

    quit, you obviously can't handle this one.

  197. What version of Exchange? by bigcreek · · Score: 1

    Not advocating for Microsoft, but Exchange 2003 on the right hardware does run and scale very well, if you need the groupware features.

  198. Re:Here's my plan and it's the best one you'll get by lamp540 · · Score: 1

    How could you NOT know that was a joke?

  199. The best solution by Mungkie · · Score: 1

    I haven't read all the comments so someone may have beaten me to the answer?. But here is my answer based on the current state of the art and numerous industry studies for reliability and lowest TCO. Call Microsoft and ask for a copy of exchange server!!!!!!!!!!!

  200. Re:Here's my plan and it's the best one you'll get by HostGeekZ · · Score: 1

    I can do nothing but sit back and admire the humor:)

    I think I will get myself some AMD's if they are that good.

    -Scott

  201. ISPs do it by tcampb01 · · Score: 1

    There are ISP-grade products that do it. Sun has one. See http://www.sun.com/software/products/messaging_srv r/home_messaging.xml

    You need to break up the jobs of message storage, client connections, and mail transfer into isolated components that can scale independent of each other and be clustered for scalability and high-availability.

    Message Transfer Agents (MTAs) are often dedicated for either inbound and outbound and also interface to scanning software (e.g. BrightMail Anti-Spam & Anti-Virus, see: http://enterprisesecurity.symantec.com/products/pr oducts.cfm?ProductID=642%20) to check for the usual suspects. For inbound mail, they leveraage directory servers (which replicate with ease) to find the specific message store used to host the mailbox for the inbound message, and then route it correctly. These are load balanced for availability and scalability.

    A user's mailbox will only exist on a single message store, but the message stores can be clustered for high-availability.

    Client connections similarly allow an array of "message multiplexors" to scale that end of the problem. The multiplexors speak webmail, IMAP, and POP. Similar to the MTAs, they are load balanced. A user can connect to any multiplexor and a directory server is used to find that user's proper message store to connect them to their mailbox.

    To the end user it looks like a single server that does POP, IMAP, and WebMail. In truth it's broken into components to achieve high scalability and availability.

    A single message store can usually store a few hundred thousand mailboxes -- for a million mailboxes you'd probably only need a handful of them.

  202. HP NonStop by Anonymous Coward · · Score: 0

    If you want that type of uptime and have a budget to support it, I'd skip the administration and technical nightmares a server farm brings and look at an HP NonStop. Install HP OSS and use one of HPs ported packages or hire some coders to port a package tot he platform. It's worth a look.

  203. Can you split the users up? by IGnatius+T+Foobar · · Score: 1

    If you can split the users up, perhaps by suborganization or by geographic area, you might (and I say might because no one answer is right for everyone) be well served by having lots of different servers handling email for each group, and then aggregating it all together with a head end that handles email routing and directory services.

    If you're putting more than a few hundred users into an Exchange environment, that's how Microsoft would have wanted you to do it. Although notoriously unreliable, the concept is sound. In the non-Microsoft world, you could build each area as a subdomain, deploy the usual tools (such as the SMTP and IMAP daemons of your choice), and then use OpenLDAP to tie it all together, and add some sort of Postfix or Sendmail routing system to make the subdomains invisible to the outside world.

    Some organizations might even consider an open source email/groupware system like Citadel that can handle a distributed network like this; it can tie together lots of servers using its own peer-to-peer protocol and share a global address book without the need to use subdomains (and any individual server is capable of being an MX for the whole network, so you might not even need a hub server at the head end -- although you might want to use one anyway in order to centralize your border services like spam/virus filtering, archival for Sarbanes-Oxley auditing, etc.).

    In summary, if you distribute and/or federate the email services, you gain the benefit of removing the single points of failure, and you can potentially put the servers closer to where the users are, reducing your bandwidth expenses.

    --
    Tired of FB/Google censorship? Visit UNCENSORED!
  204. Re:Obviously this idea has a problem.... by Anonymous Coward · · Score: 0

    I think those 100 invites get refreshed everyday.

  205. I agree. by game+kid · · Score: 1

    I just wonder how the employees will handle all the "bum bum"s and "top female" pictures floating around.

    Not that I'm complaining. I bet it's hit-or-miss but I've seen some extremely interesting pictures there...

    --
    You can hold down the "B" button for continuous firing.
  206. What are you migration options by Anonymous Coward · · Score: 0

    I would first look at the migration requirments.
    - What do you have now?
    - How fast could it be migrated
        - to what?

    Possible scenarios:
    Alot of good suggestions so far, I've never done it at this scale, I've done an avg of 200k. But from a Sr Unix admin point of view I would suggest you split your systems by functions like many suggested.

    Frontend: I would take a look at the Barracuda networks appliances that do Spyware+Spam firewalling.
    DB: LDAP (if you have the option to migrate easily)
    MTA(s): postfix
    IMAP/POP(s): Cyrus
    Webmail/via IMAP: HORDE/IMP

    Good luck with the migration! thats what will take the most planning.

  207. program similar to Outlook Web Access? by krunk4ever · · Score: 1

    What I found great about exchange servers is the fact they have this awesome Outlook Web Access component. No other software I've even seen comes close. Is there by any other chance, that a similar program exists?

  208. Sendmail works fine for our 1.33 million users by Anonymous Coward · · Score: 0

    We have a vanilla Pentium III 450 with a pair of 15k SCSI harddrives running software RAID. The OS is Debian stable with kernel 2.4.18. The load average gets high at times, but it works fine. In 8 days it will have been up one year. One year ago we took it down to upgrade the drives. The LDA is procmail.

    The vast majority of the users use a simple web-based PHP mail client we wrote. It runs on the mail server and manipulates the mailbox files directly so it doesn't have to create any POP3 connections.

    What makes it all work so well is our constraints that you may or may not be able to use. We limit users to 5,000 messages in their mailbox. Every Saturday, I run a script I found about six years ago called expire_mail.pl to get rid of messages older than 60 days. The system is noticeably slower when the mailboxes are larger. It takes Linux much longer to append to a file on ext3 when the file is larger. The incoming max message size ('O MaxMessageSize=64000' in sendmail.cf) is 64,000 bytes. That's what saves us. With the system-wide /etc/procmailrc, we automatically delete messages that contain attachments that end in exe|vbs|shs|lnk|com|pif|bat|src|dll|vb|osx|hlp|scr |zip.

    It all just works.

  209. i have two words... by Anonymous Coward · · Score: 0

    LOTUS NOTES!!!

  210. Start Here by Anonymous Coward · · Score: 0

    www.postini.com

    Postini rocks they manage spam and antivirus. if something is caught they hold it for you so your processing requirements are less. They also will process outgoing mail as well. This way you only have to accept mail from and send mail out to postini making your servers more secure. They will also mail bag your mail if your site goes down and notify you of such.

  211. Why pop3 causes admin headaches by Anonymous Coward · · Score: 0

    Couple of things, most of them have to do with the user deleting their data inadvertly, however the largest issue with pop3 is it's a big ass security hole. Passwords are transmited plain text.

    1. Re:Why pop3 causes admin headaches by Anonymous Coward · · Score: 0

      uh, users can delete their own data regardless of what protocol is used. and passwords do not have to be trasmitted plain text if you force tunneling over ssh.

      don't worry, life gets better after high school.

      oh, snap.

  212. Outsource, and other advice by Charles+Dodgeson · · Score: 1
    There are companies like fastmail (no personal connection, just a happy customer) which are set up to do this sort of thing.

    If you really want to do something like this in house, hire someone like Nigel Metheringham (old friend of mine, haven't had any contact with him in years) who set up the mail system for freeserve.co.uk when they first got started. Look at what others have done.

    Crucially, you will want several inbound MXes, several outbound SMTPs, and your IMAP server on the most robust hunk of metal and silicon that you can get your hands on.

    Years ago I would have recommended UW-IMAP with mbx format, not mbox format. Now-a-days, I'd be more inclined to use Cyrus IMAP. As for sendmail, postfix or exim, I've got my personal favorites. Your choice will have to be based on more than my prejudices and biases. But do take a look at exim, many things were built into it for freeserve.co.uk. (Freeserve went from zero to more than a million users in a few short months when it started.)

    --
    Prime numbers are exactly what Alan Greenspan says they are -S. Minsky
  213. Start where hotmail did: FreeBSD by dameatrius · · Score: 1

    Hotmail was great until MS converted it to windows.

  214. I/O by Graymalkin · · Score: 4, Informative

    While not quite a million users, HEC Montréal switched from Netscape Messaging Server running on AIX to Postfix/Cyrus/SquirrelMail running on Linux. Linux Journal ran a really nice article and a follow-up about their transition.

    One of the first things the school did was figure out how exactly their current system was failing them. Their old AIX boxes were being stressed just by the volume of mail coming through the system, they had little power left over to do any sort of filtering. This led to users getting drowned in unwanted e-mail which only exacerbated the existing load issues. This is one of the first things you need to do, figure out why your current system isn't working properly. You'll be better equipped to fix the problems when they've actually been identified.

    HEC Montréal also went for heavy redundancy and specialization. Instead of a handful of servers sharing all of the tasks equally each node in the cluster has its own job with every class of job having a backup server. Every job is going to take a beating with so many users, even if only a fraction of them are using the system at any given time.

    I'd say the most important part of what you're doing will be modeling your current use. Are you getting a ton of traffic from viruses and worms spreading over your internal network? Do you get huge amounts of spam traffic to users? In such cases filtering at your SMTP servers will relieve the rest of the system from extraneous traffic. While you might need really beefy external SMTP servers you won't need nearly as much storage space on a SAN or NAS.

    --
    I'm a loner Dottie, a Rebel.
    1. Re:I/O by Anonymous Coward · · Score: 0

      I studied in HEC Montréal for a semester and I just gotta say that overall they had their IT set up real nice. I only wish my own school would be half as good as them in this area...

  215. Ask Russ Nelson by Anonymous Coward · · Score: 0
  216. Shenanigans! by barfy · · Score: 1

    I call Shenanigans!
    This is not a true story. Companies with millions of clients that ask ignorant IT people who have to look to Slashdot for answers.
    If it is done, the CTO should be fired, The entire IT departments should be retrained, etc.

    Frankly this is a CTO level decision. One does not just get "sick" of Exchange. The fact is exchange given enough money and effort works very well for most people that use the system. Companies with millions of clients can put enough money and effort into it.

    This is almost too dumb to talk about... Shenanigans I say, Shenanigans!

  217. Autoloader by nurb432 · · Score: 1

    He didnt say 'revolver', you are just assuming..

    --
    ---- Booth was a patriot ----
  218. The Hula Project by mattdev121 · · Score: 1

    Taken from the hula-project.org FAQ: How well does it scale? Insanely well. Scalability was the primary design parameter for the original codebase. Anecdotally, people have run 200,000 registered users on a single $4,000 PC, with a 25% concurrency rate (that's over 50,000 concurrently-connected users). Of course it will be more practical when its finished but even now it seems stable enough to consider deployment.

    --
    mattdev@server$ touch /dev/genitals
    cannot touch `/dev/genitals': Permission denied
  219. qmailrocks.org by ledbetter · · Score: 1

    Check out qmailrocks.org for a fantastic full featured mail server install based around Qmail. Support for database users and ldap are options, and it includes spam filtering, web mail, and even an admin web interface. The website walks you through every single little step, and has paths for various linux flavours plus the BSD's and even Solaris.

    In terms of scalability you're going to want to star with some honkin' hardware. You will also need to seperate the sending (SMTP) servers from the receiving servers and the mail storage servers, in order to distribute your load. qmail.org has a ton of info as well about the Qmail system.

  220. Still convinced to migrate? by Joseph_Daniel_Zukige · · Score: 1

    Read nothing in your post that I can see as being persuasive.

    You did swear a lot.

    I'm sure management has already asked those questions and more, maybe even complete with expletives.

  221. Try Backup Exec for Single Mailbox Restorations by LazloToth · · Score: 2, Informative

    Not defending Microsoft here, but I have to take care of an Exchange 2003 Enterprise server, and I wouldn't think of trying to do it without Symantec (formerly Veritas) Backup Exec with the additional Exchange agent. Yes, you can back up and restore individual mailboxes, and even individual messages. Backup Exec has its quirks, but it's the best thing going if you have to take care of Outlook users. Over the years, starting with Exchange 5.5, Backup Exec has saved my rear when information stores got corrupted, log files were deleted accidently, and so on. Combined with a nice, fast AIT tape library, it's a great data preseration product for the small- to medium-size enterprise.

    --


    It's only funny until someone gets hurt. Then, it's hilarious.
    1. Re:Try Backup Exec for Single Mailbox Restorations by lgw · · Score: 1

      Not to plug any one company (though Backup Exec does the Exchange thing quite well), but the parent post makes a good point - look carefully at features of *current* backup software. The ability to backup Exchange efficiently and do mailbox-level restores had come along significantly in the past few years. In migrating from Exchange, make sure you don't lose the ability to backup quickly and restore individual mailboxes.

      Does anyone know what options you have for protecting qmail? Does anyone offer mailbox restores from file backups?

      --
      Socialism: a lie told by totalitarians and believed by fools.
    2. Re:Try Backup Exec for Single Mailbox Restorations by cloudmaster · · Score: 1

      Mailboxes *are* files. Using qmail or any other sane mail system, mail backups are just like any other file backup.

    3. Re:Try Backup Exec for Single Mailbox Restorations by lgw · · Score: 1

      Excellent! Thanks!

      --
      Socialism: a lie told by totalitarians and believed by fools.
    4. Re:Try Backup Exec for Single Mailbox Restorations by cloudmaster · · Score: 1

      I suppose I should add soem detail to that. Mailboxes are either single files, or directories full of files, depending on whether you're using maildir, mbox, or something else. I'm partial to Maildirs, where a mailbox is acutally a directory with three folders (cur, new, tmp) and those three folders contain one file per message. It's good for lock-free mailbox editing, and moderate volume, but the absolutely huge amount of files can tax some filesystems. Using mbox gets you one file containing all of the messages, which requires actual locking, and increases the odds that some bad process could mess the file up (I specifically worry about message deletions).

  222. Re:Sun's Java Messaging Server (AKA Netscape/iPlan by Anonymous Coward · · Score: 0

    And Sun has stated in several places their goal with the "Outlook Connector" is that a user should not be able to tell the difference between an Exchange backend versus a Sun JES backend. And if you role out Sun's IM, you get a Jabber/XMPP server, too.

    We have deployed it where I work supporting 75K accounts and the few boxes don't sweat.

    You really should look at it. (I *think* that I have seen mentioned that Verizon and Telstra use it for their customers.)

  223. Not even qualified by Anonymous Coward · · Score: 0

    If you have to ask that question, then you must not be qualified to even do this job? Someone qualified to do the job would already know the solution.

  224. No one has mentioned James yet by ChiralSoftware · · Score: 1
    I would start with Apache James. This is a 100% pure Java mail engine. Why Java, you ask? It's so easy to modify it because it's based on plug-ins, just like Java Servlets. So if you decide that you need database-backed user ID storage, you can do that. If there's no plug-in that does it, you can write your own. If you need clustered mail storage, you could write a plug-in to do that. You can run it on Linux, Solaris or Windows Server.

    Also you can be sure there will never be buffer overflows or similar security problems.

    I'm sure you know this, but you're going to need a clusterable technology. You need to have multiple redundant servers for this kind of load. Much better to be able to handle load by adding cheap PC-hardware servers than basing it on one huge server. James would let you build that if you want to.

    Of course, James only makes sense if you're a Java type of person.

    1. Re:No one has mentioned James yet by Anonymous Coward · · Score: 0

      Java == CPU, memory resource hog

      no thanks.

      qmail or anything else written by DJB has never had an exploit

  225. my MTA of choice by Crock_Lobster · · Score: 1

    For high volume sites, my recommendation for an MTA would be Ironport C series appliances. They are really pretty bulletproof, and can handle a large volume of mail. You could use something along the lines of Exim for your mailbox servers (or exchange for that matter if your users liked the groupware aspects), fronting the mailboxes with the Ironports. Some of the larger mail installations on the net use Ironport as their MTA.

  226. Novell will handle 1,000,000 accounts by clarkeb · · Score: 1

    Novell's eDirectory will easily scale to handle 1 million accounts. In addition Novell has ported Groupwise to run on Linux. Of course Groupwise runs on NetWare as well which is still an awesome reliable OS and neither needs to be patched anywhere near as often as M$. Good luck!

  227. Re:Here's my plan and it's the best one you'll get by keepr · · Score: 1

    I am glad someone got it.

    --
    Slashdot taught me how to use the preview button!
  228. Checkout Mirapoint by pug916 · · Score: 1

    Guys I have worked on a project that had to deploy about 1million accounts previously and you simply can not fault the Mirapoint solution. Check them out as they have arguably the fastest mail platform on the planet, but the most stable and cost effective. PS I don't work for "mirapoint" so this isn't a blatent plug but have seen it do it thang first hand.

    1. Re:Checkout Mirapoint by Anonymous Coward · · Score: 0

      I completely agree here, while I work for a smaller ISP that started off with 10,000 mail accounts using Mirapoint's solution, we have easily scaled to 35,000 accounts using their scalable appliance approach. The implemenation has a super-hardend BSD base with no known vulnerabilities (we've used them for 5 years with no successful hacks). Their new Commtouch based Anti-Spam and highly reliable Sophos antivirus solution are some of the best in the industry.

      In fact, if you want to see a hardware solution that will handle 1,000,000+ users check out Spec Mail (as well as compare to others, but do keep in mind the price and actual cost per user):

      http://www.spec.org/mail2001/results/res2004q1/mai l2001-20040126-00034.d.html

      And a SAN based solution :

      http://www.spec.org/mail2001/results/res2003q2/mai l2001-20030519-00031.d.html

      I don't work for Mirapoint, but I've been putting my trust and customers in their hands for email for several years. Everything from security, easy scalable upgrades, and performance that can be managed by a couple of admins (rather than a large team) would be hard pressed to find a solution to beat them for the money for this scale of a project.

    2. Re:Checkout Mirapoint by gnushell · · Score: 1

      I agree. I had a 84K user setup and it NEVER crashed or caused any problems. They have good spam filtering, LDAP, and fantastic hardware design. The next time I have to implement a large scale email system, I won't even hesitate and will go with them.

      --
      home != /dev/null
  229. Re:Obviously this idea has a problem.... by yuri+benjamin · · Score: 1

    First, you migrate users in batches of 100 at a time (the invites do refresh regularly).
    Then, when you engage new staff/contractors, you send a gmail invite as part of the recruitment process.

    You could send invites to the heads of each department, and have them send invites to each of their direct reports, and so on right down the chain to the lowest intern.

    --
    You make the mistake of thinking you can educate the fundamental stupidity out of people. You can't.
  230. Gmail's setup by dacarr · · Score: 1

    While gmail itself is a non-option (it's like distributing hotmail accounts for your company), what about what they have set up? You might consult with them. Another user had an architecture you could also work with. Hope you have a budget.

    --
    This sig no verb.
  231. Run Lotus Domino on this.... :-) by Anonymous Coward · · Score: 0
  232. Try asking large companies for advice by davidwr · · Score: 1

    For a big fee, companies like Yahoo and Google might lend you some expertise. Heck, it may even pay off to pay them to help you get up and running.

    Assuming you aren't competing with them of course.

    Also try contacting universities and large companies, they may have 6-figure mailbox-counts to keep track of.

    I'm sure companies that sell competing products e.g. IBM etc. will also help you get up and running, for an appropriate consulting fee.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  233. sysadmin magazine... by buhatkj · · Score: 1

    there was an article in a issue of sysadmin magazine about 1 year ago about something like this. Not quite the same scale you mentioned, but more like 70,000 users. but their solution's biggest strength was its ability to scale, so its a start.

    the gist of it was they had a fibre channel SAN, which was shared by multiple headless servers at several levels
    -squirrelmail based webmail
    -postfix
    -LDAP on postgreSQL
    -cyrus IMAP/POP3
    If You look back in the issues for the last year or so I bet you could find it...might be a start...

    My company has high usage for the number of people we have, but its still only ~500 users, so we just got a beefy redhat enterprise box...

    --
    sometimes, i wonder if i'm the only conservative on teh intarweb. ah well, back to mah hogs and warmongerin'....
  234. Re:I would start by... by Khan · · Score: 1

    I agree. That's where I usually start at. Helps clear my mind and relax me. I think for a project this large, you're going to need to get laid on a regular basis if you're planning on surviving this.

    --

    "Klaatu, verada, necktie!" -Ash

  235. Can't Be Serious by (eternal_software) · · Score: 1

    This can't be real.

    YOU (as in one person) have been asked to figure out how to provide email access for ONE MILLION ACCOUNTS?

    There is another comment on here saying that the entire IBM corporation only has around 300,000 email accounts. Do you know how many people they probably have running their email system?

    And you need to replace Microsoft Exchange, probably the most capable corporate email system available? Do they require all the features that Exchange offers??

    Sorry if I'm I can't imagine this is real.

  236. simple. by deviator · · Score: 1

    Novell GroupWise on Linux.

    All of the heavy lifting has been done for you; it's a scalable, secure, battle-tested solution running on an open-source platform.

  237. Ironport + Communigate Pro by Spazmania · · Score: 1

    Get Ironports for the front end SMTP processing and spam/virus filtering. Then get a Communigate Pro cluster for the back end with a SAN for storage.

    It'll cost you about a third of what Exchange does and its so, so painless, even for 1M accounts.

    --
    Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
  238. Doh! by Anonymous Coward · · Score: 1

    "Get a server with RAIDed SCSI disks preferably hot-pluggable. "

    And how do you achieve server failover for 99.9% uptime?

    Dude, if you don't have a clue, don't even bother to post nonsense like that.

    For 1m users and 99.9% uptime, you're going to need multiple servers and shared storage. That implies a level of experience that neither you nor the guy who asked the question has.

    Your kind of homegrown solution causes more problems than anything. It will never scale and it will never achieve the uptime required.

    Cripes. Amateurs.

  239. Can you really avoid Exchange/Outhouse? by lohphat · · Score: 1

    Not unless your non-technical user-base can be treated for withdrawal. People in marketing, sales, consulting, managemnt can not/will not relearn another groupware system unless forced to by upper management.

    Yes, you can have a robust IMAP4/webmail solution but without the integrated calendar and task delegation, you're in for a world of pain. We've tried. The Outhouse 2003 smack is too alluring for the mobile Crackberry/PDA use where their mail is at their fingertips.

    What you need to ask is what do you want? A robust mail system with low cost of ownership or happy fratboys who pay the bills (including you salary).

  240. Argentina.com: 500k users with $75k budget by Anonymous Coward · · Score: 0

    Argentina.com has half a million accounts with POP3, SMTP, and dial up access using FreeBSD.

    They did this with a hardware budget of US$75,000 over two years. Each user gets 300MB of mail space, and the cost for disk came out to US$3/GB. POP3 and SMTP uptime is about 99.95% (down 5 hours per year), while webmail access is about 99.5% uptime (down 2 days per year).

  241. Setup I have used for that scale by jmazzi · · Score: 0

    FreeBSD 5
    Dual 3.4 XEON Procs
    4GB RAM
    4 x 300GB Drives (Lotta space For Imap users)
    Hardware RAID 5

    FreeBSD 5
    Dual 2.0 Xeon DB mysql server(for vpopmail)
    2 x 70GB RAID 1
    2GB RAM

    Qmail, Vpopmail, Courier IMAP, apache, qmailadmin, vqadmin

    That system would handle it and allow for redundancy

  242. Info for Notes Haters.. by Anonymous Coward · · Score: 0



    Notes Security Framework awesome. You can set access to the server, the mail database, the actual records and can even drill down to and encrypt a particular field in the record.

    Viruses - No worries. When melissa came out companies on Exchange were down for hours if not days. Notes shops kept on trucking...

    Notes Client UI Sucks - Use the Web INotes client, use Outlook (you have options).

    Servers crash - Run it on a stable OS like OS390, Solaris, Linux, etc. (you have options).

  243. IMAP-based solution as a core by parc · · Score: 1

    Start with a frontend of servers that ONLY forward mail to your spam filters.

    In the middle are your spam filters. Run SpamAssassin in daemon mode. These guys will forward mail to your delivery machines, using a DB-backed forwarding table.

    The delivery machines run IMAP. While I haven't used it in about 2 years now, Cyrus IMAPd was a great system when I used it. It will do virtualization for you. Your users can connect to any of the IMAP servers and their requests will get forwarded to the correct server.

    With this setup, we supported about 45k mail accounts using 5 servers. The load never got about .5.

    Note that you're still SOL in the case of a server failure. However, if you study up on IMAP and POP3, you'll discover that they can't handle server failures at a design level. There's simply no way to handle losing a server.

  244. I've done this by CaptainTux · · Score: 1

    I've done something like this for a government entity with about 1.7 million accounts with similar constraints. Email me at anthony@adctech.biz or papillion@gmail.com or call me at (918) 926-0139. I can recommend, setup, do reports, etc for them for a reasonable price.

    --
    Anthony Papillion
    Advanced Data Concepts, Inc.
    "Quality Custom Software and IT Services"
  245. What? by Qbertino · · Score: 1

    You're running a million email accounts on exchange? You didn't say that, but still I'm asking. 'Cause I'd say that would be impossible. ...
    But if your company plans to scale to 1 million live and running mail accounts these are the things that come to my mind:

    1st: You need serious Iron for this.
    Either up to a few hundred beefed rack pc's (depends on how the mean usage of those 1 million accounts is) for load balancing, admin, fault tolerance, data and automation or some uber-special sun server solution for this. PCs are probably better. Scaling/maintaining is cheaper in the end.

    2nd: I only know my way about things the size of a beercase so here goes my 2 cents for the PC solution:
    Something stable (Linux or BSD) and an MTA that doesn't get in the way. Maintainability goes over speed at this point - I'd guess Postfix or Exim would be the ticket.

    3rd: Consider a DB setup for storage. Also here I'd go OSS all the way. Postgres and Firebird scale very well, but even MySQL could pull this if set up correctly. You have no big relational stuff, just a 'give data, take data and shut up' scenario here. And MySQL is fast. As in f*cking fast. If you have a good admin policy and automate that you're going to have zero fuss, zero slowpoking in your storage. MySQL might even be the best way.

    4th: Process automation/admining. Pick a good PL and/or appserver for this. This should be scalable in itself and also shouldn't get in the way (again: maintainability over speed). I recommend checking out Python and Zope. Zope loadbalancing is a piece of cake and allthough it's a slowpoke unsuitable for the grunt work, it's object relational DB is like sex with Claudia Schiffer when working with it. You could set up a handfull of boxes with that and have your backend covered (no pun intended :-) ). Zope could maybe even do parts of the webfrontend. But you'd have to test that for speed. Python alone is perfect for frontend though. The PL doesn't matter that much, but you should stick to one for all what you're doing. You're starting with a clean slate, you might aswell honor that without building a messy bloat of 10 technologies.

    5th: Your team. You need a team for this. A handfull of people who help you build the system and document it and know whats going on. The "OSS expert but no-foam-around-mouth" type is good for this. All should use the same PL for automation and generally know whats going on, even if they specialize (storage, automation, web-frontend, etc).

    6th: Facilities. This is the ballpark where you think about that aswell. Fire safety, spare power with large UPS and maybe even generators. Fat lines. Do the math, add 40% and then build it. Same 40% rule goes for cost and final rollout schedule.

    7th: Offsite backup for internal accounts. Check your requirements. If the company is large you'll need a remote backup site and some overturning backup policy.
    Backup could also be done with two or three suitcases of external encrypted HDDs that are carried around if you want to save bandwidth for account access. Few think of a solution like this, but it acutally is feasable, safe and cheap. And spare HDDs for replacement aren't a problem either.

    8th: Politics.
    External Contracts: Get nice-like with your ISP(s). You wanna have had a few beers with the guy you're explaining that you've missjudged your requirements and need an extra 4 lines. Now. Or like to switch of a few for the time being.
    You Boss and his superiours: Keep them informed but do the decisions yourself. Don't pester them with techno babble. They wanna know you can do it yourself. At this scale they are more your partners than your superiours. You need to be up to it. Naturally. Rethink that matter before you give your people a thumbs up. Nothing bad about coming to the conclusion that external contractors would be the better solution. Be a professional, not a jerk.

    Good luck.

    --
    We suffer more in our imagination than in reality. - Seneca
    1. Re:What? by vidarh · · Score: 1
      1st: You need serious Iron for this. Either up to a few hundred beefed rack pc's (depends on how the mean usage of those 1 million accounts is) for load balancing, admin, fault tolerance, data and automation or some uber-special sun server solution for this. PCs are probably better. Scaling/maintaining is cheaper in the end.

      You can easily do this with 10-20 relatively low end servers, few (4-5 of them perhaps) either talking to a SAN box or to some reasonably low end RAID's. Probably significantly less if he's willing to spend some time optimizing the system properly.

      1 million accounts is small these days. I've worked on a mail system that at one point had around 300.000 active user accounts on a webmail solution run off of a single 5 year old quad CPU backend with four 5 year old 1U rack mounted servers and a single 4U SCSI RAID box, with significant capacity to spare.

      It was scaled up dramatically after that to handle massive growth, and this was in the days when a few MB of storage was what was normal for mail storage, so the storage would need to be bigger today, but you'd be hard pressed to find servers for sale today as slow as the stuff we were running at the time.

      The other thing to keep in mind is that IO needs have NOT scaled up to match the storage sizes, as e-mail access patterns are very focused on new messages and summary data (that any decent system will keep in indexes or cache files), which means that you can work with significantly lower end RAID/SAN systems today than what was needed when we did our system 5 years ago.

      IMAP would increase IO bandwidth needs quite a bit over a properly optimised non-IMAP based webmail system, but again, you'd be hard pressed to find anything as slow as what we were running on back then.

  246. TPF by Anonymous Coward · · Score: 0

    Anyone ever hear of TPF? They have the TPF Internet Mail Server...would work perfectly for this.

    1. Re:TPF by morryveer · · Score: 1

      I've been using TPF for 15 years, and they basically advertise exactly what this person is asking for. 99.9% uptime. Airlines and Banks use TPF to acheive this on a regular basis. TPF is designed to stay up and handle large numbers of TRANSACTIONS. I don't have access to the numbers, but the largest system out there - the Sabre system - does I believe above 20,000 transactions PER SECOND. the Internet mail server is built exactly for these large mail situations, although the number I remember is 200,000. Still, I see no reason why 1,000,000 couldn't be achieved. If this is for the EDS Navy contract, EDS should be ashamed of themselves. They boast they are the largest TPF shop in the world, and operate the Sabre system for American. But TPF is the unspoken child of the mainframe world. contact me if you actually are interested in it.

  247. I'd go hire Matt Simerson... by cymen · · Score: 1

    http://www.tnpi.biz/

    I'd hire him if he was interested. Even if you don't go his route he'd be a good candidate to evalaute the purposals if you don't have someone who can do that on staff.

  248. Build it like you would build a multitier web site by east-steve · · Score: 1

    You should look at the following: - A good L4 Load balancer (Foundry, F5, Cisco, etc...) - A good platform for inbound and outbound email filtering and relay to front exchange; appliances exist that do this well (Ironport, Barracuda) and you can scale these via the aformentioned Load Balancer - Build a good LDAP Directory Infrastructure; Sun makes a very good one - it will scale into 10's of millions of users - Run your Exchange servers in VMware (the Data Center version) where you can quickly recover and backup your servers since the image is just a file - Implement Exchange on a good SAN such as EMC or Hitachi

  249. recommendations by hedrick · · Score: 1

    In the end you're going to have to ask vendors (or advocates for free solutions) to give you examples of configurations. There are certainly candidates. I believe Sun's JES is used by some large ISPs. Novell has a couple of products that might be interesting. Mirapoint makes appliances which are very much worth looking at. I'm sure you know what to ask these folks: give us examples of configurations that actually are used in large-scale installations. What I liked about Communigate is that their multisystem setups looked more symmetrical than some of the alternatives. The main reason I didn't look at them was that I was looking to do mail and calendaring and at the time they didn't have calendaring. They do now. Also look at what other service you're going to need: spam protection, support for Blackberry and other mobile access. Vendors ought to be able to tell you how to integrate these support services.

  250. Just outsource by Anonymous Coward · · Score: 0

    HINT

    Just outsource to a company http://www.loftmail.com/ They have the best corporate email setup. ..

  251. why are you asking?? by Ravenrage · · Score: 1

    of course have everyone sign up for a hotmail account. duh *ducks*

  252. Outsource by zenwarrior · · Score: 1

    Outsource it to the Chinese. You'll get a great product and Gates' goat for another snub all at once.

    --
    /.'s Psychic-in-Residence: Psychic to the Geeks
  253. working around an exchange server by doug · · Score: 1

    I also work at a company that thinks Exchange is the ultimate tool, and host it about 1000 miles away from where I work. But it serves IMAP and LDAP, so I just use tbird from my Linux box and it works like a champ. I only fire up Outlook because of meeting notices. Poke around a bit and you might get tbird to work for you too.

    - doug

    1. Re:working around an exchange server by HermanAB · · Score: 1

      If you would switch to Evolution, then you'd get your meeting notices too.

      --
      Oh well, what the hell...
    2. Re:working around an exchange server by Sketch · · Score: 1

      You get meeting notices with any email client, as they are standard mail messages. Evolution and Outlook (possibly others) just know how to read the times from them and insert them into your local calendar.

      --
      -- OpenVerse Visual Chat: http://openverse.com
    3. Re:working around an exchange server by HermanAB · · Score: 1

      Well yeah, you can get meeting notices with mutt and mail, but inserting them into a calendar is the useful part...

      --
      Oh well, what the hell...
  254. Army Knowledge Online does it for 1.72 million use by kenblakely · · Score: 5, Informative

    AKO (www.us.army.mil) is the Army's official intranet portal. We provide email for over 1.72M users, and we move almost 3 million messages a day. We do it all with Sun Messaging Server ver5.2 (soon to be Jes3) and we have exactly 2 (count 'em) two mail administrators. Sun mail is rock solid and scales great. We offer POP, SMTP, enterprise SPAM and Virus filtering as well as personal address books besides. We don't get the rich Outlook fat client, but then we want to be all web-based anyway. Can't say enough about Sun mail. If we had to do this with Exchange, I'd have to hire prolly 50 admins and deploy order of magnitude more machines.

  255. Did your company employee these women? by microcars · · Score: 1
    The ones that started a huge email flame war over a sandwich?

    If so, they've been sacked and your Exchange servers can cool down now.

    --
    I like microcars
  256. My vote is for Notes by mferrare · · Score: 2, Interesting
    I'd put my vote in for Notes also. It's architecture should scale to meet your requirements what with distributing you setup across many servers and using replication. Granted the client isn't the best by any means (more on this later) but the application itself is quite good. Your laptop users can replicate their e-mail locally which is a simple procedure. I replicate my notes locally just so I can index my mailbox on my local drive.

    But the real advantage of Notes is as a distributed applications platform. If you want to expand past e-mail and start writing applications such as leave management or room booking or technical documentation databases the this is where Notes really shines. And they're all databases and they can all be replicated so they take advantage of the same redundancy that your e-mail will use. And if you need to travel then you just replicate the databases you want onto your notebook and take them with you. It's fantastic.

    Ah, the mail client
    Why oh why does the client suck SO MUCH!! At my previous company the management were looking at moving to exchange simply because Outlook is so much a better client than what Notes (even R6) is. It's a big fat piece of bloatware (as has been discussed many times here). My main peeve is that if you edit an attachment inside an e-mail you can't save it back into the e-mail! eg: here's a typical scenario:
    Not using Notes (outlook, thunderbird, mail.app all let you do this)

    • Receive e-mail with an attachment
    • dbl-click on the attachment, edit it, save it
    • forward the e-mail, including the saved attachment, to someone else
    Simple huh?
    With Notes:
    • Receive e-mail with an attachment
    • Detach the attachment from the e-mail message. Save it somewhere
    • Use windows explorer (or whatever) to find the attachment, edit it and save it
    • Forward the message
    • before sending, delete the original attachment and replace it with the copy you have saved on your hard drive somewhere
    • send the message
    • delete your copy of the attachment
    Sigh!!!

    WHY!?!?!?!?

    But despite all that crap I still think it's an excellent platform and one you should consider. It has support for encryption and also supports IMAP (although not very well I hear). A lot of large corporations run it. I've worked for 2 large investment banks both of who run it. You can also integrate IM into it (with sametime) and remote meetings also (with sametime meeting). Also, IBM PS are good at setting it up. For something this scale you'll be up for $$$ anyway so I'd be looking at having someone come in to help you and they're pretty good (I don't work for IBM!).

    --
    Why would anyone want to use a text editor that is not vi?
    1. Re:My vote is for Notes by sam1am · · Score: 1
      My main peeve is that if you edit an attachment inside an e-mail you can't save it back into the e-mail! eg: here's a typical scenario...
      My theory is that if I edit an attachment someone sends me, I should not be able to save it *in* the email they sent. The email they sent should have the attachment they sent, as they sent it. (unless I'm missing something here)...
    2. Re:My vote is for Notes by Anonymous Coward · · Score: 0

      But you can - you can do both.

      Please read your online help documentation. Ensure you are on version 6 (Version 7 is the current version now shipping).

  257. Just give them gmail accounts by Anonymous Coward · · Score: 0

    would could be simpler?

  258. I'm going to repeat what someone else has said. by Inoshiro · · Score: 1

    You want Postfix + Cyrus IMAPD. These are your core elements; Postfix is easy to chroot, run SSL, SMTP auth, and will work with SQL or OpenLDAP, etc. Ditto for Cyrus. There exist ways to interface it with several account management sides. The fine folks at CMU have designed it to scale out the ying-yang. I've never had a peep of trouble out of either piece of software.

    Cyrus will also provide you with POP3, in addition to IMAP.

    If you want to extend the system, there are many ways to do it. On top of my Cyrus/Postfix setup, I have a procmail glue layer which runs DSPAM and any custom rules. I use MySQL for the aliases, auth table, etc. I have mod_php and Apache setup with Squirrelmail. Email is the most complex suite of applications that my Linux server does, and it does it flawlessly. I have never lost a single bit of data. I'm using a RAID array with regular backups, though :)

    --
    --
    Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
  259. www.horde.org by jayed_99 · · Score: 1

    A good webmail option is kinda a catch. Squirrelmail is nice, but compared to OWA its really out of its league.

    I recently went through the quest for a decent webmail client for my home network. I have seen the promised land, and it is The Horde.

    PHP front end. Multiple storage backends from filesystems to the standard gaggle of databases. An interesting web-accessible VFS that I can see being really useful in a corporate environment.

    IMP (the mail component) can read mail from multiple sources -- either POP3, IMAP or IMAP/SSL (maybe more, those are just the ones that I know). It also deals with spam management at the individual client level.

    Consolidated bookmarks that are web-accesible; notes; tasks; calendars; address book.

    It can use LDAP (as well as about three dozen other things) for user authentication -- an important consideration when contemplating 1,000,000 users.

    A little apache magic, and it's all SSL secured.

    I don't know how it would work in a large, large environment, but with Postgresql for a backing store, I imagine it could scale as far as you wanted it to.

    I've only been using it for a few days, but I'm really impressed.

    1. Re:www.horde.org by bobv-pillars-net · · Score: 1
      I have just one thing to say about IMP and Horde.

      Anybody who loves it hasn't read the source code.

      Shudder!

      --
      The Web is like Usenet, but
      the elephants are untrained.
    2. Re:www.horde.org by techwolf · · Score: 1

      I did similar but for a much lower level of requirements. Eventually, I was pointed here:
      http://www.roundcube.net/

      --
      I don't do this for karma, I do it for cash. It's much better.
  260. Software choices by Craig+Ringer · · Score: 1

    I find your design interesting.

    The only large mail system I'm familar with uses Cyrus IMAPd's clustering facility, OpenLDAP, and postfix.

    I'm particularly curious about the choice of Courier, and of NFS.

    Courier I have little enough experience with to comment on. I was under the impression that it was a bit old and crufty, didn't have header caching for IMAP or other useful performance enhancements, and wasn't overly well suited to "sealed server" operation (rather than servers with direct user logins too). I would be interested to know why you passed up Cyrus IMAPd, as in my experience it's fantastic software that "just works" and I know there are sites that use it for gigantic volumes of mail.

    I'm also interested in knowing what platform you're using given your use of NFS, though I guess maildir might be safe even on Linux NFS.

    1. Re:Software choices by chrome · · Score: 1

      The Courier-IMAP part is fine. I don't know about the MTA side, as I wouldn't touch it. But its a solid Maildir IMAP/POP3 server.

      We currently use an EMC NAS, but if it were up to me we'd be using Netapps. Had some Solaris NFS servers in the past. Linux would work just as well, as woudl FreeBSD - as long as you were willing to sort out the HA stuff yourself.

  261. It sounds to me... by pjdepasq · · Score: 1

    It sounds to me like you work AT Microsoft and have all finally seen the light!

    Best of luck dude. Be sure to post your solution on the web so we can all learn from your experience (if you can).

  262. Step one: reliable storage by Blasphemy · · Score: 1

    First things first, you need reliable storage. We have NetApp filer doing the job. The delivery mail spool is shared amongst servers over NFS on a dedicated GB LAN.

    Second: Load balancing. We use an F5 BigIP to balance incoming mail connections (smtp, imap and pop3).

    Third: separation of duties. We have a set of externally connected mail servers. These systems route all mail. Mail destined for local delivery gets transferred to a second set of internal mail servers. The external mail servers block spewing, ban known spammers and virus check smaller mail files. The internal mail servers accept mail only from the external mail servers. They run spam assassin and clamav to stop spam and viruses. We use exim as the mta, but anything will do.

    That's it.

    We use maildir for mail storage, cyrus-imapd for imap and a custom system for pop delivery (which sucks, but I won't get into that here).

  263. I'd start here... by crankyspice · · Score: 1

    The Exchange Replacement How-To. LDAP, IMAP, POP3 (if you must), etc., etc. Open tech that works together and scales as high as you can add servers for it to...

    --
    geek. lawyer.
  264. Sounds like your admin suck. by CFD339 · · Score: 1

    I have a server that handles 130,000 web access users -- has for something like 7 years. I'm 3000 miles away. I never get called. Its all automated.

    --
    The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
  265. Are you qualified for this kind of work? by Anonymous Coward · · Score: 0
    "where would you start?"

    I would start by hiring someone who knows what the hell they're doing, which you obviously don't if you have to ask here.

  266. I'm late to the party, but... by abulafia · · Score: 1
    I didn't see this elsewhere here, but it is a great read: A Highly Scalable Electronic Mail Service Using Open Systems. Probably a bit dated now, but it will get you started thinking about things.

    Part of the abstract:

    In the design of any of our service architectures, we have several requirements that must be met before we would consider deployment. For email, the first of these is message integrity. It is absolutely essential that messages, once they are accepted by our system, be delivered to their proper destination intact. Second, the system must be robust. That is, in as much as is possible, the system should survive component outages gracefully. Additionally, the entire system design should minimize the number of single points of failure. Third, the system must be scalable. When EarthLink began deployment of the current architecture, in January of 1996, we had about 25,000 subscribers. In September of 1997, EarthLink provided email service for over 350,000 subscribers with a 99.9+% service uptime record. In fact, we expect the current system to scale to well over 1,000,000 users without significant alteration of the architecture as presented here. Moreover, one should be able to accomplish the scaling of any service with a minimum of outage time, preferably with none. In all cases the performance of the service must be at least adequate, and the service must be maintainable. Problems must be easily recognizable, and it should be obvious, whenever possible, what is the cause of the outage. Further, its solution should be easy to implement and, in the meantime, the impact of the outage should be small and locally confined. Finally, we would like the service architecture to be cost-effective, not just in terms of equipment acquisition, but, more critically, in terms of maintenance.
    --
    I forget what 8 was for.
  267. Try this... by drasfr · · Score: 2, Informative

    Just an idea... if you want to go with open sources products in your company.

    First, the most important is the backend storage.

    - I would try using a SAN for storage, like a small Clarion for example. I would carve the storage for the mail there on a volume.
    - I would create a set of export servers that would connect directly to the SAN and re-export the volumes to a set of front end servers using a combination of gndb, gfs, etc...
    See this document:
    - http://www.redhat.com/magazine/008jun05/features/g fs/
    - configure a set of servers that would act now as the mail servers themselves (frontends). I would strongly suggest using maildir. CourrierIMAP for the pop3/imap accounts is great. Install this on all the machines. For the SMTP agent you could use courrier but I usually prefer Exim.
    - run both the IMAP/POP/SMTP servers on all the servers, using maildir only.
    - use a mysql database to store the users information (passwords, email addresses, etc...). You might want to configure 2 mysql servers. One as the Master slave that will receive only the writes, and the other that would be accessed for read and balanced with the first one as reads to access user information and accounts will probably be 99% of the database activities.
    - use a load balancer to put in front of all the frontend servers, do a load balancing for all the services (POP3/IMAP/SMTP) with sticky session that will try to keep the same users on the same machines when they try to download their mail.

    When you are running out of capacity, simply adds new frontends, put them behind the load balancers and voila...

    of course I would advise going right away with powerfull 2x3.6GHZ P4 servers and like 4GB of memory. That is powerfull and can certainely serve a LOT of users already per server.

    my 2c, written quickly. I apologies if not complete but I am pretty sure the general idea is there and sound.

    open to comments

  268. You are wrong in every way. by Some+Random+Username · · Score: 4, Insightful

    There is absolutely no reason at all to leave 80% free space, 15% is more than enough to ensure you don't have fragmentation problems (I am assuming you are using a reasonable filesystem of course).

    Second, people with rediculously frequent mail check times are not any more of a problem. Modern operating systems use file system caches. You do not have to touch the disk subsystem in any way, frequently accessed data will be in RAM.

    And finally, a database has alot of extra overhead, and there is alot of deletes going on. Sure, such a select statement would work, but reading the files in one directory is an order of magnitude faster. And the deletes will really hammer your database. FFS+softupdates makes file deletion extremely fast. A relational database is not the answer for everything, stop trying to pretend it is. Use the right tool for the job, and for storing files, a filesystem is the right tool. Its not relational data, it doesn't need to be queried in arbitrary, complex ways, so it doesn't belong in a relational database.

    1. Re:You are wrong in every way. by cecil_turtle · · Score: 2, Informative

      With such a strongly worded title you'd think you'd actually have some experience to back up your claims. Memory access is faster than disk access, period. I don't care what file system you're on or what kind of caches the OS implements, fact is it's going to go to the disk almost immediately to store the change. And we're not talking about one user checking every minute. We're talking about tens of thousands of users checking every minute or few minutes. That's a continuous load on the disk - not a desirable situation for a server. Also remember that access logs are also written to disk as well.

      I'm not arguing that relational db's are the way to store everything; I'm totally about the right tool for the job. But file systems are good for storing files, they're not intended for the level of data updates (new files / deleted files) that a high use email server generates. Databases are. Also disk writes from databases are also optimized if your database is well designed and resistant to paging. If you don't want a RELATIONAL database, fine. There are other types of databases you know. Mail servers don't have anything in common with file servers in terms of resource usage.

    2. Re:You are wrong in every way. by cecil_turtle · · Score: 1

      Sorry for the double response but you also forgot that the OS will only take a small portion of the memory on a server, so again with the number of accounts we're talking about here the filesystem cache you speak of will become insufficient very quickly. The userspace memory is the proper place for that amount of data. You might not notice the difference in performance on a small implementation but when you get a capable server with this kind of load on it you will find that databases scale much better - anything that regularly goes to the disk is asking for trouble. From your comments it sounds like your experience is probably on a much smaller scale.

    3. Re:You are wrong in every way. by Some+Random+Username · · Score: 0, Flamebait

      Again, you are completely and totally wrong. Your experience is clearly lacking, anyone with ANY experience with filesystems and with RDBMSs will tell you, you are wrong. Trying to pretend I am making this up for some reason only makes you look even less knowledgable.

      You don't care what kind of cache the OS impliments? Maybe that's why you have no idea what you are talking about? Start caring, and start learning. Filesystem reads DO NOT go to disk, that's the fucking point, that's what a cache is, its storing it IN RAM. Filesystem reads go to the buffer cache, if something is not in the cache, the kernel reads it into the cache. Those tens of thousands of users checking every minute are NO LOAD ON THE DISKS AT ALL. The database would cache this data in RAM to prevent disk access, and so does the filesystem. I suggest you read a book on filesystem or even just general OS internals so you understand that what you are saying is total nonsense.

      And again, filesystems are much, much, much, a huge fucking enormous order of magnitude faster at performing data updates than a database, especially since the database also has to update its transaction log and indices, and flush them to disk. And like I said, FFS + softupdates makes deletes even faster. Mail servers don't have anything in common with file servers in terms of resource usage? Maybe you should tell that to all the hundreds of huge mail system admins out there who will tell you that you are completely and totally full of shit. Its creating, reading and deleting files, sounds pretty fucking fileserver-like to me. Sticking it in a database makes database server-like, sure. It also makes it much, much slower.

      And for your second post, maybe you use windows for your mail servers or something insane like that? But in the real world, the filesystem's buffer cache can and will be as big as it can. Do a top on a linux machine with a 2.4> kernel, and notice how free is almost none? Its using all the spare RAM for the buffer cache. So quit talking complete bullshit and then lying more when you get called out on it.

    4. Re:You are wrong in every way. by paskie · · Score: 1

      What OS are you talking about anyway? If you take a reasonable OS (let's assume Linux), it will give you as much cache as you have of free memory - there is no partitioning of memory to "kernelspace" and "userspace" - the cache size is dynamic. So either you have the memory used by the database, with all its overhead, or the same amount of memory is going to the caches of the same data managed by the kernel. Your post implies that your memory is big enough to hold all the mails in the database's core, without accessing the disk - but in that case, why wouldn't it be big enough to hold all the mails in the kernel's caches?

      Your unclepost's first paragraph makes even less sense - if there is modification to record, you have to go to the disk _anyway_ if you want it stored. Accesslogs are irrelevant, they have no relation to where you store your mails.

      --
      It's not the fall that kills you. It's the sudden stop at the end. -Douglas Adams
    5. Re:You are wrong in every way. by eh2o · · Score: 1

      Finding a row in an indexed table is at least as fast as finding a file in a directory. Most network DB connections involve converting binary data to string, but that is only a constant factor in space/performance. Your assertion that reading a file is really an order of magnitude faster than reading a row from an indexed table is dubious at best.

      Caching, low priority deletes, etc are also found in most modern DBs. Many DBs are also quite resource efficient (I can't speak for Oracle though....).

    6. Re:You are wrong in every way. by Some+Random+Username · · Score: 1

      No, finding a row is not at least as fast as reading a file. Lying will not change anything. Databases have additional overhead, handling the connections, parsing the SQL, scanning the indexes to find the row, fetching and formatting the row, and returning it. This even involves multiple trips across the kernel/userland boundry as many syscalls happen. Try tracing a database that is doing nothing, and see the tons of calls made just to do a single simple query. Then do the same thing with a program that just reads the contents of a file, and notice how its far fewer calls being made.

      The simple, undeniable fact is that every RDBMS has more overhead than the filesystem, and offers NOTHING in return when dealing with simple file storage which needs none of the features of a database. A database is great for being a database, it is not great for being a filesystem, that's not what it was designed for, but it is what the filesystem was designed for (suprise!).

    7. Re:You are wrong in every way. by nazzdeq · · Score: 1, Informative


      I would disagree with the filesystem approach as high availability and disaster recovery aren't as good. If you think the relational database approach won't work, you're not using Oracle 10g RAC.

      Billion row tables are common. Deletes on these tables aren't an issue as you're deleting by primary key and you can store that in an index-organized table partitioned however you want. I manage TB size database daily, I know.

      Using a real relational database (not lame ass mySql) you can take advantage of all of the high availability & disaster recovery features of Oracle too, like backups, RAC and standby databases.

    8. Re:You are wrong in every way. by Some+Random+Username · · Score: 1

      I didn't say a database wouldn't work, I said its slower and provides no benefits compared to a filesystem. You never need any of the power of SQL to figure out what data to fetch, its not even relational data. Its a simple "read this file in that dir".

      I am not sure what planet you live on where filesystems don't have HA and can't be backed up. Believe it or not, files can in fact be stored in multiple places at the same time, much like data in a database can. And if you want really good HA file storage, get a Netapp.

    9. Re:You are wrong in every way. by Anonymous Coward · · Score: 1, Interesting

      I think your comments are true to a point. I have been very happy feeding mail to 4000+ users on a single server using FFS+softupdates. However, there has been a study on this and although it's one study, it is data:

      http://www.google.com/url?sa=t&ct=res&cd=2&url=htt p%3A//www.usenix.org/events/lisa03/tech/full_paper s/elprin/elprin_html/&ei=rQghQ4zwNLPyYLHzgaIN

    10. Re:You are wrong in every way. by Some+Random+Username · · Score: 1

      Well, that does show pretty clearly that mysql is slower, WAY slower for deletes. It doesn't even do normal operations like reads unfortunately. All it really did show was that searching is faster if you have indices to search. This didn't really need to be shown, everyone already knows this. However, you don't need a database to get an index, any imap server could impliment indexing if they wanted to, and in fact dovecot has:

      http://www.dovecot.org/

    11. Re:You are wrong in every way. by eh2o · · Score: 2, Interesting

      As I said before, all of those things add up to a constant overhead. (but maybe you never took a class on algorithms so you don't know what I'm talking about...)

      In order to say that an RDBMS is an order of magnitude slower, one most show that as load increases the overhead of the DB grows faster than that of a FS doing the same task. (and, generally, to say that this difference is "an order of magnitude" the spread between them should increase at least linearly).

      Doing a trace on a DB for a simple query tells you absolutely nothing about its scalability.

    12. Re:You are wrong in every way. by csirac · · Score: 2, Informative
      there is no partitioning of memory to "kernelspace" and "userspace"

      Yes, there is. Try some kerneltrap articles to learn more about Linux (and OS internals in general :-). This article describes how systems with > 1GiB "big memory" works on Linux on ia32, which is reminiscent of the himem.sys days of MS-DOS ;).

      2^32 = 4 "GiB". That's all you can address with 32 bits.

      On ia32, Linux allocates 1GiB of virtual address space to the kernel, and the remaining 3GiB to user space.

      Thus, the maximum amount of physical memory that can be mapped to a stock ia32 kernel is 1GiB.

      This is enabled via the PAE (Physical Address Extension) extension
      of the PentiumPro processors. PAE addresses the 4 GB physical memory
      limitation and is seen as Intel's answer to AMD 64-bit and AMD
      x86-64. PAE allows processors to access physical memory up to 64 GB
      (36 bits of address bus). However, since the virtual address space is
      just 32 bits wide, each process can't grow beyond 4 GB. The mechanism
      used to access memory from 4 GB to 64 GB is essentially the same as
      that of accessing the 1 GB - 4 GB RAM via the HIGHMEM solution
      discussed above.


      There are awkward things you can do at kernel compile-time to get more than 1GiB accessible to the kernel on ia32, but it's not as pretty as you seem to be thinking.
    13. Re:You are wrong in every way. by whmac33 · · Score: 1

      unclepost's!

      Never saw that on slashdot before. Nice

    14. Re:You are wrong in every way. by ibbey · · Score: 1

      I'm NOT an expert on the subject, but I think you're misinterpreting him. You don't need to store the entire database in memory-- only the indexes & possibly a few additional fields (subjects, senders & dates). The vast majority of read access will be users checking their mail-- and most of the time, they won't have any new mail. A simple check of the index will tell you that. If they do have new mail, you go out to the disk & read in the actual mail. The over head of reading the mail will be more then just reading the disk, but you should make up for it with the savings from caching the indexes.

      In the file system, you can't tell the system to only cache the relevant memory. It will automatically cache what it thinks is relevant. (as far as I know-- like I said, I'm not an expert-- please be polite if I'm wrong)

    15. Re:You are wrong in every way. by csirac · · Score: 1

      Doing a trace on a DB for a simple query tells you absolutely nothing about its scalability.

      How remarkable these algorithms guys are. The thing about SCALING in this scenario is all about the ABSOLUTE terms of the physical implementation.

      It's sometimes known as 'DIMENSIONING'.

      WHY IS IT SO IMPOSSIBLE FOR SOFTWARE ENGINEERS TO UNDERSTAND THE CONCEPT OF "COUNTING CYCLES"?

      If you find that 100 syscalls are being made per DB query and only 10 per FS query, that tells you quite a lot. It means that the FS implementation is using 1/10th the number of syscalls to get the same amount of work done for this particular scenario.

      Does that prove in any way it's going to be 10x more efficient in every scenario based on this test alone? NO, but considering the expense of switching contexts and so on I would be VERY surprised to see anyone who knows what they're talking about find this metric totally worthless in the same way that you have.

      I'm an Micro-EE, and having worked on a couple of software projects with some software eng. guys I can really recognise your mindset.

      There are two things that seem almost incomprehensible to them:

      1. That a whole bunch of crufty redundant code in tight loop, "doesn't matter because it doesn't add to the complexity because it's constant time"

      2. That ADDING an additional nested loop in place of complex state-machine type logic could possibly be more efficient... "but it's O(n^3) now instead of O(n^2)!!!"

      Sure, that's a counter-intuitive example - but my thinking was not limited by their "n^3 is bad" mantra they had "learnt". I was able to easily recognise that their stupid, massive state-machine logic was way more cycle-heavy for the current problem parameters than simply replacing it with another very simple loop. It shrunk that particulary cpp file by 500 lines and resulted in about a 6x speedup for that section of code.

      Was my solution going to be 6x faster for all possible data sets? No, in fact it would have been much SLOWER given large data sets but the problem paramters were clearly defined (not to mention we were already taking worst-possible-case data due to limitations of other parts that were beyond our control).

      COUNT THE CYCLES. Do the maths. Big-O is a useful TOOL but most people seem to be completely missing its point.

      You can NOT just throw those CONSTANTS away and forget about them. At some point, you have to plug in those co-efficients, because you may be surprised to learn that the O(n^3) solution works quicker than the O(n^2).

    16. Re:You are wrong in every way. by swillden · · Score: 2, Informative

      There are awkward things you can do at kernel compile-time to get more than 1GiB accessible to the kernel on ia32, but it's not as pretty as you seem to be thinking.

      Well, if setting CONFIG_HIGHMEM=y counts as "awkward". The kernel docs say that's the correct setting for machines with between 1 and 4 GiB of RAM.

      From my laptop (with 1.5GB RAM and a Linux 2.6.13 kernel, with CONFIG_HIGHMEM=y, which is actually the default setting on most distros these days):

      %cat /proc/meminfo
      MemTotal: 1555496 kB
      MemFree: 32000 kB
      Buffers: 137900 kB
      Cached: 1228596 kB

      1228596 KiB == 1199.8 MiB == 1.1717 GiB

      So my kernel is currently using more than 1GiB for caching disk storage. In fact, my kernel can address up to 4GiB of RAM. I have another 1GiB DIMM on the way (which will push my laptop to 2 GiB RAM), so in a few days I'll be able to show my machine caching around 1.7GiB. (Yes, there is a reason I need 2 GiB RAM in my laptop, and it's actually not file caching).

      For machines with more than 4GiB of RAM you have to use PAE. That will allow the system to use up to 64GiB of RAM, but each process (including the kernel, even though it's not really a process) can only access 4GiB. So, your argument holds some water in the case that:

      • The machine has more than 4GiB to use for caching.
      • The machine has a 32-bit processor
      • The database engine runs multiple cache daemons, each of which caches up to the 4GiB of data it can address, and the actual server process has some mechanism for querying the correct cache daemon for a particular chunk of data.

      In that case, the database can cache more than the kernel. I'm not aware of any database engine that has such cache daemon processes. IMO, if you're putting more than 4 GiB in the box, you should probably go ahead and buy an Opteron for it also, avoiding the whole issue.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    17. Re:You are wrong in every way. by The+Clockwork+Troll · · Score: 1, Funny

      You're very angry and I appreciate that! I hope you equally appreciate hearing that your argument is a fucking failure because it fucking assumes that implementing a fucking scalable fuckfuckinging mail server reduces to a few fopen(); read(); fclose() calls.

      Anyone who's actually done more than tried to fucking lecture somebody the fuck on why they're not using the right fucking approach to implefuckmenting a fucking cock ass fuck shit mail server would realize that if you don't use some sort of database back-end, you still end up creating data fuckstructures and cockindexes on top of this magical filesystem of yours, and fucking code to manipulate those fucking data structures, and code to update the fuck out of those data structures and fucking indexes, and fucking code to fucking partition the godly file system directories so they are fucking balanced, and lo and behold you've fucking implemented something that looks like a fucking database! The difference being, you wrote it so it's got newer and more interesting fucking bugs than My-fucking-SQL, but god fucking damn it, you used the FILE SYSTEM so eat a fuck!

      --

      There are no karma whores, only moderation johns
    18. Re:You are wrong in every way. by csirac · · Score: 1

      Hey, don't get me wrong. I have no idea why I would want any email in an SQL database either. As was pointed out, a filesystem storage scheme suites email perfectly.

      I'm developing an app that involves image archiving. The SQL DB contains image IDs which translate to files on a JFS filesystem that hold the actual (1MiB or so) images.

      There's a time and place for everything... perhaps I'd put email in an SQL DB if I was adding plenty of searchable meta information to each message for statistical analysis and stuff...

      Was just trying to be informative ;)

    19. Re:You are wrong in every way. by koniosis · · Score: 1

      From that paper:

      The obvious performance differential between the database options and both UW and Courier indicates that email storage is indeed a problem well suited for a database solution.

      But you are right about file deletions MySQL does take longer, but almost all other tests it appears to out-perform the file systems, especially if the tests are repeated over (because of caching the results to said tests)

      --
      I spent ages trying to think of sig, but never did :(
    20. Re:You are wrong in every way. by rahard · · Score: 1
      Filesystem reads DO NOT go to disk, that's the fucking point, that's what a cache is, its storing it IN RAM. Filesystem reads go to the buffer cache, if something is not in the cache, the kernel reads it into the cache. Those tens of thousands of users checking every minute are NO LOAD ON THE DISKS AT ALL.
      I am a fan of flat filesystem, but I think it's different story for 1 million users. How big is your RAM if you want to cache the whole (mail) thing?

      I have a large mailbox (> 500 MB) and I don't think I am the only one with big mailboxes. I could only imagine if there are 1 million users accessing their mailboxes (say with IMAP).

      Even doing scripts with many `find' can sometimes make my system goes into a crawl ... A friend did this:

      for i in `seq -w 1 1000000`; do mkdir $i; done

      mkdir: cannot create directory `33147': Too many links
      mkdir: cannot create directory `33148': Too many links
      mkdir: cannot create directory `33149': Too many links

      Still thinking of creating 1 million files?

    21. Re:You are wrong in every way. by TheRaven64 · · Score: 1
      Real world example: Quicksort. This algorithm is O(n^2), while everyone knows that good comparison-based sorting algorithms are O(n log(n)). Therefore, Quicksort must be rubbish, right?

      Quicksort works well in the real world, because in the majority of cases it doesn't come close to its worst-case performance.

      --
      I am TheRaven on Soylent News
    22. Re:You are wrong in every way. by greenhide · · Score: 2, Insightful

      In a related vein:

      I'm a lowly web programmer, not nearly as brilliant in the programming field as these other geniuses here, but I find it interesting that almost all web programming books tell you that if you can move processing into the database query instead of running it in the machine code, that it'll be faster.

      This is so rarely the case. Unless you have a very powerful database server, odds are good that quite a lot of the various aggregate functions you might want to run will go much, much faster if you simple do a simple select in the database and then loop through the processing in the web app code. Not sure why this is true but it is.

      A month or two ago I heard a great quote on Cartalk that I think should be plastered to every programmer, scientist, and engineer's bulletin board:

      "Reality often astonishes theory."

      In all honesty, though, I think that a database *would* be up to the task, even for 1M+ users. Consider Amazon, which probably gets several thousand simultaneous hits each second. And each page they pull up involves much more complex data searches than a simple mailbox.

      I'd say the key concerns here aren't surrounding efficiency of processing. Mail servers, no matter how configured, are relatively low on the scale of computational complexity. It's more a size issue than anything else. The main problem will be determining how to store the data in a way that is safe, secure, fast, and reliable. Because the data needs to be redundant and widely dispersed (as in the New Orleans example someone pointed out above), it may be that a database, while not the fastest tool, may be the best tool for the problem.

      I'll admit; I know nothing about how one would go about making identical file systems available simultaneously on many distant servers. But I'm guessing once you start doing that, you're starting to increase the complexity for the system in any case.

      --
      Karma: Chevy Kavalierma.
    23. Re:You are wrong in every way. by dwandy · · Score: 1
      lovely rant you've got going here... but
      its not even relational data.
      really?
      How is "I want messages sent to XYZ that are unread" any less relational than "I want cars at dealer XYZ that are unsold" ?
      It seems to me that e-mail is a collection of related data - perfect candidate for a relational database.

      just curious...
      --
      If you think imaginary property and real property are the same, when does your house become public domain?
    24. Re:You are wrong in every way. by Wdomburg · · Score: 1

      Reading files from a directory is faster than an indexed database query? Interesting theory, and perhaps true with a small number of files that happen to be hot in the buffer-cache.

      If we talking about a reasonably loaded mailserver, with a lot less memory than disk space (we average about 512MB per 100GB of storage), no. The machine will choke on a flood of itsy bitsy teeny weeny disk ops. This is, however, where a database shines - whether we're talking about something fully relational, or something embedded (like a dbm or bdb file).

      Databases are not only appropriate for when data is being queries in "arbitrary, complex ways". The majority of databases out there are used to provide simple, well-defined access to indexed persistent data. This use pattern suits mail just dandy.

    25. Re:You are wrong in every way. by timjdot · · Score: 1

      Please don't attribute to SW Engineers what is corrctly attributed to inexperience. Alot of db advocates never worked much with filesystems. Anyways, I think ORCL plops its own file management instead of the OS so that underscores your point. BTW, IME ORCL is a bit of stone soup as the query optimization is really a manual process of determining indexes et cetera so does not really offer the savings versus a properly designed file-based system. Of course, dba's a re more prevelant and the tools to hook to db's well established and taught in the industry. BTW, email is inherently easy to scale as the work load is inherently not interconnected. A tools vendor who tells you email serving is a hard problem should be crossed off the list. It's just not that hard of a problem and the only hard parts are the load balancing and management of cheap resources and backups. And, of course, choosing proper hardware and design to avoid issues with HW failures and continue operating robustly. Best wishes, TimJowers

      --
      Expect Freedom.
    26. Re:You are wrong in every way. by WinterSolstice · · Score: 1
      (Yes, there is a reason I need 2 GiB RAM in my laptop, and it's actually not file caching).

      Let me guess... you are running Lotus Notes AND Firefox on your desktop? Or is it Notes only a pig under Windows?

      -WS

      --
      An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.
    27. Re:You are wrong in every way. by Anonymous Coward · · Score: 0

      That's more of a hardware limitation than something Linux would impose if it had the choice. On x86 every process's page table must have the kernel memory mapped. The user-process-after-1GB thing is just an arbitrary limit someone picked. Most (all?) 386+ kernels already do this.

    28. Re:You are wrong in every way. by BorisAmmerlaan · · Score: 2, Insightful
      A friend did this:
      for i in `seq -w 1 1000000`; do mkdir $i; done

      So you took the nearest LART and Enlightened him.

      Seriously, though - is there ever a reason to stick 1,000,000 objects into one container without any regard whatsoever to the type of objects or container? (Ignorance doesn't count.)

    29. Re:You are wrong in every way. by MadAhab · · Score: 1

      No, storing mail in a database is still stupid. Backups are now unnecessarily complicated. You have to write your own mail server rather than scaling up an existing one. Etc. But by all means, continue wanking.

      --
      Expanding a vast wasteland since 1996.
    30. Re:You are wrong in every way. by swillden · · Score: 1

      Let me guess... you are running Lotus Notes AND Firefox on your desktop? Or is it Notes only a pig under Windows?

      :-)

      Actually it's because I need to simulate a half-dozen machines using VMWare. Running Linux plus six copies of Windows Enterprise Server 2003 consumes a little bit of RAM, even when you give the VMs the bare minimum they can run with.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    31. Re:You are wrong in every way. by Some+Random+Username · · Score: 1

      If car dealers have associated data, and cars have associated data, and they are related, then its relational data. Email is not like this. Messages sent to xyz isn't relational because who its sent to is an attribute of the mail, there is no seperate "table" containing "people" objects and their associated data.

      Unread mail is in the "new" directory. All you would need to do is scan the new directories index to find messages sent to xyz, exactly the same as what a database would have to do, without the additional overhead of handling connecions, parsing sql, dealing with locking and contention from so many queries on the same table at the same time, updating transaction logs and flushing them to disk (making more disk access instead of less), etc.

    32. Re:You are wrong in every way. by henni16 · · Score: 1

      odds are good that quite a lot of the various aggregate functions you might want to run will go much, much faster if you simple do a simple select in the database and then loop through the processing in the web app code. Not sure why this is true but it is.

      Do you have an example for this?
      I only know counterexamples for that, especially if there is some latency between the web- and db-servers (or in general: the DB and the Client, be it a Webapp or another program).
      The only thing I can think of are:
      * you always use "serializable" transcations and have lots of users
      * you have heavy load on the database server(s) (maybe because of complex joins) and less load on the webserver(s) and that heavy load slows down the simple selects
      * someone did a horrible stupid job implementing the aggregate functions ;-)

      But if you only have simple selects and aggregate functions I can't think of an example;
      especially since there are some functions (max, min) where the database doesn't even have to do calculations if the fields are indexed (I know you didn't say _all_ functions).

      I am honestly curious about examples for the other case

    33. Re:You are wrong in every way. by eh2o · · Score: 1

      It is well known that for various small problems the less efficient algorithms are actually faster. e.g., sorting when n < 5 or so is faster using a naive approach. Big-O analysis includes worst/best/average case analysis as well as input constraints. Maybe your software engineers partied too much in school (or didn't go at alll) because Big-O done right is not just a matter of drilling some mantra.

      Rules of thumb; 1) optimize constant factors last (simple economics... witness the massive popularity of scripting languages). 2) benchmark and use a profiler under conditions as close as possible to expected in the real world. To do otherwise is irrational hubris or just plain naive.

    34. Re:You are wrong in every way. by greenhide · · Score: 1

      Unfortunately I can't remember the exact instances where this has been true. To be honest, it really is 50-50; I find that it's been less true recently, possibly because I've become a little wiser when setting up the databases to begin with.

      Actually, a good example is when you need to return a recordset with only row per "real" record, but you need to pull the first value only from a one to many relationship with that record -- like, say, details of the first transaction for that record.

      So say the query that takes too long looks like this:

      SELECT
            ( SELECT order_total FROM Orders WHERE Orders.order_person=People.person_id AND Orders.order_id = ( SELECT MIN(order_id) FROM Orders WHERE Orders.order_person=People.person_id)
      ) as earliest_total,
            Person.first_name,
            Person.last_name,
            [etc]

      FROM
            People
      ORDER BY
              People.last_name, People.first_name

      Depending on the DB server, it may be faster to just do:

      SELECT
            Person.person_id,
            Person.first_name,
            Person.last_name,
            [etc]

      FROM
            People
      ORDER BY
              People.last_name, People.first_name

      LOOP OVER QUERY
            SELECT order_total
            FROM Orders
            WHERE order_person=[person_id]
            ORDER BY order_id ASC

            earliest_total=order_total[0]

      Of course, keep in mind I'm just pulling out an example that I can remember. Sometimes, especially with a lot large tables all joined together, some of these subqueries can take a lot of processing time -- we're talking 15-30 second. On the other hand, most generic selects can take just 1-2ms, even with database connection overhead.

      --
      Karma: Chevy Kavalierma.
    35. Re:You are wrong in every way. by rahard · · Score: 1
      Seriously, though - is there ever a reason to stick 1,000,000 objects into one container without any regard whatsoever to the type of objects or container? (Ignorance doesn't count.)

      Like you said, it was just a (silly) oneliner attempt. Well, another person tried to do it more hierarchical with subdirectories, eg:

      /spool/b/o/r/i/s/a/m/m/e/r/l/a/a/n/
      ...
      /spool/b/u/d/i/r/a/h/a/r/d/j/o/
      /spool/u/s/e/r/n/a/m/e/

      We just wanted to experiment that creating 1 million mailboxes is problematic if we want to implement it directly in filesystem. (I still want to try though since I could write tools to mangle mailboxes without having to go through the database.)

    36. Re:You are wrong in every way. by Anonymous Coward · · Score: 0

      I hope you equally appreciate hearing that your argument is a fucking failure because it fucking assumes that implementing a fucking scalable fuckfuckinging mail server reduces to a few fopen(); read(); fclose() calls.

      Of course not - why, silly, your mailserver would never *store* any mails without write() calls. There - open, read, write, close - and I'll give you seek as a bonus. Let the bloody C runtime figure out how to turn that into a mailserver. Wait, that didn't sound quite right. Damn!

    37. Re:You are wrong in every way. by henni16 · · Score: 1

      Okay, correlated subqueries and stuff are more than a simple select for me. ;-)

      So I think the problem for that example won't the DB and the aggregate function, but the nested SQL query:
      If I'm not mistaken, a totally non-optimized execution of that query could result in doing the innermost join of "orders" and "person" (for "min(order_id)") a total of "(numOfPersonsInDB * numOfOrdersInDB) * numOfPersonsInDB" times.

      Okay, in reality it won't be (numOfPersonsInDB*numOfOrdersInDB) but (numOfPersons+x) because throwing away all the tuples that don't match "order_person=person_Id" before executing the innermost query is an almost certain optimization.
      But you still end up with "(numOfPersons+x)*numOfPersons" joins for the "min". In comparison, your pseudo-code does only one joinless (but maybe big) select to get the persons and later does one join of orders and person for each person
      => only numOfPersons Joins of orders and person - and only some slight overhead for possible unneeded data transfer and sorting (not really big if order_id has a non-hash index).
      On a side note:
      That's an example of what I meant with possible problems relating to network latency if you do things in the client instead of the DB:
      if every of these queries in "LOOP OVER QUERY" needs 50-100ms to travel..
      (I recently had much "fun", "using" an application with the exact same "get all FOO, for every FOO get BAR"-code on a DSL line instead of the university network ;-) )

      I suck at SQL if I don't have lots of time, so the following non-correlated version might or might not get the same results :

      SELECT oder_total, first_name, last_name
      FROM orders
      JOIN
      (SELECT min(order_id) as minOrder, person_Id, last_name, first_name as pId
      FROM person, order WHERE person_id=order_person
      GROUP BY person_id,last_name,first_name)
      ON order_id = minOrder

      I *guess* that this will be faster than both of the other versions:
      only two joins and some groping..aeh grouping :-)

    38. Re:You are wrong in every way. by crucini · · Score: 1

      Actually, to say that X is an order of magnitude greater than Y is to say X=KY, that is X is a constant factor times Y. Or more precisely, Y < X < K*K*Y.

      So if algorithm X is always an order of magnitude slower than algorithm Y, we can assume they share the same complexity.

    39. Re:You are wrong in every way. by Wdomburg · · Score: 1

      Whether the idea is implemented is irrelevant to whether it's good. (And for that matter, there are database driven MTAs already.)

      And whether something "complicates backups" is not always the major concern for a particular implementation. If that were true, noone would be using databases at all.

      In the case of mailbox indexing, there's a simple solution to that concern anyways - don't put mail in the database; just shove metadata in there. Voila. Problem solved. You can even leave the actual mailbox in any format you'd like, whether your preference is mbox, mbx, maildir, or whatever. And then your index can be regenerated if it gets corrupted, or the on-disk data changed since last access.

      Have a problem with using an external process for indexing? Fine, use an in-process database - sqlite, or maybe bdb if you don't insist on SQL semantics.

      The approach certainly doesn't make sense for every instance, but as usual, simplistic approachs only make sense up to a point. Hell, even "plain filesystem" approaches have become more complex over the years because the origional ones were simple because they were unoptimized (try putting a gig of mail in mbox format, and then deleting an old message for a good example).

  269. Insightful?? by Black-Man · · Score: 1

    Did you ever think that organizations with customer care need their messages stored in a relational database because they need to reference the threads, i.e. the history of the communication? And you could be talking100's of thousands of inbound email *every* day?

  270. Re:Obviously - RTFM!!! by elrick_the_brave · · Score: 2, Informative

    Exchange 2003 - any edition. You can scavange the restored database and bind it to any account that doesn't have any exchange.. I.E. a new temporary account... RTFM!!!!!

    --
    (1st sig) If this were a snappy sig, you'd be reading it right now. (2nd sig) I'm a karma whore. >Insert FUD here
  271. One Word -- Openwave by Anonymous Coward · · Score: 0

    While Sendmail's the only other MTA with the flexibility and performance, even it can't keep up with Openwave Email Mx. And with Sendmail, its a beast to manage on a large scale and its only one piece of the puzzle. Trust me, the price is worth the performance and managebility. It has all the features you ask for, plus. Obviously, hardware is the other part. Netapps are a good mid-range product, but they love their stuff. Get ready to pay twice for it. There's a range of load balancers out there that fit the bill. If its a high traffic site, go with Foundry's ServerIron. Otherwise, you'll want just stick with Cisco.

    www.openwave.com

  272. Erm, non-existant free calendaring on *nix? by Vlad_Drak · · Score: 1

    Sorry, but calendaring is not there yet on *nixes, try back later. Yeah, iCal, WebDAV, blah.. nobody has done it yet for free; at least not in the users' eyes.

    Hula's calendaring is looking swank these days, though. If the codebase ever becomes more modular (split out SMTP/POP/IMAP/Calendar), then, Hula might make some big waves.

    You can roll your own inbound/outbound MXs and filtering boxes but the creamy filling that the people demand is going to come shrinkwrapped.

  273. stay away from Windows by Anonymous Coward · · Score: 0

    99.9% uptime and Windows are mutually exclusive so, yes, you need something non-Microsoft anyway.

  274. IBM and their Notes, the way it was 7 years ago. by 4iedBandit · · Score: 1

    As of 7 years ago, when I worked for IBM, the Notes installation for all of IBM's West Geoplex consisted of SP Nodes. Silver thins running AIX. About 6 frames worth (12 nodes per frame), using high nodes for backup with TSM. Everything was connected to SSA, I can't even remember how many drawers. I do remember when I was cold all I had to do was go stand behind the SSA racks. ;-) High nodes suck, just for the record.

    Everything was tied in with the SP high speed switch, which was connected to two Ascend switch routers. (If I remember the company was bought by Lucent). The Notes complex for mail was tied in to the campus network via the switch routers, and also tied to the Notes Database complex (which was a similarly large SP installation.)

    We were using gated to dynamically change the default routes if one of the campus network connections died.

    We also used pman to monitor the health of the complex. pman notifies you instantly when something goes wrong, where as Tivoli monitoring only polls every few minutes. There were several occasions when I worked there where we detected and resolved problems before Tivoli ever noticed.

    Is this relavant to what you want to do? Probably not, just reminicsing a bit. I'm one of the few I know who actually liked the SP.

    What they do now, I have no idea. It wouldn't suprise me though if it was Regata Lpars with GigE and Shark disk.

    Now as far as Notes is concerned, RUN, do not walk, away. I can't stand it, but hey, it's your mail system. ;-)

    --
    "The avalanch has already started, it is too late for the pebbles to vote." -Kosh
  275. Murder by Conception · · Score: 1

    I did some investigating on expandable mail systems and the only one I found was cyrus' murder project, http://asg.web.cmu.edu/cyrus/ , and so far it's worked quite well for me. The campus supports 10's of thousands of users. I don't know of any reason why it could not be expanded to hundreds of thousands and beyond. It supports high availability and is quite fast. Also, unlike other servers it allows a single namespace, so no imap1.domain.com, imap2.domain.com, everyone is just imap.domain.com. Check it out.

  276. Ok then by Yakasha · · Score: 1

    And another. DBMSs make integration with web applications a lot easier. Odds are there are already tools and classes and languages (like php) to easily mess with the db.

    Also, if you want your email in a db without using exchange, try dbmail.

    I've set up a dbmail/postfix installation to use with my company's web application we're developing. Though I have no idea how well these work with 1000000 users, they're worth exploring.

  277. Really want to know why the client isn't as slick? by CFD339 · · Score: 2, Informative

    Simple. Its cross platform. The entire product is cross platform. Yeah, like java. Only they did it before java was a pipe dream. Late 80's.

    It has this thing called a seperation layer. All the code except the ui is the same on all the platforms. Clients used to be for os/2, mac, win16, win32, and solaris. Client side that got scalled back because nobody paid for the others -- client is win32 and mac now -- soon with code under linux as part of the next generation client. Lots of people are using on Wine.

    Now, the server is still cross platform. Win32, Linux, Aix, iseries (as/400), zseries.

    The problem with making something cross platform is, you don't use all the nifty little Windows specific integration and custom pretty things. You don't get something for nothing -- you have to make all those bits.

    Oh, the other thing? Outlook feels integrated because everything automatically does the windows automatica launch active-x thing. Just highlight a message subjet, bingo! Embedded code launches! that's why viruses and worms.

    If stuff wants to run in Notes, it has to be have a signature. OHHH, public/private key signatures and encryption. When? 1991. Hunh? Yeah, since 1991.

    If something wants to run in Notes -- It need PERMISSION to run. Thus, no viruses or worms unless you're stupid enough to tell them "OK, sure, go screw up my machine".

    Yes -- the development environment is weird and pretty unsophisticated. It takes a lot of time to learn because its not like other things. BUT -- I can make it do cool, secure, reliable things at a tenth of the cost you can in J2EE or MS .NET.

    Excited about JSR170? Ah, me too. The Notes database internals match it almost perfectly. Domino will make a great JSR170 back end. Hell, its almost that already.

    Meantime, you trolls are whining about a product that runs in Linux as a server and (using Wine) as a client. Runs on Mac. Has a fully functional JAVA environment for development and a remote API through CORBA and DIIOP. No no, instead you'll use a proprietary only -- Windows Only, Active Directory Only, Virus Distribution Engine from Microsoft.

    ahahahahahaha. Enjoy it!

    --
    The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
  278. Clustered Groupwise system. by robpoe · · Score: 1

    Either run it on Netware or OES. Running it on Netware gives you full clustering capability .. which means very little, if any downtime. (I run GW7 on a NW6.5 cluster, and if a node fails or is taken down for maintenance, it's back and running within about 5-10 seconds on a different node. The clients never see it being down, they only notice a few second delay in sending/opening an email).

    GW 7 has NICE webmail, pop3/imap/pop3s/imaps/SOAP, is LDAP compliant, has PDA connect software, and the ability to do webmail with light devices (i.e. your phone).

    --
    = Grow a brain...
    1. Re:Clustered Groupwise system. by M3number3 · · Score: 1

      I concur - there is no reason not to consider the GroupWise system. I remember a few years back that Novell showed NDS/eDirectory scaling to 1 billion (said with inverted pinky to mouth) objects. A clustered GW7 system, well planned, will provide you with exactly what you're looking for. Best of all there are clients for just about everything, AND you can run Post Offices on 'doze or Linux based on your per-location needs. Assuming all the boxes won't be in the same physical location, you should also take some time to carefully plan/analyze your WAN environment. NDS/eDirectory will be far more efficient from a traffic perspective than will Microsoft AD, so you will be a small WAN performance gain right there. Lastly, it goes without saying these days, but employ a reliable SPAM and AntiVirus gateway to make sure that junk isn't sapping up valuable bandwidth. BTW - I would think that with 1 million users (no pinky necessary) that Novell would be willing to pony up some consulting support and agressive pricing in return for the right to do a Case Study on your deployment. Good luck!

    2. Re:Clustered Groupwise system. by halgorithm · · Score: 0

      I couldn't agree more - Novell has made huge strides in GroupWise. If anything out there that could be considered 'Exchange killer', it would have to be GroupWise.

  279. Consider Openwave or Sun Enterprise Mail Platform by sbrath · · Score: 1

    I'd check out the Sun Mail platform or the Openwave platform. They are pay for, but scale very large! Sun also licenses per employee instead of per mailbox, which is a plus for email providers. The Sun Platform has a plugin for Outlook that allows it to totall mimic outlook back end things like calander and such. http://www.openwave.com/ http://www.sun.com/software/javaenterprisesystem/i ndex.xml Just my opinion, For what it's worth.

  280. A few comments. by Some+Random+Username · · Score: 1

    Sendmail or postfix or even qmail will do the job just as well as exim. Just say "use whatever MTA you like" instead of trying to pretend your MTA of choice is the only way to go.

    I found dovecot to be faster than courier, and use less RAM. It also does ldap, ssl, maildir, etc, etc.

    Making a mess of ugly directories is not needed if you are using a decent filesystem. I know the BSD's FFS has dirhash to make handling tens of thousands of files/dirs in a single directory work just fine, Solaris has something too. I'm sure one of the dozens of linux filesystems has this dealt with. And don't bother with a linux or BSD NFS server for something this size, just go netapp.

    Spamassassin!?!? Good lord man, you will need dozens of servers just to run that. Its incredibly resource intensive, it needs its own server just for a few thousands users. Perl and tons of string mangling is not a resource effecient spam filtering solution. Use a statistical filter written in C.

    Don't make all the boxes the same. Its a much more effective use of resources to dedicate these X boxes as MTAs, these as POP, these as IMAP, LDAP over there, etc, etc. You don't want all that stuff on every server, or you are wasting lots of RAM with identical processes on seperate servers that can't share resources. Its also easier to tune the OS to fit exactly what the box is doing, which doesn't work so well when its doing everything.

    Hardware load balancers are not at all needed. Throw a couple OpenBSD machines running CARP and PF in front of the servers. It will be cheaper, and gives you firewall + load balancer in one.

    1. Re:A few comments. by chrome · · Score: 1

      This is the recipe for *my* cake. In *my* cake, I like Exim. I like hardware load balancers. I even like spamassassin - properly configured.

      You don't have to like my cake. You might like a different recipe, and thats fine.

      The OP wanted opinions.

      In my opinion, *my* cake tastes pretty good. Its working for me. You might not agree with it, but unless you build that exact system and try to run it with more than a handful of domains, you really don't know how it would work.

      I've had 3 years with this system, migrated away from qmail, changed back-end NFS, changed the way virus/spam scanning is done ... and its working well.

      You eat your cake, I'll eat mine. We can both be happy then ;)

    2. Re:A few comments. by Anonymous Coward · · Score: 0

      I agree with this mostly up until the OpenBSD machines as firewalls/loadbalncers. The last thing you want on your way to the internet is PC. With spinning disks and fans. Sure you could eliminate them, but the advantages of a purposly built firewall are numourous. Load balancing wise I would still go with a hardware based solution like a Foundry, Cisco, or one of the myriad of others.

    3. Re:A few comments. by Some+Random+Username · · Score: 1

      Uh, did you miss the part where I said "a couple" and mentioned CARP? OpenBSD comes out of the box with redundant, syncronized state, failover firewall functionality. CARP is like VRRP or HSRP, only free. You run two, or three, or four, etc openbsd machines, if ones dies, or you want to take it down, it doesn't matter at all.

      And openbsd doesn't have to be on a PC, it runs on real hardware too, sparc64 and alpha work fine. You can also get no moving parts i386 hardware, not everything with an intel chip is a PC.

    4. Re:A few comments. by Some+Random+Username · · Score: 1

      No need to get all defensive, its just "a few comments" like I said. Because the OP wanted opinions, and your post was close to what I found works well, its faster for me to just point out how he can change your recipe than repeat everything.

      And actually, I did bake almost exactly "your" cake, sans exim. Hence my comments about how to improve it and make it taste better. Its not a perfect cake, it has room to be improved. In future iterations, I refined my original recipe. This is a good thing, not something to be afraid of or angry about.

      Courier was a bit of a pig, so I tried dovecot and found it worked better. SA was an enormous pig, and several statistical filters are both faster, use less RAM, and are more effective. If you've only made one cake, and have been looking at it for 3 years, how do you know you couldn't bake a better cake?

      You won't know if your cake can be improved until you try. Don't be hostile to constructive critisism, you might improve your baking skills. You didn't sound happy with courier, have you tried dovecot? I am not telling you that you must change your mail servers or else, I am just offering advice, you don't have to take it, nor does the OP.

    5. Re:A few comments. by Kaoslord · · Score: 1

      care to share the names of these statistical filters? im a bit curious

      --
      Kaoslord [quote goes here] define("slashdot purity","67.5");
  281. Re:NO Domino by Anonymous Coward · · Score: 0

    Agreed. Notes is horrible. My employer uses it. It is almost the worst e-mail client imaginable. Only good thing about it is that it's better than i-notes. That isn't exactly setting the bar very high though...

  282. Re:Here's my plan and it's the best one you'll get by Mr.+Underbridge · · Score: 1

    Dude, that was fucking hilarious. Excellent troll, sir.

  283. no, it will not be sendmail by Doktor+Memory · · Score: 3, Interesting

    All of these systems will be running sendmail.

    You're high. Building a massive production email system on Sendmail 9 is slow-motion suicide. If the security holes don't get you, the terrible configuration methods and complete lack of scaleability will, nevermind the fact that Sendmail Inc is trying desperately to replace the product.

    "Most managable with [...] heavy customization?" I'd laugh if I wasn't crying. And I'm crying because I used to work for a company that deployed a massively customized sendmail infrastructure -- and I was one of the poor bastards who had to maintain it. Trust me, you don't want to do this. Ever.

    Yes, milter is cool. No, it's not cool enough to justify burning CPU cycles on sendmail in 2005.

    Even Sendmail Inc tacitly admits that Sendmail's design is garbage: take a look at the design document for Sendmail X, and note carefully how much it resembles Postfix and Qmail. There are very good reasons for this.

    --

    News for Nerds. Stuff that Matters? Like hell.

    1. Re:no, it will not be sendmail by Doktor+Memory · · Score: 1

      (er, where by "9" I mean "8" -- damn fingers)

      --

      News for Nerds. Stuff that Matters? Like hell.

  284. backend architecture... by rhaig · · Score: 1

    There have been several good answers for the front end. Here's a good backend architecture.

    make sure you virtualize. VMWareESX and Vmotion are very cool. They have tons of info on their site for using virtualization to increase uptime and it's all true. I thought it was a load of BS until I started using it. It's great for DR and multiple sites.

    flat backend filestore....
    ok, I have nothing to do with these people. I have no stock in the company, I jsut think they have a cool product. http://datadomain.com/

    Check out Data Domain if you're going to use flat filesystem for the filestore. They use bitpattern matching to provide pseudo single instance store, and (they claim) 20x compression (though with this technology and something like mail, you could probably approach 8x easially.

    Their products do remote replication so you can have your multpile sites with the same mailstores.

    also figure storage.... 1M users, figure average message size is 15K (in a single instance store system, no SIS, figure 75K). Figure everyone is going to have 1000 messages in their inbox.

    so that's 15-75TB if you could limit mailbox size reasonably, you could probably get away with the DataDomain DD460 without too much hassle. put one at each site, set up asynchronous replication, buy 2 extras for backups in different locations and offset their synchro schedules. if you need a message delete more than a few weeks ago, tell whoever wants it to go ask the corporate lawyers why you shouldn't keep email on tapes. If they really want to back it up to tape, email me and I'll build you an architecture on paper.

    if you don't use a DD product for your backend, look at pillardata.com you could build a 20TB system for about $6/GB and when you fill it up, expand it for about $3.50/GB (in 4TB chunks)

    I do storage management for a living. I have about 160TB of accessible storage spinning right now. Backend I can do off the top of my head. Front end is best left to others, but the backend is always the same.

    --
    "We are not tolerant people. We prefer drastically effective solutions"
  285. Is that you Malcolm? by Anonymous Coward · · Score: 0

    Malcolm Turnbull has called for the Government to give every Australian their own email address for life.

    http://www.abc.net.au/news/newsitems/200509/s14566 74.htm

  286. The question is bogus. Hypothetically..... by CFD339 · · Score: 2, Informative

    Hypothetically (since nobody is dumb enough to believe this is a real life case of a million users being defined by someone betting his career on slashdot trolls)

    If it were me starting from scratch -- the model for a million uses is the internet itself. SMTP, DNS, and mabe a big LDAP directory tool. For calendaring, you're SOL, but nobody calendars with a million poeple. That's meaningless. Calendaring is only useful at the workgroup level anyway. Look to any good workgroup calendaring tool and let users define thir own working groups.

    Now, backing off the big million user stupid number. In the real corporate world, you have two real players and a ton of also-rans. The two real players are IBM/Lotus with Notes and Microsoft with Exchange.

    The market is split roughly evenly. In the US Microsoft leads a bit, in Europe and EMEA IBM/Lotus leads. How much and actual numbers are hard as hell to track down. IBM doesn't release them and Microsoft likes to count every copy of Office as an Outlook seat. Suffice it to say both companies own about a hundred million actual users.

    The basic trade off between the two - With Exchange you get tighter integration with Active Directory and smooth look and feel integration on windows. It feels like all part of the operating system. On purpose. On the other hand, you're forced to use Active Directory, forced to use Win32, and all that integration without any real security means viruses are unstopable. With Notes you get a bulky client that many users find hard to understand. You also get almost 100% prevention of virus spread (it has built in security) and other goodies. Its also a development platform and its cross platform. The client is Win32 and Mac, and users have writen howto docs for WINE. The server is linux, win32, AIX, ZSeries, and iSeries (as/400).

    You may not know this, but BOTH can use the Outlook client. Yes, the outlook client is supported with a Domino mail infrastructure. Who'd have thunk it?

    Oh, and Domino supports other mail clients too. Pop3, IMAP, and a very good Web Browser -- all at once for the same person if you like. Its got native SMTP support, as well.

    What Notes isn't, its pretty. Most people say Outlook is prettier. Ok. Easy to do if you own the OS and make software that only runs in one environment.

    So, I hear rants about Notes. I hear trolls whining about a product that runs in Linux as a server and (using Wine) as a client. Runs on Mac. Has a fully functional JAVA environment for development and a remote API through CORBA and DIIOP.

    No no, instead they'll use a proprietary only -- Windows Only, Active Directory Only, Virus Distribution Engine from Microsoft.

    You gotta love that. Why? Well, its pretty.

    --
    The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
  287. Google It! by DieBase99 · · Score: 2, Informative

    I searched on Google for "email system" for "1 million users" ... this page came up: @Mail with large user bases -> it even gives you a case stude of Hotmail!!! the company is called @Mail it is the exact same solution that seeqmail.com uses and they have over a million users. Read it... Find out more... and Google some more Don't pay over-priced consultants unless it is something you have absolute no expertize in. It is your job to figgure out how to get it done.

    1. Re:Google It! by Da+w00t · · Score: 1

      No, No no no no no no no no no no no no no no no no please please pleas not @Mail. I had the displeasure of getting rockcool.com (Hah! No link because it's a dot bomb!) to purchase a copy of @Mail because it was backed in perl ... to find out that the codebase sucked, it corrupted inboxes, and oh, you could log into any account with any password. Yes, their shit was something fierce for broke. This was 5 years ago. I don't know if it's any better or not.

      --

      da w00t. mtfnpy?
  288. Set your standards higher! by callipygian-showsyst · · Score: 1
    The system must scale perfectly, 99.9% uptime is expected... where would you start?"

    What?! You think that being down over 45 minutes every month is "scaling perfectly?"

  289. Check your math by Anonymous Coward · · Score: 0

    365 (days) * 24 (hours) / 1000 = 8.76 (hours)
    No?

  290. Re:LOOK WHO IS REALLY DUMB! [IT'S YOU!] by Cruithne · · Score: 0, Flamebait

    Good job. You wasted quite a bit of time proving in excruciating detail that i messed up, after everyone else alerted me about it.

    Somehow i'm not surprised that, in your calc.exe fest, you missed the obvious - I missed a decimal space.

  291. In 6.5, you can edit attachments in received email by rhsatrhs · · Score: 1

    I wouldn't normally do it your way, though. I would click the "Forward" button, then double-click the attachment, select the "Edit" option, edit and save, and then send. But I could do it your way. The only requirement is that you have to put the received message into edit mode before you try to edit the attachment. There is no "Edit" command on the menu or action bar for received messages (because face it... the received message is supposed to stay the way the sender intended it), but you can do it either by ctrl-e or by double-clicking anywhere in the message body.

    -rhs

  292. Not to be picky... by tattoi.nobori · · Score: 0, Offtopic
    You said,

    "...the email system for a huge company, which fed up of Exchange, wants to replace their entire system..."

    I think perhaps that a parenthetical rephrasing, such as,

    "...a huge company (the subject), which, fed up with (not of) Exchange, wants to replace its (not their, as we want the possessive pronoun to match the subject) entire system..."

    ...might work better. Unfortunately, Mr. Language Pedant has no advice for your email woes. ^_^

  293. GroupWise by uisqebaugh · · Score: 0

    I administer GroupWise at one of the largest health care organizations in the United States, and I can say that I've been veyr pleased with its reliability, scability, quality and its ability to run on either Netware or Linux.

    Whenever other companies are having problems with email viruses, we don't seem to have the same problems. On top of this, our uptime is very impressive even in non-clustered sites.

    Though it's not free, Novell's technical support is superb. Give it a try!

  294. Ask slashdot? by highwaytohell · · Score: 1

    Ask a pidgeon breeder how quickly he can round up a million birds.

  295. well don't ask slashdot by Anonymous Coward · · Score: 0

    I'm sure there are some very competent people here but for that size project you need professional help. You need to either look at Sun's messaging product or software.com's Intermail. I've implemented both and had issues with both. At 1mil accounts you shouldn't hit either products scalability issues. Although I will say I was just involved in a major ISP's migration from intermail to Sun's messaging for approx 7mil users because intermail simply wasn't scaling. 130+ servers with lots of folks architecting the various layers.

  296. Re: including getting IBM in to help by rhsatrhs · · Score: 1

    Well, there's your problem. Instead of bringing in IBM, you need to bring in a qualified consultant, preferably an IBM Lotus Business Partner, with actual real-world experience in the trenches making Notes and Domino really work for organizations similar to yours, and who makes it their business to understand your problems, fix them, and transfer knowledge to you. You'll never get that from IBM. You're a small customer for them. You barely register on their radar screen. And contrary to what most people assume, the people IBM Global Services have out in the field supporting Notes and Domino simply don't have vast amounts of special inside knowledge and direct connections to development.

  297. Novell Netmail or Hula by AngryElmo · · Score: 1

    Netmail for add bells and whistles (plus "support) or Hula for the FOSS version of Netmail (I think under GPL). Netmail was supposedly designed for large numbers of users

  298. Re:LOOK WHO IS REALLY DUMB! [IT'S YOU!] by Anonymous Coward · · Score: 0

    you mean decimal *place*. Maybe you're "not a History or English degree" after all and you just made a small calculation error. Or maybe you just made a spelling mistake. Either is UNACCEPTABLE.

  299. On building a better mousetrap by Koutarou · · Score: 0

    Having been through this sort of thing before on a smaller-scale (we've done 3 mail system re-engineerings in the last 8.5 years, the last 2 LDAP-based and are just breaking into 6-figure mailbox count), I can make one big recommendation: Think VERY long and hard about your LDAP schema and make sure you get it right the first time. Do this LONG before you even think about other software/hardware.

    If done properly and you get proper triggers done to push updates from your backend database to LDAP (or just make the LDAP your canonical data store for technical data if you can), everything else can be implemented using off-the-shelf software and hardware (think SAN backend for mailbox store with redundant switched paths between servers and storage and load-balancers in front of most of your components).

    And as stated before, seperate your roles and put them on discrete redundant clusters of machines.

    Our architecture actually had the mailboxes in private space, with access coming via front-end MX for incoming mail, and pop/imap proxy for reading.

  300. zmailer for outgoing by r00t · · Score: 1

    Assuming you need performance, zmailer is your
    best choice for outgoing email.

    1. Re:zmailer for outgoing by ladybugfi · · Score: 1

      And you want to make your 1M mailbox corporate infrastructure dependent on a piece of software that is minimally maintained by a couple of persons?

      IMHO, sometimes there are other considerations than performance.

  301. Re one million ? by Anonymous Coward · · Score: 0

    one million accounts. i'd ask for some proof of that first before even considering taking on the project.

  302. talk to Iwoa Telcom by adaminnj · · Score: 1

    A friend of mine at Iowa telecom just did this
    poke around you might find him he's a bit fed up with /. so he probably wont read this but if you really want to know how to do this you will find him.

    --
    I'd Tell you all my secrets but I lie about my past
  303. exmerge by Anonymous Coward · · Score: 0

    you could use exmerge on 5.5 to do mailbox-level backups and restores. it would stop single-instance storage for that user's old mail, though.

  304. I cant believe I forgot the best part... by Some+Random+Username · · Score: 1

    When using the openbsd machines for loading balancing and firewalling, you can run spamd on them. Its openbsd's very small and very effecient greylisting daemon. Doesn't matter what MTA you use, uses almost no resources, and cuts out the vast majority of spam before it even touches the mail servers.

  305. MSFT is BAD by MrArmyAnt · · Score: 1

    Do what ever. And how did this get slashdotted? didn't know a question could do that. Tried forever to get something slashdotted. My site for one when it opened. and here you go and get it up in the highest rank everywere, a question.

  306. Duh by Anonymous Coward · · Score: 0

    Personally, I'd start by asking for a raise.

  307. Domino by Yakman · · Score: 1

    IBM / Lotus Domino sounds like a good fit. Supports webmail, POP3 and IMAP out of the box, servers are extremely reliable if administered properly. Scalable, supports clustering (and real clustering, not like Exchange). Runs on Linux, Windows, and big iron from IBM. Sounds pretty much like what you're looking for, since you didn't say "free".

  308. Communigate Pro worth looking at by Hellboy0101 · · Score: 1

    I'd start by looking at Communigate Pro from Stalker Software. They have a Dynamic Cluster solution which taps out at 5 million accounts, and includes everything you are asking for. They have a Super Cluster that will handle 5 million+ if need be. Their prices are very reasonable, and they have won numerous awards. Their Network Computing stress test did something like 160,000 e-mails per hour with zero errors. They have a free unlimited trial to download, and runs on 21 platforms from Windows and Linux to QNX and BeOS!! http://www.stalker.com/content/solutions.htm

    --
    Because teenage pranks are fun when you're about to die!
  309. You might try . . . by as400tek · · Score: 1

    I know this is going to come out all wrong, but I have been a mail admin for a while and if you could use multipal servers under one domain Lotus Domino on iSeries is by far the most stable mail collaberation product on the market. I know no one likes Notes, but it is stable and the iSeries is by far the most stabel production machine on the earth right now next to a mainframe and solaris. You can run Domino on Solaris too and you should be ok there too. There is great fail over services. Don't think because it's an IBM product too that it's big and crappy. Remeber that IBM purchased Lotus and it's stilll a great product and just keeps getting better. You can put tons of user on each Domino server and agin it's stable too. That is all for now. Yes I said Lotus Domino and Notes......what you going to do about it? No I did not use spell check.

    --
    David Vasta iSeries(AS/400) Admin & Junkie
  310. Rediffmail by Russ+Nelson · · Score: 1

    Rediffmail uses qmail, and they have upwards of 30 million users. But I wouldn't dare to contract your rhetoric with facts.
    -russ

    --
    Don't piss off The Angry Economist
    1. Re:Rediffmail by bani · · Score: 1

      certainly goes a long way to explaining why rediffmail sucks so hard.

      what about the fact yahoo abandoned qmail? too embarassing?

  311. CommuniGate Pro by halightw · · Score: 1

    It's not free, but the best most scalable non-microsoft solution has got to be CommuniGate, it could scale to millions of accounts easily. Supports just about any OS and even includes almost any messaging protocol you can think of. Check it out at www.stalker.com.

  312. Communigate Pro in dynamic cluster mode by kRutOn · · Score: 1

    It's been said before, but I'll have to throw my lot in with Communigate Pro. There have been installations of over 4.5 million.

    Check out their page on dynamic clusters. I use it every day and must say it's the best investment I've made in commercial software.

  313. Yahoo Business Email by dvanatta · · Score: 1

    For $.99 per user can use Yahoo's Business Email using your company's domain. From Yahoo: http://smallbusiness.yahoo.com/email/faq.php#8 Why should I outsource my email system to a Yahoo! Business Email plan? Our platform is a well-established system with millions of satisfied customers and billions of emails successfully delivered every month. With a Yahoo! Business Email plan, you can expect: * Cost savings: Yahoo! Business Mail can provide email for your entire company at just a fraction of the cost of mail systems you host yourself. * Immediate implementation: Yahoo! Business Email plans are so easy to set up and use, you can start configuring your email system minutes after your initial purchase. Other email systems may take weeks or months to implement. * Greater reliability: Yahoo! Business Email systems are monitored 24 hours a day to ensure performance and reliability. * Automatic Technology Upgrades: You don't have to worry that your email system will be outdated or need upgrading. Yahoo! Business Email plans automatically upgrade your email system with the latest hardware and software improvements.

  314. Re:Build it like you would build a multitier web s by Sevidrac · · Score: 1

    I would not use the terms good with Hitachi. Maybe I am biased with EMC products, but they sell more than the next big three SAN vendors combined.

    --
    What luck for rulers, that men do not think. - Adolph Hitler
  315. For WebMail checkout @mail - atmail.com by Anonymous Coward · · Score: 0

    Stop using SQ or Horde/IMP, @Mail - http://atmail.com/ is a much more reliable and "Outlook" looking WebMail client.

  316. References: by einhverfr · · Score: 1

    Computer Associates

    addict3d (more info than the CA link above)

    It doesn't look very exploitable, but it is worrysome.

    --

    LedgerSMB: Open source Accounting/ERP
    1. Re:References: by Anonymous Coward · · Score: 0

      That's in the QMTP server, not the SMTP server. That's the protocol described here, not the one described here.

    2. Re:References: by TheLink · · Score: 1

      Uh you should worry if someone unauthorized can set an arbitrary RELAYCLIENT on your system.

      Only an admin user should be able to do that.

      So the bug isn't a problem in practice (like most other bugs with qmail or other software DJB guarantees).

      --
    3. Re:References: by Russ+Nelson · · Score: 1

      If you have process limits in place that prevent processes LARGER THAN 2 GIGAfuckingBYTES, then you have no reason to worry. All of Guninski's "vulnerabilities" require similar misconfiguration.
      -russ

      --
      Don't piss off The Angry Economist
  317. Custom or Commercial suggestions by Hardware · · Score: 1
    You have two options. Create a custom solution or use a commercial one. I ended up having to create a custom solution for the 85k mailboxes I manage.

    Having investigated scalable mail systems I would recommend at least taking a look at Mirapoint. They aren't perfect but they are professional and have a very nice solution, though it will cost you.

    If you're going to do things yourself I'd suggest looking at some of the following:

    • RedHat Enterprise Linux 4
      Has most of the following already installed and you'd want to subscribe to a version of RHN that let you rapidly roll out new servers or upgrades/security patches to existing servers

    • OpenLDAP
      Can be used for authentication and directing mail and pop/imap or even webmail session to the appropriate backend mail stores.

    • Perdition
      POP/IMAP proxy can use LDAP

    • Postfix
      Again can use LDAP

    • Apache with PHP
      I used this to proxy/redirect webmail logins

    • Webmin
      It's cluster feature is actually quite handy and it's monitor scripts along with some Perl make for a quick and easy monitoring solution.

    Using the above you can setup front end mail exchangers doing various anti-spam and anti-virus work in a load balanced setup with dynamic banning of IPs based on logs of refused mail. They should make use of LDAP so you don't allow any mail in that is destined for a non existant user.

    Then you can use this to balance multiple back end servers of virtually any description. You could even have multiple vendor solutions used for the backend servers. Of course you'll need to tie it all together with custom administration scripts, etc
  318. Stalker Communigate by Anonymous Coward · · Score: 0

    Stalker Communigate is very good but for that amount of traffic it will be expensive.

    It's a great roduct though.

  319. PARENT IS WRONG! by Anonymous Coward · · Score: 0

    The parent poster couldn't be more wrong. It's actually hard to even read his post entirely.

    1) qmail is VERY EASY to install, and yes, due to it's age you must patch it. This is why you should use netqmail from qmail.org.
    2) qmail is the easiest MTA to configure PERIOD. Single config files with multiple data lines (domains) or even a single line of configuration (server name)? Sounds damn easy to me.
    3) Not scalable? Are you insane? qmail scales with the operating system. It uses mostly system calls to complete it's tasks, so it's very very quick. But again, it depends on what you are doing. You can't run _eveything_ on one machine, regardless of MTA used.
    4) "on other daemontools"? That doesn't even make sense. First, daemontools is a software package by that name, there are no "others". Second, in #3 you said it was designed for inetd, no, it was designed for tcpserver and not daemontools. Running it under daemontools ensure it's always running and started by init itself. I think you need to reread your installation files, djb blasts inetd and merely recommends daemontools.
    5) It does? Such as? OH, I know, the installation setup. Well you can change that very easily when you install -- conf-home.
    6) What? qmail is one of the most flexible MTA designed (postfix might kick original qmail's ass out-of-the-box though). With something like netqmail with the QMAILQUEUE intercept patch, you can get in between any operations qmail does. And yes, thank you, it is very secure.
    7) Oh snap! You just said it was secure and now you say it isn't!?! Pick a side! But ok, please enlighten us how a program written in 1997 hasn't been updated due to security issues (I am aware of the 64bit 4gb ram remote root exploit though, but good luck with that exploit code; datasize anyone?)?
    8) Yes, you are right. Too make matters worse, it's not a fair open source license either. You are 1 for 8 so far, good work!
    9) Yes, also a very annoying problem. But many have taken the initiative to compile the most common and required features into nice toasters. Just hit google for a qmail toaster. I personally roll my own qmail :). 1.5 for 9...

    In the end you turn out to be nothing more than a poorly educated troll. Please be quiet, adults are speaking. ;)

    -mo

  320. Re:Obviously this idea has a problem.... by ibbey · · Score: 1

    Ummm... I think it was a joke. Must be over your head.

  321. You've already made an excellent start by msblack · · Score: 1
    By dumping Exchange, you've already made an excellent start. As of a couple years ago, MS Hotmail was running on Sendmail and various ad-ons. Microsoft couldn't scale Exchange for their flagship e-mail portal. Does anyone know or will any Microsoft employee admit to what is being used for Hotmail?

    My company employs a combination of several technologies which provide almost 100% uptime. Although no system will be perfect, I believe you can achive that 99.9% service level.

    Our e-mail enterprise product is CommuniGate Pro (CGP) from an unfortunately-named company called Stalker Software. CGP is in use by many ISPs, scales very well, and is high performance. We're much smaller than 1 million users with around 40,000 accounts. CGP supports SMTP, IMAP, POP, Webmail, LDAP, and has plug-ins for antispam and antivirus. As these functions require a lot of I/O and CPU horsepower I would configure a separate e-mail security appliance. Our CGP servers have a Unix load factor of about 1.00 or less.

    For e-mail security, we use a pair of IronPort C60s as our border SMTP gateway. The C60s run Sophos antivirus and Symantec Brightmail. Brightmail has a false positive rate of 1 in 1,000,000 which is very important in large organizations. These C60 systems can each process several hundred thousand messages per hour, which is ideal for peak demands and are great for blocking zombie hosts. No system will block all spam or viruses. However, you can expect to catch roughly 98% of spam and 99.9% of viruses with no effort from the users. Power users can always emply additional spam filters with their e-mail client, such as, Thunderbird.

    DO NOT skimp on hardware. Buy high-end Intel or "Unix" servers (Sun, HP, IBM, etc.) and install your favorite flavor of Unix/Linux. Did someone else mention hot-pluggable redundant systems? DO NOT store e-mail messages on your e-mail system. Get yourself a real NAS or SAN server, such as the Network Appliance FAS series. Don't skimp on low-cost imitations. Our NetApp servers have a record of 100.00% uptime for the past five years. Honest! Our only downtime on the NetApp servers was for UPS or power maintenance, or filesystem migration. We have not experienced any downtime. Can I say it again?

    Experts will argue whether you should run iSCSI or NFS. NFS is just as fast as iSCSI and can be shared across multiple servers. I-SCSI and SAN volumes cannot be shared across multiple servers so scaling an iSCSI volume to 1,000,000 users is out of the question. Because CGP manages account and file sharing mitigation, you don't have to worry about silly and incompatible NFS file locking utilities.

    Good luck with your "project" and please let us know upon what you decide to use.

    --
    signature pending slashdot approval
    1. Re:You've already made an excellent start by fimbulvetr · · Score: 1

      I don't know how they got you to believe a C60 can handle several thousand of messages per hour. Best case, they can handle 80,000. Plan for 45-60k. I run 4.
      You may be running the newest Async beta, but I still doubt that will get you 300,000msgs/hr with two.

      Outside of that, no comment on the CGP, agree on the Netapp. Disagree on the high-end intel (These things don't need CPU, they need i/o). Agree on the iSCSI vs. NFS.

  322. Re:LOOK WHO IS REALLY DUMB! [IT'S YOU!] by Anonymous Coward · · Score: 0

    Perhaps you shouldn't have made this statement:

    I'm sorry but it just bugs me how many people throw around uptime figures without knowing what they mean.

    Easy to point out someone elses mistake, eh? Oh... That taste in your mouth is crow.

    The taste in my mouth is your mom's pussy juice.

  323. No...they would sell the client Lotus Domino... by FatSean · · Score: 1

    I mean...what else?

    --
    Blar.
  324. Cyrus IMAP by joib · · Score: 1

    For example, the Cyrus IMAP server supports single instance store using file system storage (Maildir-style IIRC). You don't need a database to do it.

  325. Delegate and decentralize by phutureboy · · Score: 1

    Instead of running one mega-honking-big, water-cooled mission critical mail system, consider breaking it down by division. Set up subdomains so that users have addresses such as "username@research.organization.com" or "username@london.organization.co.uk"

    Then, each division can handle its own mail. Or, you can set up different mail clusters to handle each division. Still centrally managed, but easier if you break the load into smaller chunks.

    Of course, none of this is possible if you're talking about, like, email for mobile phone customers, where all the addresses are in the format 3015551212@messaging.mobileprovider.com.

    Anyway, that's my 2 cents. Don't spend it all in one place.

  326. Novell GroupWise should be able to handle it by Anonymous Coward · · Score: 0

    GW claims 10K+ users per (single CPU) server - we have about 6K accounts on a box that's also doing web access, smtp & anti-spam and it hardly gets into double-digit CPU so I think that is conservative. ~40K is the theoretical limit I think.

    Clustering would help reliability and scaleability, and you can run it on linux or netware.

    GW7 lets you run an OutLook client against the GW backend if you don't like the native client. POP3, IMAP etc and there's also a linux client.

    Design (logical and physical) would depend on factors like the average mailbox size, the percentage of users likely to be online concurrently, and where the users are geographically.

  327. Re:Sun's Java Messaging Server (AKA Netscape/iPlan by Anonymous Coward · · Score: 0

    I'm a UNIX Systems Admin from a University who runs the Sun Java Enterprise Messaging software (which has had several name changes over the years), and if you're looking for massive scaling, reliability and flexibility- I think this product is what you're after.

    I think there is still a few issues with the outlook connector for calendaring, but I haven't looked at it for a while. Though we don't run 1,000,000 users, we happily support over 50,000 on a not-so-big machine. With the right hardware behind it, high performance LDAP servers and Sun Cluster I think you've got your answer.

    I dont agree with it being difficult to administrate.. you set it up properly and once you get your head around how it works (if you understand PMDF, this is a no-brainer), it's very manageable and goes like a train. It may take a little time to get it all up to scratch, and if you're not in the know Sun Pro Services should be able to sort it out for you.

  328. GroupWise for big projects by Anonymous Coward · · Score: 0

    As an admin of GW with 250K users I can personally say that it is the correct way to go for large-scale projects. The company I work for is spread out around the globe and our CIO demands 99.99% uptime (it's hard...I admit) or our jobs are on the line. We have a great agreement with Novell that allows us to do this with clusters (we've never lost an entire cluster at one time with OES Linux or NetWare even during updates, upgrades, etc). The new GroupWise client is loved by our users (easy to use) and the WebMail is just awesome. Administration is a cinch and has minimal stress (considering the requirements). As a technology that's been around and proven for years it is great. Integrating with eDirectory is great as well because of the marvel of partitioning. Assuming we don't have problems this year with an entire cluster we'll hit 100% in just a few more months for uptime but we still have the holiday rush to survive so here's hoping...

  329. CommuniGate Pro by msblack · · Score: 1
    Suppose you are given a chance to build from scratch an email system that has to support around one million accounts. Some corporate, some personal, some free. POP, IMAP, webmail, etc are requirements. The system must scale perfectly, 99.9% uptime is expected... where would you start?



    I just reread the parent after posting a long description of our configuration. Is this system in your basement? I ask because you mention personal and free accounts. You could put these special non-corporate accounts on a different system. Why burden a corporate system with non-corporate users?



    If you must, CommuniGate Pro is great for this function because you can create multiple e-mail domains within a CGP cluster. CGP lets you delegate administrative functions for each e-mail domain. From our 40,000 user system, I would bet the farm that CGP easily scales to 1,000,000+ users on a fairly small cluster.



    As others and I have emphasized, separate the functions on different systems. Install a front-end e-mail security appliance to handle blacklists, block zombies, LDAP attacks, antivirus, and antispam. Do NOT run these using MailScanner or SpamAssassin. Use a commercial appliance such as IronPort (major competitors are MiraPoint and MailFrontier).



    Put your mailstore on a real NAS server such as a Network Appliance FAS-960 which can handle up to 32 terabytes and handle a large number of simultaneous transactions (NFS ops). Cheap RAID systems cannot support the load of 1,000,000 users. NetApp servers automatically generate instantaneous snapshots of the file system every hour thereby permitting easy restore of messages without going to backup tape or secondary storage.



    Someone else mentioned installing a bunch of redundant fibre, etc. I would hope your e-mail system is installed in a data center with these features.

    --
    signature pending slashdot approval
  330. Re:LOOK WHO IS REALLY DUMB! [IT'S YOU!] by Anonymous Coward · · Score: 0

    You know, corporate accounts is sure as hell gonna notice $305,326.13 Michael!

  331. [OT] You cannot honestly believe your sig can you? by Anonymous Coward · · Score: 0
    Two ways to end the war: (1) Kill all terrorists. (2) Convert to Islam. Unfortunately, diplomacy is not a part of either

    Neither of your proposed solutions work. (1) fails because your big scary overused buzzword is not a buzzword at all. It's a person, with a family. And if you were to kill my kid/parent/sibling, I'd resolve to kill you. I wouldn't expect any less from the friends and family of someone I'd killed. For every person you kill, you create many more 'terrorists.' Ultimately, you are advocating genocide. But you're a smart cookie, so I have to assume you know and are ok with that. Kill 'em all, along with the niggers, gypsies, and jews, right Adolf? (2) They don't give a damn what religion you practice, they just want you to a) stop killing their friends, family, and elected leaders, and b) stop trying to tell them how to run their own damned country. Saudi Arabia is an Islamic state and that hasn't stopped Bin Laden and friends from attempting to topple that nation has it?

    Killing people is a tarbaby. The harder you struggle, the more bogged down it it you become. If I had mod points, I'd mod you down based on your fascist sig alone.

  332. Errrr.... by raulaswipee · · Score: 1

    I hate to throw a commercial solution out here....

    but, our company uses Lotus Notes and we have been satisfied with it. Not sure about its IMAP capabilities, but throw enough resources (servers) at it and take some time to learn how to manage it and things work pretty smoothly.

  333. Plain Text by Craig+Ringer · · Score: 2, Informative

    Yes, passwords are transmitted in plain text. So is IMAP, and so is SMTP. You do make your users authenticate for SMTP, right? Picking another protocol will not help in this regard.

    What you need to do is support STARTTLS for these protocols. That lets the client connect then negotiate an encrypted connection with the server before sending passwords. It's easy to configure the server to refuse to authenticate the client unless an SSL session has been set up if that's what your security policy dictates. It's also possible to have the server demand a client certificate from the client before setting up the SSL connection, adding an extra layer of authentication.

    You'll probably also have to support the old IMAPs, POP3s, and SMTPs standards, but they should be considered deprecated and only in place for crap clients that don't know about STARTTLS.

  334. Needs to be said: by StarsAreAlsoFire · · Score: 3, Funny

    From ASR ( http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.ht ml )
    Re : Mail Transfer Agents

    Qmail : a small office of neatly dressed clerks, delivering short clipped remarks to queries, and handling mail with a rude impersonality, except in the case of failiure where they let their hair down and have an after-hours beer and let you know about it, pointing to the pertinent header sections.

    MMDF: A jumped up mailroom boy with a chip on his shoulder. Loves the bureaucracy and takes great pride in stamping "illegal address" in red ink on any mail it passes. Unpacks all the mail and repacks it in his own special envelopes before delivery to end users.

    PP: MMDF gone mad with standards fever. Think "Brazil".

    No, PP is... well, see, when it receives a letter, it chops it into small pieces, then translates bits of it using an English-Hungarian phrasebook and puts all the bits into various pigeon-holes. When it gets round to delivering the message, it collects all the bits, translates them back using a Hungarian-English phrasebook, tapes them together, and loses the letter. Some time later, you get a bounce message:

          ----- The following addresses had permanent fatal errors -----

          ----- Transcript of session follows ----- ... while talking to bloat.example.com.:
    >>> RCPT To:
      550 My hovercraft is full of eels

    PP is John Cleese.

    Sendmail: Shiva as a postman. Many arms delivering mail, dancing, taking drugs, destroying as it sees fit. Often makes creative changes to the mail for kicks, but ultimately can be persuaded to do anything with the right incantation...and that includes giving you other people's mail.

    VMail: No experience yet, but I'd guess something like a wisened old man sitting on the porch outside the postoffice. Looks at everyone who passes by with deep suspicion, but turns out to be friendly and helpful once he realises you're not there to rob the place.

    Micro$oft IMC: The Scarlet Pimpernel of postmen. Hard to find, impossible to order about, but every once in a while it saves a piece of mail from disaster. Sometimes even with it's head(ers) intact.

    cc:Mail SMTPLINK: A 5 year old child left in charge of a large sorting office. Can't reach over the counter properly, can't handle more than one letter at once and has to go looking for a grownup whenever it wants to deliver to mail to other towns. Often opens parcels to look for shiney things inside then just delivers the wrapping paper onwards.

    cc:mail UUCPLINK: an insane madman sitting in a box. Mail is thrown into a box where unknown things happen to it.. sometimes mail actually leaves the box.. usually to be delivered to the administrator of a totally unrelated postoffice and containing a complaint that the madman could not find the recipient in his dark box and would you please contact the person with the key of the box. Of course, the only way to reach that person is by mail and even if the box is opened the madman cannot be pursuaded to actually send mail to unknown addressees to the person with the key anyway...
    Gus, Pete Bentley, Malcolm Ray, Perry Rovers

  335. Backups by Craig+Ringer · · Score: 4, Informative

    Backups.

    With POP3, the client downloads mail and deletes it off the server. Without a significantly butchered POP3 server there's no way to hold copies of that mail for a period of time (say, to ensure it goes on to your archival tapes, or to make sure you can recover files the user deleted accidentally). It's one less thing to worry about if their workstation / laptop dies, too - just give 'em another one. If more mail clients supported LDAP address books and WebDAV calendars this would be even nicer; as it is I still have to keep their mail folders in their network home dir so I can back up their address book.

    You can back up POP3 boxes if you're on a corporate network, by forcing the client to keep its spools on the user's homedir. That tends to be slow and inefficient, though, and it doesn't let you do things like transparently split out attachments and store only one copy of an identical attachment for everybody.

    It's also easy to lose mail with POP3 if your client does something silly. Most clients seem pretty decent now, but I remember old Eudora versions used to DELE mail off the server then crash, corrupting their mailboxes. Woohoo.

    IMAP gives admins much more control over user mail. You can back up their mail folders, including their outbox and filed mail. You can enforce mail lifetime limits if your information retention policy requires it. You can store single copies of duplicate messages and attachments. You can give users access to shared mailboxes, and to each other's mailboxes where necessary. You can manage their mail folders remotely ("I can't delete $message, help!"). You can set up filters that deliver mail into sub-mailboxes automatically. Good clients automatically sync the IMAP mailbox so it can be used when the client is offline, like POP3. You can have your anti-spam software learn from their mail client's Junk folder. It's just much saner for business environments, in much the same way that network home directories and thin clients are much saner than a bunch of desktops with local storage are.

    IMAP also permits you to give the user a single view of their mailboxes from their desktop and when they're on the road, or accessing their mail from home. Don't even talk about "leave mail on server" for POP3 - users WILL misconfigure it and suck all their mail down onto one of their machines, then come to you looking for help cleaning up the resulting awful mess.

    Now, for an ISP, things are the opposite. You want to get the users' mail through your system and get rid of it. Most ISPs only offer POP3 and have small mailbox caps, so the user can't set their client to never delete mail off the server. They don't want to be responsible for user mail, they want it off their hands ASAP. An ISP can just tell a user who deleted a message then wants it back "well, that was silly then wasn't it?". An ISP doesn't want to back up 5 years worth of mail for 500,000 users.

    My point is that for corporate environments IMAP is so superior that it's almost nuts to offer anything else, but for an ISP POP3 is a much more viable option. So what's so bad about POP3 depends entirely on what your needs are.

    1. Re:Backups by EvilTwinSkippy · · Score: 1
      Or you could just be agnostic and support both. We do at my place. Sure, we give folks a song and dance about why IMAP is soo much better than pop, but you get a few folks on laptops who insist on carrying everything with them.

      Of course, once they see that webmail doesn't work right and they discover they can only check their mail from one location, they change. Sure they are a pain in the ass about it, and I get tired of answering the same questions over and over again about "bugs in the server" and "my messages are missing". I think it's when they stop by my office and see the answer to "I checked my mail at home and the messages were gone when I got to the office" scrawled in coffee and blood on the wall behind me, they get the idea that POP is bad.

      All except of the CEO. But he only ever checks his email on his laptop.

      --
      "Learning is not compulsory... neither is survival."
      --Dr.W.Edwards Deming
    2. Re:Backups by Craig+Ringer · · Score: 1

      Decent IMAP clients support offline IMAP (they sync with the server when they can connect). This solves the one major downside of IMAP compared to POP3 for roaming users, and means that I simply see no need to even offer users the opportunity to shoot themselves in the foot with POP3.

      It also means I don't have to talk users through recovering mailboxes from their home machines and sending them to me. Once I manage that, I have to restore them to some semblance of standards compliance by fixing whatever idiotic mangling their mail client has done (*cough*Eudora*cough*), and upload their local mail files back to the server where they belong. Ugh. Way better to just give the user the option of IMAP, IMAP, or IMAP.

      I'm currently looking at how hard it'll be to make a customised Thunderbird installer for the user that comes bundled with a pre-configured IMAP account (with offline IMAP enabled) and has their client cert pre-installed. Preferably an install that they can simply copy off a CD and run - no setup process required. If only Thunderbird had some sort of writeable network address book support...

  336. a variety of solutions by adownie · · Score: 1

    There are a few things you are going to want to consider in this. First, you really need to define what you want the mail system to be. If your requirements are simply POP/IMAP, then you can go with a variety of vendors or, if you are big on building it yourself, some opensource offerings. However, with a million accounts, you aren't going to want to revisit all the outlook clients to point to POP/IMAP. For a direct exchange replacement you might want to talk with communigate (formerly stalker). You will also want to consider looking at the mail gateways for inbound and outbound traffic. While opensource is a great thing, managing the infrastructure for a system this large on opensource can be a bit of a pain. So, I would look at the commercial vendors out there like IronPort as they provide all the filtering and traffic management through a single interface. The vendors will provide all the design assistance you will need, so there really isn't a need to bring in a consultant. So, my design would be to have an MTA layer facing the net (potentially dedicated inbound and outbound MTAs), the mailstore layer in the protected net, an LDAP master server and a couple of replicas for the mail systems to hit. I know IronPort and Communigate would do this well. While I am a big fan of open source, for a system this large you really want support and companies backing up the whole thing... LEt the flames begin

  337. Notes and Outlook -- nope, not kidding by sydbarrett74 · · Score: 1

    Use Notes/Domino on the backend and set up Outlook as the client. That way, people get to keep the same look and feel, but it's being handled by a much more scalable solution on the backend.

    --
    'He who has to break a thing to find out what it is, has left the path of wisdom.' -- Gandalf to Saruman
  338. DBMail by Althazzar · · Score: 1

    The company i work for has a tool exactly for this kind of situation called DBMail (available, open source, at http://www.wodan.net./ It does exactly this: store mail in a database, making your system as scalable as your system of choice. We currently have a few setups of half a million users, and there is no reason why a bigger setup wouldn't be possible. If you need help or want to know more, mail me :)

  339. Plan. Test. Spec. Deploy... by cgenman · · Score: 1

    (5) Get yourself another job before all of the problems can be found. When they're found, come back as an even higher-paid consultant to fix them.

  340. You should quit the Job by I_LV_MSFT · · Score: 0

    Resign. Go work for Google. After you get some experience with how a real large scale projects work, go back to the same company to fix whatever mess the guy after you have done.

  341. One million accounts? by Nailer · · Score: 1

    One million accounts, huh?

    So one out of every 5000 humans in the universe will have an account on your mail infrastructure?

    Even counting for dupes, I don't believe you'll ever serve a million email accounts. This isn't a technical thing. It's one of those 'gee there aren't enough people in the world for me to believe you' things.

    But I might be wrong. People seem to estimate their customers and users in such a way that suggests a good chunk of everyone on the planet is using their stuff all the time. I just don't believe them until I've seen real evidence. Nothing personal.

    1. Re:One million accounts? by EvilTwinSkippy · · Score: 1
      Did you miss the part about this being the U.S. Navy?

      They have 368,000 active duty personnel, 142,000 reservists, and 177,000 civilians. Total: 687000. With standards allowances for safety and growth, you DO need to plan for well over 1 million accounts.

      --
      "Learning is not compulsory... neither is survival."
      --Dr.W.Edwards Deming
    2. Re:One million accounts? by Nailer · · Score: 1

      Did you miss the part about this being the U.S. Navy?

      Er, no. I missed nothing, as the article does not mention who the system is for at all, you pompous git.

    3. Re:One million accounts? by EvilTwinSkippy · · Score: 1
      Got me there. I have no idea where I pulled that nugget from, save the folks in the opening few comments bitching about EDS.

      (Commence self flagration.)

      --
      "Learning is not compulsory... neither is survival."
      --Dr.W.Edwards Deming
  342. Simple by LivingSacrifice · · Score: 1

    Two words:

    Qmail and FreeBSD.

    You will not find a more stable, scalable solution anywhere.

    1. Re:Simple by Anonymous Coward · · Score: 0

      Nice try, next try.

  343. I hate to be the thousandth person to say it, but by Dread+Pirate+Shanks · · Score: 1

    What this company should do is fire this tool and hire someone that doesn't need to ask Slashdot how to handle their huge and apparently very important mail system.

  344. Communigate by Custard · · Score: 1

    I'm browsing at +3 so I'm not sure how many people have already suggested it but you really should give Stalker a call and talk about Communigate

    Dan

  345. Quick setup by mseeger · · Score: 4, Insightful
    Hi,

    my recommendations:

    • Calculate with about 20-30 man days for the initial design. You'll need some software development for about 30-50 man days, 100 man days for setup, testing and fine tuning. Figures may wary upon skill and LWF. Time for integration into your backup service is not included.
    • Use a directory service with replication mechanism (preferred LDAP, we've done it with MySQL too). Every system except the load balancers will get a replica.
    • The user data is stored on machines with Cyrus . Depending on machine size, user profile, mbox size etc. you take between 5.000 and 50.000 users per system.
    • The directory service knows which user is on which system. Prepare a script to move users from one server to another (including the mbox).
    • Incoming IMAP connects go through a loadbalancer to frontend systems with the perdition proxy. Those will relay thre requests according to the directory to the responsible IMAP server.
    • Incoming HTTP requests will go through the loadbalance to an Apache with Squirrel on the frontend systems. Those will convert the requests into IMAP requests and connect to the local perdition.
    • Generate a web frontend for the user to setup auto reply, vacation and anti-spam settings.
    • From those settings you can create SIEVE scripts for the user.
    • Incoming and outgoing SMTP traffic is handled by systems with sendmail. Local delivery is handled by LMTP connects directly to the IMAP servers (cyrus can handle LMTP).
    • Antivirus and Antispam is handled through the milter interface and appropiate plugins. Plan for individual settings per user (can be generated from the data in the directory server).
    • Loadbalancing SMTP us trivial.
    • Add monitoring (e.g. Nagios), Backup and Restore (last one most important, nobody wants backup, all everyone wants is restore).
    • If desired, use a cluster file system for those IMAP servers to have even more redundancy.
    • Make sure you have access to the internal DNS of your company. If you can setup "mail.acmecompany.com" to point to several ips (depending on location) this may ease your job lot. If you cannot, this may be hard (and expensive) for your load balancers.
    • You can scale everything horizontal in this concept. Choking point may be the load balancers.
    • You can distribute the system easily onto several locations. Distribution over several continents is only recommended if you can either manage the DNS or the mail agent settings per continent.
    Please forgive me, if i'm not completely correct. I'm only the sales rep ;-). But we've done it several times for ISPs. OSS software usually does the biggest part of the work. Usually some components (depending on existing contracts and knowledge) are commercial software (e.g. anti virus, load balancers, cluster file system). Typical operating systems are Solaris or Linux.

    With backup support you should be able to setup such a system in 6 to 12 months (the later more realistic for big companies).

    Most probably users will complain about the lacking calendar.

    Most troublesome will be the migration phase (hope you realized i didn't mention it above). This depends so much on your current scenario that it is very difficult to give a general advice.

    > where would you start?

    Contacting me ;-). Perhaps get a budget first. As i said, i'm sales....

    Regards, Martin

  346. Re:qmail by drbill28 · · Score: 1

    We have a fairly large scale email system. There are 180,000 email accounts on the system. All can be used. But only 8,000 are really truely active users. We use maildirs to store the emails. Adding vpopmail, spamassassin, virus scanning to the pipeline does not slow anything down. Performance is great. It is not even a good machine. Plus I have had 100% uptime so far on it. If they go this route, vpopmail takes full advantage of relational databases for authentication. It speeds that up dramatically. It is a pretty neat system though.

  347. I would start by partitioning the problem by Anonymous Coward · · Score: 0
    Rather obviously, I would start by partitioning the problem. Rather than looking for a system that can take 1M users, I would take for granted that you can build systems that can handle 10,000-100,000 users. Divide the problem into having a large number of autonomous mail systems handling some fraction of the userbase.

    I would prefer partitioning the users by having a compact lookup map on the MTAs handling incoming mail, rather than distributing them across mail servers by hashing or other means. This way you have some way to directly control the distribution. For instance so you can move specific users to specific machines, and if you want to scale up your system, you can have more control over which users you migrate where etc.

    (If you have 1M users, aim for 20M email addresses. A map mapping each user to an autonomous mail system can be put into a very compact data structure which can be kept in RAM. This data structure need not be mutable -- you can rebuild and reload it periodically to include updates)

    Ideally you would have some proxying for POP and IMAP so that users do not need to be aware of what mail system they need to connect to. They just go to pop.mail.example.com or imap.mail.example.com and you can use DNS to route them to a local proxy . The local proxy knows which mail system the user belongs to and connects the user there.

    This scheme would require some modest amount of development work, but I think it would be justified in this case. None of it is too complex. For instance there are several ways to solve the user-mapping in the MTA, from the naive Perl kludged LMTP-backend doing central-database-lookup (takes about 30 minutes to implement) to an atomically updatable compressed trie-based map in the MTA, so you can reject mail to invalid recipients before delivery takes place. It would surprise me if there are no proxy servers for IMAP or POP available which can do the same mapping, but I guess you can have someone write a fairly scalable one in a just a few weeks.

    If you go for a monolithic solution for 1M users you will experience a lot of downtime and frustration and to keep a beast like that up is going to cost a lot more than having a bunch of cheap stand-alone systems.

  348. Try DBMail (www.dbmail.org) by duredhel · · Score: 1

    I'm still surprised that this one is still under most people radars... It uses a DB as its backend (mysql, postgresql). You've got to use something like Postfix, Exim, etc. as SMTP. This thing seriously ROCKS. I'm still waiting for it to hang. It scales very well and is definitely worth a check.

  349. Obviously by Anonymous Coward · · Score: 0

    I would use a few servers as incoming SMTP servers in a round-robin DNS or with load balancers. I would also set up a few backup IN MX for the domains.

    I would store mail on backends servers which would have an smtp, pop and imap daemon running.

    For the mail checking I would put POP and IMAP proxies, which can also be in round robin DNS or with load balancers.

    You would then need a system to redirect from the SMTP servers and POP/IMAP proxies to the right backend. This can be done with a database or an ldap server. Or email addresses could be rewritten @backend1.domain.com, etc...

  350. Bug is not a security hole by Lost+Found · · Score: 1

    It looks like that buffer overflow might be there, but it depends on stuffing lots of data into the RELAYCLIENT environment variable. Because qmail-qmtpd does not have the setuid bit, RELAYCLIENT must be set by root or the daemon user prior to dropping root. Hence this bug is totally unexploitable.

    http://lists.grok.org.uk/pipermail/full-disclosure /2004-March/018191.html

    But I agree with the sentiment - oh so close!

  351. Outsource or IMS by petienne · · Score: 1

    I'd agree with the comment of looking at outsourcing.
    There are excellent email providers which could provide customized email solution that can easily scale to your need.
    Should that not be an option, one solution to scaled to that size for me in the past is the iPlanet Messaging Server from Sun. It is also used by a lot of ISPs for their customers. Very versatile, customizeable and solid.

    Good luck

  352. Re:MailDir vs. MySQL by shurdeek · · Score: 1
    Actually, I used to have performance problems with MailDir, because I have a pretty slow mailsever (PII/400, 128MB) and some folders have tens of thousands of messages. I already have been using reiserfs which should be optimised for this type of data, but apparently it wasn't enough. So about 4 months ago, I added MySQL layer onto MailDir, adapted maildrop and wrote a webmail that can utilize this (I also use FCGI and persistent database connections for further optimization). The performance was increased at least tenfold in large folders. In particular, finding new messages (in all folders together) usually takes under a second, previously it was measured in minutes. The database now holds somewhat more than a million messages.

    Of course there are certain drawbacks, like when you use IMAP, the database loses sync, so you shouldn't mix it on individual accounts. But I don't use IMAP now so it's perfect for me. This could be fixed by modifying the IMAP server, which shouldn't be that difficult, I just don't have the need.

    My point is, don't criticize if you don't know what you're talking about ;-)

    Yours sincerely,
    Peter

  353. /. answers for these questions are 99% BS by Anonymous Coward · · Score: 0
    I would love to help you with this one, but my experience has been limited to setting up ISPs handling much less than 1M accounts.

    You're going to get the normal 99% BS from /. "experts" that have NEVER set up a mail server in an environment with REAL traffic. I don't know why some many people must render opinion about things they don't know anything about. Skimming a HOW-TO doesn't make you an expert.

    I would suggest you find a professional with proven experience in this arena. Lets also hope your company has given you enough of a budget to do this properly.

    Good Luck

  354. Wisdom follows, pay attention! by Anonymous Coward · · Score: 0

    You cannot replace Exchange with Qmail or anything like that. Exchange is a workgroup collaboration system, which is much much more functionality than plain e-mail.

    If you want to implement for a big COMPANY, you must go IBM-Lotus Notes/Domino, because that is the only honestly full featured collaboration suite alternative (Novell's one lacks big time).

    A single big IBM iSeries AS/400 or zSeries mainframe iron can serve 15k CONCURRENT Lotus users, either over the native OS, or by running Linux in virtual partitions. And those machines are rock solid, both by hardware and software. Just don't forget to reboot every decade.

    You could use Sun iron also, but Sun is about to turtle in one-two years, so don't count on support. You also don't want to run Domino on Windows Server, do you? Reboot twice a month just because of security hotfixes? You are smarter than that!

    The idea to implement a million mailbox system on PC iron with free software is only viable for webmail systems, that give mailboxes for free, because you do not have to give uptime warranty there.

    Companies buy big black IBM iron, because their IT HQ staff do not want to work extra nighshifts to fix all kinds of weirdest problems, they want to go home at 5pm and spend all their big income on thier kids, wife, dog and car.

    Geeks should not be allowed to go anywhere near a corporate data center, they should be shot on sight!

  355. Re:Here's my plan and it's the best one you'll get by FF3451 · · Score: 1

    What the guy should do is buy an e-mail system that can handle 1,000,000 users and not screw around trying to chewing gum his own solution.

    He who never tries never learns anything either... and he who learns how to achieve something like this is subsequently worth a lot of money.

  356. Re:I hate to be the thousandth person to say it, b by Getfunky · · Score: 0

    Why not open source the project?? get some geekers to code some mutant linux based mammother system to send ooogles of spam(cough) mail.

  357. here's a mail from someone doing this on cyrus now by Anonymous Coward · · Score: 1, Informative

    > > > FWIW, I've experimented with 750k mailboxes on a single system with 8GB RAM and we
    > > > plan to put that number in production in a couple of months here.
    > >
    > > Ouch, 750k? How many concurrent accesses?
    > >
    >
    > We currently have 1.6M, 1.2M and 940k mailboxes in 3 boxes with fiber to a single emc storage, all boxes
    > dual Xeon 3.4Ghz EMT64T with 4G.

    We tend to have quite large mailbox lists, but not as large as this. The biggest issues we've found with large mailbox lists are:

    1. Number of concurrent connections.

    If you support/encourage IMAP usage, then you tend to end up with quite a few more connections than POP.

    Although technically IMAP can be very long lived, we find there are lots of short connections (mostly due to things like Outlook Express which when doing a "sync" pass does a logout and login for each *folder* in a users account!) and some long ones. With about 650,000 folders on one machine (about 130,000 users) and at peak times we see about 3500 imapd processes. We use linux 2.6, and find that this is a good number of maximum processes to have. Although the kernel is just about O(1) for everything these days, we find that there does seem to be a bit of an elbow point around the 5000 process mark where things just seem to start showing higher latency and average loads on the server

    2. Size of mailboxes.db file

    With a large mailbox file, you probably want to use the skiplist format. Part of the implementation of the skiplist db however is that the entire file is mmap'ed into memory. While this is generally fine since each process shares the same mmap file backing, with really large mailboxes.db files you can end up with just huge page tables.

    For instance, the above 650,000 folders mailboxes.db is about 100M is size. With pages being 4k each, that means each process needs 25,600 pages just to mmap that file into it's process space. If you have > 4GB of RAM, you have to use x86_64 or PAE mode in linux. Both of these mean that each page requires a 64bit page table entry (8 bytes). If you have 3500 process then...

    3500 * 25600 * 8 = 716800000 = 683M

    Yes, that's 700M of memory just to hold the memory map of all your processes, no actual real data at all!!
    This also means that you MUST use the high-PTE option in linux, or else you'll have lots of low memory pressure.

    3. IO

    CPU isn't an issue. IO definitely is. Cyrus uses minimal CPU on todays hardware, but it still is an IO hog.

    That's part of the reason we sponsored the meta-data split patches that have gone into 2.3 so that you can separate out the email store part and the cyrus.* files onto separate partitions/spindles to improve overall performance. Where possible, split out:

    user.seen state files
    quota files
    cyrus.* files
    email spool files

    Onto separate spindles/partitions. At least that way you'll be able to use something like "iostat -p ALL 120" to see which parts of your system are generating the largest IO.

  358. ISPMAN by mangaramblo · · Score: 1

    I think the best Open Source solution is ISPMAN.

    It uses LDAP as it's backend database and is all-in-one ISP solution with mail+web+dns+etc. You don't have to use everything if you want.

    The best thing is that "you may start with a single server to manage user's mailboxes and add more as you grow. ISPMan can manage this and allow you to create user's accounts and mailboxes on different servers. This does not affect the user at all but allows the system administrator to balance the load of mails on different machines."

  359. I suggest MS Exchange.... by grahamsz · · Score: 1

    /running right after you

  360. the ultimate server for 1 million users by millst · · Score: 1

    I have an old 386 in my shed, 66Mhz DX with 24Mb of RAM and a 2Gb HDD that has been running as my mail server for 6 years since its last reboot. Something like this should do the trick for you so long as all your users don't want to check their mail all at the same time. ps, try beating that kind of uptime $1m data center. pps, i'm worried i may have to take it down soon to replace the power supply as the fan is getting slower :(

    1. Re:the ultimate server for 1 million users by ohjethuth · · Score: 0

      ps, try beating that kind of uptime $1m data center.
      The scale of the service delivered by a $1m data centre (minus a bit of downtime) would probably beat your single user box :)

      --
      Oh s**t!
  361. You could... by ohjethuth · · Score: 0

    ...get everyone a Gmail account!

    --
    Oh s**t!
  362. Check out Mirapoint by JBird · · Score: 1

    Take a look at the Mirapoint mail appliances at http://www.mirapoint.com/. 99.999% reliability and scalable up the wazoo.

  363. Stop right now by biglig2 · · Score: 3, Insightful

    What you have here is an opportunity for a tremendous open source win against exchange, and you are about to stuff it up because you do not have a clue how to do it.

    So, what you do right now is you go find someone who does know how to do it. And by that I mean someone who can demonstrate they know how. Which does not equate to having a low slashdot id; it equates to having done real projects of this scale.

    So, how do you start? You ring IBM and get them to come in and talk to you. You ring Red Hat. You ring Accenture.

    If you want impartial advice from someone who isn't a vendor (which is a good idea), then you go find some companies that has a million seat open source e-mail deployment in place and you see if you can get their messaging admin to talk to you.

    --
    ~~~~~ BigLig2? You mean there's another one of me?
    1. Re:Stop right now by vidarh · · Score: 2, Informative
      1m mailboxes isn't much, and doesn't require a big complex system. Been there done that, and learned a lot (both about what to do and not to do) in the process, but the short story is that it isn't particularly hard.

      The main challenge when I was doing it 5 years ago (I designed and wrote most of the prototype of a free webmail system, and managed the development team that completed it) was lack of good open source webmail solutions and lack of scalable mail storage systems, and hardware limitations.

      Today there's a huge number of GOOD IMAP based webmail packages, such as IMP, and mail storage isn't much of a problem anymore - you can get a couple of TB of storage relatively cheaply.

      Today, if I was going to do this in a corporate setting, I'd buy 3-4 small cheap servers to process inbound/outbound mail, 2-3 reasonably high powered machines with good IO capacity and RAID5 to split the users mail storage, POP/IMAP access over (IO is more or less the ONLY thing that really matters - whenever you need to make a choice, always choose higher IO capacity over almost anything else), 2 machines for an LDAP directory of which server the user is on, 2-3 cheap servers to run the web frontend on.

      All in all for that kind of scale, if your total cost pans out to more than 20-30 cents per user in hardware these days you're doing something very,very wrong (and you can manage for MUCH less depending on usage patterns of your users and how much time you're willing to spend on tweaking the software).

    2. Re:Stop right now by milimetric · · Score: 1

      Good post, good post. Except for one thing.

      ACCENTURE????

      Do you like... work for them or something? You've got to be kidding, right? I have one thing to say about accenture. Cingular Wireless. Accenture was hired to do the merger with AT & T. Now, I know that this is a tough task to ask of anyone but I've never in my life seen a more screwed up website than Cingular's. It looks like someone was amazed that the project actually compiled and just released it out there.

      Seriously, Accenture is all business, 0 quality. Get yourself a real quality consulting firm. IBM and RedHat are good examples of companies that can provide quality.

    3. Re:Stop right now by biglig2 · · Score: 1

      Well, the thing about the big consultancy firms that don;t make product is that they all suck, so when you try to pick of an example for a slashdot post, your odds of finding one that isn't useless are not good.

      I bet their suits are nice though.

      --
      ~~~~~ BigLig2? You mean there's another one of me?
    4. Re:Stop right now by milimetric · · Score: 1

      lol, yeah, those rich bastards : )
      Open Source consulting, now there's something I'd be into. Basically making one company all over the world and charging for the service but releasing all source code for others to benefit / improve.

  364. GMX is a very large provider wich fits the profil by metasepp · · Score: 1

    Have a look at GMX. they are one of the biggest E-mail providers in Europe, (mostly Germany and Austria)
    http://www.gmx.de/
    As far as i know, they have a 100% non MS Solution.
    Mostly Linux clusters.
    They seem to scale pretty well.

    Hope this helps.

    Greetings

  365. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  366. Re:Army Knowledge Online does it for 1.72 million by joib · · Score: 2, Informative

    My university also uses the Sun Messaging server. But we're only about 15000 students, so it's not a huge deal. But it works really well, at least compared to the old system with NFS-mounted mailboxes; there were constant problems with that, and it was overloaded and slow too.

  367. Downtime by shani · · Score: 1

    At firms I worked with (telephony companies, usually), scheduled downtime is not included in downtime numbers. Of course, it depends on the SLA, but this is how it worked in the "5 nines" days of Ma Bell. 5 nines (99.999% uptime) was basically a myth. :)

  368. Rewrite, anyone? by jlehtira · · Score: 1
    As someone who uses Qmail and likes it a lot I might recommend you do just that. There are some features that people expect from a modern mta package. IMAP would be one of them, and my humble opinion is that binc-imap would go well with qmail.

    I just did a qmail install yesterday and even when it's a good program it has a long steep learning curve. Every time I install qmail I need to google around, read a lot of documents and understand new things to decide what I need. Netqmail is a good but insufficient starting point while qmailrocks.org's version is completely overblown (at least for me). While figuring out the patches and other tools I'd need I couldn't help plotting yet another qmail package of my own.

    Improving netqmail and making it what qmail should be would be great.

    As an aside, my personal feeling is that if DJB sticks to his licensing (and continues to ignore all the patching) we need to eventually rewrite qmail. It's getting worse by the day and the patches are already starting to conflict with each other.

    1. Re:Rewrite, anyone? by Gaima · · Score: 1
    2. Re:Rewrite, anyone? by Anonymous Coward · · Score: 0

      I think Dovecot is much nicer than Binc. I would be interested in why you think otherwise!

    3. Re:Rewrite, anyone? by Russ+Nelson · · Score: 1

      Well, that's what Courier is intended to be: a GPLed qmail.

      So what patches do you feel are necessary?
      -russ

      --
      Don't piss off The Angry Economist
    4. Re:Rewrite, anyone? by jlehtira · · Score: 1

      I think Dovecot is much nicer than Binc. I would be interested in why you think otherwise!

      Why, of course. First, it is slightly cheaper; and secondly it has the words qmail-pop3d and www.qmail.org inscribed in large friendly letters on its homepage.

      Seriously though, only the latter had a meaning when I made my choice. I had originally installed qmail because a friend recommended it and I installed an imapd later when I started to need it. I first tried courier-imapd but couldn't figure out how to make it work at the time, and binc happened to be the next one to try; maybe it was mentioned on qmail.org. It matched my needs perfectly - installation was a snap and it has worked flawlessly ever since. Thus, I never got to try Dovecot or any other imapd.

      If you have experience from both, I would be interested in why you think Dovecot's better!

    5. Re:Rewrite, anyone? by Doktor+Memory · · Score: 1

      The set currently implemented by the freebsd qmail port might be a good start...

      (hi russ!)

      --

      News for Nerds. Stuff that Matters? Like hell.

    6. Re:Rewrite, anyone? by TemporalBeing · · Score: 1

      At least making it able to compile with GCC 3.x or later would be a good start as well. For example, adding the errno patch.

      --
      Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
  369. Some info on how it works by Steeltoe · · Score: 1

    Of course not. If implemented properly, it should be transparent. That means you disolve the link when somebody performs a mutation on their instance. If 1000 people do this, it means you will have to store the 1000 variations. But you could also detect similarities with ie. hashing, and only store those unique objects and link to them for each instance.

    Ironically, Microsoft is developing WinFS which is supposed to be able to automatically hardlink files transparently, thus the filesystem will automatically support Instance Store for every application. This is actually a pretty neat feature!

    Yes, people will seem stupid when you assume they are. It is most usually about your assumptions, not them..

    Instead of jumping on the problem, just think the obvious solution, and then patent that ;-)

    1. Re:Some info on how it works by TheLink · · Score: 2, Insightful

      "Ironically, Microsoft is developing WinFS which is supposed to be able to automatically hardlink files transparently, thus the filesystem will automatically support Instance Store for every application. This is actually a pretty neat feature!"

      Not if you really want a copy.

      For most normal users, disk space isn't a big problem. If it is, duplicate files aren't usually the cause of the problem.

      When I make a copy of a file, I don't want the O/S to just add a link to the same file.

      I want a frigging copy.

      If there's a bad sector or something goes wrong the chances are higher that I can recover the data if I have a _real_ copy.

      I use a file system for storing data. If disk storage was such a big problem, Google etc wouldn't be giving out GBs to users for _free_.

      I/O is a bigger problem. Disks store a lot more nowadays, but are not that much faster.

      --
    2. Re:Some info on how it works by Fulcrum+of+Evil · · Score: 1

      Yes, people will seem stupid when you assume they are. It is most usually about your assumptions, not them..

      You described a software process in a vague manner. Had that been the actual claim for exchange, it would have been stupid, even if they did the right thing.

      As it is, you may just be being vague - I don't know, and I'm not much interested in MS software.

      --
      "We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
  370. Hula by Norny · · Score: 2, Informative

    It's unfortunate you got so many junk answers to your query (e.g. "resign", gmail, .mac, etc). I had a server running ~15,000 accounts on a Pentium 133 with IMail 7 a while back. It wasn't pretty, but mail got sent and received as it should.

    Hula claims to scale pretty well, integrate with ClamAV and SpamAssassin, and have lots of other cool gimicks for calendars and such. For 1 million accounts, I'd get some sort of dedicated spam/virus filter, though.

  371. Complete tosh. by jotaeleemeese · · Score: 1

    High level concepts can be outlined in a few paragraphs.

    Even if eventually the poste calls the consultants he can get enough ideas here to at least be properly informed of the overall direction a solution could take.

    Why are there so many people around with a "don't bother" attitude?

    --
    IANAL but write like a drunk one.
    1. Re:Complete tosh. by Anonymous Coward · · Score: 0

      for real jotaeleemeese, most the posts so far have been, blah blah blah, why would you even ask such a question dummie! Mostly because I don't think they have an answer. He obviously came here for some initial direction, not a fully thought out plan of action. It looks bad on the slashdot community that most of the posters act like this.

      BTW, some of us would like to konw what is better than Exchange in this scenario, including myself.

  372. Over here by guruevi · · Score: 2, Insightful

    We do it with a bunch of Postfix servers and MySQL. The MySQL is going to be clustered soon but currently runs separate on each server. Each server has MySQL and Postfix and generates statistics. Currently the most heavily loaded machine (10000 mail accounts) eats about 1-5% of CPU (Single Xeon with 3x72G SCSI RAID5). We estimated you can push about 100000 accounts/server given enough disk space (we are planning to put it on Apple SAN-solution) and separating the MySQL database. There are about 10 mails/sec. passing through the server (IN/OUT). An environment with 1000-2000 exchange e-mailaccounts takes up 2 dual proc. servers for the frontend and 2 single proc. servers for the backend (storage) needs migrated to a 70000$ storage solution because the current gives not enough throughput. The problem is that each times a secretary opens a calendar (eg. to schedule an appointment with the managment) all those mailboxes, schedules, calendars, notes are opened, searched through and synced (takes about 2000MB of datatransfer in a few seconds) while the IMAP protocol doesn't do that and provides the same functionalities.

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
  373. Re:Army Knowledge Online does it for 1.72 million by ataddei · · Score: 2, Informative

    Sun has the Sun Outlook Connector that allows MS Outlook to behave normally while there are Sun Messaging, Calendar, Addressbook and Directory serves instead of MS Exchange. In addition Sun has a SAFE methodology and toolkit to migrate out of MS Exchange.

  374. What a shame. by jotaeleemeese · · Score: 2, Insightful

    Somebody that obviosuly has never been trusted with a challenge on his job.

    Sad.

    --
    IANAL but write like a drunk one.
  375. Re: Infrastructure for One Million Email Accounts? by ataddei · · Score: 0

    Concerning the question from Cliff from yesterday about an alternative to MS Exchange, very clearly Sun has 3 reasons for you to move: - A backend software servers to support one million and many more users on the messaging, directory and groupware. Sun software servers are OEMed into many other vendors in the telco space for normal email, unified messaging, etc. etc. AND for some huge corporate levels. I won't list as well own many telcos serve B2B customers and do messaging hosting. Sun's software is here the clear leader. - The Sun Outlook Connector that allows a user to keep MS Outlook client but see all Sun servers for mail, calendar, tasks, global addressbook, private addressbook, notes, journals, etc. like if this was an MS Exchange server. So no need to retrain your users and I can tell you it was exposed to the fire of real users! - The Sun Groupware Migration Toolkit (SGMT) that allows you to SAFELY migrate the whole users data from MS Exchange (any version) with NO NEED to force users to change their passwords and no disk expansion on mail side with coexistence for user level, password, mail and even public folders. Again from real projects. Let me know for an offline discussion if you need more details

  376. Nonsense. by jotaeleemeese · · Score: 1

    Heavy /. users on my company are in charge of systems providing email services n that range.

    Actually I think I have spotted their reply already.

    Many people underestimate the /. readership for I don't know what reason, several of the reply a pure gold for somebody in the situation of the poster.

    --
    IANAL but write like a drunk one.
  377. Have you looked at Decimail? by Anonymous Coward · · Score: 0

    I agree that there is not much point in using a database if what you need is a "traditional" email system. But a relational database allows you to go beyond what a traditional email system is capable of: I wrote Decimail (http://decimail.org/) to investigate these possibilities. Basically, Decimail is a PostgreSQL database with IMAP and SMTP daemons.

    In Decimai, an IMAP mailbox is defined by an SQL query. So you can group messages by date, subject, sender, content and so on. The main contrast with other systems is that this categorisation is done retrospectively: there is no need to file messages when they arrive, but instead you find them when you want to read them. And each message can appear in more than one mailbox if it matches multiple queries.

    Decimail is not particularly efficient: it's emphatically not what this questioner wants for his millions of users! But if you sometimes find the organisational features of your existing email system a bit limiting, it might be of interest.

  378. Re:Army Knowledge Online does it for 1.72 million by jsnipy · · Score: 1

    Yeah its not the quickest thing (spop) and you virtually can't send any attachments :(

    The portal is nice though ... if you are just dealing with other ako users.

    --
    -- if you mod me down, I will become more powerful than you can possibly imagine
  379. Enterprise-wide scalability by Anonymous Coward · · Score: 1, Informative

    If you want to guarantee anything beyond 99.0% availability, you must have complete redundancy at least 200 miles apart. This distance make all the MPLS links unusable, signifiantly increasing system complexity.

    You never mentioned what your RTO and RPO were. If you can lose 24 hours worth of data, there are fairly standard methods. 12 hours is doable. Less than that and you need to spend a ton more money. SRDF/RA is interesting when you get down to the 5 minute area and don't want to write across the WAN for both locations.

    Probably the easiest solution is to get 4 mainframes, 2 per site, create linux partitions on them and use some commercially supported MTA. Use all the mainframe replication facilities to do the remote replication daily.

    Or you could use email like it was meant with federation and each dept or location having their own local server.

    Don't forget about spam filtering, SOX compliance, and automatic encryption of external communications. IronMail merged with a PGP product can do this. The free PGP implementations make the data the individual's, not the companies. I'll just say that commercial PGP has "other solutions available" so the company still can get access to encrypted information.

  380. NMCI Blows by HangingChad · · Score: 3, Informative
    Just so you know. Most of us out in South East Asia refer to NMCI (Navy-Marine Corps Intranet) as the Not Mission Capable Intranet.

    When it works at all it's slow. Sometimes you can hit the Send button and just sit there and wait a while.

    When we have to work on a Navy project we had to start bringing our own equipment and hubs. Even their developer machines come loaded with 10 year old software and you can't get your email and be logged in as a developer at the same time. To check mail you have to log out, log back in under a different account, then log back in as a developer. The NMCI machines are boat anchors.

    NMCI is the worst defeat the US Navy has ever suffered.

    --
    That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
  381. Lack of reading comprehension skills... by jotaeleemeese · · Score: 1

    Poster: we have Exchange, it does not work, we are moving to something else.

    drsmithy: you should use Exchange. It is the rock0rz!

    Me: doh!

    --
    IANAL but write like a drunk one.
  382. So which delivery agent automatically hardlinks? by Nailer · · Score: 1

    Your idea is good, but is it implemented anywhere?

  383. Re:Split up the tasks carefully by davecb · · Score: 1
    Make sure the mechanism used to distribute the mail directories to the pop/imap/whatever servers is not NFS.

    Over my objections, a colleague tried to have the mail directories on on machine and the pop servers on four others. At light loads he got acceptable performance, and so put it in production. With several thousand accounts, 30-minute (not seconds, minutes!) delays between messages were common.

    NFS (v2 and 3) is pessimal for constantly-updated files with ad-hoc locking mechanisms.

    As suggested in the parent, distribute the mail files for a given user to a machine which provides the pop and imap services from local disk.

    --dave

    --
    davecb@spamcop.net
  384. Don't build your own, rent it. by Anonymous Coward · · Score: 0

    Don't set your self up for huge sustainment costs by building and maintaining your own e-mail system. Contract with someone who knows how to do it correctly and cheaply. Contact Google, Yahoo or MSN to have your million users added to existing infrastructure. I am sure they can give you your own domain and would be considerably cheaper. If all else fails, take a look at Oracle Mail. You will be pleasently suprised.

  385. Large-scale POP3 by blorg · · Score: 1

    A million users and they want POP3? Add a gun and a single bullet to your administration requirements.

    An ISP, etc. could quite reasonably require a million POP3 accounts. Given that the submitter mentions Some corporate, some personal, some free I am guessing that employee users are in the minority here.

  386. Maybe Nemesis? by c0l0 · · Score: 1

    A few weeks ago, I read an interesting article about Schlund&Partner, one of the largest internet-related companies in Germany these days, developing an eMail-solution of their own, because they weren't able to find anything suiting their demands. They're the guys behind GMX, the arguably most popular eMail service in Central Europe. Their eMailing system, Nemesis, is designed to provide scalability and redundancy all the way, maybe you can get them to relicense it to your company (I don't know which license the project actually is under, sorry), or at least let you evalute their solution so you can deceide what you can actually expect from nowadays eMail-gadgets for the enterprise.

    The article was published on the german-speaking Linux-Magazin, Issue "August 2005".

    Good luck! :)

    --
    :%s/Open Source/Free Software/g

    YTARY!
  387. Lotus Domino on an iSeries by G1975a · · Score: 1

    Merging Lotus Domino with the power and stability of an IBM iSeries (aka AS/400) would give you the stability and robustness you require. Unfortunately, the cost isn't cheap.

    The iSeries also runs Linux if you are looking for a stable, hig performance (but not cheap) 'server'. You may already have an iSeries if you have 1 million people working for you.

    Lotus Domino runs on Linux so you could run it on a Lintel box and get the stability of Linux with the Domino feature set. Alternatively, an IBM mainframe (or any other that would run Linux) could be a Domino server, if you wanted to re-use existing equipment you may have.

    As for the client, Lotus Notes runs on Windows and Linux (in WINE). Their web client, iNotes, while not perfect, performs nicely and has some security features built-in that you'd need for roaming users.

    (no, I don't work for or sell IBM products)

  388. Doable by jav1231 · · Score: 1

    The thing is, 1 million accounts is not simultaneous. There will be a substantial number who won't be on the server at any given time. The number of accounts isn't as much a concern as the size of the inboxes involved. If you run quotas and keep them down, you can have a decent machine handle these accounts even with one server. That is, until you add webmail. With that number, webmail would have to be a serious package. I know for my implementation, the speed of the webmail package dropped with large inboxes, so again quotas help. I'm with the suggestion above about a Beowolf cluster. Clusters are nice for redundancy, speed, and the ability to stay up indefinitely.

  389. And you're mad because? by Anonymous Coward · · Score: 0

    So you're mad because you can easily customize the mailbox interface in Notes? You've just shown that you can sort mail by subject, so where does the problem come in?

    If you're angry with the company you work for for not making the UI enhancements you want you should be complaining to them about it, not the /. crew

  390. What I would do. by MrJerryNormandinSir · · Score: 1

    I'd go with a unix based operating system, build
    a HA active/active cluster that consists of at lease 3 nodes each. If you are an Intel shop I'd go with
    6 Dell 6650 poweredge servers. Some form of SAN, or
    NAS storage. 1M users must mean a budget so I would go with an EMC CX600. I'd use Linux HA. Configure
    the first three servers as your mail hub, configure your accounts, imap, ldap, pop, webclient, etc. on your next three servers. Sendmail gets bashed but I'd tell you what... It is stable and when configured properly it is secure. LDAP can be populated via your Active Directory .. or why not rewrite and centralize your authentication on a modified Open LDAP server!

    If you don't have to time to roll your own system, check out scalix. Roger Williams University (my son attends there) uses Scalix. www.scalix.com

    1. Re:What I would do. by Intron · · Score: 1

      After 5 years of pissing away my time with sendmail, I am looking at a different MTA on the next server (scheduled to be deployed next month). So I did the obvious, clearly scientific test: running the sucks-rules-o-meter on MTA's. The results:

      I did "sucks" vs "rocks". The word "rules" appears in too many places that refer to mail configuration rules.

      RATING Exchange 1819 188
      RATING Exim 29 33
      RATING Postfix 49 140
      RATING Qmail 59 250
      RATING Sendmail 229 84

      --
      Intron: the portion of DNA which expresses nothing useful.
  391. if it were me ... by psbrogna · · Score: 1

    Make the spamfiltering function somebody elses problem- either a service (Mailwise) or appliance.

    Software? Cyrus IMAP + Postfix.

    Hardware? Something high density ... perhaps opteron based IBM blades? Clustered on some linux distro that IBM would provide a support contract for. I've also been looking at 64bit HPC stuff from Terrasoft solutions on Mercury HW- very intriguing.

  392. Cluster System by thed00d · · Score: 1

    At a previous company I was with, we ran CommunigatePro, with roughly 500,000 users on it. Our cluster had only 4 machines to support this, and we had plenty of room to grow. They offer great support, and have several installation services. Everything is customizable in it, and has all the features you requested.

    --
    http://www.accelerateglobalwarming.com
  393. Where would I start? ..... by kwandar · · Score: 1

    GMAIL!!

  394. Fuck Stalker by Anonymous Coward · · Score: 0


    Stalker fucked their customers real hard a year ago.

    My guess is that they have a new leadership with clueless MBAs and maybe planning for an IPO.

    It is really pity since Stalker was the best player, but my advice is to stay well away from them.

    http://www.theregister.co.uk/2005/02/04/stalkers/

  395. Re: Wrong on all points... by Anonymous Coward · · Score: 0

    1) It's a bitch to install. Won't even compile on modern Linux distributions. You have to patch it to compile it and the patch isn't even hosted on qmail's site.

    Yes, it's not for newbies like you. It's for people who know what they're doing. On which "modern distro" did it "not even compile", anyways?

    2) It's a bitch to configure. Rather than parsing a single configuration file, qmail relies heavily on the presence of individual files in a directory.

    Guess what, many admins prefer that approach over
    lengthy, nested config files.

    3) Not not not not scalable! That's a myth. Doesn't properly batch jobs together. Hell! qmail was originally designed to be run from inetd!

    qmail was designed to run from daemontools.
    Where do you take all that bullshit from that you're spilling here?
    Oh, and it scales quite well, quote from qmail.org:
    USA.net's outgoing email, Address.com, Rediffmail.com, Colonize.com, Yahoo! mail, Network Solutions, Verio, MessageLabs (searching 100M emails/week for malware), listserv.acsu.buffalo.edu (a big listserv hub, using qmail since 1996), Ohio State (biggest US University), Yahoo! Groups, Listbot, USWest.net (Western US ISP), Telenordia, gmx.de (German ISP), NetZero (free ISP), Critical Path (email outsourcing service w/ 15M mailboxes), PayPal/Confinity, Hypermart.net, Casema, Pair Networks, Topica, MyNet.com.tr, FSmail.net, Mycom.com, and vuurwerk.nl.

    4) Heavy reliance on other daemontools.

    Yes, because

    1. it's UNIX
    2. all other tcpserver implementations were broken at the time (and afaik are still broken)

    What are you gonna criticize next, qmails low memory footprint?

    5) Breaks well-known and understood UNIX standards.

    Which "well-known and understood UNIX standards" are you referring to? Do you have *any* clue what you're talking about? NO.

    6) Security through lack-of-functionality.

    Actually it's secure by design.
    If you had the slightest clue about software architecture you'd have realized that on first glance.

    7) Not really secure despite the claims.

    So, you have found a vulnerability and collected the $500 USD from djb? Oh, you didn't?
    Then shut the fuck up.
    Your FUD is not appreciated, kiddy.

    8) No longer maintained.

    Says who? You? haha!

    9) No features. Adding them requires patching, and patching, and more patching.

    Yes, which you do *once* and then roll it into a nice package for future deploys.
    Last time I had to change something in my qmail-tarball was over a year ago.
    That tarball is all I need to pull up new installations with all the patches that I need.
    Deploy takes exactly one line:

    tar zxf qm.tgz && cd qmaili && make && make setup check

    And, guess what, the Makefile even sets up the config for the box at hand which is easy because all it takes is stuff like echo `hostname -f` >/var/qmail/control/me etc.
    With postfix I'd have to replace tokens in a config-file-template (which will ofcourse change syntax in newer versions) and other shit.

    But from your statements I can tell that all this is way beyond your little head already. So next time you see the adults talking about stuff you don't grok you'd better shut up and listen instead of humiliating yourself like you just did.

    And if you still didn't get it:
    All your points are either outright *wrong* or display an embarrassing lack of clue. Go figure.

  396. Danger, Will Robinson.. math skills lacking! by adamgeek · · Score: 1

    I didnt bother to read all of your comment. I stopped right around the point i noticed that your mathskills are worse than mine.. and that's saying a lot, haha. Basic math lesson below..

    24(hrs/day) x 7 (days/week) x 52 (weeks/yr) = 8736 HOURS IN A YEAR.

    "99.9% uptime equates to about 526 minutes, or 87.6 hours you _could_ be down each year"

    8736 / 100 = 87.36. Therefore, 1% downtime (99.0% uptime) is 87hrs lost per year! 0.1% downtime (MUCH DIFFERENT THAN 1.0%!) is 1/10th of that number.. 8.7hrs/year.. roughly 43/minutes per month. You got tripped up because you multiplied 8700 by .01.. which [in this case] is the mathematical equivalent of "1 percent" .. NOT the remainder of "99.9 subtracted from 100" (which is what you THOUGHT you were multiplying it by).

    I hate to sound so derisive.. but seriously, you start off saying you've built enterprise-class systems.. yet you don't know something as fundamental as how much uptime "3 nines" equates to? I can understand crappy math skills (i have them too), but I can't understand not knowing something so fundamental by heart. I play with cameras for a living, so maybe i've misjudged the amount of off-the-cuff knowledge an admin/architect of a 250k acct email system must have regarding uptime.. but WOW haha.

  397. Re:LOOK WHO IS REALLY DUMB! [IT'S YOU!] by Anonymous Coward · · Score: 0

    hey dude, fyi you messed up ;-)

  398. Forget it... by madman101 · · Score: 1

    Of course, none of the solutions offered here are in the least bit compliant with the myriad of regulations that are going to need to be addressed in a enterprise this large. Sorbanes-Oxley is just one of your problems.

    Sorry, but if you need to ask this question on /., you need to outsource this project to a company that specializes in projects this size and will guarantee compliance.

  399. Well by Shads · · Score: 3, Insightful

    In my opinion you're going to need a cluster of servers or at least round robin'd mx records for the servers. I personally think sendmail scales the best of the mta packages and offers the best set of features and ease of maintenance, although alot of people would argue it's intrinsicly insecure... I've never had problems, but I kept our mail servers up to date. I would seperate the smtp machines the outside world uses to deliver mail to your space from the servers used by users of your service to deliver mail. I would also move delivery services (imap, pop, webmail) to their own machines instead of having them on the smtp machine and you would probally be best to use a nas for the actual storage medium. This is actually a really interesting project. Good luck and let us know how it turns out :)

    --
    Shadus
  400. My 2p on where to start.. by chewitt · · Score: 2, Interesting
    My 2p, based on experience of designing, managing and being commercially responsible for large scale messaging systems for the last 6-8 years (where large scale covers 500k users to 9m users) is that you don't want to use OSS as the core for projects this size. This may sound somewhat heretical to the /. audience, but if you're serious about the uptime constraints (99.9% is light - 99.999% is where you need to be and 100% is what you should be aiming at) and weighing in that someone's business somewhere is going to heavily depend on the success of this system, you *need* the focussed support and SLA's that you will only get from a commercial vendor. You're still going to glue the system together with a number of open technologies and there will be substantial customisation to meet your needs, but the core of the system needs to be rock-solid. In general my experience has been that much OSS Mail componentry is fantastic at lower scales both technically and commercially, however the admin burden rises unacceptably when the collective sum of all those components needs maintaining - even when in the hands of highly skilled administrators. Mail platforms at these scales constantly have problems/issues in them somewhere due to the unpredicatbility of a million users alone, so one of your biggest concerns is how you overcome them. Being dependent upon the OSS community or internal resources to perform a root cause analysis and fix a code bug when your system is running live is not a situation you can afford to be in.

    Some things to consider: MS Exchange is a lot more than just mail. If Calendaring and other forms of group-working are involved then the task at hand is substantially more complex than for a mail only system. Also, these days with virus and spam being endemic the platform needs to incorporate a framework that handles them as well as policy driven content management controls at it's core rather than have them as bolt-in's or bolt-on's. Are you bound by any regulatory requirements?. Geography is a major influence, and if this is a business platform how does this affect your strategies for resilience, disaster recovery and backup of the platform? In a perverse way most of the decisions you have to make when building systems of this size are about business decisions (what's the cost of retraining users to use new mail clients is a favourite of CTO's) and it's not specifically about the products/technologies involved.

    So, exactly what type of hardware/software and surrounding infrastructure you need to assemble to create 'the whole' is a somewhat open-ended question without going into a decent level of detail on your requirements and the drivers behind them. However, once you go north of about 500k users the number of commercial vendors tails off dramatically. If you include group-working as a factor it reduces further. I'll not start suggesting names (I currently work for a vendor in this space and self-plugging's not in the spirit that /. operates on), but i'd recommend starting out by talking to some of the analyst groups that have staff researching this end of the messaging market (Radicati, Gartner, Butler Group) and then opening dialogue with vendors appropriately.

  401. Check out Communigate Pro by Anonymous Coward · · Score: 0

    i believe Communigate Pro by Stalker software has a 1 million account single server system running for a couple of years now. They also have cluster configurations to eliminate that single point of failure. Very good product, very stable.

  402. oracle collaboration suite by mij_zelf · · Score: 1

    you can implement this scalable and redundant, it offers all services mentioned plus more. With this http://www.oracle.com/technology/products/cs/index .html you can also use the oracle database as datastore for all kinds of docs, effectively replacing fileservers. Don't forget things like backup & recovery ... I don't know how others think about this or have experiences with it but I think it worth some investigation. It has a price tag per user but with a full implementation it could be a nice price... http://homepage.mac.com/ik_zelf/oracle

    1. Re:oracle collaboration suite by Anonymous Coward · · Score: 0

      Do not deploy OCS. It is a nightmare, backup and recovery is a huge problem, and Oracle is terrible with support. We just rolled it out at my job and without going into details that might get me canned, the higher ups in IT and management have been grumbling amongst themselves that we should have got Exchange. It might also be worth noting that both the consulting firms that were hired to help us deploy OCS (supposed OCS experts) use Exchange at their companies.

  403. i have a good solution by afrest · · Score: 1

    can you get in touch with me at afrest@hotmail.fr

  404. Jettison your AV load! by lwriemen · · Score: 1

    Move all your users to eComStation clients, so you won't have to be concerned with AV software or any existing trojons.

  405. The Obvious Thing To Do... by PHanT0 · · Score: 1

    Start porting your favorite, e-mail server to an Atari... what fun is this project unless you're going to run it on old arcane hardware which was never designed for anything remotely close?

  406. Ask an expert... by davecb · · Score: 1
    Depending on the degree of migration work involved, you may want to engage someone who's managed large moves away from Microsoft solutions.

    If this were my company's move, I'd try to get John Terpstra of the Samba team to consult with us. See http://us1.samba.org/samba/team/ for contact information.

    --dave

    --
    davecb@spamcop.net
  407. A good question by HitScan · · Score: 1

    I'm wondering if they didn't know how to use Exchange properly, what makes them think that "Anything But Microsoft" is going to be any easier? Are they just going to try each one until they find one with defualt settings that most closely match what they want to do? Inquiring minds want to know.

    --
    HitScan
  408. I don't think the average Dell dual Xeon box is up by Anonymous Coward · · Score: 0

    Well if the users get a 100 Kb mail box it might.

  409. Has anyone suggested Gmail? by mrlatito · · Score: 2, Funny

    Gmail is open to everyone now right....just sign up for 1,000,000 gmail accounts and go on vacation! Let the engineers at google do it.

  410. Notes/Domino by hey! · · Score: 3, Interesting

    Of course nearly everyone who uses it hates it, because it seems unnecessarily complicated. But this is precisely the kind of situation Domino was designed to handle: scaling. If you can get by with Sendmail, you don't need or want Domino, but if you want to manage a million email accounts, this is one of the first places I'd look.

    This is exactly what Notes was designed to do: scale. People have been building systems on this scale with notes for nearly twenty years. You can not only scale it by moving parts of your email system onto mainframe class iron, but you can distribute it and provide all kinds of flexibility and redundancy into your system to meet virtually any messaging requirement (e.g. choose an alternate MTA for high priority traffic when there are Internet disruptions). Naturally there's some complexity involved, but if you can get by with sendmail you probably shouldn't be using Notes.

    What's more important is that management of accounts and identity, which is distributed, delegatable, and backed up by robust cryptographic certificate management. You can let a subsidiary manage it's own accounts, they can subdelegate that to a division and the division can subdelegate that to the IT staff on site; at each level policies can be set, enforced, and changed for lower levels.

    --
    Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  411. Cyrus by DanFluidMind · · Score: 2, Informative

    I would seriously look at Cyrus (http://asg.web.cmu.edu/cyrus/), which is designed to be scalable for huge numbers of email accounts. And the email users don't have to have accounts on the Unix boxes. It stores the messages in the file system but sets up index databases so that accessing the mailboxes is fast. It can also handle single-instance storage of the messages sent to multiple mailboxes.

  412. Mail by ladule · · Score: 1

    There are two answers to this. One is Qmail which as long as you are familiar with this program and are willing to invest the technical time into it would give you the basics for mail. One draw back here is that you have no protection and would have to add an appliance due to the size of the network such as Symantec which would only increase cost and IT time. Second would be to again use Qmail and run ldap you could then proxy your mail through a scrubber lowering CPU usage and some maintenance time. The third and solution that I would highly suggest as being your best bet is to outsource this service to a subscription based provider that also includes mail protection i.e. Anti-virus, Anti-Spam, Anti-Spyware as well as the ability to control outgoing mail protecting sensitive company documents. There is only one Company that I know of that provides this complete mail with protection service and that is Sarron www.saron-corp.com.

  413. I just did this yesterday, too. by No-op · · Score: 1

    Ugh. I just did this crap yesterday.

    we don't do brick-level backups of mailboxes because it's too much overhead, so to restore a bunch of deleted contacts from a users mailbox I had to go back and restore the whole mailstore. conveniently, though, I was able to pop it on to the RSG on my new exchange 2003 box, which hurt much less than I thought it would.

    I only wish the user had been on our new exchange 2003 box, which is backended onto a netapp filer- so far the snapshot backup /filer management for Exchange seems to work pretty well, and you can restore arbitrary mailboxes, emails, whatever, from any point in time that you have backups for.

    Sometimes, though exchange just makes me want to start smoking again.

    --
    EOM
  414. gmail.com by neo · · Score: 1

    Don't dismiss this out of hand. I'm willing to bet that Google would licence the technology they use for private corporations, and you already know it can handle over a million users. Google is in the business of making money... (or that's what I've heard).

    I would certainly give them a call and see what they are asking for a private version of Gmail.

  415. Heh, that's easy. by Pizentios · · Score: 0

    I would use qmail, with courier and vpopmail running over top of it. For the web mail check out http://www.horde.org/. It's got some really great features.

    For spam and virus check out a barracuda unit, simply amazing. http://www.barracudanetworks.com/ns/?L=en

    --
    -Pizentios
  416. ...if correctly implemented by stonetony · · Score: 1

    Exchange 2003 is the best option because of available support and scalability, BUT it must be correctly implemented. The company I work for has ~200,000 mailboxes that used to stretch across several Exchange 5.5 environments, TAO mail, Lotus Notes and GroupWise. We started a consolidation effort four years ago that is in its final stages (mostly because we've acquired other companies during that time that had to be migrated.) This was accomplished by building a brand new native mode Win2K infrastructure along with Exchange 2000 then migrating clone user accounts then mailboxes in phases. We went from a problematic patchwork of mail platforms with diverse support to one large Exchange 2000 (since upgraded to Exchange 2003 in preparation for Win 2K3) environment spread out all over the globe that is highly available, clustered and 99% up time. If you're planning this you definitely need Exchange 2003 installed from scratch and involve Microsoft. I've seen it work first hand.

  417. I'd consider consulting with the following... by Bourdain · · Score: 1

    I'm by no means a developer though I read up on email technology and providers all the time.

    I'd consider contacting the good folks at:
    http://www.fastmail.fm/
    they provide one of the fastest and most standards compliant IMAP, pop3, and SMTP services I've ever used

    They support lots of bandwidth and storage and low costs using, AFAIK, all open-source software at a seemingly low cost per user

    Also, the individual who maintains the following website might be of good assistance to you:
    http://www.ii.com/internet/messaging/imap/isps/
    and
    http://www.ii.com/

  418. HMail by neoweb · · Score: 1

    I would look into HMail.
    It comes with webmail or you can use a different one, IMAP, POP, SMTP, External accounts, Antivirus (ClamAV), blacklists, MySQL or MSSQL support, web based admin control panel that you can let users use to control thier account, individual domains or the whole server, multihoming, action scripting and more. Oh and it's FREE. A good choice if you have to use MS as your server OS. Now of course it's never been tested for that many email account but I do know of people using it for multiple domains with 1000's of accounts on each domain.

  419. Gmail by Anonymous Coward · · Score: 0

    I think Slashdot readers have enough spare Gmail invites to help this guy out, right?

  420. I've actually done this. by shin0r · · Score: 1

    Earlier this year me and my team rolled out the largest email system in Europe for $UK_ISP (not BT).

    It caters for 4 million current users and can scale to an estimated 10 million.

    We use Openwave MX software to do this - it was the only thing that would scale. I mean the *only* thing, nothing else could cope. We looked, trust me.

    You need *lots* of hardware. This isn't a full list, but to give you an idea:

    24 MTA machines
    12 FEP (front end processing) machines
    16 queue machines
    48 mail storage machines
    16 virus-scanning machines
    2 dedicated DNS boxes
    4 directory servers (to look up mailboxes)
    16 webmail machines

    Numerous other boxes including logservers, terminal servers and a jumpstart environment for quick rebuilds.

    Typical box stats: SunFire V440/480, quad processor, 8gb / 16gb RAM where possible. All run Solaris 8/9.

    These are hooked up by fibre to a couple of enormous EMC arrays, and a bunch of HP EVA storage also. Total capacity? currently ~48tb.

    It's a massive project, and it's not perfect, or (ever) completely finished, but it works!

    Good luck with your project, if I could give you one bit of advice it would be to take whatever spec you think you need and double it.

    cheers

    1. Re:I've actually done this. by prgrmr · · Score: 1

      This sounds like the solution, but I'd tweak it:

      IBM pSeries servers instead of SUN. Better price/performance, better support, hot-clustering and fail-over available, and they'll be around longer.

      I'd also go exclusively with EMC Clariion SANs. With the new 330GB Fiber drives at 15 per shelf and 10 shelves per rack, you'll have approximately 40TB per rack (allowing for 1 hot spare drive per every 2 shelves and using 300GB as the usable drive space to allow for formatting). You can link CX's together or not, depending on how you structure the servers on the front-end. You can also mirror 2 CX's across remote sites for Disaster Recovery, HIPPA, Sarbanes-Oxley concerns, etc.

    2. Re:I've actually done this. by shin0r · · Score: 1


      IBM pseries are nice, but they don't run solaris :)

    3. Re:I've actually done this. by hogans-hero · · Score: 1

      Have you looked at CommuniGate Systems (aka Stalker Software)? There is a deployment in Sweden with 4-5 million users using their cluster. I guess it is too late for your choice but there is indeed an alternative to Openwave. http://www.stalker.com/CommuniGatePro I'm not directly affliated with them. I just built a few clusters using their software. ;-) Regards..

  421. and then you need a copy by diegocgteleline.es · · Score: 2, Interesting

    You have a flawed assumption in that the file is read only. Exchange/Outlook will let you modify the attachment in place and keep it in your mailbox.

    ....and then, Exchange WILL have to write a new copy of the data, because you just modified it and the data is not the same than before - you can't use the same copy. If the 1000 users keep the same file it's fine, if they modify it you need 1000 copies about it

    Sharing something with people (which for some reason database people call "single instance store" I've learned today) can be done in both a filesystem and in a data base. Databases are "one-size-fits-all" kind of tools, not always the "best" solution, but one that you've lot of chances of making it work even if it's not the best solution. Linus said something similar when he was suggested to develop GIT in top of MYSQL...if you really know what you're going to do with the data, and you KNOW that a filesystem is enought, why use it? It's buying a 900HP car to your mother - STUPID. The "let's do it just because we can" is a good step if what you want is to write overengineered, bloated software.

    Because a filesystem IS a database. Except that instead of having a SQL-ish interface, you've a "read(), write(), readdir()" kind of interface. Which happens to be really fast (filesystems are implemented inside the kernel, they're reliable, they're much simpler, easy to manage, etc).

    When you use a database like mysql, you're just using a database in top of, uh, another database (the filesystem). Which has not sense. It WILL work, but that doesn't means is the "best possible solution"

    Despite of all this, BTW, hardlinks are NOT the solution for the "share a file between 1000 users" problem. It can be, but remember that you can't make hardlinks between different filesystems. I have no idea if you can use LVM to solve this, if ACLs + symbolic links can be used to implement this in a delivery agent. And if you cant (I don't really know), someone really should think about adding something to filesystems to allow it like plan9 did, because it has sense

  422. MOD PARENT UP by Chuckalo · · Score: 1

    or he will kill you

  423. Branded Yahoo Webmail by sjbjava · · Score: 1

    I'd seriously consider Yahoo Webmail if it could branded for the company.

    1. Re:Branded Yahoo Webmail by sher1harris · · Score: 1

      This paragraph taken from the qmail website.

      "A number of large Internet sites are using qmail: USA.net's outgoing email, Address.com, Rediffmail.com, Colonize.com, Yahoo! mail, Network Solutions, Verio, MessageLabs (searching 100M emails/week for malware), listserv.acsu.buffalo.edu (a big listserv hub, using qmail since 1996), Ohio State (biggest US University), Yahoo! Groups, Listbot, USWest.net (Western US ISP), Telenordia, gmx.de (German ISP), NetZero (free ISP), Critical Path (email outsourcing service w/ 15M mailboxes), PayPal/Confinity, Hypermart.net, Casema, Pair Networks, Topica, MyNet.com.tr, FSmail.net, Mycom.com, and vuurwerk.nl."

      Sounds like all of these companies think you should use qmail

  424. The real fear of IMAP by iMacorIBM · · Score: 1

    Well, now that we've cleared up the benefits of IMAP, I'd like to add the real reason, which nobody seems to mention, of why people like POP3 over IMAP is PRIVACY. The idea is 'Pull it down, and off those company servers' before you get fired because some friend forgot to use your gmail account. When you host your own mail, IMAP rox. A Debian Cyrus/Squirrelmail solution has speedily pushed out my mail for years with much better virus and spam protection than most. Evolution (client, nearing the end of its REALLY buggy days) allows me to merge my corporate exchange and personal IMAP on my desktop, at home and at work (via OWA). Inappropriate emails are quickly drag 'n dropped out of Work->Inbox and into Personal->Funny. Rather than craft yourself some POP3 hack around IMAP, spend the time to setup your own personal IMAP system, and get yourself an ISP that lets you do this. (One that allows incoming port 25, authenticated outbound SMTP via ISP) Over the years I've beefed mine up using a VPN to a second/multiple mx locations for redundancy, as I have family members in the area - I use Rogers who leave your IP alone, when on UPS. Well, not a million users, but family and friends and an old Dell D233, on software RAID IDE. Debian packages with inherent security.. I sleep well on long vacations, and don't waste valuable company time trying to maintain my privacy. And YES, I might consider a Cyrus based solution, if you are looking for FREE and FAST. Postfix+Courier might work too, not sure how Courier scales, but if you like Maildir, GO FOR IT. iMac.

    1. Re:The real fear of IMAP by stu42j · · Score: 1

      If you move or delete a message from an IMAP account it will be deleted from the server. Unless the company has made special effort to keep permanent copies which they could just as easily do with POP3.

    2. Re:The real fear of IMAP by iMacorIBM · · Score: 1

      Agreed, In your Utopian world where they run rsync every 5 minutes and perform backups every hour... Then you'd have to be quick :) I think for the most part we aren't quite there yet :) iMac.

  425. Re:LOOK WHO IS REALLY DUMB! [IT'S YOU!] by Anonymous Coward · · Score: 0

    And yet somehow the fact remains you were still wrong. Hard to say Ooops and move on?

  426. I would start with a distributed setup like this.. by ides · · Score: 1

    I wrote an article on a system I use in production for about 20,000 accounts that should scale up to what you're looking for. Obviously you'd want to add in more servers, nice RAID setups, etc. The nice thing about this setup is it separates onto different servers the inbound MX traffic from the POP3/IMAP traffic. Here is a link to the article:

    http://www.samag.com/documents/s=8920/sam0311b/031 1b.htm

    Another benefit is if there is a hardware failure it either doesn't impact the system at all ( if you lose an MX server ) or it only impacts a small subset of your total addresses which makes it more manageable.

  427. Do it like Hotmail -- use BSD by Anonymous Coward · · Score: 0

    Do it like Hotmail does and use BSD.

  428. Lotus Domino by metamatic · · Score: 1

    You'll see a lot of trolling and flamebait regarding the Notes client; but the fact is, IBM Lotus Domino might meet your requirements.

    It provides full e-mail with calendaring and scheduling. It supports POP3 and IMAP, and webmail, as well as S/MIME, MIME file attachments, HTML, and LDAP for directory lookup. It has been demonstrated dealing with 300,000+ simultaneous users on a single server. (In fact, the network bandwidth gave out before the server did.)

    It scales from a single old x86 (I have some old quad Pentium 2 200MHz boxes running it fine) to a zSeries mainframe. It has clustering, so if you set it up right, when a server goes down users' clients move to a clustered replica and don't even notice a problem.

    I'm not wild about the Notes UI either; but you owe it to yourself and your company to check out Domino as a possible solution, because you don't have to run the Notes client at all. You can even keep using the Exchange client, and just replace all the servers. There are tools to help migrate from Exchange. I can probably put you in touch with some people experienced with Exchange migrations if you like; e-mail me (address on personal web site, at bottom of page).

    Another option might be IBM Workplace Messaging. That's focused around web mail. I have to confess I don't know how far it has been scaled up. (I'm not in sales.)

    (Opinions mine, not IBM's.)

    --
    GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
  429. Big Mail by fade · · Score: 1

    I just built a mail system that was intended to scale out to about 4 million users. We replaced over 70 windows machines running iMail with a small array of linux boxes behind load balancers. Mail is largely IO bound, so you pretty much have to get a kickass storage system. I specified fibrechannel connections to a generic SATA array with a lot of spindles. Because you're going to get a lot of delivery and read access concurrency, use Maildir as the spool storage. Select a good fast filesystem. We used Reiser for historical reasons, but XFS would also work well. We used Qmail as the transport with ldap patches and courier-imap(ssl) as the imap server, which practically gives you the webmail component for free. Spend your money on the SAN backend, and give each mail node a lot of ram. Apart from the Maildir requirement, you could build out a like architecture with postfix as well. To date, over 2 million accounts have been transitioned to the new system, and those 70 windows machines have been retasked as webservers. Here's the machine breakdown:

    2 * dual xeon/1G ram/small boot drive -- SMTP
    6 * dual xeon/4G ram/small boot drive -- imap/pop
    2 * dual xeon/4G ram/internal raid -- LDAP
    4 Terrabyte SATA Raid backend connected by fibrechannel cards

    *note, if you have a requirement to save virus prone boxes from commiting sepuku over foreign agents coming in on the mail channel, you'll have to scale the smtp requirements nearly arbitrarily to account for processing each message for malignant components at that stage. Personally, I think it's kind of distasteful to use Free (speech) systems to make windows secure, as it's sort of defeatist, but that's my political baggage. YMMV. ;)

  430. Re:qmail-ldap can do it by JacobKreutzfeld · · Score: 2, Interesting

    I used qmail-ldap to build a service which has had zero downtime in over a year, planned or unplanned. I had a handful of 1U servers offering SMTP(S), IMAP(S), POP(S), WebMail, and local DNS and LDAP caches. They stored mail on a backend NetApp accessible to all servers via NFS. One master LDAP server was where accounts were added, and it replicated to the cache slaves on each 1U server. I can add capacity to the NetApp, and add servers to handle load with no downtime. The 1U servers are fronted by a redundant pair of F5 load balancers.

    We were able to apply OS patches box-by-box, taking them out of service individually, but without any downtime to the service. Very nice.

    Others are using qmail-ldap for large ISPs, of the size you are asking about. Check out their mailing list.

  431. This is how we do it by Anonymous Coward · · Score: 0

    The company I work for (which shall remain anonymous, just like me) does it thus:

    Use cisco content switching modules (CSMs) for load balancing all of the clusters, we no longer use POP and IMAP, but we used to. Just webmail now, I'll explain the old style, just the hardware side since that's pretty much all I was privy to. Filling in the rest though is just a matter of reading how to tie it all together, which is well documented.

    LDAP - OpenLDAP:
    2xDell 1850

    IMAP - Courier IMAP:
    8xDell 1850

    POP - Can't say that I remember what software, courier possibly:
    8xDell 1850

    SMTP - DJB's Qmail:
    12xDell 1850

    Webmail - Squirrel Mail (we created our own later):
    8xDell 1850

    NAS - NetApp Filer:
    2x960 (I believe) in a failover configuration

  432. a mountain of not-even-close by Doktor+Memory · · Score: 1

    1) It's a bitch to install. Won't even compile on modern Linux distributions. You have to patch it to compile it and the patch isn't even hosted on qmail's site.

    Look, it's annoying that Bernstein and the GLIBC authors have decided to take their mutual pissfest out on us, but "echo gcc -O2 -include /usr/include/errno.h > conf-cc; make" is not exactly going to kill you, now is it?

    If it is going to kill you, there's always the net-qmail distribution.

    2) It's a bitch to configure. Rather than parsing a single configuration file, qmail relies heavily on the presence of individual files in a directory.

    A matter of taste, I guess. The single-config-per-file method makes it very easy to build kickstart/rpm profiles that add or remove certain features without having to carefully parse/edit a monolithic configuration file, but I can see how for a junior sysadmin it's a little more confusing than just "look in main.cf."

    3) Not not not not scalable! That's a myth. Doesn't properly batch jobs together. Hell! qmail was originally designed to be run from inetd!

    You really have no idea what you're talking about, do you? (Hint: qmail isn't sendmail, and qmail-smtpd isn't "qmail" any more than inetd is "unix".)

    4) Heavy reliance on other daemontools.

    You can use daemontools to manage qmail if you want to. It's not a requirement, and the official docs don't even suggest it.

    5) Breaks well-known and understood UNIX standards.

    Really? Which ones?

    6) Security through lack-of-functionality.

    It's an MTA. It transports mail. Securely, as it happens. This is a feature, not a bug.

    7) Not really secure despite the claims.

    Really? Care to enlighten us?

    8) No longer maintained.

    This is as close to an actual valid complaint as you've got here: it's certainly been a good long time since the last release. And yet, it still works.

    9) No features. Adding them requires patching, and patching, and more patching.

    Look, if you need an MTA that speaks LDAP, SQL and UUCP, has hooks into an integrated calendar, and polishes the bumpers on your car, it's probably true that qmail is not the tool you want to use. Have fun trying to manage whatever monstrosity it is that does.

    It does one thing, and it does that one thing extremely well: some of us still consider that to be a virtue.

    Serious sysadmins don't use qmail and for damn good reason.

    I rather doubt you'd recognize a serious sysadmin if one bit you.

    --

    News for Nerds. Stuff that Matters? Like hell.

  433. Empirical evidence in favour of IMAP+RDBMs by revisitor · · Score: 1

    A little bit of research reveals this:

    Many features of a DBMS are highly advantageous from the point of view of an IMAP server. The obvious performance differential between the database options and both UW and Courier indicates that email storage is indeed a problem well suited for a database solution. Indexing capabilities give Cyrus and mySQL an advantage over Courier and UW when scanning headers and searching header fields. mySQL's full-text index provides a particularly expedient method for searching through message text, although it adds significant maintenance cost to operations such as adding and removing messages. A server-side buffer cache also improves performance by speeding up searches on recently accessed data. Although UW outperforms Cyrus by a small margin on some full-text searches, mySQL demonstrates clearly that a DBMS can search email much more quickly than a file-based solution. Most importantly, these results offer desperately needed empirical data comparing the performance of these three storage implementations.

    http://www.usenix.org/events/lisa03/tech/full_pape rs/elprin/elprin_html/

    I don't suppose anyone's come across any newer research (or implementations using this approach)?!?

  434. Security? by Anonymous Coward · · Score: 0

    Don't forget there may be some sort of security required too. I'm not just talking about using SSL and TLS either. There may be some requirement to use two-factor auth and email encryption. Plus the potential privacy legislation issues to deal with. You could be up for some extra headaches here and possibly a free sports car or a lifetime of golf days from Vasco or RSA.

  435. Lotus Notes by Anonymous Coward · · Score: 0

    I would look at Lotus Notes on the biggest pSeries server you can find, or an i5 iSeries if you can. Infinitely manageable and configurable, with native or web interfaces (you do, however, get *much* more functionality with the native client) it additionally lends itself to considerable use in application design within Notes itself, as well as integration with the DB2 database as seen on the i5. Very very nice and I believe bullet-tested by the folks over at IBM.

    I am not a Notes sales guy, just a happy admin.

  436. Easy answer! by jabber01 · · Score: 1

    I have a BUNCH of Gmail invites. Ya wan'em?

    --

    The REAL jabber has the user id: 13196
    What you do today will cost you a day of your life

  437. Who in their right mind... by cow-orker · · Score: 1

    ...would put such a giant system in a single place? Multiple mail servers don't have scalability problems.

    Put one mail server in each department. Universities have done this nearly thirty years ago and it still works. It also saves bandwidth as intra-department email doesn't need to be routed. Everything works out of the box, they invented SMTP back in the 80ies to make this sort of thing possible.

    If you really must give people their firstname.surname@company.com address, put the mapping into a database and have central routers forward accordingly.

    Honestly, what's wrong with the tried and proven low tech solutions?

  438. I dunno. What are they? by TheLink · · Score: 1

    More than half the bugs listed are for nonDJB code. qmailadmin and masqmail are not by DJB.

    Much of the rest are for running in a 64 bit environment. If you want to port some 32 bit apps to a 64 bit environment, no surprise that you might have to change a few things first before things run properly.

    The "rcpt to" overflow DoS thingy isn't a problem in practice, because your qmail processes should have sane ulimits on them. You might want ulimits even if you run postfix or some other mailserver.

    The other one happens if you allow users to send 2GB messages. If you don't and you have a ulimit limiting the amount of memory qmail-smtpd uses to <2GB, I think the process would die first (not sure what happens if you try the exploit without ulimits on a box with < 2GB free RAM...).

    The other thing of course is: qmail-smtpd runs as qmaild and not root. So even if you do allow > 2GB messages and there is an overflow, the attacker only gets qmaild permissions.

    The attacker needs to work a lot more to get further.

    AFAIK DJB doesn't claim there are no bugs. He just gives a security guarantee. And so far, I don't see how these bugs will allow an attacker remote root on a qmail system. Even without ulimit controls, you'd just get DoS or qmaild.

    I don't believe openbsd is ahead of the curve though ;). Ahead in some areas perhaps. But quite behind especially in performance.

    It'll be fun to get another DJB vs Theo thing. When was the last one? ;)

    --
    1. Re:I dunno. What are they? by jnf · · Score: 1

      Yes I only counted 4 of them as being DJB bugs. As for the 32 to 64 bit thing, yes naturally. My biggest argument is as a professor of CS and generally an asshole to people about their coding, this is something that should be have coded in a manner that would not break when switching between processors, even if its far into the future and we are using 256b processors. I mean, even if 64b didn't exist at the time of the code writing, the implications of such things were clearly understood.
      Yes, all of the bugs are unrealistic in practice, and you should have ulimit controls in place regardless, but for a programmer to depend on this is a mistake, this holds doubly true if the programmer makes claims as to being security minded.
      I mean, it's still a bug in the code either way, you should not rely on the OS to protect you.
      Even with qmaild permissions, most boxes out there are pretty cushy once on the inside, and most root compromises happen from a user who already has local access-- whether it be an account or a compromised service that didn't run as root.
      DJB claimed at least the top three were not bugs because simply put, no one runs qmail like that, which is incredibly silly on his part to me, I mean, okay so this changes your code how? It's still a bug, even if the os protects you from it, you havent fixed the bug, just masked it.
      In regards, to OpenBSD, I suppose in a few areas t hey are ahead of the curve, for instance the way they just changed their malloc/free routines (although we shall see over time if they were ahead of the curve or off hitting the pipe again), but in many area's they are quite behind in comparisson to say PaX/Grsec.. and thats just in security .. outside of that, they just got ELF support in i386 as of what? 3.4?

      I'd like to see DJB, Theo and RMS in a boxing ring with sickle's attacking each other to the death, and then we should shoot whoever is the winner.

    2. Re:I dunno. What are they? by TheLink · · Score: 1

      What I'd like to know is whether the bug appears when qmail is compiled on 64 bit powerpc or alpha code. If it does, then the bug is with qmail. Otherwise, it'll take a bit more to convince me it's a qmail bug.

      I did a test, the ulimit thing won't shutdown the 2GB single line overflow thing. So it might be a way in. If it is genuinely exploitable DJB should pay up :). I run qmail on freebsd so I'm not sure how to exploit it.

      BTW I have the freebsd server (my firewall etc) running on vmware so if anyone hacks into it, I can suspend the virtual machine, make a copy and maybe see what happened. Lot more options that way...

      --
  439. FirstClass Groupware by darkone · · Score: 1

    Since there are 100 posts about Linux solutions, I'll make my other suggestion.

    FirstClass ( www.firstclass.com ).

    FirstClass is a great piece of groupware software that has been around since before the Internet. Calendars, Shared Conferences, Web Pages, VoiceMail/Fax, Shared Folders (available through CIFS/Windows File Sharing), IMAP, POP, Auth SMTP, EASY EASY EASY administration. The Mac, Linux, and Windows clients look identical, EASY setup for users, a server name, user name, password. No ugly SMTP/POP/delete messages and all of that. Different web mail templates, one looks and acts JUST like the client.

      How many concurrent users, because 1mil users is not hard, 1mil concurrent users is. Also 99.9% uptime is not THAT hard, it's 9 hours of down time per year.

      The downfall of FirstClass is though you can have multiple "clustered" Internet boxes (http, smtp, pop, etc), you can only have one main server. Also everything (unless you POP) is kept ON the server, so you NEED and internet connection (or modem) to work with EMail.

      Another bonus is that web pages are EASY, NO HTML required. Create a document, change the fonts and colors, drag image, BAM, the web page is done.

      FirstClass is definatly worth a look.

      -Ben

  440. Qmail Rocks with slight mod's by Anonymous Coward · · Score: 0

    http://www.qmailrocks.org/ and a few other patches for much larger setups

  441. distributed by rfisher · · Score: 1

    This doesn't sound like a problem for a single, monolithic solution to me.

    The "free" accounts--for one--should be a separate system.

    Each team should have its own email system. Each team should have the option of having one of their own manage their email instead of IT. IT should define standards to use for interoperability.

    Webmail should probably be a generic web IMAP interface with the requirement that each team's email system have IMAP turned on so it'll interop with the webmail.

    This is the kind of setup that I've seen work best. But YMMV.

  442. Re:Really want to know why the client isn't as sli by RzUpAnmsCwrds · · Score: 1

    "Oh, the other thing? Outlook feels integrated because everything automatically does the windows automatica launch active-x thing. Just highlight a message subjet, bingo! Embedded code launches! that's why viruses and worms."

    Stop spreading FUD. Outlook hasn't executed scripting in messeges for years. While it's true that Outlook uses Trident ("IE") for displaying messeges, it runs it in a mode with Javascript and ActiveX disabled. Even Trident flaws are unlikely to cause a compromise.

    All recent email worms have been of the "download this executable and run it" variety. And Outlook 2003, by default, won't even let you download executables.

  443. I'm in the Navy by Anonymous Coward · · Score: 0

    I am in the navy and we have horrific information technology problems. Basically the problem is that we have little internal expertise and have tried to make up for it with a unruly mass of contractors. If you are an IT contractor just be sure to use the phrase "web enabled" and the Navy will bite. We now have an absurd array of web pages that we are supposed to access for training, pay, medical, promotions, retirements....the list goes on. Guess what - each one is provided by a different contractor and requires it's own usernames and passwords. I literally have 14 (not kidding) usernames and passwords written down that I need to do my job. What is particularly shocking is how many of these sites use your social security number as part of the sign-in process.

  444. Guninski by Russ+Nelson · · Score: 1

    Oh, yeah, Guninski. He's a crank. Sure, if the sysadmin doesn't apply any process limits, an attacker can deny himself service. That's like saying that if you have a gun you can shoot yourself in the foot.
    -russ

    --
    Don't piss off The Angry Economist
    1. Re:Guninski by fimbulvetr · · Score: 1

      Note: You and I have argued qmail's security before, quite a while ago, so I don't want to get into that again:)

      At any rate, Guninski seems to be a competent person and his work appears 100% legit in everything I see. While some exploits might require odd configurations, they are still a bug in the software, and that's what Guninski is after and that's exactly what people _should_ be after: bugs in software.

      The fact the DJB denies his exploits is absurd, but it is par for the course seeing as he does have a record of that.

      Your ad hominem attacks are depressing.

    2. Re:Guninski by jnf · · Score: 1

      This changes the code how?
      I thought we were conversing over whether there were bugs in the code or not?
      Not whether OS dependant configurations masked the bug or not?
      This is more like saying, certain versions of gun model X have been known to misfire and sometimes shoot out the back of the gun, but its not a bug if you point the gun backwards.

    3. Re:Guninski by jnf · · Score: 1

      You pinpointed my argument exactly.
      Whether or not the OS protects you from these bugs, they still remain bugs.
      The fact that DJB refuses to fix his code speaks volumes about his character and is reason enough to not use his code.
      Do his interests lie in security, or upholding his 'never had a bug found' reputation. I think this instance clearly shows where he stands.

    4. Re:Guninski by Russ+Nelson · · Score: 1

      Not whether OS dependant configurations masked the bug or not?

      I guess it depends on whether you think the application or the operating system should be in charge of resource limits. It seems to me that, since an operating system needs resource limits to protect itself from rogue programs, a cooperating program may just as reasonably rely on the existance of resource limits. djb's programs do.
      -russ

      --
      Don't piss off The Angry Economist
    5. Re:Guninski by jnf · · Score: 1

      I guess it depends on whether you think the application or the operating system should be in charge of resource limits. It seems to me that, since an operating system needs resource limits to protect itself from rogue programs, a cooperating program may just as reasonably rely on the existance of resource limits. djb's programs do.
      I would agree with you, if it didnt take a programming error, i.e. int overflows to cause the situation. This is the case in all 3 of Guninski's bug's, i.e. integer overflow causes the problem, os ulimits/etc stop you from exploiting, but the bug still exists none the less.
      So in the end, it has nothing to do with who controls resource limits, but rather who caused the bug that causes the need for resource limits.

  445. Not a good idea by Steeltoe · · Score: 1

    If your disk is flakey and your data is important, you will back it up on removable media which you store in a separate location.

    With internet, it is even easier. You can use one of the online backup services, or do it yourself. What can be achieved rather painlessly today in backup is amazing: In one day I mirrored a Linux PC over internet, with full mirroring of the whole Debian-install, making for a complete redundant backup-machine which can be put online in minutes. Pretty fun, and I can clone the entire install anywhere I like, never having to install everything by hand again.

    If you're relying on copies of files for backup, then I guess you never had a HD die on you. It's not fun, and copies won't help you. Solutions exists to get the data, but it's too pricey for individuals.

    If you're into local redundancy, I guess RAID could be a cheap option, or just copy the files to another filesystem.

    I think your gripe is with Microsoft usually not giving enough options to their customers, because I can't see why you don't like this solution, which of course in a sane system should be possible to turn off. For a workstation this is probably not needed anyways, but imagine what this can do for a CVS-sandbox fileserver.

    I'm surprised that the filesystems for Linux haven't been doing this for years yet. It might have performance issues, but those should be solvable.

  446. Get the storage "right" first by ChrisA90278 · · Score: 1

    I would seriously start the design process with the storage system. Email is mostly just "storage" that is accessed by many different servers all at once. It needs to be fast, fault tolerent and easy to backup while in use for this a "point in time snapshot" feature is good. You may want to talk to Sun about thier "ZFS" filesystem on Solaris 10. If not read up onwhat
    it does and then get something else like it.

    Once you have bomb proof storage running on a cluster of servers, raid, hot spares a transactinal file systam and all that than you
    add smtp, imap, pop webmail servers. Use lots of MX record, round robin DNS or whatever to load balance. If the storage don't work the system
    will not work, get that right first

    Lots more detils but "this is /." and peole here think for only 2 seconds (if that) before they type an answer

    BTW 99.9 is setting the sights a bit low that would alow about 8 hours of downtimeer year. Shoot for another "9".

  447. Look at it a little closer. by Some+Random+Username · · Score: 1

    "All the other tests" are 2 other tests, searching and selecting all headers. This is not indicative of actual use, and doesn't demonstrate that mail storage should be a database. As I said previously, its just because they are comparing mysql with indices, to imap servers without indices. Throw dovecot in there with its indexing and all of a sudden mysql isn't faster at searching.

  448. Go with Sun by Anonymous Coward · · Score: 0

    http://www.sun.com/software/products/messaging_srv r/home_messaging.xml The Sun Java System Messaging Server is a high-performance, highly secure messaging platform--the leader in the service provider messaging market. Scaling from thousands to millions of users, the Java System Messaging Server is suitable for both service providers and enterprises interested in consolidating email servers and reducing total cost of ownership of communications infrastructure. It also provides extensive security features that help ensure the integrity of communications through user authentication, session encryption, and the appropriate content filtering to help prevent spam and viruses.

  449. What does that have to do with anything? by Some+Random+Username · · Score: 1

    First of all, no, your RAM will not be enough to cache everything. Just like it won't with a database. You will end up with more RAM dedicated to caching with just filesystem, since a database takes up a bunch of RAM for other things.

    And your last part makes no sense at all. Of course doing find will be slow. Just like select * would be slow on a table with a million rows. You shouldn't be trying to access everything at the same time no matter where you stored it.

    And your mkdir problem is likely just because you don't have enough inodes. If you are creating a filesystem to store alot of files and directories, you will create one with enough inodes to have them all.

    And of course, you don't want to make a million directories in the same dir. Make a few thousands directories (one for each domain) and then have each of those contain a few hundred of thousands maildirs (the users for that domain).

    1. Re:What does that have to do with anything? by Some+Random+Username · · Score: 1

      Blech, that should say "contain a few hundred or thousand maildirs", not "few hundred of thousands maildirs". Time for a coffee.

  450. Nicely done. by Some+Random+Username · · Score: 1

    Feel free to ignore reality and pretend you need to build a database. Those "data structures" are called files, and the filesystem is already written, and it takes care of them for me. Pretty handy huh? And decent imap servers already have indexing, so that's taken care of too. Oops, you suddenly get all the benefits a database would give you, without the huge overhead.

  451. Do you even read what you write? by Some+Random+Username · · Score: 1

    If the files are not in the buffer cache using fs storage, then they would also not be in the DBs cache using a db for storage. You will have LESS RAM available for caching data if you use a database, since you now have all the other stuff you don't need from a database using up RAM too.

    If a fileserver is going to "choke on a flood of ... disk ops" then so will a database server, or does it have magic powers to avoid access disks? Or are you trying to compare a fileserver with 512MB of RAM vs a database server with 16GB of RAM?

    And the majority of mysql installations out there might well be used to provide an SQL frontend to simple, non-relational data, but that is definately not the case with real databases. As I have explained repeatedly now, the only thing a database will get you is faster searches (SEARCHES, not ACCESS), and that is entirely because of the indexing. Use an imap server that does indexing and suddenly the database is offering you nothing.

    1. Re:Do you even read what you write? by Wdomburg · · Score: 1

      If the files are not in the buffer cache using fs storage, then they would also not be in the DBs cache using a db for storage. You will have LESS RAM available for caching data if you use a database, since you now have all the other stuff you don't need from a database using up RAM too.

      The difference here is that in a database you're caching the index - i.e. the information you need for message operations - and not the message itself. Filesystem caching typically works on a block level only. With ext3, for example, the minimum block size is 1k (and defaults to 4k). The relevent metadata is much much smaller.

      On top of that, most operating systems will read ahead a few blocks on top of the requested information.

      Having an index implemented by the IMAP daemon certainly solves some problems. If the underlying message format encodes metadata into the filename. That way you're only stating the directory and inspecting unchanged messages. The big problem with that approach, though, is that you're now indexing on access, not on delivery, which means you're pushing the machines harder during peak use periods, rather than taking advantage of spare cycles throughout the day.

    2. Re:Do you even read what you write? by Some+Random+Username · · Score: 1

      "The difference here is that in a database you're caching the index - i.e. the information you need for message operations - and not the message itself."

      No, that's not a difference, that's exactly how it is with an imap server with indexing. Your indexes can be in RAM if you want, or they can be in files, which will be in the filesystem cache if they are accessed often. The actual message itself could very well be in RAM with either a filesystem or a database if it has been access recently.

      "With ext3, for example, the minimum block size is 1k (and defaults to 4k). The relevent metadata is much much smaller."

      So, if you indexes really are that tiny, make them memory, not file. And I wouldn't suggest using ext3 either.

      "The big problem with that approach, though, is that you're now indexing on access, not on delivery, which means you're pushing the machines harder during peak use periods, rather than taking advantage of spare cycles throughout the day."

      I'm not sure if you are familiar with maildirs, but there is a "new" directory where all the new mail goes. Once someone reads it, it is moved to the cur directory and indexed. Indexes are updated when the files are altered, just like with a database. If you really need your unread mail indexed as well, you could certainly change your LDA to do that for you.

  452. Stalker's CommuniGate Pro by nazgul@somewhere.com · · Score: 1

    They call it a mail server server, but it includes web, calendar, ftp, radius, pop3, imap, wap, smtp and just about every other relevant RFC you can think of. It includes Outlook compatibility (calendaring as well) and runs on just about anything. And, more importantly for what you want, it scales very well, supporting large volumes of incoming email, millions of users, and multi-machine clustering.

    I've used it for many years on somewhere.com. Not many accounts, but it handily bounced (and spamtrapped) as many as a million messages a day.

  453. Just give everyone. . . by etn991 · · Score: 1

    a Gmail account.

  454. Only search within the user's subdirectory (duh) by UnapprovedThought · · Score: 1
    you're going to be heavily limited by the disk speed and processor time

    Adding a database layer makes it even worse, typically increasing the chance the box will start swapping, while helping to drain the CPUs, and eat most of the memory you could have used for an index cache.

    Not good when you want to show a list of emails and the user attempts to sort by something, or search for that email from three years ago.

    A poorly written SELECT statement can also be a very effective way of slowing a large email database to a crawl.

    I'd like to introduce you to (my little friend...) the concept of creating subdirectories as a way of organizing data. :)

    Each user could have their own subdirectory. There is no need to store everything in a single directory though -- the subdirectories could be further subdivided based on month or even day. The filenames themselves could be chosen so that commonly searched fields are available without needing to search the contents of the files. You don't have to search through the whole 2G of emails (just because Deus-Ex-change does that doesn't mean that you have do it that way).

    Since 99% of searches are looking for something that happened the same day or a previous week, I don't think it would get bogged down that easily (but I'm willing to listen if you can find an exception).

    (And this is without commenting on some of the bloat monstrously hypocritical idiots have tried to add to some of the common Unix utilities, but of course there are non-bloated versions around that run much faster. No forced obsolescence for me, thank you.)

  455. Re:Only search within the user's subdirectory (duh by AKAImBatman · · Score: 1

    Adding a database layer makes it even worse, typically increasing the chance the box will start swapping, while helping to drain the CPUs, and eat most of the memory you could have used for an index cache.

    Um no. Adding an index to the data (what a DBFS really is) would speed up the search, not slow it down. The idea is to find the information you need as fast as you possibly can. There is no way that walking an index is going to be slower than churning through 2 gigabytes of data.

    I'd like to introduce you to (my little friend...) the concept of creating subdirectories as a way of organizing data. :)

    Um, no again. When I refer to 2GB of mail, I mean 2GB per user. When I was in tech support (back when drives weren't much larger than 2GBs!) we constantly had to repair the PST files of some poor sap who had inadvertantly gone beyond the storage capacity of a single PST file. (FYI, PSTs corrupt silently instead of complaining about being full. It's quite annoying.)

    There is no need to store everything in a single directory though -- the subdirectories could be further subdivided based on month or even day.

    To what end? If I'm searching my entire mailbox, I still need to churn through all those subdirectories. Not to mention that your organization scheme may be counter-intuitive to me. I may prefer to organize my mail by project instead of date. (In fact, I really don't know of anyone who orders their mail by date.)

    The filenames themselves could be chosen so that commonly searched fields are available without needing to search the contents of the files.

    Or, the meta-data of a database file system could extract the necessary components, index them, and store them with the file itself. No need to munge one type of meta-data (filename) to support other types of meta-data (subject, from, to, etc.). Not to mention that your scheme hangs by a very lose thread. What happens if a user decides to rename the file?

    Question, did you read the link I gave in the great grandparent post? You may find it informative.

    Since 99% of searches are looking for something that happened the same day or a previous week,

    No, that's not the normal pattern I see. Most searches are an attempt to find some obscure piece of information that's been lying dormant for years. For example, if a coworker gives me a username and password for use when they're unavailble, it could be anywhere from months to years before I need that information. Other examples include procedure documents, URLs, code documentation, and project information needed by follow-up projects.

    That's what makes GMail so effective. None of that info is lost. It's all indexed and tagged so that you can easily search for it in the future. Filesystems should be able to replicate that experience.

  456. RE: Infrastructure for One Million Email Accounts? by Anonymous Coward · · Score: 0

    Look at Samsung Contact
    http://www.samsungcontact.com/

  457. IMAP & privacy by Craig+Ringer · · Score: 1

    That seems justified.

    I'm in the useful position of being the BOFH and mail admin at work - at a fairly flexible and reasonable workplace - so I'm able to use my IMAP mailbox without fuss.

    I can imagine that may not be true for some. However, POP3 won't help you - it's still gone through your company's mail filters, been logged, and if the company is really dodgy been scanned for "flag" words / analysed or even had a copy stored. The download protocol doesn't matter - either your company is not reading your mail, or they are.

    The point is that if your company doesn't permit the use of work email for personal stuff, you're generally better off following that. Even if it's not reasonable - because they're not likely to be reasonable about it if there's a problem, either.

    1. Re:IMAP & privacy by iMacorIBM · · Score: 1
      Agreed, I have no expectation of privacy with corporate email. Still under 30, I have some silly friends, or rather don't autodelete based on the words 'Warning' in the Subject line. Rather I relocate, have a good laugh, knowing that things aren't being watched, and the next time my quota comes up short, there's no excuses sitting around in my mail folders.

      I guess in my experience, assiting with mail administration here and there, and having watched a couple of dismissals, often the 'mail review' comes long after the fact.

      An employee who wants a full defense to any allegation regarding work would like the benefit of knowing that the 'ol IMAP folders don't have any of those funny jokes from the competition.

      iMac

  458. Re:Army Knowledge Online does it for 1.72 million by Anonymous Coward · · Score: 0

    I have to call bullshit. 1.72 million users and less than 3 million messages a day? So each user on average only sends and/or recieves less than 2 messages per day?

  459. Re:Army Knowledge Online does it for 1.72 million by Eol1 · · Score: 1

    Its not bullshit and you obviously havent' been around the Army. Every troops has an AKO account, maybe (and I say maybe) 20% of them log on a day and maybe 20% of them actually send emails with it. Every unit in the army still runs its own independent unit email system (usually m$ exchange) which handles the vast bulk of US Military email traffic. NETCOM tried to the force the issue with PKI CAC implementation but the Army resisted and DA G6 never backed them up. AKO is a good idea done badly and lacks serious command support but that is OT.

    AKO really does have ~2 million accounts and maybe ~3 million emails daily ... active accounts is a different item though.

    --
    De Oppresso Liber
  460. Non-obvious by Anonymous Coward · · Score: 0

    Except when the document is opened, changed, and version control added you DO have 900 different instances. Gee, aren't you glad that feature saved you?

  461. Samsung Contact by brainee28 · · Score: 1

    I run Samsung Contact, which is an implementation of HP Openmail. It was designed to scale at least 32000 users per server, and can be used with a webclient, with POP3, IMAP, and Outlook via MAPI. My users think it's exchange, but it isn't. All running off a Linux server

  462. Nothing Scales like this baby.. by Anonymous Coward · · Score: 0
  463. Cyrus IMAP by NekoXP · · Score: 1


    Designed for it, aggregate servers and load sharing, the works.

  464. How we do it... by wolf31o2 · · Score: 1

    Where I work, we have a system similar to this. I noticed that nowhere do you list virus scanning and spam blocking. Of course, offering these services will *dramatically* increase the needs for infrastructure.

    We use several sets of servers. The first servers are our MX servers. We have 5 of these and they process all incoming mail from the Internet at large. These servers are responsible for two things, delivery to local accounts, and processing incoming mail. They are all behind a Cisco CSS load balancer and have a limit set on their connections. There's a good reason for this, to be explained later. Besides the MX servers, we have 3 MQ servers. The MQ servers have no limits on their connections, as they are spill-over servers only, designed to queue up mail. This ensures that our spam and virus scanner servers do not become overloaded, even during a large spam attack, as the MQ servers must traverse their own, even more limited ingress on the load balancer. We also have SMTP servers, which are used by internal customers for sending mail. We have 3 of these. These servers also support SMTP AUTH over SSL/TLS for customers when they are off-network.

    All of these servers are served by a load balanced set of SV servers, or our spam and virus scanning servers. These servers are running your usual concoction of mimedefang/spamassassin/clamav and are used by both the SMTP and MX servers. We currently have 12 of these to keep up with peak loads. All mail servers are running Sendmail, as its milter interface has performed much better in our tests than any other MTA. Of course, the exact configuration of the servers is a bit of a secret, but we have separated queues to keep emails from filling up the queues. Each of our 3 queues has 10 sub-directories, to keep the number of actual files in each directory down and to limit disk I/O on such large directories. Filesystem choice makes a big difference here, so you'll want to figure out your average email type and determine what filesystem to use based on this. The more RAM you have, the better.

    For mailboxes checking, we have 3 sets of servers, our POP3 servers, of which we have 5, our 4 IMAP servers, and our 3 webmail servers. The webmail servers are running an IMAP proxy. Of course, all of these services are behind a load balancer. We even use the load balancers between services, such as webmail/IMAP. We use Courier IMAP, squirrelmail, and nupop for these, though all are heavily modified to support features which aren't necessarily needed outside our environment, such as automatic username munging based on originating IP. Backend storage is provided by NetApp Filers.

    This services about 500,000 email customers.

  465. Use Mirapoint by michaelamdavies · · Score: 1

    Appliance Reliable Low overhead No downtime Blindingly fast A very very happy customer over several years...

  466. Re:Only search within the user's subdirectory (duh by UnapprovedThought · · Score: 1

    First, let me say that this argument may have already been carried out quite eloquently and in greater depth in other threads. You may want to check them out. Maybe your view is already represented. Not that I want to duplicate many of the answers already given, but...

    Adding an index to the data (what a DBFS really is) would speed up the search, not slow it down

    How does a DBFS index materially differ in any way from the existing indexing system of filesystem i-nodes (index nodes anyone? :) directories, caches and buffers, in a way that would matter to this application?

    When I refer to 2GB of mail, I mean 2GB per user

    Sorry, somehow I misread that. But, my line of reasoning still applies. It's not the size of your attachments but the wisdom of your indexing method :)

    What happens if a user decides to rename the file?

    It may be useless to argue about the details of specific implementations, but is this a critically useful feature for an email user? And why would you want to have your front end permit renaming files? That would seem to be a strange feature to have for an email client, let alone an email server, especially since the email has already been delivered. And why do you see the use of naming conventions as hanging on a loose thread? Metadata seems to work pretty well for websites, and the naming conventions get pretty arcane there.

    Most searches are an attempt to find some obscure piece of information that's been lying dormant for years

    I reject this. I think the most frequent search is to scroll up and down the page with the most recent emails of that day to look for a recently received email, and then to click on it and read it. That most frequent of searches is right under your nose -- perhaps that's why you didn't consider it. In wider searches, searching for data that's being automatically categorized for you is by far easier and faster than "look under every rock to search everything for me." If I know it was from last year, why should I wait for Clippy to search through all of this year's stuff? We don't have to work like Deus-Ex-Change on this one. Especially when you know where they're headed with it, and that they're indifferent (by design) to the impact of their relentless bloating.

    Whether or not a DBFS is the right tool for the job depends more on how well tested it is, and less on the other incidentals. If it permits a bunch of extra, marginally useful indices but the performance doesn't change much, and instead it adds another layer of library bloat and new unknown bugs to place the 99.9% uptime requirement at risk, then why should anyone use it? Maybe a better question would be -- what does it do best, and is this really a case of it?

    Some final questions to ponder: who is watching out that all of the new science being created isn't simply a renaming of old science principles? Is there any incentive to recognize the contributions that have come before, or is it more profitable to convince people that it is "new" somehow? If the old science isn't being taken into account, then, is the new science really science or just namespace cruft?

  467. Re:So which delivery agent automatically hardlinks by AnyoneEB · · Score: 1

    Another poster mentioned two mail servers which use hard links for this purpose.

    --
    Centralization breaks the internet.
  468. OpenMail^H^H^H^H^H^H^H^H Scalix by richi · · Score: 1
    Lots of users? Some corporate, some personal, some free? POP, IMAP, and webmail? High uptime? Sounds like Scalix.

    OpenMail (on which Scalix was based) scaled to insane levels compared with Exchange, Scalix should be the same. If we're talking consumer ISP-style workloads, you should be able to approach 100K users on a smallish Intel server. The key is to have a decent SAN, as previous posters have pointed out.

    Scalix can support just about every Outlook feature that Exchange can (forms being the notable exception). Any mailbox can be used with POP, IMAP, Outlook/MAPI, or the Scalix web client (SWA). SWA is an AJAX client, with a look'n'feel close to Outlook.

    Scalix quotes 99.99% uptime, and I saw even better in OpenMail days. Again, a good SAN is a must.

  469. groupwise by Anonymous Coward · · Score: 0

    Groupwise could be the solution, no virus, and you have solutions for linux and windows, and netware, etc

    It's from novell

  470. News to me. by CFD339 · · Score: 1

    Its news to me. Of course, less than 56% of the exchange market has gone past version 5.5 so I suppose that could be the reason I'm unaware of the change.

    --
    The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
  471. Re:Only search within the user's subdirectory (duh by killjoe · · Score: 1

    "Um no. Adding an index to the data (what a DBFS really is) would speed up the search, not slow it down."

    There is no reason why the plain old files in the filesystem could not be indexed. See spotlight for an excellent example.

    --
    evil is as evil does
  472. oh please by Anonymous Coward · · Score: 0

    We have 1000 users with one admin who devotes maybe half of her time to maintaining Domino. Outage? Try every 12 months when we successfully upgrade to the latest version of Domino. BTW, Domino 7 came out this week - we'll be upgrading by year end. Why are you still fumbling around with old versions?

  473. LOOK WHO IS REALLY DUMB! [IT'S ALBERT P.] by Anonymous Coward · · Score: 0


    Yes, look who really is dumb. Albert, what is wrong with you??? Can't you just tell him politely that he made an error like a civilized person? No, you can't. You have to be ubercranky and shout in his face. You barbaric twit.

    And BTW, what about your sig? You don't use "tonight" in the past tense. "DID YOUR MOM SERVE YOU AN EXTRA HELPING OF DUMB THIS EVENING" maybe, or "WILL YOUR MOM SERVE YOU AN EXTRA HELPING OF DUMB TONIGHT" (although that one doesn't really make much sense).

    You're an idiot and a troll, Albert. Go away and never come back, or, as you said, go commit suicide.

    --
    Bonk the Zonk! TMM for editor!
    Trolling all trolls since 2001.

  474. Re:Sun's Java Messaging Server (AKA Netscape/iPlan by larsu · · Score: 1

    We use it here at out university. The JES Suite is pretty expensive for a million users, but much more affordable with .edu pricing. Scales _really_ well.

  475. Recommendation for determining a specification by TIK_ · · Score: 1

    A significant set of comments here approach this problem from the server and sysadmin side. None of them approach the problem from the user interface or usability side. How can your implementation be successfull if your users (up to 1 million) are unhappy with the way they are required to use it, or it's unusable to them (very common for those not technically inclined as you are almost certainly to have with that many email accounts)

    First you need to determine how the users will access their email, and how they are going to use it. Will this be webmail, client app, PDA, etc.

    Then you need to determine what the user requirements are:

    • Calendaring support (this will be nasty if application integration is required)
    • Shared contacts
    • Webmail (and what browsers will be used)
    • SSL/TLS
    • SMTP Authentication
    • Roaming users (anyone not on the intranet)
    • IMAP, IMAPS
    • POP3, POP3S

    Also what applications they will be using to connect to this system such as: emacs, pine, mutt, thunderbird, outlook express, MS outlook, opera, evolution, mail.app, etc.
    If possible try to enforce a policy restricting the use of email clients to a small subset, but do however remember that there may be users on Mac's, **nix PC's, Windows PCs, and potential others. (NB: Avoid allowing Outlook Express if you wish to use IMAP)

    Determine your security requirements for the mail system. Is everyone required to connect using SSL encrypted links?

    Determine your minimum service levels required (99.9% uptime or higher, do note that every 9 beyond the first 3 can be expected to double the cost of the solution)

    Determine support levels for hardware with respect to warranty, part availability, technician availability etc.

    Determine backup requirements, are you required to be able to restore individual emails, individual mailboxes, all mailboxes, and how many levels of backups are required? Do you need to be able to restore emails deleted 4 months ago

    Quota requirements, are there limits on the size of a persons mailbox, can this be customized, are there limits on the size of an email a user can send, and the same for receiving. Will you allow a user to store 2+GB of email on your system?

    Determine other legal requirements, such as a requirement to be able to retrieve any email sent through the system for auditing/legal purposes

    Determine effectiveness of antivirus filtering and how many levels of antivirus filtering will you require to ensure robustness and the correct level of user protection?

    Determine level of spam filtering required (generic, user specifiable, with headers, without headers,

    Do you require mailinglist or distribution list requirements (mailman?)

    How many physical sites will be accessing this mail system (one office? multiple branches)

    Will you be requiring a support ticketing system? (example: RT from http://www.bestpractical.com/)

    Will users be able to customize their mail settings (enable/disable bayesian spam filtering, custom antispam rules, setting of spam thresholds, autoresponder messages, out of office replies, disable/enable spam filtering, disable/enable antivirus filtering)

    What level of redundancy are you required to have? Do you need to provide redundant systems even if one datacentre is disconnected (somehow)
    ie. main datacentre you use in the UK is disconnected for some reason outside your control, do your roaming users in the UK still need to be able to access their email without any loss through an alternative backup mail system in the US?

    Can your users be split up into multiple sub-domains? ie. production, hr, finance, lists, support, technical, development, etc. And will they notice or can you hide it from the user with a simple server-side rewrite.

    How are you going to measure the performance of the system once in place. wrt disk space, amount of connections, upti

  476. Oracle/Sun Solution by BlueQuark · · Score: 1


    Hi,

    I would start looking at

    Hardware:
    two Sun Fire 2900s or 6900s
    Hitachi (HDS) Storage Server
    EMC DMX 1000

    consider solid data solid state disk for message queues.

    Software:

    Oracle 10g, collaboration suite.
    Veritas Cluster Server RAC Edition
    Veritas VVR - EMC's SRDF is over priced
    Veritas Foundation Suite DB Edition for Oracle.

    Result:

    Plenty of horsepower. Highly scalable and reliable.

    Cost: See your Sun, EMC, Veritas and Oracle dealers.

    Seriously, this is what I would do and I've had quite a bit of experience building large messaging systems. The above combination usually works well.
    Avoid Linux, unless you like alot of overtime.

  477. Re:Only search within the user's subdirectory (duh by AKAImBatman · · Score: 1

    How does a DBFS index materially differ in any way from the existing indexing system of filesystem i-nodes (index nodes anyone? :) directories, caches and buffers, in a way that would matter to this application?

    The "index node" provides an index tree for a very simple type of query: The filesystem heirarchy. (This was addressed in the article I linked to.) However, the inodes provide no real information about a file other than its location, unless they are extended to include meta-data attributes. Using the email example, "To", "From", "CC", "Received", "Sent", and "Subject" are all meta-data fields you might expect to find in an email. With the meta-data, I can ask the index, "from:UnapprovedThought@gmail.com" and get back an instant result. Without the index, I have to churn through every email file on disk. Not to mention that I need to *parse* each file to find the info I'm looking for.

    The scheme of using the filename does alleviate the situation somewhat, but it still is not tremendously fast (lots of I/O here), you still need to parse each filename, and it is limited in the number of fields it can contain.

    It's not the size of your attachments but the wisdom of your indexing method :)

    But you are advocating no index. I'm advocating an index built into the file system. Which is it?

    And why would you want to have your front end permit renaming files?

    You have no choice. If the user can see a file on disk, they can rename it. Your options are to extend the OS GUI to prevent the user from taking such action, or work with the user so that the client and the file system show consistent views. In a DBFS, the two *will* be consistent. So much so, that you may not even need an email client, or the client may be nothing more than a specialized version of the file browser. :-)

  478. Re:Only search within the user's subdirectory (duh by UnapprovedThought · · Score: 1

    I had scanned your article a while back looking for the technical guts of DBFS but lost interest in it for some reason.

    In any case, an inode also provides a degree of locality, sort of like a stake in the ground. Files described by the inode are likely to huddle physically near that inode on the disk, and therefore quick to access if the disk head (a relatively slow moving object) is already hovering around that spot. Perhaps the "query result" is even within the same block or track that has been already read by accessing the inode, so that no further physical disk activity is needed. Replacing this with a non-locality based system may not result in similar performance, especially if a logical to physical hash is used to compute an actual physical location, and the location is far away from where the index is stored. It also opens the door to poor worst-case performance. In short, I would need to see comparative benchmarks of best-case and worst case before I believe this would be as you claim, both in single case "queries" and for a heavily loaded server having to handle a real-world load of multiple queries, inserts, deletes and updates at the same time, on different parts of the disk.

    With the meta-data, I can ask the index ... and get back an instant result.

    This sounds simplistic, as if the disk hardware and its natural latency were somehow absent from the picture. If it is instant, as you say, then there is also a fair chance that the same query would return instantly for a non-DBFS. Namely, because the block the information is in, is already cached...

    Without the index, I have to churn through every email file on disk.

    The filename solution avoids most of the churnings. As do a finite number of indices. But, at some point the DBFS will also have to churn through the disk for a randomly chosen string in the body of the email message. Or, you could technically index everything, but I doubt you'll do that, based on the fact that these are large chunks of mostly random data coming in, and you'll be spending all of your time updating the index with each email that comes in and wondering why your new server is so slow. Under these two (at minimum) competing processes, physically, the disk head will be flying to-and-fro from the location of the index to the location of the data and back, and it's easy for me to think that this could be implemented improperly, and that the index itself would become fragmented, or inconsistent in a forced shutdown... in short, it could get very complicated, unreliable, unrecoverable, and yes, even slow.

    Parsing using a non-bloated language built on a minimum of non-bloated libraries is still going to be faster than the disk speed, especially since these are roughly linear time searches. (Obviously, I'm not talking about a Windoze design here, where the purpose is to see how much RAM can be consumed inside of buzzword-du-jour subsystems so as to get you to replace your computer that much sooner.)

    The scheme of using the filename does alleviate the situation somewhat...

    You have a gift for understatement.

    ...but it still is not tremendously fast (lots of I/O here),

    Once again you seem to ignore that this information will typically be in the cache. Your DBFS will have a cache, and any other filesystem will also have its cache. You will want to use cache space for your index, right? Thus, there won't be "lots" of difference, except maybe in the reverse if you have lots of indices that no one actually uses.

    ...you still need to parse each filename...

    Parsing a line of text isn't going to bring the heavens down. Especially when it is already in-cache and you can split the work up among several processors.

  479. Re:Army Knowledge Online does it for 1.72 million by sandwormusmc · · Score: 1

    Most Army installations only use their AKO webmail accounts for forwarding to their installation e-mail servers, most of which as far as I know use Exchange. Most of the time the only use of AKO webmail comes when the installation specific e-mail system is down for maintenance ... so the response is probably relevant for those who use AKO webmail on a regular basis, but not Army-wide.

    Judging by those numbers, though, I would say the Sun setup is great for forwarding, but not for high scale groupware.

  480. Re:Only search within the user's subdirectory (duh by AKAImBatman · · Score: 1

    The filename solution avoids most of the churnings.

    And also limits the amount of information that can be stored. You're going to bump up against the upper-limit of the filename length, just from the subject. Add an email with a large number of To's or CC's, and your filename solution breaks down.

    This sounds simplistic, as if the disk hardware and its natural latency were somehow absent from the picture. If it is instant, as you say, then there is also a fair chance that the same query would return instantly for a non-DBFS. Namely, because the block the information is in, is already cached...

    Of course it's not truely instant, but it's close enough. Take the Spotlight search system as an example. It produces a list of files, as you type. For a comparison of performance, go to '/usr/bin' on a Unix system and attempt to use tab completion in BASH. Notice how slow BASH is at retrieving the results?

    But, at some point the DBFS will also have to churn through the disk for a randomly chosen string in the body of the email message. Or, you could technically index everything,

    Hallelujah, he finally gets it! Yes, index everything. Or more precisely, the hook for handling email files would divide the file up into keywords which would be added to the index. The amount of storage for this would be the one-time cost of the word plus 4-8 bytes for each instance of the word found across the entire filesystem.

    You will want to use cache space for your index, right?

    Correct. But in the absence of the data being cached, it's much faster to burst read the index than it is to read through every file. Ideally, the head would never leave the platter while reading in the index, as opposed to studying each file on disk. Worst case for head movement in this situation would be O(fragments) for the index and O(files*fragments) for studying each file.

    Unless these are modelled as subdirectories so that they don't take up space in the filename.

    Mess, mess, mess. Not to mention that file systems still often have limits on the size of the full path.

    I doubt you'll do that, based on the fact that these are large chunks of mostly random data coming in, and you'll be spending all of your time updating the index with each email that comes in and wondering why your new server is so slow

    Who's talking about servers? I'm talking about clients. (Although servers would work just fine as well.) And updating the index is a minor amount of data to commit. As I said, the cost of the word, plus a 4-8 byte cost for each instance on the filesystem.

    Also, the DBFS has to read an extra index, a storage area a normal filesystem doesn't have to maintain, physically translate to, update, or use up cache space to store.

    Except that a Database File System would be built to maintain, physically translate to, update, and use cache space to store these indexes. Remember, I'm not advocating the use of a database on top of a files ystem. I'm advocating a more advanced file system that extends the current indexing capabilities.

    FYI, BeFS was a full database file system, HFS+ now has DBFS features, and NTFS has a great deal of DBFS features (despite Microsoft ignoring the features in OSes prior to Vista).

    You were worried about a parser, but here you've got a larger, more complicated filesystem driver taking up space in precious RAM, butting up against kernelspace and single-threadedly stuck to one CPU.

    None of the above. I realize that you decided my article was a snore-fest, but I actually suggested using FUSE to stick the file system in userspace. Which means that the process can be multi-threaded, multi-processor, and freely use pagable memory.

    I still don't see why the webpage/GUI writer, for instance, would be forced to explicitly throw in a rename button on the client (this is supposed to be a record of delivery, not a workgroup editing effort).

    All the user has t

  481. Re:Only search within the user's subdirectory (duh by UnapprovedThought · · Score: 1

    You're going to bump up against the upper-limit of the filename length...and your filename solution breaks down

    I can think of several ways to solve this offhand. If your filesystem has a fixed filename limit, or you can't tune your filesystem to increase the filename length, you can still store each email as a directory, with each field as a separate file within it, side-by-side with the message. If you don't want to do that, you can store as much as will fit and spill over longer fields to the message itself. Or, just have shorter fields. You could have a client-side address book that maps a hash id to recipient email address, etc. (There are a bunch more ways but I'm not going to bore you with them.)

    go to '/usr/bin' on a Unix system... Notice how slow BASH is at retrieving the results?

    No :) It was quite fast for me. You must be doing something wrong.

    Even if it is slow for you, your client won't typically be looking at a directory filled with a million little files (a disingenuous example if I may say so), because it will be smartly broken down into subcategories.

    Hallelujah, he finally gets it! Yes, index everything... The amount of storage for this would be the one-time cost of the word plus 4-8 bytes for each instance of the word found across the entire filesystem.

    Everything, eh? Well, if your entire filesystem is on a SAN, that's one huge global index that has to span a single immense partition for 1 million users. The index would be so big that you would likely not be able to fit the entire thing within the RAM of a single server. If that happens, your system is in even more trouble than I thought earlier. I figure a 1 million user system will have to handle 100 million incoming emails a day, most of them during the morning peak. So, the SAN drives have to keep revisiting the index even for reads, not just for writes. And all that just so you don't have to store the word viagra in a million different places. That is, DBFS will be slow unless it has some form of distributed filesystem capabilities.

    Ideally, the head would never leave the platter while reading in the index, as opposed to studying each file on disk

    You want to keep re-reading the index from disk, rather than from cache? That doesn't sound like the ideal or best case. The ideal is to avoid as much disk access as possible.

    I will glean from this that there are no benchmarks yet...

    updating the index is a minor amount of data to commit

    But the disk head has to move far away from where it was just to update a tiny thing. That's fine for a one-user system, but for a large email system it can gum up the works very quickly.

    Remember, I'm not advocating the use of a database on top of a filesystem

    Don't worry, I may be dense, but that much has sunk in. I can sympathize -- freeing you from your subdirectory inhibition has been just as difficult. :)

    I actually suggested using FUSE to stick the file system in userspace

    Yeah, it's either run in userspace or you will have a huge index (filled with stuff like viagra1, v1agra, etc.) that you can't swap out. But the downside is that you've replaced the base of your information pyramid (that ought to be stable enough to build on) with an unproven component. It also doesn't improve matters that this new filesystem is multithreaded if you're trying to debug a file or index corruption issue. While that's happening all 1 million users will be waiting for the system to come back up, because for the global index to remain consistent, the entire system has to go down if only a part of it has failed.

    DBFS sounds great for a small syste

  482. You don't need that much hardware..do the math... by nazzdeq · · Score: 0

    1,000,000 email users

    About 70% of users use system daily (if this is a webbased email like Hotmail, this is a high estimate)

    Each user reads 30 emails and sends 10

    A grand total of 19,600,000 emails per day (this is a high estimate)

    Only 226 messages either read or sent per second per day

    @ 100k avg. per message storage is 1.8 gb per day or 666 gb per year if you don't compress

    An Apple xServe dual G5 w/16gb of RAM and xRAID 5 TB will work just fine

    For uptime, two xServes w/ a load balancer both can send and receive, filter spam

    For the webmail piece, you can cache the whole website or poor man's way - make a ramdisk and copy the website to the ramdisk, whole site is cached.

    QMail or Sendmail take your pick, parse incoming mail via php/shell/perl script and round robin mail to diff mailbox dirs on the xRaid, cakewalk.

    Add a separate web server for added scability

    Replace filesystem storage w/ an additional 2 node Oracle RAC cluster w/ email data partitioned over the xRaid to take advantage of additional high availability & disaster recovery features.

    Bottomline, if you have 20 admins for this and a room full of hardware, you need some more skills...hehe.