State Dept E-mail Crash After "Reply-All" Storm
twistah writes "It seems that a recent 'reply-all storm' at the State Department caused the entire e-mail infrastructure to crash. A notice sent to all State Department employees warned of disciplinary actions which will be taken if users 'reply-all' to lists with a large amount of users. Apparently, the problem was compounded by not only angry replies asking to be taken off the errant list, but by the e-mail recall function, which generated further e-mail traffic. One has to wonder if capacity planning was performed correctly — should an e-mail system be able to handle this type of traffic, or is it an unreasonable task for even the best system?"
Nope :)
Why do they have people sending to a list that anyone can reply to in the To: or Cc: lines. It should work one of three ways:
1) it limits the number of receipients (after list expansion) in the To: or Cc: lines, so those mailing a large list must put it in Bcc:
2) it should only allow certain people to send to large lists (implemented as a whitelist)
3) it should massage things on the server so that a list called 'all-company-list' would show up in the To: or Cc: lines of receipients as 'all-company-list-reply' and the list admin and sender are the only ones who see the replies to all
Honestly, mailing lists are not new technology and this has been a solved problem for years. Because they are incompetent mail admins they are forced to threaten employees!
Dear state department
I'm sorry to hear about your recent trouble
There is a brand new invention on the internet which have the ability to ease the strain on your mailservers. it is called maillist managers. one is called mailman and can be found at: http://www.gnu.org/software/mailman
There are several others, some free, and some non free, but they exist for most server platforms. If you don't have the expertice in house to set it up corrctly, you can get any number of consultancy companies to help you out.
Yours faithfull
Almost anonymous coward
OpenNet, by a very quick look on google, seem to be their network name for the non-classified bits and pieces. Supposedly Microsoft + Cisco stuff.
Feel free to disagree, but please provide a URL reference to the OpenNet email server software vendor if doing so.. ;-)
According to this article, they were migrating to Exchange in 2001. If it was set up by admins who knew what they were doing, they could have set the perms on the distribution list so only authorized users could use it.
read bedlam. in annoying pathological cases, the user(agent) can't know who's on the dl or how big it is.
for some cases, it's probably possible for the user agent to do something slightly more intelligent. but it's a hard problem.
yes its exchange internally
openNet is what they brand it as
feel free to correct me with evidance that it was not the case any more but I know 2 exchange servers there and this say's otherwise
exchange has the recall ability and so does lotus notes
most other servers do not have this feature for very good reasons l
regards
John Jones
www.johnjones.me.uk my blog about email and digital communication
This is a configuration error, not a newsworthy event.
For sendmail, it would be a configuration directive in their sendmail.mc (or whatever theirs is:
confMAX_RCPTS_PER_MESSAGE("100") ... or a modified line in sendmail.cf:
O MaxRecipientsPerMessage=100
In MSExchange it would be a registry change
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\MSExchangeIS\ParametersSystem\Max Recipients on Submit
DWORD Value 100
Serious? Seriousness is well above my pay grade.
dude, transsexualism has nothing to do with being gay. most homosexuals aren't transsexuals. they're just males/females who are attracted to their own sex.
the city you are looking for is Trinidad, Colorado, which has been dubbed the Sex Change Capital of the U.S.
http://www.hanselman.com/blog/HowToEasilyDisableReplyToAllAndForwardInOutlook.aspx
2 simple lines that you can include in your Outlook client to prevent this action internally on your exchange server.
Note this does not include any macros in the email.
So he can just append a line to his users' .mozilla/thunderbird/chrome/userChrome.css and all works well.
You assumed that they mass emailed the notice and are incorrect.
As the article states, the notice was sent by "cable" which is the old telegram system and still the only official means of communication between the Department and US Missions overseas.
The cable system is on a completely separate classified network.
As the unfortunate recipient of the mail storm emails I will say that many people included information in their replies that referenced the cable (and subsequent Department Notice) telling people to stop hitting reply to all so you are not entirely incorrect. It is just that the Department was smart enough not to send out a blanket email to everybody.
The other thing that seemed compound this problem was that the To: line didn't have any names or mailing list groups listed. People (idiots) didn't realize that they emailing almost everyone in the Department.
I would also point out that the email servers slowed but I never experienced any lost email or service interruption. Some emails were delayed by as much as two hours.
Read the comments in the Exchange blog above. In a more complex environment it isn't always possible to determine how many users a DL will expand to. The DL may be stored in a different place (especially in multi-organization situations) and/or the ability to view the membership of the DL may be restricted (i.e. you can send to 'Google Legal Team' but you aren't allowed to see who that is).
So the client can *sometimes* warn about the number of recipients, but not in a guaranteed way. It is possible to restrict DL send permission to a certain set of users, but that requires the administrator to be thinking about that.
If, like approximately 1/2 of the American population, you currently had no health care at all, your attitude would probably be different.
A) Nobody has "no health care". There's always the emergency room -- in the US they can't turn you away for lack of ability to pay. This isn't ideal but it disproves your statement of "no health care at all"
B) There's ~47,000,000 Americans without health insurance. Out of a population of ~300,000,000. That's 15.67%, not 50%
Neither of those are ideal but if you are going to post on a subject at least get your facts straight.....
I want peace on earth and goodwill toward man.
We are the United States Government! We don't do that sort of thing.
Having been a witness to the incident in question, here's what happened:
1) Around December 30th a blank e-mail (with receipt request) went out to almost all users. Apparently it was from a single user with some malware etc. (we didn't get any further details).
2) The next day, the same blank message was sent out again (from the same user).
3) As people came back from vacations, we got a few "Please remove me from this list", and "What is this message" send as reply-all.
4) Then, followed with a bunch of "Me Too".
5) Then, a bunch of "Please, don't reply all" (sent, of course, reply-all).
6) Followed by a bunch of "remove me from this list".
and so on, and so forth, with no end in sight...
The initial message didn't have any virus or other "payload"; just a blank message that caused a bunch of confusion. The whole incident was actually pretty hilarious to watch.
The guy who wants to quit but doesn't because he'll only get unemployment benefits if he's fired :)
Um...which goes to show how little you know about unemployment. At least in MA, you don't get shit if it is "termination with cause", ie fired. If you're laid off, great- but even then, your employer gets a phone call from the unemployment department asking whether you were fired or laid off. Nothing stops them from lying and saying you were fired with cause- and then you've got a legal battle on your hands, which you can't afford.
Other fun facts about unemployment in MA: you don't get paid for two full weeks after you FILED- not after you were laid off, but after you FILED. You get a pittance compared to your normal salary; you'd be lucky to make rent on a studio apartment in Boston based off an entire month's unemployment checks.
Any income is deducted from your UA check. Say for example you find a 2-3 hour consulting thing on CL and make $150 helping someone fix their computer. Guess what? Your unemployment check for that week will be $150 smaller. This basically means that you have no incentive to find any kind of income while you're on UA.
Last but certainly not least: you have to pay taxes, medicare, medicaid, etc on your unemployment benefits. It's not bad enough that you're basically on welfare- you have to fork over a portion of the money the government is giving you, BACK to the government. Cute, eh?
Please help metamoderate.
Message recall. Oh dear.
Years ago, I wrote the bulk of this feature. It is not an Exchange feature, but an Outlook feature. It works by sending a custom MAPI message that Outlook recognizes and processes. Of course, this only works if all recipients are using Outlook. It also, after we did some usability testing, only deletes unread email, or email that has not been moved to a subfolder (the original version was quite determined and would hunt down and kill the message even if it had been moved to a subfolder, renamed or entered the email protection program). In this way, it did not violate the UI dictum that the computer move things around when you haven't given it instructions to do so.
So yes, it is Outlook only. If sent to a non-Microsoft mail system, it degrades to a simple notification that the message is being recalled. And it does not a good choice for getting rid of flames you shouldn't have been sending. But within its expected use as a feature - correcting mistakes in email that should have been caught before pressing send, it works fairly well.
But because it is client based, rather than an Exchange feature, it does cause a new mail message to be sent to each original recipient and, combined with a send-all storm, could greatly exacerbate things.
And, preemptively, for those who have philosophical objections to me having written the code in the first place, I'll just have to live with your disapproval and hope my steady paycheck somehow sooths my guilty conscience.
Easy there cowboy, he did not say he would remove reply all functionality from the system- just remove the reply all button from the toolbar.
Thus you'll still be able to right click a message and select reply to all, and use the hot key CTRl+A to reply to all, just not accidentally or ignorantly click the 'letter with return arrow button that does something' and generate a reply to all.
The summary is a little misleading, but from the article, the "notice" was in response to the reply-all's taking down their server, not the cause of it. And it doesn't sound like the notice was sent via email. TFA describes it as a "cable".
If I don't put anything here, will anyone recognize me anymore?
There's ~47,000,000 Americans without health insurance. Out of a population of ~300,000,000. That's 15.67%, not 50%
I don't know anything about those source numbers, so I'll just go ahead and believe them, but I've gotta call you on those sig figs there. 15.67? 4 sig figs? How about just 20%.
(I'm not sure whether to thank or to blame all of my physics teachers for drilling us in sig figs)
coding is life
A properly designed mail server would accept mails to named distribution groups and just drop the mails into each of the associated mailboxes for internal mailing lists. I know this is how it works with our mailserver and yes the physical IMAP machines are on different continents but each IMAP server recieves only one copy of the mail over the network.
"Linux is for noobs"-The new MS fud strategy
Looks like the pathetic one is you, and the Submitter. If you RTFA, it clearly says
He said the result was "effectively a denial of service as e-mail queues, especially between posts, back up while processing the extra volume of e-mails.
Never says the actually crashed, merely that the high volume generated large queues, exactly what you would expect to happen in a properly engineered system. But hey, this is Slashdot, so making up reasons to hate Exchange (and there are plenty of LEGITIMATE reasons to hate exchange) is the norm.
nntp anyone? This is not a new problem you know... And yes you can configure your client to periodically refresh, show you just the new items and use it for offline reading.
Show a man some news, distract him for an hour. Show a man some mod points, distract him for the rest of his life.
Actually, this wouldn't have mattered so much if they were using Novell Groupwise.
Groupwise would store the message only once in the database and then put a pointer in every user's mailbox referring to that message. If you'd recall the message it'd just remove the pointers in mailboxes where the message has not yet been read, in order to reflect the current situation.
One of the reasons I avoided Exchange like the plague is that Microsoft implements stuff like a hack job instead of doing things properly.
Microsoft has described this in excruciating detail before, because at one point even they managed to crash their Exchange server - through mail list reply all spam.
http://msexchangeteam.com/archive/2004/04/08/109626.aspx
Sounds like the State Department might not have upgraded to Exchange 2003.
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
The problem is MS Exchange. Proper mail servers will only save ONE copy of a message and attachment sent to any number of users.
Exchange does use that kind of system. The problem was not that Exchange was replicating the message, it was that it had to process all of the requests to deliver the message. The mail server still has to add the pointer to each user's mailbox for each message, and if the message contains every user, that takes time to process.
That is what separates the men from the boys. Unfortunately, Exchange is one of the boys.
So the men must all be using mbox format?
The clash of honour calls, to stand when others fall.