Massive Email Crash Hits Canadian ISP Shaw
rueger writes "One of Canada's biggest cable/Internet providers has their customers in an outrage. '... after an interruption of Shaw's email services Thursday led to millions of emails being deleted ... About 70 per cent of Shaw's email customers were affected when the company was troubleshooting an unrelated email delay problem and an attempted solution caused incoming emails to be deleted ... Emails were deleted for a 10-hour period between 7:45 a.m. and 6:15 p.m. Thursday, although customers did not learn about the problem until Friday, and only then by calling customer service or accessing an online forum for Shaw Internet subscribers.' To top it off, when Shaw did send out notices about this, they looked so much like every day phishing spam that many people deleted them unread."
Not that Shaw is particularly noted for this... bust just sayin' in general.
I've got mail through one account that also automatically forwards to another account (not gmail or google, thanky god), so even if one provider loses my email dataset, the other still has a good copy. I also pop my mail in to my own machine, so I 've got a local archive. I wonder what the details of this canadian mishap really are... :>(
Who?
Back bacon and beer, eh? This what happens when Bob and Doug McKenzie are the network troubleshooters.
www.chihuahuarescue.com- Help to end dog abuse, abandonment and cruelty
WAIT.... "deleted?"
As in, "spilled the seed on the ground?"
I can understand if things maybe don't get delivered for a few hours, or maybe a few got munged up somewhere during the repairs, but to blissfully direct the firehose into the abyss that is /dev/null...
Who pays for that?
Oh, wait. Canadians are rather accepting of abuse on the part of their phone/cable/broadband suppliers, and the Tories back up the big businesses.
"Unacceptable! Unacceptable! Mepps, mepps!"
---------------------------------------
Rotate the pod, please, HAL....
Are one of the reasons I don't use ISP hosted email. Main reason is portability.
There will be more mail tomorrow.
I've hosted my own mail server for about 15 years and I regularly think to myself, 'I'm tired of worrying about hardware and my circuit. Maybe I should let somebody else host it.'
Then it seems there's always an article like this that clears my head.
Cox over the years has had some spectacular email outages and fuckups. To the point where I now use Gmail via IMAP and a private domain via IMAP.
isn't a holding-bay?
Like, ones that make a backup before messing with critical data? As an elementary precaution known to anybody halfway competent in IT?
This just demonstrates a massive, massive management screwup, as they allowed unqualified personnel to work on their systems. Save a buck, loose a million.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
From TFA : "The mistake was an “isolated event,” Lakshman said, and promised a detailed review, which would include a discussion about compensation."
Except it isn't. Few years ago I had a business's domain email hosted with Shaw (was included with the internet service and they provided IMAP), and they lost all of it. They wouldn't return my calls about it, and on the third time I called in a week or so later I was told it would not be recoverable, that there is no backup for their business email service at all, instead they would credit the account ~5 days of internet service. I was floored, but was too busy to get into it with them and I had our email backed up so I just moved on with a email hosting provider I felt more confident about.
Shaw's internet service has been decent, but I wouldn't trust them as anything more then a data pipe.
Yeah, messing up is normal. Failing mark a snapshot before becoming with a million emails is incompetence. With a snapshot, human error might have resulted in losing three minutes worth of emails.
"To err is human, to fuck up the whole system requires root."
I've been working with a Vancouver based retailer's email newsletter for years. Around here, Shaw is by far the worst of the bigger email domains on the list for deliverability - at least on any of the big webmail providers, recipients can white-list the email address we use to send out emails. Further, some emails will be completely deleted and not put into the junk folder at all. And emails that are suspected spam, will be deleted after only 7 days - don't go on a 10 day vacation.
I could go on, but suffice it to say that I if I notice an associate or friend or family member using a shaw.ca email address, I will often strongly suggest that they move to any of the big webmail providers.
BTW: Gmail provides IMAP and POP access, which is a stumbling block for those who want a desktop email client. I'm not sure about Yahoo or Hotmail.
Also, gmail exists.
with your comment, I modded you down because it began in the title
Haven't we all fantasized about just deleting the goddamn queue and going home?
Imagine that - a registered member logs in as AC to tell us that he's a douche. Wow - at least he knows he's a douche!! There is hope for him. Not much, but some hope.
"Windows is like the faint smell of piss in a subway: it's there, and there's nothing you can do about it." - Charlie Br
Indeed. And with preventing the server from accepting emails, and a snapshot, there would have been no loss at all. (Emails in the 3 minutes going to the secondary...)
Those truly incompetent are those not aware that they can make mistakes. Seems management is trying hard to make the engineers more like them.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
http://it.slashdot.org/story/12/07/13/2050234/citys-it-infrastructure-brought-to-its-knees-by-data-center-outage
BTW: Gmail provides IMAP and POP access, which is a stumbling block for those who want a desktop email client. I'm not sure about Yahoo or Hotmail.
I'm sorry, I don't follow your logic. How is providing the option for POP and IMAP -- in addition to webmail -- considered a "stumbling block"?
only a douche's hope.
Yeap, the whole mail system is designed from the core so mail should never be lost as I learnt in my young days.
Managing to loose a single email never mind millions is quite an achievement.
Everything I write is lies, read between the lines.
...if your email is not in at least two physically separate places, you are at risk of losing all of it, forever.
It's weird Shaw can't restore from a backup - the article is a bit weird on the exact details about what happened and just ends with "the emails were not backed up".
If your online mail provider does not allow you to access or export your data to your own PC (via IMAP, POP, or whatever) then you should switch to one that does - and start backing up your own email if you want to be more confident that it's going to survive catastrophes.
Looks like one man's stumbling block is another man's cornerstone ;).
How do you not notice all emails being dumped until 11 hours later?
Damn, isn't there anybody here but me who has been locked out of their gmail account for about 2 weeks now? I have not changed a thing in my fetchmailrc or mailfilterrc's, and have been sucking my gmail account dry at 3 minute intervals with fetchmail for damned near 5 years.
2 weeks ago, both fetchmail and mailfilter started reporting password failures. It worked about 30 minutes a day for 5 or 6 days, but has not worked since the last week of February.
I call them up, get some yahoo whose command of English sucks dead toads through soda straws, he leaves to go get someone who speaks English, but the next guy isn't a hell of a lot better, and he finally speaks clear enough that he is telling me the account is blocked because my machine is compromised. I object, its a linux box, behind a router running DD-WRT. Doesn't make squat to him, my machine is compromised.
Seeing as how everything that comes in here has to run the clamav gauntlet, and that this is a linux machine which has not had java enabled anywhere near firefox in months, currently at V-19.0.2, AND that its behind a router running DD-WRT, AND neither chkrootkit nor rkhunter can find anything to complain about, I seriously doubt it has been compromised.
I had been gradually weaning my mailing list activities, moving them to other servers precisely because of their no dups policy, so that was all the impetus I needed to just move all my subs. I still scan them on schedule just in case they actually get someone who reads english wondering why a fetchmail instance is failing to login, telling fetchmail the password is toast when its the same pw I've been using for years, and its long enough John didn't get it in 6 hours of grinding on it when I last checked with john the ripper.
Until that happens, screw gmail, and the camel that rode in on them.
Cheers, Gene
"To top it off, when Shaw did send out notices about this, they looked so much like every day phishing spam that many people deleted them unread."
Erm. No they didn't? I'm looking at one right now and it doesn't look remotely like 'every day phishing spam'. It doesn't offer me anything, threaten me with anything, or ask me to click on anything. It doesn't include any links except to a forum thread, which the text doesn't make any special effort to make you click on. It didn't trigger my mental 'phishing detector' in the slightest.
I got the email notification late Saturday, two days after the event happened, I guess. That's not a horrible delay. I also saw a bunch of delayed mails come through around that time - 10 or so - and they notified me of the sender and subject line of three mails that were lost, so looks like they managed to recover quite a lot.
I dunno, I guess I'm not TOTALLY OUTRAGED at this. As another commenter said, you know, admins screw up sometimes. Lord knows I have. The fact that they're at least able to identify the subject lines of all the lost mails makes a big difference; you could get any really vital ones re-sent.
There's a lot of incompetence about, especially bullshit such as the secondary being /dev/null itself as some sort of stupid anti-spam bandaid. I was stung by that one when I had the situation where the primary that was accepting mail for a company I was working for was getting congested and the host they were paying the ISP to supply as the secondary had a management imposed policy of just dropping everything. Probably 2/3 of incoming mail during working hours was never delivered in that four month window before they admitted that we'd been paying to let them throw our incoming email away.
MS Exchange lowered the bar. Yes I know it's supposed to do a dozen other things but it's MTA was crap for years and still seems to generate a lot of panic on sysadmin mailing lists.
While it's utterly trivial to alias everything incoming (or even outgoing) to another host that's another bit of infrastructure and often seen as an unnecessary expense. Their backups will be system files, whatever is in the mail spool at any given day is beneath their care factor and anything that arrives after the last backup is gone anyway.
Remember this folks before considering outsourcing, it's not their email so they don't care about it as much as you do. While you may want to keep stuff in two places they are not going to bother to go to the extra expense unless it's to their advantage.
You don't even need a secondary. If your SMTP server goes off-line, the senders should retry for up to 4 hours. So you can quite literally unplug a mail server, do what you got to do within 4 hours, plug it back in and no mail wil be lost.
Excuse me, but please get off my Pennisetum Clandestinum, eh!
I apparently got hit by this.
I say apparently because I didn't notice anything had happened. A few people had to resend mail once they bothered to phone me, and then it was business as usual. I figured an MTA was just acting up somewhere along the line. No big deal.
I don't normally do this, but frankly, Shaw doesn't deserve any extreme bad press over this. They're a pretty good company. I've never had an issue with their service before. I continue to pay my bill, and they continue to provide me with the service for which I'm paying for.
Frankly, they're so "we don't care what you do with your pipe" that I probably wouldn't complain even if I had a reason to. No copyright alert system bullshit, no throttling (nothing that has effected me, anyways), no monthly limits. Well, they claim there's bandwidth caps but I've blown by them every month for the past two years and nothing has ever happened (and I'm not talking about by a few megs- I'll happily do 250GB/mo+ when my "cap" is supposedly 80GB). They never complain. And I keep paying my bill on time.
So really, as far as Shaw goes, cut them some slack. Shit happens. If you feel the need to complain, they'll usually do something for you with minimal effort (I had to call them once about a glitch with my PVR, they gave me a brand new unit and refunded me $30 on my next bill). There are far, far, FAR worse companies out there to deal with.
Did you ask for your money back?
I'd been using SSH to get to my website for ages. And then it suddenly stopped working. I wondered, was it my password? But no. It turns out that, without telling me, my hosting company had changed things. Specifically, I now have to whitelist IP addresses if I wanted to use SSH. (This little fact is still not mentioned in the help documentation.) This is particularly frustrating as I have a dynamic IP address.
So, maybe Google went and changed something that your setup depends on, like requiring whitelisting or some such.
Well if everybody has their mail servers configured correctly the incoming mail should be flag for redelivery by the sending MTA for at least 2-4 days, so hopefully nothing is lost. I believe Sendmail is 4 days. You would think with so many users Shaw would also have at least secondary MX records for failover. Despite being a horrible protocol email does have it's delivery protections. The problem these days is that everybody *expects* immediacy with a technology that was designed with broken connections in mind. Just think about it like this: "They don't deliver on Saturdays anymore.. don't worry you will get it on Monday"
You have to log in with the gmail interface and answer a captcha. Then your account's back on.
Shutting down free speech with violence isn't fighting fascism. It IS fascism!
You are right. I was thinking of IMAP servers for clients sending outbound mail, but they should be separate and a secondary would not help.
Although the postfix/sendmail default for delivery failure is 2 days, I believe.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
There's a lot of incompetence about, especially bullshit such as the secondary being /dev/null itself as some sort of stupid anti-spam bandaid.
How stupid is that? Incredible! The whole reason for secondaries seeing more spam is that some of them do not have spam filters because of incompetent mailadmins. The fix is to either have the secondaries forward to the primaries (when they are back up and storing for some time before that) or to have the same spam filter on the secondaries.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
panic on sysadmin mailing lists
Hehehe, good old Microsoft. Never transparent, never clear, always good for surprises and gets more obscure than Linux kernel hacking when you have to fix problems MS did not anticipate. In German we call these "Schoenwettersysteme" (translates as "nice-weather only systems"). Toys, not fit for any real-world use.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
same PHB that let there data center fire take out 911 and other stuff in Calgary. Now I can see a fire killing all power or tripping the master power switch but not having a back up data center?
HR sees = cs degree as competent in IT while passing over people who went to tech schools or have years of experience
But, if anything happens, It's my own fault. I don't have to trust my ISP to do anything but provide the pipe.
I would never trust exchange as a relay. Everywhere I worked where I had the power to do so, Exchange did not sit in the DMZ, and relayed through proper unix mail servers. I prefer sendmail, because I am familiar with it and know how to properly extend and secure it. Use postfix if you prefer, but again, I'd never trust a Microsoft Exchange mail relay.
Nearly a week ago I contacted Shaw's customer support and requested one of their service reps contact me regarding setting up a client's ISP account. I received an automated e-mail later that day confirming they got the message and I was told they'd get back to me within 48 hours. Six days later and they still haven't contacted me. The client is now using a different provider for their Internet connection. A shame as Shaw really is one of the nicer ISPs to deal with in Canada, but they seem to have dropped the ball this week.
I don't believe 911 was affected, though many other government services were affected, including all registry services, and all electronic health records in the hospitals. To be fair, I'm not sure how much blame Shaw has in that one, The government contracted IBM (I think) to do the data centre, and IBM (if that's who it was) hosted it in the Shaw building, Without knowing the contracts involved, it's equally likely that this was a government screw up, an IBM screw up, or a Shaw screw up. Ok, the amount of damage from the fire itself seems excessive, indicating a poorly designed data centre, which was Shaw's fault. But I honestly put the lack of any redundant systems for such critical infrastructure down as a government screw up, as it is likely their contract that specified only a single data centre. (for such mission critical stuff, I can't figure why they wouldn't have a minimum of two completely redundant systems in two different cities running with live fail-over capabilities. There's no reason that outage should have lasted more than a minute or two, let alone the week plus that it did.)
Summary:
"A bunch of elderly users lost photos of kittens and grandchildren in massive email purge."
Shaw is still a better ISP than Telus. In fact Shaw could drop ALL email ALL the time and still beat Telus. Telus sucks - it's true. At least I don't have to sue Shaw in court to get customer service from them. Eat shit and die, Telus!
Oh, sorry, I meant that in a good way. I meant that a lot of people need a desktop email client, and in the past Hotmail and Yahoo didn't offer that, whereas Gmail has had it for years.
Yeap, the whole mail system is designed from the core so mail should never be lost as I learnt in my young days.
There were always ways to lose mails. One obvious way is if a mail server dies then you lose all mail between the last backup and the mailserver dieing. Another is if both the original mail and the bounce suffer a delivery failure but these circumstances were rare afaict. The majority of the time mails were either delivered successfully or bounced to the sender.
Then spam and virus mails with faked from addresses came along. If you bounce such mails you create backscatter for an unrelated user. If you reject them during the SMTP session then there is less chance of backscatter than if you bounce them yourself but it can still happen if the spam/virus mail is being sent to you by another MTA rather than directly by the virus/spamming tool. To avoid backscatter and keep things simple many filters just discard mails that they identify as spam or virus mails without attempting to inform the sender. If a mail is misidenfied as spam or a virus mail in such a system either due to imperfect hueristics or a configuration screwup then it will be silently lost.
In summary the deluge of spam and virus mails has lead to reactions that destroyed the relibility of internet email.
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
My university (I graduated long ago) gives a complementary email account to all allumni. Very useful for maintaining a constant address regardless of ISP.
Tell that to google. I have no access by any method. End of discussion. I didn't even call them until after my username and passwd known to be good, was rejected trying to login via FF.
I don't use webmail. Ever. Its a solution promulgated because they can wrap it up in so damned much advertising that you sometimes can't find the frigging message. Why folks, mostly winders users I suppose, use it, and put up with the hassle of spending 5 minutes to log in using a browser, when that is an automatic function of fetchmail that takes less than 100 milliseconds when committed to a background script. If the login is successful, then that waiting mail is downloaded to my hard drive at 400kb/sec & 30 seconds later I'm gone. I hit the + key and read it.
Now, if they wanted to cull the accounts that are not seeing their advertising, that's fine by me, as I have access to other mail servers. But no, they can't be honest, they have to lie like a used car salesman, telling me my machine is infected. There are 2 or 3 mailing lists, one of them a 500 msgs/day list still being fed into it. But they'll probably not notice as they have probably and old message culler that kicks in when the mailbox is at 95%. And I have no clue how much space that is.
In short, but at length in this reply, it is googles problem. They can fix it. If they were changing something that required I change a fetchmail option, they could have issued a broadcast to all users message. They did not.
Cheers, Gene
I've run enterprise email servers before, and every now and then a connector will break. For example, piping email traffic to Exchange via Mail Marshal will infrequently result in mail delivery failures because something is broken. It gets fixed shortly after its reported, but it isn't always reported in a timely fashion. So, isn't this fairly common?
The error message you get when you can't log in directs you to a web page where it explains you have to log in by web. Maybe those "winders" users follow documentation better than you?
Shutting down free speech with violence isn't fighting fascism. It IS fascism!
Not from fetchmail:
But I had to turn it on in .fetchmailrc & when I did, without the prescan by mailfilter, it worked, its sucking over 100 old mails dating back to March 1 now. So we wait and see if it will accept the next pull request. This gives me a list of lists whose subscriptions I need to move. lkml and mplayer for starters. Now I have re-enabled mailfilter too.
fetchmail's latest does have a new error message though, which for here make zero sense, not multidrop. everything goes to me although I do have a few /dev/null destinations in my procmailrc.
fetchmail: awakened at Tue Mar 12 11:46:52 2013
fetchmail: restarting fetchmail (/home/gene/.fetchmailrc changed)
fetchmail: warning: multidrop for pop.gmail.com requires envelope option!
fetchmail: warning: Do not ask for support if all mail goes to postmaster!
fetchmail: starting fetchmail 6.3.9-rc2 daemon
And the docs for 6.3.9-rc2 do not appear to discuss this. In any event, if mailfilter doesn't nuke it on the server before fetchmail pulls it, its handed off to to procmail SA and clamav. I see what survives that.
Anyway after 12 days, its working again, until gmail gets another fart stuck crossways I guess. As to when that might be, I haven't the foggiest.
Cheers, Gene.
Too bad it refuses to talk IMAP properly.