Yahoo's Amazing Disappearing Mail Servers
Golygydd Max writes "A Techworld story reveals that the reason Yahoo email has delivery problems is that the company's mail servers mysteriously close once in a while." From the article: "According to trimMail's Email Battles site, which recently monitored 16 of the company's advertised email hosts 240 times over a half hour period, only 133 of its probes were answered. Many of the servers were closed and unavailable. Overall availability ranged from 25 percent to 75 percent over the admittedly short test period. The average availability was 55 percent, with the worst of the servers available only 7 percent of the time."
Duh! That is why they have multiple redundant servers. When one server goes down the email is routed to another server. Personally I have never encountered a situtation where an email sent to my yahoo account did not reach me. Yahoo Groups is a different story. Emails used to disappear frequently when they merged with eGroups. Things have stabilized now, but sometimes emails sent to a group do not reach all the participants and it is not a receiver issue but a routing issue on yahoo groups servers. Overall the uptime should be close to 100%. Nobody cares what is happening behind the scenes, whether one server has 100% or 7%.
Email is DESIGNED to handle failures of this kind. Assuming Yahoo is running some form of clustering, it's quite reasonable to think that systems will start/stop as load fluctuates. Availability of individual servers is largely irrelevant - it's the availability of the system at large that matters.
I have no problem with your religion until you decide it's reason to deprive others of the truth.
240 times over a half hour period is a high rate of connections per server (8 per minute per server), especially for email servers, so is it possible that Yahoo!'s servers were simply defending themselves against a perceived threat? Connection throttling was the first thing that came to mind on reading the blurb.
Yahoo is actually doing the right thing here, from a technical point of view. The worst thing you can do is have an MX that accepts connections but is not responsive enough to actually handle accepting a message at that point -- it's far better to stop accepting SMTP connections when you detect you're at your maximum capacity.
This is because SMTP clients who fail to get a connection will immediately try the next MX. If they get a connection, but can't send the message, they may back off and try again later, delaying the message further.
--
Twoflower
Yahoo is a heavy user of greylisting. I would expect any of their servers to break connections, refuse connections and even deploy firewall rules including tarpitting to anything their greylisting algorithm finds annoying. In fact I am pretty sure about the first two, dunno about the last item. I am planning on doing it on the servers I run, I would be surprised if they do not have it already. After all they have a huge department that does nothing else but mail for themselves and their resale customers.
Move along people, simply the dot.bomb times are back. Yet another metric company making big noises about the fact that someone BIG looks bad on their metric. Reason is most likely that the metric is badly designed and does not take current large scale mail handling practices into account. We have all been there a few years ago when everybody and his dog was pushing metrics around just before the bubble collapsed. Move along, nothing new here.
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/
Indeed, as soon as I read the headline of the article I knew some dork that didn't understand greylisting was behind it. I've implemented greylisting with MDaemon (along with it's other 9 or so anti-spam layers) with great sucess, and if you use decent monitoring tools, everything works just fine.
My name is coaxeus, and I approve this message. In fact, I think it is awesome.
Yeah, greylisting works great, if the sending mail servers behave too. My employer (a small ISP) uses it, and every now and then a remote server has some weird retry patterns that fuck everything up. Try explaining THAT to a customer.
"240 times over a half hour period, only 133 of its probes were answered."
Well, if a single host tried to ping me once every 7.5 seconds for a half-hour, I'd want my hardware to ignore a few of them, too.