A Note On Thursday's Downtime

Posted by Soulskill on Saturday July 18, 2015 @04:25PM from the slashdotting-ourselves dept.

If you were browsing the site on Thursday, you may have noticed that we went static for a big chunk of the day. A few of you asked what the deal was, so here's quick follow-up. The short version is that a storage fault led to significant filesystem corruption, and we had to restore a massive amount of data from backups. There's a post at the SourceForge blog going into a bit more detail, and describing the steps our Siteops team took (and is still taking) to restore service. (Slashdot and SourceForge share a corporate overlord, as well as a fair bit of infrastructure.)

2 of 75 comments (clear)

Min score:

Reason:

Sort:

While you're at it, add some modern features by the_humeister · 2015-07-18 16:49 · Score: 4, Interesting

like unicode support and ipv6.
Cause?? by scsirob · 2015-07-18 18:56 · Score: 4, Interesting

It's great to see how you responded to the failure and got services resumed pretty quickly. However, I'd rather like to see a follow-up sometime, describing a root cause analysis. With all the clustered, distributed servers and filesystems you use today, such an outage shouldn't be possible, right?

--
To Terminate, or not to Terminate, that's the question - SCSIROB