Amazon Explains Why S3 Went Down
Angostura writes "Amazon has provided a decent write-up of the problems that caused its S3 storage service to fail for around 8 hours last Sunday. It providers a timeline of events, the immediate action take to fix it (they pulled the big red switch) and what the company is doing to prevent re-occurrence.
In summary: A random bit got flipped in one of the server state messages that the S3 machines continuously pass back and forth. There was no checksum on these messages, and the erroneous information was propagated across the cloud, causing so much inter-server chatter that no customer work got done."
Since their drive towards Amazon Prime, their deliveries have been appalling (at least in the area of the UK I'm in), YMMV. As much as I appreciate their candour, the delivery and delay problems are what is driving me away at the moment.
A thistle is a fat salad for an ass's mouth...