Slashdot.org Self-Slashdotted
Slashdot.org was unreachable for about 75 minutes this evening. Here is the post-mortem from Sourceforge's chief network engineer Uriah Welcome. "What we had was indeed a DoS, however it was not externally originating. At 8:55 PM EST I received a call saying things were horked, at the same time I had also noticed things were not happy. After fighting with our external management servers to login I finally was able to get in and start looking at traffic. What I saw was a massive amount of traffic going across the core switches; by massive I mean 40 Gbit/sec. After further investigation, I was able to eliminate anything outside our network as the cause, as the incoming ports from Savvis showed very little traffic. So I started poking around on the internal switch ports. While I was doing that I kept having timeouts and problems with the core switches. After looking at the logs on each of the core switches they were complaining about being out of CPU, the error message was actually something to do with multicast. As a precautionary measure I rebooted each core just to make sure it wasn't anything silly. After the cores came back online they instantly went back to 100% fabric CPU usage and started shedding connections again. So slowly I started going through all the switch ports on the cores, trying to isolate where the traffic was originating. The problem was all the cabinet switches were showing 10 Gbit/sec of traffic, making it very hard to isolate. Through the process of elimination I was finally able to isolate the problem down to a pair of switches... After shutting the downlink ports to those switches off, the network recovered and everything came back. I fully believe the switches in that cabinet are still sitting there attempting to send 20Gbit/sec of traffic out trying to do something — I just don't know what yet. Luckily we don't have any machines deployed on [that row in that cabinet] yet so no machines are offline. The network came back up around 10:10 PM EST."
So if you hammer your own servers, do you have to send an email to krow to get your privileges restored?
Now if you could just post the link to the form where I can claim my full refund (for time not wasted incurred) I'll go back to being a loyal "customer".
I record my sleeptalking
In Soviet Russia, Slashdot slashdots Slashdot!
probably the biggest proof that Slashdot has become sentient is that is willing to suicide self before seeing again another batch of Idle videos.
Slashdot has apparently learned how to masturbate, because it is now fucking with itself!
Any day you get to legitimately use "horked" in a public post can't be all bad. :P
Who Slashdots the Slashdotters?
First thing I'd do as Cyber Security Tzar would be to outlaw any network device that has the potential to become faulty.
We could've avoided this tragedy entirely.
Modding me -1 troll doesn't make me wrong.
This is another betrayal by Obama, as he yet again bows down before the fat cats and career politicians.
Shame!
The problem was the system was HORKED, didn't you get that?
The switches were running Windows 7 Starter Edition. http://tech.slashdot.org/article.pl?sid=09/02/09/1348255
The year is 2025.
Well, Ladies and Gentlemen, here you see what you may think is an archaic lot of old computers. You would be mistaken. These are Slashdot. No, no cause for alarm...and that door's locked anyway, you can't get out through there. The tour only goes forward. But I'm glad at the very least that you know what Slashdot is. Not was. IS.
It's a safeguard against...something. Something that was unleashed for 75 minutes in 2009 that crippled what was rumored to be the most robust public-facing cluster known. All we have left from that fateful day is the single post from the Slashdot network admin. Someone archived it, lucky us, because he was never seen after that day. I have a copy here, hardcopy of course -- no sense in taking risks so close to...well....
Here it is:
I fully believe the switches in that cabinet are still sitting there attempting to send 20Gbit/sec of traffic out trying to do something. I just don't know what yet.
Is it possible the duplicate article generator tried to spawn, became entangled in its own potential well of duplicity, and now is trapped like two Lisp programmers deep inside their parenthesis?
Every mans' island needs an ocean; choose your ocean carefully.
In Korea, only old people slashdot slashdot. The memes are funny. The insightful comments are insightful. The funny comments are funny, the trolls are trolls. Seems reseting slashdot fixed everything. The entire world is doomed!
And before anyone says this is a shitty plot... I *did* say Michael Bay.
Today is red jello day - all workers must eat all of their red jello. Failure to comply will result in five demerits.
Mirror
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Act as a data source to Excel.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
February 9th, 2009 8:55pm Slashdot becomes self-aware.
...were he not typing that long-a$$ summary. Twice as fast if he didn't have to spellcheck.
(j/k)
Which leads me to this question:
What do Slashdotter staff read to avoid doing work?
WARNING: Smartphones have side effects--most of them undocumented.
Is that worse than B0rked?
I thought the scale was:
B0rked
Horked
F*cked
Stuffed
Iffy
Working
AT&ROFLMAO
this actually explains duplicate posts pretty well...
The time lords, for a joke, take stories from slashdot, go back a day or two, and submit them. They get posted a few days early, but to avoid paradox, reality requires the "original" post to be made anyway. Thus we get double posts of stories.
You all owe the slashdot editors an apology.
Darth --
Nil Mortifi, Sine Lucre
Where does being Bork Bork Borked rank on that?
I would still find ways to not do work even without Slashdot!
Cool! How do you manage to do that? Please, share your secret with the rest of the world...
...being out of CPU, the error message was actually something to do with multicast. As a precautionary measure I rebooted each core just to make sure it wasn't anything silly. After the cores came back online they instantly went back to 100% fabric CPU usage and started shedding connections again. So slowly I started going through all the switch ports on the cores, trying to isolate where the traffic was originating. The problem was all the cabinet switches were showing 10 Gbit/sec of traffic, making it very hard to isolate. Through the process of elimination I was finally able to isolate the problem down...
What did I say that sounded like "Tell me about your day at work" ?
Squirrel!
I fully believe the switches in that cabinet are still sitting there attempting to send 20Gbit/sec of traffic out trying to do something - I just don't know what yet.
Um, trying to get first post?
It may be strange for those not in the networking field, but when things really go bad, the only place to be is physically in the data center.
Heh. I've heard that in the old day you could find broken Token ring hardware by listening after a high pitched whining noise. Guess one really has to be there for stuff like that.
Was there, and confirm true. Whining noise normally came from IBM SE who was trying to fix problem.
Man, your poor slash key has a hard life.
First rule of portfast mode:
What ever happens in portfast mode, stays in portfast mode.
This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
And "access from the home office" would allow them to do what exactly?!?
Guaranteed first posts.
Better known as 318230.
That's not a nice or polite way to talk about your manager.