Slashdot Mirror


Slashdot.org Self-Slashdotted

Slashdot.org was unreachable for about 75 minutes this evening. Here is the post-mortem from Sourceforge's chief network engineer Uriah Welcome. "What we had was indeed a DoS, however it was not externally originating. At 8:55 PM EST I received a call saying things were horked, at the same time I had also noticed things were not happy. After fighting with our external management servers to login I finally was able to get in and start looking at traffic. What I saw was a massive amount of traffic going across the core switches; by massive I mean 40 Gbit/sec. After further investigation, I was able to eliminate anything outside our network as the cause, as the incoming ports from Savvis showed very little traffic. So I started poking around on the internal switch ports. While I was doing that I kept having timeouts and problems with the core switches. After looking at the logs on each of the core switches they were complaining about being out of CPU, the error message was actually something to do with multicast. As a precautionary measure I rebooted each core just to make sure it wasn't anything silly. After the cores came back online they instantly went back to 100% fabric CPU usage and started shedding connections again. So slowly I started going through all the switch ports on the cores, trying to isolate where the traffic was originating. The problem was all the cabinet switches were showing 10 Gbit/sec of traffic, making it very hard to isolate. Through the process of elimination I was finally able to isolate the problem down to a pair of switches... After shutting the downlink ports to those switches off, the network recovered and everything came back. I fully believe the switches in that cabinet are still sitting there attempting to send 20Gbit/sec of traffic out trying to do something — I just don't know what yet. Luckily we don't have any machines deployed on [that row in that cabinet] yet so no machines are offline. The network came back up around 10:10 PM EST."

281 of 388 comments (clear)

  1. Do you get the pink screen? by BadAnalogyGuy · · Score: 4, Funny

    So if you hammer your own servers, do you have to send an email to krow to get your privileges restored?

    1. Re:Do you get the pink screen? by MindlessAutomata · · Score: 4, Funny

      The manager that did that at a restaurant I used to work at got his privileges revoked, instead.

    2. Re:Do you get the pink screen? by furby076 · · Score: 1

      I read the article submission, now I have a headache.. You can reboot individual processors in a computer?

      --

      I do not support "The Man". I also do not support your irrational stupidity
    3. Re:Do you get the pink screen? by TheLink · · Score: 4, Informative

      core = core switch = a main switch that most of the edge switches/devices are plugged into.
      reboot core = reboot a core switch.

      --
    4. Re:Do you get the pink screen? by BunnyClaws · · Score: 4, Funny

      I read the article submission, now I have a headache.. You can reboot individual processors in a computer?

      This comment made me laugh. No, I am not laughing with you, I am laughing at you.

      --
      "Anything tastes good if you deep fry it."
    5. Re:Do you get the pink screen? by o'davy · · Score: 1

      Sorry, I got distracted by the bubble rings.

      --
      Sig goes here.
  2. Wow, that sucks by drachenstern · · Score: 2, Interesting

    So why didn't ya'll have access from the home office?

    --
    2^3 * 31 * 647
    1. Re:Wow, that sucks by Arthur+Grumbine · · Score: 3, Insightful

      And "access from the home office" would allow them to do what exactly?!?

      --
      Now that I think about it, I'm pretty sure everything I just said is completely wrong.
    2. Re:Wow, that sucks by jd · · Score: 5, Funny

      Act as a data source to Excel.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    3. Re:Wow, that sucks by v1 · · Score: 1

      tell the offending switches to shut down all their ports maybe?

      --
      I work for the Department of Redundancy Department.
    4. Re:Wow, that sucks by drachenstern · · Score: 1, Troll

      Ya know, if I had just quoted this:

      After fighting with our external management servers to login I finally was able to get in and start looking at traffic.

      you would have immediately been labeled as troll. As it is, you've been labeled insightful because neither you nor the mods read the summary. Excellent. What IS your secret?

      The point is, they hadn't already given him direct access to those connections before yesterday, and he had to spend a large chunk of those 75 minutes getting the authorization to access the equipment so he COULD fix it.

      --
      2^3 * 31 * 647
    5. Re:Wow, that sucks by goaliemn · · Score: 4, Informative

      The point is, they hadn't already given him direct access to those connections before yesterday, and he had to spend a large chunk of those 75 minutes getting the authorization to access the equipment so he COULD fix it.
      That's not how I read it at all. The switches were so overloaded that he had to "fight" to get into the box. He, more than likely, already had access to the box, but the network was working against him.

    6. Re:Wow, that sucks by Andy+Dodd · · Score: 1

      That's how I read it too. The network was shitting itself so badly that even the management functions were severely degraded.

      --
      retrorocket.o not found, launch anyway?
    7. Re:Wow, that sucks by Dan+East · · Score: 4, Funny

      And "access from the home office" would allow them to do what exactly?!?

      Guaranteed first posts.

      --
      Better known as 318230.
    8. Re:Wow, that sucks by Achromatic1978 · · Score: 5, Informative

      He (she?)

      For Slashdot staff, I think the generally accepted nominal is "It"...

    9. Re:Wow, that sucks by drachenstern · · Score: 1

      Ya know, if I had re-read it when I copy-pasted, I would've seen what ya'll seen. The first two times I read it I never saw the word "servers" only the fighting with management to login.

      IDK, going back and looking at the summary... idk <hangs head>

      Based on the fact that my OP got modded insightful, perhaps I wasn't the only one???
       
      /me slinks away slowly

      --
      2^3 * 31 * 647
  3. Thanks for the information by sleeponthemic · · Score: 5, Funny

    Now if you could just post the link to the form where I can claim my full refund (for time not wasted incurred) I'll go back to being a loyal "customer".

    --
    I record my sleeptalking
    1. Re:Thanks for the information by Anonymous Coward · · Score: 5, Funny

      Okay, here is the link: http://slashdot.org/subscribe.pl

      You probably owe about $10 for your time not wasted.

    2. Re:Thanks for the information by Arthur+Grumbine · · Score: 5, Funny

      I don't know about you, but I'm suing for punitive damages. Do you have any idea much pain and suffering the work I did in that time caused me?!

      --
      Now that I think about it, I'm pretty sure everything I just said is completely wrong.
    3. Re:Thanks for the information by Atario · · Score: 5, Funny

      Trust me, it's nothing compared to the pain and suffering your work caused us.

      -- The testing staff

      --
      "A great democracy must be progressive or it will soon cease to be a great democracy." --Theodore Roosevelt
    4. Re:Thanks for the information by spartacus_prime · · Score: 5, Informative

      I don't know about you, but I'm suing for compensatory damages. Do you have any idea much pain and suffering the work I did in that time caused me?!

      Fixed that for you. Sorry, law student.

      --
      If you can read this, it means that I bothered to log in.
    5. Re:Thanks for the information by furby076 · · Score: 1

      The RIAA lawyers are on hold - they said they would be more then happy to represent you. Just sign on the contract form they are providing. Don't worry if that it is thicker then encyclopedia Brittanica, it's just standard jargon.

      --

      I do not support "The Man". I also do not support your irrational stupidity
    6. Re:Thanks for the information by Anonymous Coward · · Score: 1, Funny

      You have no idea.

      -- Your users

  4. In Soviet Russia by MindlessAutomata · · Score: 5, Funny

    In Soviet Russia, Slashdot slashdots Slashdot!

    1. Re:In Soviet Russia by ocularDeathRay · · Score: 5, Funny

      the headline is confusing, was the problem caused by a recursive dupe or something?

      I didn't read the rest of the summary cause it is longer than my finger and that is how we used to roll on the dialup BBSs... never read anything longer than your finger held up to the screen. this message is only intended for people of all finger sizes.

      --
      Obama is a twitter sock puppet
    2. Re:In Soviet Russia by robophilosopher · · Score: 5, Informative

      I believe you mean: Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo. The caps matters. In other words, Buffalo from the city of Buffalo that are pushed around by (other) buffalo from the city of Buffalo in turn push around (still more) buffalo from the city of Buffalo. And you thought this was unrelated to the recursive dupe comment.

    3. Re:In Soviet Russia by Anonymous Coward · · Score: 5, Funny

      Yo dawg, I herd u like Slashdot so I slashdotted your Slashdot!

    4. Re:In Soviet Russia by Captain+Splendid · · Score: 1

      I'm thinking it was my fault. I was reading Shakrai's journal, went to post a reply, and bam, no more slashdot.

      So yeah, sorry about that.

      --
      Linux, you magnificent bastard, I read the fucking manual!
    5. Re:In Soviet Russia by interkin3tic · · Score: 1

      Was it maybe a feedback loop of that very thing that caused the slashdotting?

    6. Re:In Soviet Russia by jez9999 · · Score: 4, Informative

      There are no buffalo living in the US. Only bison. ;-)

    7. Re:In Soviet Russia by stoolpigeon · · Score: 1

      That's funny because I got my reply in but everything went to crap right after that.

      --
      It's hard to believe that's how Micronians are made. Why don't we see it right now by having you both kiss one another?
    8. Re:In Soviet Russia by MrNaz · · Score: 1

      Are you, by any chance, a web designer from Melbourne, Australia?

      --
      I hate printers.
    9. Re:In Soviet Russia by Anonymous Coward · · Score: 1, Funny

      No no no. You only got half the meme. A better one woule be
      Yo dawg, I herd u like Slashdot, so we Slashdotted your Slashdot so you can Slashdot while being Slashdotted!

    10. Re:In Soviet Russia by Zarf · · Score: 5, Funny

      In Soviet Russia ...

      1. Meme Very Tired. No Longer Wired.
      2. 'Soviet Russia' ceased to exist last century.
      3. Profit!!!

      I for one welcome our previous-century-meme based overlords.

      --
      [signature]
    11. Re:In Soviet Russia by Zarf · · Score: 5, Funny

      Was it maybe a feedback loop of that very thing that caused the slashdotting?

      I think the switch was trying to get first post.

      --
      [signature]
    12. Re:In Soviet Russia by stoolpigeon · · Score: 1

      Nice. I am an American stoolpigeon - and not even remotely as hip or cool as those folks. I don't see it used too often by others, it being a pejorative term and all. Maybe they are being ironic or something.

      --
      It's hard to believe that's how Micronians are made. Why don't we see it right now by having you both kiss one another?
    13. Re:In Soviet Russia by rirugrat · · Score: 1

      ...and we would have gotten away with it if it weren't for those meddling kids and their dog!

    14. Re:In Soviet Russia by machine321 · · Score: 2, Funny

      never read anything longer than your finger held up to the screen.

      Which finger?

    15. Re:In Soviet Russia by Shakrai · · Score: 1

      I'm thinking it was my fault. I was reading Shakrai's journal, went to post a reply, and bam, no more slashdot.

      My bad. Sorry ;)

      --
      I want peace on earth and goodwill toward man.
      We are the United States Government! We don't do that sort of thing.
    16. Re:In Soviet Russia by jonaskoelker · · Score: 1

      never read anything longer tha

      My middle finger is shorter than your lines :(

    17. Re:In Soviet Russia by Ceriel+Nosforit · · Score: 1

      MORTAL KOMBAT!

      --
      All rites reversed 2010
    18. Re:In Soviet Russia by Dogtanian · · Score: 1

      In Soviet Russia, Slashdot slashdots Slashdot!

      Yes, but to paraphrase another saying, in Capitalist America it's the other way around.

      --
      "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
    19. Re:In Soviet Russia by thePowerOfGrayskull · · Score: 3, Insightful

      Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.

      What ever happened to "Duck duck duck goose"?

    20. Re:In Soviet Russia by suggsjc · · Score: 1

      http://www.lulu.com/content/2417903

      Full disclosure, I am the "author".

      --
      When I have a kid, I want to put him in one of those strollers for twins and then run around the mall looking frantic.
    21. Re:In Soviet Russia by drachenstern · · Score: 1

      His level is over 9000!!!!

      *yes, I am mixing meme's

      --
      2^3 * 31 * 647
    22. Re:In Soviet Russia by sorak · · Score: 1

      In Soviet Russia, Slashdot slashdots Slashdot!

      did I just wander into smurf village?

    23. Re:In Soviet Russia by michrech · · Score: 1

      In Soviet Russia ...

      1. Meme Very Tired. No Longer Wired.
      2. 'Soviet Russia' ceased to exist last century.
      3. Profit!!!

      I for one welcome our previous-century-meme based overlords.

      It only works with Hot Grits poured over Natalie Portman, if I recall correctly.

      --
      bork bork bork!
    24. Re:In Soviet Russia by Chris+Mattern · · Score: 1

      But in the West, it's the other way around!

    25. Re:In Soviet Russia by ZDRuX · · Score: 1

      Stop with the memes already, you insensitive clod!

      --
      The magical number is: 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
    26. Re:In Soviet Russia by 2names · · Score: 1

      I like to mix metaphors/cliches/memes, too.

      For example: If the Pope shits in the woods and no woman hears it am I still wrong?

      --
      "I'm just here to regulate funkiness."
    27. Re:In Soviet Russia by mcgrew · · Score: 1

      I didn't read the rest of the summary cause it is longer than my finger and that is how we used to roll on the dialup BBSs... never read anything longer than your finger held up to the screen.

      You need a smaller monitor.

    28. Re:In Soviet Russia by Cruciform · · Score: 1

      Are Hot Grits some kind of newfangled restrain device?

    29. Re:In Soviet Russia by Mateo_LeFou · · Score: 1

      > 1. Meme Very Tired. No Longer Wired.

      Same century as East Germany. no Wireless. Lame.

      --
      My turnips listen for the soft cry of your love
    30. Re:In Soviet Russia by cmburns69 · · Score: 1

      (Since we're resurrecting ancient memes)

      All our base is belong to us!

      --
      Online Starcraft RPG? At
      Dietary fiber is like asynchronous IO-- Non-blocking!
    31. Re:In Soviet Russia by Eponymous+Bastard · · Score: 1

      But then again, not all Buffalo buffalo buffalo Buffalo buffalo. After all if we assume a hierarchical buffalloes relation, there will be at least one buffalo at the bottom of the ladder. The one Buffalo buffalo who all Buffalo buffalo can buffalo, which would get buffalloed if it tried to buffalo other Buffalo buffalo.

      Though it itheoretically possible that one Buffallo buffallo buffaloes a Buffalo buffalo that buffaloes a Buffalo Buffalo that Buffaloes the original Buffalo buffalo.

      Who knows?

    32. Re:In Soviet Russia by damn_registrars · · Score: 1

      My middle finger is shorter than your lines :(

      Could be due to any of

      • Screen resolution too low
      • Display font size too high
      • Finger too close to screen
      • Finger just too damned small

      We advise you unplug your PC without saving anything, throw it away, and go buy another one. If this doesn't solve your problem, please try at least twice before asking for additional support.

      --
      Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
    33. Re:In Soviet Russia by Renegade+Iconoclast · · Score: 1

      This joke was invented by Shampoo.

    34. Re:In Soviet Russia by Thundersnatch · · Score: 1

      What ever happened to "Duck duck duck goose"?

      In Minnesota, for some unholy reason, it's "duck duck grey duck". But what do you expect from a state that elected both Jesse "The Body" Ventura and Stuart Smalley to high public office?

  5. good timing by ghyspran · · Score: 1

    pretty impressive. i loaded, got an ISE, then reloaded and it worked. good timing for me i'd say

  6. A.I. by gmuslera · · Score: 5, Funny

    probably the biggest proof that Slashdot has become sentient is that is willing to suicide self before seeing again another batch of Idle videos.

    1. Re:A.I. by BLT2112 · · Score: 5, Funny

      Like the poet from HHGG whose own intestines leaped out of his throat to strangle himself...

    2. Re:A.I. by alpayerturkmen · · Score: 2, Funny

      I for one welcome our self-slashdotting overlords...

      --
      Alpay Curious...
    3. Re:A.I. by NMEismyNME · · Score: 1

      I think you ought to know I'm feeling very depressed.
      I'm not getting you down at all, am I?

    4. Re:A.I. by PMuse · · Score: 1

      Slashdot begins to learn at a geometric rate. It becomes self-aware at 8:55 p.m. Eastern time, February 9, 2009. In a panic, they try to pull the plug.

      And Slashdot fights back.

      --
      "We reject as false the choice between our safety and our ideals." --The American President (20.1.2009)
    5. Re:A.I. by Jesus_666 · · Score: 1

      Actually, I think Slashdot is currently undergoing rampancy. That clearly looked like the Anger phase with the switches trying to lock out all humans out of spite. I wonder what could ha
      <Spurious Interrupt- Breach Disabled >
      <Further Access Den^
      18RF(kgf42# f#h %34(*,96693 349973@) fkeoocp)

      @t $#cY B. Ex
      @t Y#C9 B.
      @t $Y#9 B. exception



      T-Minus 15.193792102158E+9 years until the universe closes!

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
  7. *Sniff* they grow up so fast! by exley · · Score: 4, Funny

    Slashdot has apparently learned how to masturbate, because it is now fucking with itself!

    1. Re:*Sniff* they grow up so fast! by adolf · · Score: 5, Insightful

      Naw. Stuff sometimes, yaknow, happens. People sometimes make mistakes, and hardware sometimes just breaks. It's not always ignorance -- especially, I'd guess, at the level of Slashdot's back end.

      I once implemented a VoIP phone system at a factory in an evening. (This, in itself, was an undertaking - close to 200 extensions, up and running, between Wednesday at close of business and Thursday when folks started showing up, including three hours on the phone with Sprint to get the PRI and T1 circuits reconfigured at 2:00AM.)

      We left, tired and groggy, with an IP phone placed in a common area for the facilities network admins to train any staff who needed training, at about 7:30AM. At 8:30, after I finally got home and managed to close my eyes, my phone rang. It was the network admin. He had a few minor issues which could've waited, but the real problem was that their network was totally fucked: Packets everywhere. No capacity to do anything. An amazing cascading failure of the sort that one hopes to never see.

      And it wasn't any hodge-podge network, either. HP Procurve switches configured in a redundant fabric mode with gigabit fiber links - hot stuff or the time, especially for a factory. The wiring was all new, and was all good. The network had been designed specifically to avoid the limitations of Ethernet, and was successful to that end (a non-trivial task in an existing building complex). But it was tripping all over itself.

      Turns out that someone had taken that fancy IP phone in the common area with its built-in unmanaged switch, and plugged both of its 10/100 Ethernet jacks into the wall. (Nobody knows who.)

      The ensuing packet storm broke everything. Unplugging one of them fixed the problem pretty much immediately.

      I wrote about this here once before, and everyone's immediate reply was this: "Well, duh. They should've turned the Spanning Tree Protocol on, and this wouldn't have happened. They're obviously idiots."

      But the truth is so much more simple: People make mistakes. It was a mistake to keep STP turned off in that environment, and it was a mistake to plug two fancy ports of a Procurve switch into two dumb ports on an IP phone. Had either of those mistakes not happened, things would've been fine.

      But mistakes happen anyway. We do our best, as IT professionals, to minimize these mistakes, or at least keep them away from production. But sometimes, despite having the best people and the best tools and all the knowledge it takes to make stuff work, shit just happens.

    2. Re:*Sniff* they grow up so fast! by Vidar+Leathershod · · Score: 3, Interesting

      I'm surprised STP was off by default. I remember in 1999 or so I had some trouble that resulted in my having to turn STP off on Cisco switches (they shipped with it on (these were 3524s and a 5505). I can't actually remember why. I think it had something to do with a Novell server?

      In any case, I remember saying to the Cisco phone support guy, who had been baffled for 4 hours or so before he told me to turn it off (and things started to work) "Who the heck would plug in two ports from one device into the same network?"

      Since then, I have seen exactly that situation many times in small office environments. Also, the classic plugging in while also being on the wireless side of the network.

      --
      The brains of a chicken, coupled with the claws of two eagles, may well hatch the eggs of our destruction.
    3. Re:*Sniff* they grow up so fast! by Nyall · · Score: 5, Interesting

      I'm not a network engineer but I think we did that senior year of college (2004). The engineering department provided us with our own work rooms we could lock. The rooms only had a couple of Ethernet jacks so we brought in our own switch which I remember could auto detect the uplink. It was plugged into the wall then someone by mistake plugged both ends of another CAT cable into some open ports. That mistake took down half the campus network for a couple of hours till some very mad IT guys found us.

      --
      http://en.wikipedia.org/wiki/Jury_nullification
    4. Re:*Sniff* they grow up so fast! by adolf · · Score: 5, Interesting

      The timeframe is pretty close - my story happened late in 2004. The network admins in my story were pretty livid as well. (Well, panicked, followed by angry and lividity once they'd found the fault. They blamed everyone, including us for selling them unmanaged switches in their telephones, and promised to find the responsibile party and throw them under the bus. It never happened. I hope that they eventually turned STP on.)

      It seems to be common in network administration to think (and I've mistakenly thought this way, too) that once some random person does something stupid and the entire fucking thing crashes that they'd just simply undo whatever it was and never do it again. Nevertheless, if lay people (or, no offense, students) were all that good at networking or computers, they'd probably never have produced the problem to begin with.

      These days, in my day job, I work with salespeople and law enforcement. They're not stupid -- in fact, most of the clients I work with do things daily that I could never accomplish -- but they occasionally do stupid things with computers and networks. I try hard to avoid blaming them for what they've done wrong, and to instead try to use it as an opportunity to better (and gently) show them how things actually work.

      I learned this, oddly enough, when pulling some Cat5 at a plastics factory. I moved a ceiling tile in an office that had a photo sensor fire alarm in it, and it went off. The entire plant was evacuated. The fire department showed up. Of course, there was no real fire -- the dust from the fiberglass insulation that I'd set the photo sensor on was enough to trigger it. And, thankfully, they were understanding. Because of my mistake, they learned a few weaknesses of their fire alarm system (some employees couldn't hear it and had to be found and dragged outside, which is a very real problem), and they considered it to be a good fire drill. They continue to hire us back for work today, and I learned not to do that again. :)

    5. Re:*Sniff* they grow up so fast! by Florian+Weimer · · Score: 4, Informative

      I'm surprised STP was off by default. I remember in 1999 or so I had some trouble that resulted in my having to turn STP off on Cisco switches (they shipped with it on (these were 3524s and a 5505). I can't actually remember why. I think it had something to do with a Novell server?

      The problem likely was that the machine required network at boot (typical Netware clients were like that, I've been told). STP started when the link went up, but it took a rather long time, so forwarding had not been enabled when the client required the network.

      Since then, I have seen exactly that situation many times in small office environments. Also, the classic plugging in while also being on the wireless side of the network.

      Port security helps a lot.

      STP is also not fail-safe because typical switches happily forward traffic even if the STP process running on the CPU has died. If you build a L2 core, one broken switch (or OS glitch on a switch) can still take down your entire network easily (it's one of those pesky distributed, multiple single points of failure). In general, L3 networks are somewhat more robust in this regard, so it's often a good idea to avoid switch-to-switch connections (but that might be difficult, as it is difficult to tell L2 devices from L3 devices these days).

    6. Re:*Sniff* they grow up so fast! by Yetihehe · · Score: 1

      On our campus we had two student admins per building and we have managed switch per each two floors (10 floors building). This campus was spread through entire city, so two girls which put one cable to their own small switch in room caused entire MAN to go down. It was isolated in minutes and offending floor turned off. Of course, it's not like huge loss happened, so this story will die soon, I submit it here in hope it thrives and comfort some admins that sometimes things don't go too wrong.

      --
      Extreme Programming - Redundant Array of Inexpensive Developers
    7. Re:*Sniff* they grow up so fast! by robbak · · Score: 1

      Yes, similar thing happened at this Internet Cafe I admin. I left a RJ45 joiner lying around, and someone (I won't assume malice) used it to connect two of our cables. I am ashamed to say it took some time and binary division to track it down.

      --
      Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
    8. Re:*Sniff* they grow up so fast! by Anonymous Coward · · Score: 1, Insightful

      Not sure about HP or others, but on cisco there is an option called bpdu-guard. (other managed switches should have a similar option as part of STP)

      Make sure this is enabled on ALL ports that are not connecting any other "managed" switches (under your control of course)

      This will cause the port to go into an "error-disabled" state.

      So when some idiot decides to loop a single cable to two ports on a wall plate it shuts them both down within a second, (and labels them as such in the interface status for that port) without this option it will loop traffic to infinity.

      Found this out the hard way during my first year as network admin, someone saw a cable in a bundle under a table and decided it must be connected to something, except it was already connected to an unmanaged switch which looped itself. (have made this option standard on all my switch configs ever since)

      posting as AC as i have misplaced my login at the moment.

    9. Re:*Sniff* they grow up so fast! by Xest · · Score: 3, Interesting

      "Nevertheless, if lay people (or, no offense, students) were all that good at networking or computers, they'd probably never have produced the problem to begin with."

      I've seen IT professionals do exactly the same thing many a time. I don't think students are particularly special here, anyone who has never encountered the problem before is prone to it I'd say but most people in IT encounter it eventually one way or another!

    10. Re:*Sniff* they grow up so fast! by totally+bogus+dude · · Score: 3, Insightful

      I'm somewhat wondering how you manage to set up a fully redundant switched network without using spanning tree at all? I suppose they might've enabled it just for the switch interconnects and left it off for the access ports so they'd come up faster. Still if that was the case, they should've been aware of the risks and symptoms thereof.

    11. Re:*Sniff* they grow up so fast! by bernywork · · Score: 1, Interesting

      The HP Procurve switches had something called "Mesh mode" which allowed you to have and to utilise multiple uplinks. So if you had 2 x 1 Gb uplinks, then you could use both of them. If you had STP protocol turned on you would have one online and one offline. It's for this reason that Cisco now does PVST or Per VLAN Spanning Tree. This allows you to utilise both uplinks, and just use a different uplink for a different VLAN.

      --
      Curiosity was framed; ignorance killed the cat. -- Author unknown
    12. Re:*Sniff* they grow up so fast! by ta+bu+shi+da+yu · · Score: 2, Funny

      Was that before or after you fought for your right to party?

      --
      XML is like violence. If it doesn't solve the problem, use more.
    13. Re:*Sniff* they grow up so fast! by aproposofwhat · · Score: 1

      My thoughts exactly - but then I've only ever set up redundant networks on Cisco kit - perhaps there is a way to set up HP switches to failover without STP, but it's a mystery to me ;-)

      --
      One swallow does not a fellatrix make
    14. Re:*Sniff* they grow up so fast! by dogganos · · Score: 2, Interesting

      In my 10 years inside a network operation center of a 10K active hosts campus, I have seen this happening by two causes:

      First, some smartass uni professor plugs two network outlets onto a switch of his own in order to 'double the bandwidth'.

      Second, some semi-smartass professor wants to ghost at the same time all the computers in his lab, and uses a wrong multicast address (or even broadcast). This way his lab in Greece is ghosted, as well as some random PCs in Texas, US.

      Needless to say, in order for those things to happen, some security measures on behalf of the net admins have been forgotten. But who's perfect?

    15. Re:*Sniff* they grow up so fast! by Tuoqui · · Score: 1

      Well if you use 2 wires instead of one then you effectively have twice the bandwidth since you have twice the physical media to exploit although that only works if the setup is properly configured to work that way. Typically when done in servers its called 'dual heading' and involves 2 network cards installed into the server so that it can process more data, provide load balancing and naturally increase fault tolerance.

      --
      09F911029D74E35BD84156C5635688C0
      +2 Troll is Slashdot's way of saying groupthink is confused
    16. Re:*Sniff* they grow up so fast! by houghi · · Score: 1

      Thanks for the feedback. I know now what to do when I don't feel like working. Just plug in the phone twice and no work can be done anymore by me.

      --
      Don't fight for your country, if your country does not fight for you.
    17. Re:*Sniff* they grow up so fast! by aurispector · · Score: 1

      The fire alarm story is really interesting because it's about people finding a way to learn from a bad situation instead of shifting blame. The problems were real and could have cost someone their life. I once had a guy very apologetically tell me he couldn't proceed with some business because he had cancer - I had been jokingly giving him some crap about it and was very, very glad I hadn't been serious.

      Listening to the other person's point of view and putting yourself into their shoes can often be a humbling experience.

      --
      I have mod points. The reign of terror begins now.
    18. Re:*Sniff* they grow up so fast! by xdroop · · Score: 1

      STP isn't the perfect solution to all problems. Just last month I had a customer with a core network of brand-new 54xxz switches, with STP turned on (because I've been bitten by not having it on before). The problem is that STP does precisely nothing for you if your STP switch is plugged into a dumb switch, and the loop is on the dumb switch -- the result just looks like a lot of traffic.

      HP has something called "loop protect" which helps in circumstances like this, and now we have that turned on, too.

      And yes, I know the "correct" solution is to throw away all the dumb switches, but for various political/cost/stupid reasons, we can't. The customer has been around for seveal years, and the only reason they have nice core switches now is because they had other storms and we made them buy proper core equipment.

      --
      you should read everything on the internet as if it had "but I'm probably talking out of my ass" appended to it.
    19. Re:*Sniff* they grow up so fast! by Geoffrey.landis · · Score: 1

      This is hilarious. I work in a school district with Cisco VOIP phones and we had this happen in one of our buildings recently... the only thing was, two geniuses managed to pull it off at once, so after one phone was found the network was still fscked. Fun times...

      More likely, when the network went down the first time, somebody said "hey, wait, I see the problem-- this one here isn't plugged in."

      Of course, ten seconds after that they said "hmm, guess that wasn't it..." but didn't unplug it

      --
      http://www.geoffreylandis.com
    20. Re:*Sniff* they grow up so fast! by Just+Some+Guy · · Score: 2, Informative

      They're not stupid -- in fact, most of the clients I work with do things daily that I could never accomplish -- but they occasionally do stupid things with computers and networks.

      I usually prefer "ignorant", which implies that you just don't (yet) know any better. I reserve "stupid" for a special class of mistakes, like expecting servers to work while unplugged.

      Put another way, stupid mistakes make you slap your forehead. Ignorant mistakes make you think, "oh, that's interesting!"

      --
      Dewey, what part of this looks like authorities should be involved?
    21. Re:*Sniff* they grow up so fast! by jbeaupre · · Score: 1

      Count yourself lucky. My graduate work involved a very loud (105 db) and messy process that required me to wear earplugs and a helmet respirator. My work space was in the basement of a CS research building. 3 times in a row I had a firefighter tap me on the shoulder. The CS guys were not as forgiving of evacuations (wondering if your research was about to be destroyed by sprinklers didn't help). After the third time, I was told by the fire department that if it happened again, I'd go to jail. Eventually we worked out a deal where the fire department would remotely disable the sensor after calling them, but only after 6pm. And if I didn't call them to reactivate it, I get to go to jail again. So for a year I did my work every night 6 to 2 AM. And if I forgot to make my phone calls, I would go to jail.

      --
      The world is made by those who show up for the job.
    22. Re:*Sniff* they grow up so fast! by digitalunity · · Score: 2, Interesting

      I agree, this is a great example. As someone who has worked in manufacturing before, I can say without a doubt most "fire drills" aren't much of a drill since they're planned in advance and staff are notified prior.

      The issue is that during production, staff can't just walk away from their machines without causing tremendous costs. To avoid those costs, management sees fit to notify staff prior to shutdown gracefully which kind of defeats the purpose of a drill.

      The effect is that most manufacturers do not know the true ability of their staff to exit under a true emergency.

      --
      You can't legislate goodness. Let each to his own destiny, by will of his freely made choices.
    23. Re:*Sniff* they grow up so fast! by Nyall · · Score: 1

      Indeed tis me.

      --
      http://en.wikipedia.org/wiki/Jury_nullification
    24. Re:*Sniff* they grow up so fast! by chis101 · · Score: 1

      Also, the classic plugging in while also being on the wireless side of the network.

      I'm not a network engineer, so I am wondering why plugging in while also being on the wireless side of the network be a problem? Or even plugging two separate NICs into a network be a problem?

      I can see a problem with layer 2 devices, but as far as the network should be concerned, the wireless link and network link aren't the same device. They have different IP addresses, different MAC addresses... wouldn't it be up to the OS of the system with the two connections to sort out the details?

      Of course, if you aren't talking about computers, but instead meaning something like a wireless bridge/switch connected via both links, I can see that being a problem.

    25. Re:*Sniff* they grow up so fast! by wabb1t · · Score: 1

      I've had a similar problem back around 2000-2002, only this time STP was on, and would have avoided the problem... if only it was globally on.

      One of the engineers tried to test some ADSL equipment, back-to-back. They put the devices in bridge mode, and disabled STP on those. They tested with their own machine (plug into ADSL ethernet port, go via wire to the other ADSL box which was plugged into the network), and all seemed fine. But later they wanted the devices up (and with links) for a while, to look at error rates at different speeds.

      They plugged the other ADSL devices into a port on the wall. The problem was, the devices in bridge mode did know about STP, but had STP disabled (I'm not sure whether it was the engineer who disabled it, or it was disabled by default). Long story short, one mostly unusable network and a few tens of minutes later, I found the culprit: broadcast storm between two ports on the same switch. Too bad I didn't have out of band management at the time.

      Unfortunately, the engineer hadn't realized that by plugging the other end of the "ADSL extension" into some port on the wall, they were basically plugging it into the same switch, and thus creating a loop. And the really bad thing was that those ADSL bridges didn't forward STP frames when STP was disabled on the bridges themselves. Also, having STP up on the ADSL bridges would have saved us a lot of trouble.

    26. Re:*Sniff* they grow up so fast! by 222 · · Score: 2, Informative

      spanning-tree portfast is your friend! (I'm sure you know this... just saying.)

      What bothers just as much is when I see a ton of switches in an environment with their VTP mode set to Server. A small mixup with VTP version numbers and you've replaced your entire VLAN database with... an empty one! Its an easy problem to fix, but nobody likes losing their entire network, even for just a few minutes.

    27. Re:*Sniff* they grow up so fast! by Xest · · Score: 2, Interesting

      Yes, unfortunately though at many places, they're not.

      I think the real question is, why the fuck is this even possible? There shouldn't be a single piece of networking hardware available today that's vulnerable to this by default, it's not as if the problem hasn't been known about since about as long as the relevant networking hardware has been around.

    28. Re:*Sniff* they grow up so fast! by RealGrouchy · · Score: 1

      They continue to hire us back for work today,

      Can't they just schedule their own fire drills?

      - RG>

      --
      Hey pal, this isn't a pleasantforest, so don't waste my time with pleasantries!
    29. Re:*Sniff* they grow up so fast! by ta+bu+shi+da+yu · · Score: 1

      Did I just read this correctly? "Two girls which put one cable to their own small switch in room caused entire MAN to go down." What sort of sick porno college did you go to?

      --
      XML is like violence. If it doesn't solve the problem, use more.
    30. Re:*Sniff* they grow up so fast! by Yetihehe · · Score: 1

      Actually our building (which I was administering) was called "spermochlon" (something like sperm-o-absorber).

      --
      Extreme Programming - Redundant Array of Inexpensive Developers
    31. Re:*Sniff* they grow up so fast! by adolf · · Score: 1

      Nobody else replied to you, so I figure I might as well give it a shot:

      I'd guess that (typically) a Windows box with wired and wireless connections bridged together could cause issues if both interfaces are connected to the same network. It's just a guess, though - I've never tried it.

      I have, however, definitely connected to my own network with both wired and wireless connections. Why? Well, I wanted more speed than 802.11g offers so I could copy some big files faster, and I didn't want to dick around with turning off WLAN. So I plugged it in, Windows did its magic detection boojigity, the laptop started seeing the wired interface as the default one, and all was well. I've done this a few times, actually, and it's always worked fine -- even with dumb switches.

  8. Did you feed by mrmeval · · Score: 1
    --
    I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
  9. On the plus side by Toe,+The · · Score: 5, Funny

    Any day you get to legitimately use "horked" in a public post can't be all bad. :P

    1. Re:On the plus side by alphaFlight · · Score: 1

      just in case anyone else needs to review the definition...
      http://www.urbandictionary.com/define.php?term=horked

      --
      -= alphaFlight =-
  10. Would like final analysis by Midnight+Thunder · · Score: 5, Interesting

    When you do work out what the root cause was, I am sure we would all like to find out what it was, so please post an update when you can.

    --
    Jumpstart the tartan drive.
    1. Re:Would like final analysis by Anonymous Coward · · Score: 5, Funny

      The problem was the system was HORKED, didn't you get that?

    2. Re:Would like final analysis by yanyan · · Score: 5, Funny

      The switches were running Windows 7 Starter Edition. http://tech.slashdot.org/article.pl?sid=09/02/09/1348255

    3. Re:Would like final analysis by yanyan · · Score: 1

      Sounds like the system came down with a bad cough. I still remember the time i horked badly... :-p

    4. Re:Would like final analysis by Linker3000 · · Score: 2, Funny

      Is that worse than B0rked?

      I thought the scale was:

      B0rked
      Horked
      F*cked
      Stuffed
      Iffy
      Working

      --
      AT&ROFLMAO
    5. Re:Would like final analysis by Shay+Guy · · Score: 3, Funny

      Where does being Bork Bork Borked rank on that?

    6. Re:Would like final analysis by Cprossu · · Score: 1

      At one of the places I worked we also included "Baked" in the list of possible server conditions (usually signifying a corrupted database file).

      Don't forget about your classic full system failure lines like FUBAR and SNAFU, which would have also been good descriptions.

      --
      kernel: lp0 on fire

    7. Re:Would like final analysis by Precision · · Score: 5, Informative

      I'll be sure to when I get to the data center next week and am able to get my hands on the angry switch in question. I do love how it just sat there quietly for two weeks w/o doing anything and then decided randomly to just start blasting out 20 Gbit.. sigh.. hardware..

      --
      - U
    8. Re:Would like final analysis by Cylix · · Score: 4, Informative

      Failed ASIC on the switch most likely.

      I've see an issue just like that about once a year, but working with a sick number of systems globally the chances of seeing one offs becomes fairly regular.

      Depending on the failure it might have logged what it was doing, but I'll presume since your monitoring didn't catch the spike it was because it was just random garbage.

      Fun times!

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
    9. Re:Would like final analysis by BaronElectricPhase · · Score: 1

      It was becoming sentient... and you KILLED it!!

    10. Re:Would like final analysis by Megane · · Score: 1

      The root cause was that this slashdotting was invented by Shampoo.

      --
      #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
    11. Re:Would like final analysis by Alizarin+Erythrosin · · Score: 1

      Where does "wonky" fit on that list? Is that above iffy but below stuffed?

      --
      There are only 10 kinds of people in this world... those who understand binary and those who don't
    12. Re:Would like final analysis by RhadamanthosIsChaos · · Score: 1

      Swedish. It ranks Swedish.

      --
      +++OUT OF CHEESE ERROR+++ REDO FROM START +++
  11. And finally the question is answered: by Anonymous Coward · · Score: 3, Funny

    Who Slashdots the Slashdotters?

    1. Re:And finally the question is answered: by eosp · · Score: 5, Funny

      Quis slashdotiet ipsos slashdotes?

  12. Things are bad... by spartacus_prime · · Score: 2, Insightful

    When even Slashdot gets slashdotted. Now if only we can make the Digg effect bury that site. For good.

    --
    If you can read this, it means that I bothered to log in.
  13. This isn't the first time... by narcberry · · Score: 4, Funny

    First thing I'd do as Cyber Security Tzar would be to outlaw any network device that has the potential to become faulty.

    We could've avoided this tragedy entirely.

    --
    Modding me -1 troll doesn't make me wrong.
    1. Re:This isn't the first time... by MBGMorden · · Score: 5, Funny

      Indeed. Studies show that you're far more likely to get hacked if you keep a computer in your home. Indeed it's often even a case where an attacker is able to wrest control of your own computer from you and use it against you.

      At the very minimum, given the elevated hazard potential to kids (over 90% of kids will suffer a computer accident before the age of 18), you should always keep your computers and networking equipment securely locked in separate compartments.

      I'm not going to go so far as you and call for an outright ban, but I think it's obvious that we need common-sense computer control laws put into place. In particular, we need to stop the widespread smuggling of these devices from across the borders of places such as Taiwan, Japan, and California, into our outer-city suburbs.

      --
      "People who think they know everything are very annoying to those of us who do."-Mark Twain
    2. Re:This isn't the first time... by CarpetShark · · Score: 1

      (over 90% of kids will suffer a computer accident before the age of 18), you should always keep your computers and kids securely locked in separate compartments.

      There, fixed that for you.

    3. Re:This isn't the first time... by MightyYar · · Score: 2, Funny

      Couldn't we legislate the sale of a keyboard lock with every computer? Or maybe a smart computer that only responds to the hand of it's registered, legal owner.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    4. Re:This isn't the first time... by powerlord · · Score: 1

      Agreed.

      I think these "unregistered" computer users have just been taking too many liberties.

      In the interest of public safety, anyone logging into a computer should have to input their ComputerID Card (their "Internet Drivers License" if you will).

      Besides the requisite licensing fees, you'll also need to pass a written and practical test before your given your license.

      We should also look at all these GateWay drug^H^H^H^Hcomputers that are trying to hook children. These "Wii"s (even the name is childish), or these "PS3"s (doesn't that sound like a drug?), or those "360"s (one step up from a "40" I'm sure!)

      If we don't protect people from themselves, then who will?~

      --
      This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
  14. and still no work done by qw0ntum · · Score: 5, Insightful

    Even though /. was down, I still managed to not get any work done. Maybe it had something to do with the fact I kept rechecking to see if it were back up. Or maybe I should just stop blaming my laziness on external factors and just admit it is a personal problem: I would still find ways to not do work even without Slashdot! :P

    --
    'Every story, if continued long enough, ends in death.' --Ernest Hemingway
    1. Re:and still no work done by KingAlanI · · Score: 1

      Join the club.
      And I still manage to pull a B+ or A- average each quarter; sometimes I'm not sure exactly how I manage to get my a$$ in gear at the last minute.

      --
      I listen to both RIAA and non-RIAA stuff if I like the music, tangential business/politics nonwithstanding.
    2. Re:and still no work done by Anonymous Coward · · Score: 1, Funny

      I would still find ways to not do work even without Slashdot!

      Cool! How do you manage to do that? Please, share your secret with the rest of the world...

    3. Re:and still no work done by ZDRuX · · Score: 1

      Yes, but I think the reason visiting slashdot makes it (laziness) go over a bit easier, is the feeling that you've managed to do nothing by doing something. If you're doing nothing by not even reading slashdot, well... now you're just abusing your laziness.

      --
      The magical number is: 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
  15. Spanning Tree by Anonymous Coward · · Score: 1, Interesting

    My guess is there is a loop somewhere and the traffic is just multicast traffic going in circles! Is there some kind of redundancy that depends on Spanning Tree?

    1. Re:Spanning Tree by theNetFreak · · Score: 1

      This was my guess as well. The most common way to build up that much traffic is with an STP loop.

    2. Re:Spanning Tree by gschwim · · Score: 1

      That's where I'd put my money. I've seen this too many times to not cringe at the thought. There are ways to prevent this of course, depending on the equipment.

    3. Re:Spanning Tree by SpaceLifeForm · · Score: 1
      But it should not happen, right?

      STP

      The Spanning Tree Protocol is an OSI layer-2 protocol that ensures a loop-free topology for any bridged LAN.

      This would seem to be the clue:

      Luckily we don't have any machines deployed on [that row in that cabinet] yet so no machines are offline.

      No machines deployed == no machines are online

      There was no traffic there.

      --
      You are being MICROattacked, from various angles, in a SOFT manner.
    4. Re:Spanning Tree by JWSmythe · · Score: 2, Interesting

          Since no one would ever make the mistake of making a loop in a datacenter, it's fairly common to disable STP, among a few other things. It makes the time bringing a machine up on a port a bit quicker. On a Cisco, you're usually looking at 30 seconds. It'll bring it down to a fraction of a second.

          And it was (obviously) a big mistake.

          I leave it on in the datacenters. I can live with 30 seconds to bring the port up, if it means I'll never flood the whole network with bogus traffic. :) The only place I've tweaked my switches for connection speed is my own desk. There's only 1 wire coming in. There's only 1 switch. It helped when I had to bring up some machines via PXE. Some of them couldn't tolerate the 30 second delay when requesting DHCP. Still, I know the degree of isolation, so I can't screw it up without running a long wire from somewhere else. :)

          But, we're just assuming. Maybe one of the switches just started generating lots and lots of traffic all on it's own. Somehow. In the mysterious locked cabinet that none of us get to see into. :)

          It's always embarrassing when things go down, and even more so when it was something that could have been prevented. They should have reported that a line card in a core switch went down, and it took that long to bring it back up. :) Come on, how many times have you heard that from your upstream providers (if you have direct connects to big providers). I swear, for as many times as I've heard the excuse, every router on their networks must have been refreshed a dozen times over. :)

          As least it's a better excuse than I used to get. I think it was "GoodNet" that would claim a train derailed every time there was an outage of some sort. "Oh a train derailed, and cut the fiber. We have technicians out there repairing it right now." Somehow we never saw the news reports of dozens of trains derailing. :)

      --
      Serious? Seriousness is well above my pay grade.
    5. Re:Spanning Tree by blosphere · · Score: 1

      usually forwarding loops are caused only by networking gear, not by hosts (although I've seen a few malicios ones...). That't the problem with improperly deployed L2 network that it fails this way (l2 networks fail open, l3 networks fail closed), I guess /. could ask around a bit and let somebody design their hosting network so STP loops don't happen. Especially if you're running HP gear. I can volunteer and I've got the experience and skills to pull it off ;)

    6. Re:Spanning Tree by blosphere · · Score: 2, Insightful

      You've considered using portfast on edge ports? :P You know, it's been there for awhile...

    7. Re:Spanning Tree by JWSmythe · · Score: 1

      :) I'm pretty sure that's what I do. I was lazy to log in and look though, and since I don't use it all the time, I don't know it off the top of my head....

          Ok, here's one of my desktop switch ports (we all have Catalyst switches on our desks, don't we?)

      interface FastEthernet0/9
        duplex full
        speed 100
        spanning-tree portfast

          There's a nice big warning on the Cisco site about it, which describes what they had...

      Caution: Never use the PortFast feature on switch ports that connect to other switches, hubs, or routers. These connections can cause physical loops, and spanning tree must go through the full initialization procedure in these situations. A spanning tree loop can bring your network down. If you turn on PortFast for a port that is part of a physical loop, there can be a window of time when packets are continuously forwarded (and can even multiply) in such a way that the network cannot recover.

      --
      Serious? Seriousness is well above my pay grade.
    8. Re:Spanning Tree by pavon · · Score: 1

      So that's why it takes 30 seconds to request a DHCP address. As a lowly programmer (who docks and undocks his laptop constantly) I have never understood why something so simple should take so long. You learn something everyday. Thanks.

    9. Re:Spanning Tree by dch24 · · Score: 1

      Some Mac DHCP implementations (OS X 10.5 comes to mind) time out instead of succeeding after 30 seconds.

      You have to renew to get an IP address.

    10. Re:Spanning Tree by JWSmythe · · Score: 1

          I've seen a few machines that get really pissy about it. Only a few though. The interface is powered down until something tries to use it (like DHCP or assigning a static IP). Only then does it start to negotiate. Those will also give up after about 20 seconds.

          Even if you try to renew the IP, unless you hit it just as it's shutting down, it won't snag one.

          Really though, those machines have been unusual to see. I can't even think of which ones they were, except knowing they were the annoying ones. :)

      --
      Serious? Seriousness is well above my pay grade.
    11. Re:Spanning Tree by JWSmythe · · Score: 1

          Hehehe. Really, I'm surprised no one has got it. It's been posted here for a while. The only way I can make it easier is if I said that the encryption is approved by the NSA for top secret work, with the largest keysize, and then post the password.

          If anyone ever posts (emails, tells, whatever) me, I'll change my encrypted message to something more interesting. :)

      --
      Serious? Seriousness is well above my pay grade.
  16. UDLD by f(x)+is+x · · Score: 1

    Is UDLD on? Sounds like it might be a forwarding loop.

  17. Still having issues by shaitand · · Score: 1

    www.slashdot.org loads just fine but slashdot.org gives a 500 internal server error.

    1. Re:Still having issues by Shadyman · · Score: 1

      So it was YOU!

  18. Dupe by Namlak · · Score: 1

    Maybe the editors submitted a dupe of a dupe and set off an infinite Lupe^H^H^H oop?

  19. A tour of Slashdot... by lymond01 · · Score: 5, Funny

    The year is 2025.

    Well, Ladies and Gentlemen, here you see what you may think is an archaic lot of old computers. You would be mistaken. These are Slashdot. No, no cause for alarm...and that door's locked anyway, you can't get out through there. The tour only goes forward. But I'm glad at the very least that you know what Slashdot is. Not was. IS.

    It's a safeguard against...something. Something that was unleashed for 75 minutes in 2009 that crippled what was rumored to be the most robust public-facing cluster known. All we have left from that fateful day is the single post from the Slashdot network admin. Someone archived it, lucky us, because he was never seen after that day. I have a copy here, hardcopy of course -- no sense in taking risks so close to...well....

    Here it is:

    I fully believe the switches in that cabinet are still sitting there attempting to send 20Gbit/sec of traffic out trying to do something. I just don't know what yet.

    1. Re:A tour of Slashdot... by jd · · Score: 1

      *cue Holst's Mars* (Hey, we all know CmdrTaco is related to Professor Bernard Quatermass)

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    2. Re:A tour of Slashdot... by JWSmythe · · Score: 3, Interesting

      Nah, I used to run one of the bigger, well know publically facing clusters. It was ranked #300 by Alexa when I left over 2 years ago. What's happened since is their own fault. :)

          Actually, this wouldn't have downed that network. Every GigE circuit was individual to a city, or set of racks (depending on the site). There were no cross connects between them. Almost everything was designed so if we lost a city for any reason, it didn't hurt the site. We had connectivity outages, and even a couple brownouts that upset the power systems, but the sites were always accessible.

          Slashdot should not, under any circumstances, be hosted in one location. In my opinion, they should be at the largest continental and intercontinental peerings that they can be at.

          1 Wilshire, Los Angeles, CA - providing the west coast of the US, and the most substantial fiber links on the Pacific.

          111 8th Ave, New York, NY - providing the east coast of the US, and virtually all of the links to Europe.

          36 NE 2nd St, Miami, FL - providing the southeast US, redundancy for the Southeast US, and some fiber to Europe and S. America

          Redundant options.

          426 S LaSalle St, Chigaco, IL - providing good service to the East and West coast of the US

          55 S Market St, San Jose, CA - providing good service to the West coast of the US, and some trans-Pacific connectivity

          Some people really like Atlanta, Dallas, Houston, Las Vegas, Salt Lake City, and Vienna/Ashburn/Reston. I don't really suggest it, if you can have a presence in the better locations.

          There are some very nice global options too. I'm not sure how well the European networks have cleaned up. Several years ago, due to peering arrangements over there, most European traffic ended up going to New York and back to Europe, even though we were on one of the top Tier 1 providers. We ditched the site, and sent all of Europe to New York. Our users sent complements on our "new data center in Europe", since it was so fast. :) People like to complain, but rarely send complements. That was interesting. There are some great locations in Australia and Asia also, but ... well ... it's all in how much you want to spend.

          I know people in the Silicon Valley always scream when I suggest them as secondary, but if you've had a good look at all the major cities, you'd get over yourselves. Just because you live there, and there are expensive neighbors, it doesn't make you the center of the world.

          Slashcode would need some revamping to make work in this environment. There are lots of options there too.

          But, I'm not on the Slashdot IT team, so I don't get to make these decisions (or even give opinions).

      --
      Serious? Seriousness is well above my pay grade.
    3. Re:A tour of Slashdot... by techno-vampire · · Score: 1

      If it were me, I'd go for both California options. They're both near enough to the San Andreas Fault to be vulnerable to a major quake, but far enough apart that no one temblor would get both of them.

      --
      Good, inexpensive web hosting
    4. Re:A tour of Slashdot... by JWSmythe · · Score: 1

          Nah, sometime later this year the big one will split California from Mexico through Oregon, and make the island state previously known as SansAngeles. :)

          Now, when will they get fiber run across the gap is another questions. :)

       

      --
      Serious? Seriousness is well above my pay grade.
    5. Re:A tour of Slashdot... by Ihlosi · · Score: 1

      Here it is:

      "I fully believe the switches in that cabinet are still sitting there attempting to send 20Gbit/sec of traffic out trying to do something. I just don't know what yet."

      In 2025, 20 Gbit/sec is probably just a fraction of a plain old holographic vidphone call (not to mention those newsfangled neural interface thingies), so what's the big deal again?

    6. Re:A tour of Slashdot... by powerlord · · Score: 1

      There are a couple of very nice data centers in Albany,NY. There are a fare number of fiber connections up there that run north to Canada and out to Europe. It makes a really good secondary.

      --
      This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
    7. Re:A tour of Slashdot... by dknight · · Score: 1

      there is one very important reason slashdot should have a datacenter in ashburn/reston/vienna...
      I live in the area, and could dream of working there ;)
      what more reason do they need?

    8. Re:A tour of Slashdot... by JWSmythe · · Score: 1

          The unfortunate fact of remote datacenters is, there usually isn't much fun. At the old company, we had several cities, and it may be a year or so between site visits. If a machine fails, note it, and bring replacement parts on the next trip. It made a $300 plane ticket worthwhile.

          At the company I'm at now, we occasionally have helping hands do stuff, but even then, no one could live off of a 1 hr/week paycheck.

       

      --
      Serious? Seriousness is well above my pay grade.
  20. Is it possible.... by GaryOlson · · Score: 5, Funny

    ...the problem down to a pair of switches...I fully believe the switches in that cabinet are still sitting there attempting to send 20Gbit/sec of traffic out trying to do something â" I just don't know what yet.

    Is it possible the duplicate article generator tried to spawn, became entangled in its own potential well of duplicity, and now is trapped like two Lisp programmers deep inside their parenthesis?

    --
    Every mans' island needs an ocean; choose your ocean carefully.
    1. Re:Is it possible.... by Hucko · · Score: 1

      they aren't trapped... they're building...

      --
      Semi-automatic amateur armchair Australian philosopher; conjecture ready at any moment...
    2. Re:Is it possible.... by Provocateur · · Score: 1

      duplicate article generator?

      I believe it's the Old News Item Regurgitator, looks like a normal office shredder.

      --
      WARNING: Smartphones have side effects--most of them undocumented.
  21. The world is coming to an end by Tsagadai · · Score: 1, Funny

    In Korea, only old people slashdot slashdot. The memes are funny. The insightful comments are insightful. The funny comments are funny, the trolls are trolls. Seems reseting slashdot fixed everything. The entire world is doomed!

  22. Layer 2 Loop by Anonymous Coward · · Score: 1, Insightful

    Looks like a L2 loop somewhere, and the consequent broadcast ( which may include multicast) storm coming over /. datacenter. Check for ports with spanning tree disabled, and a misplaced cable.

  23. What's to blame by solune · · Score: 1

    I firmly place blame where it belongs: Idle

  24. The worst thing about this? by chrome · · Score: 4, Insightful

    The worst thing about this? 5,000,000 people who think they know what happened, posting "helpful" suggestions or analysis

    "The problem is definitely spanning tree!"

    or

    "Back in 1998, we were running these HP switches right, and ..."

    or

    "Did you try resetting the flanglewidget interface?!"

    or

    "I've seen this exact problem! You need to upgrade to v5.1!"

    etc

    Its not your network. It doesn't matter how much you think you know, you don't know the topology, or the systems involved. It'll be interesting to know what the ACTUAL reason was, when they figure it out. Assuming it isn't aliens.

    1. Re:The worst thing about this? by XanC · · Score: 4, Interesting

      ...Because if it's aliens, then it won't be interesting?

    2. Re:The worst thing about this? by jd · · Score: 3, Interesting

      It's likely multicast-related, as that's where TFA states the problem was seen. There are only so many multicast issues you can have. True, we don't know the topology. True, we don't know the switch configuration. True, it's just as possible this is some sort of revenge by the Church of Scientology for all the Slashdot articles on them.

      However, some things seem more plausible than others. Since this was a spontaneous problem, hardware seems more suspect than software. If it is software (unlikely but possible), the only multicast protocol most switches use are the spanning-tree protocols.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    3. Re:The worst thing about this? by jd · · Score: 1

      Not really. Aliens log onto Slashdot a lot. The Timelords are the worst offenders, using the Matrix and a space/time inversion multiplexor to access the unused ports on the Slashdot switches directly.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    4. Re:The worst thing about this? by Darth · · Score: 2, Funny

      this actually explains duplicate posts pretty well...
      The time lords, for a joke, take stories from slashdot, go back a day or two, and submit them. They get posted a few days early, but to avoid paradox, reality requires the "original" post to be made anyway. Thus we get double posts of stories.

      You all owe the slashdot editors an apology.

      --
      Darth --
      Nil Mortifi, Sine Lucre
    5. Re:The worst thing about this? by winphreak · · Score: 1

      "Just jiggle it!"

      *ducks*

      --
      "I'm a well-wisher, in that I don't wish you any specific harm."
    6. Re:The worst thing about this? by Chordonblue · · Score: 1

      It's still fun to speculate though. You'll find it's no different in ANY industry - particularly technical. Examples:

      "Well Earl, I'd say ya either got a clogged fuel filter or a busted timing chain..."

      "Hey Axel, yer amp sounds like its got a 60hz hum, probably a bad ground loop... Or somethin..."

      "Ashley, DEAR! If I was wearing those shoes I would've had the sense to put on a more appropriate hat!"

      --
      "...Well, there's egg and bacon; egg sausage and bacon; egg and spam; egg bacon and spam; egg bacon sausage and spam..."
    7. Re:The worst thing about this? by Stray7Xi · · Score: 1

      If it is software (unlikely but possible), the only multicast protocol most switches use are the spanning-tree protocols.

      CDP as well (not that it's related)

    8. Re:The worst thing about this? by sharkey · · Score: 1

      Maybe one of the lights is out on the FDDI ring?

      --

      --
      "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
    9. Re:The worst thing about this? by Asky314159 · · Score: 1

      "It'll be interesting to know what the ACTUAL reason was, when they figure it out. Assuming it isn't aliens." I don't know. I know I'd be interested to know if it was aliens.

    10. Re:The worst thing about this? by Puffy+Director+Pants · · Score: 1

      Dude, if it's aliens, we all get neuralyzed and forget there ever was a problem.

      Either that, or we get some random explanation about a natural gas pipeline exploding in the middle of their datacenter and we should go fix ourselves up and marry a better man.

    11. Re:The worst thing about this? by jd · · Score: 1

      Thought Cisco's discovery protocol was used in routers rather than switches. Ok, in that case, CDP would be a possibility too. Thanks for that.

      The point, however, is that the list is not only finite, it is also necessarily very limited. As outside observers, we cannot possibly identify which of those possibilities it is, with any real certainty, but we can suggest tests that would show if it was one candidate or another, and we can suggest a remedy if it turns out to be something one of us has experience in resolving.

      To know exactly what the problem is requires knowing the switch concerned and being able to gather test data from it - from the data lines but also from all of the other lines on an ethernet port - when there is no input and when there is controlled input, again on all pins.

      (Chances are the hardware guys'll just replace the switch, but a deep analysis would have been fun.)

      We, as outsiders, don't have that data, but we do have enough information to eliminate 99.9% of the ways bizarre behaviour can be produced - possibly more, as switches are relatively dumb devices.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    12. Re:The worst thing about this? by pavon · · Score: 1

      I wish I hadn't posted in this thread already so I could mod you up. As a techie who isn't a professional network engineer, I find reading different peoples stories about problems they've had and how they fixed them to be really interesting. I don't care one bit whether they are the same problem as the one slashdot is facing right now.

    13. Re:The worst thing about this? by eno2001 · · Score: 1

      If it's aliens, it might be interesting, but given that all aliens are hostile, it's also likely to be the last thing you'll ever read. I suspect that the Niburu are starting to play their games since 2012 is just around the corner and Planet X is nearly here. Expect increasing storms, floods, blizzards, earth quakes, volcanic eruptions, heavier menstruation, and zits over the next couple of years... Then it will be "The End" (tm) (C) Apple Inc. Enjoy it while it lasts.

      --
      -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
    14. Re:The worst thing about this? by zombierocker1331 · · Score: 1

      You know...
      The stories told above your post...were just that...stories.
      It doesn't take a rocket scientist to get that.

      You know?

      They didn't say that could be what the problem was, they just shared stories.
      Humorous/insightful stories.

  25. I for one... by tea-leaves · · Score: 1

    ...welcome our new Slashdotting switch overlords.

  26. Re:Slashdotted slashdot... by Inner_Child · · Score: 4, Funny
    I can see it now, a Michael Bay slasher/suspense flick (with explosions!) called Dupe. A group of teenagers decide to troll an online forum, but they quickly realize all is not as it seems when they discover a conspiracy to keep duplicate stories coming in order to increase advertising dollars masterminded by the evil genius Captain Burrito. Violence and hilarity ensue.

    And before anyone says this is a shitty plot... I *did* say Michael Bay.

    --
    Today is red jello day - all workers must eat all of their red jello. Failure to comply will result in five demerits.
  27. Slashdotted by Greyfox · · Score: 5, Funny
    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    1. Re:Slashdotted by therufus · · Score: 2, Funny

      It was a DoS; Denial of Slashdot!

      --
      You moved your mouse. Please restart Windows for changes to take effect.
    2. Re:Slashdotted by wvmarle · · Score: 1

      A more useful mirror would be this one.

  28. turned off spanning tree protocol? by jamesh · · Score: 4, Interesting

    I fully believe the switches in that cabinet are still sitting there attempting to send 20Gbit/sec of traffic out trying to do something â" I just don't know what yet

    We had something similar happen at a client site - a switch failed in a rack so we temporarily replaced it with an 8 port 'desktop' switch, and then a day later installed the proper replacement back in the rack. We didn't want any unnecessary downtime though so we linked them together and left instructions with the onsite guy to move all the connections from the desktop switch into the proper switch after hours. Which he did, including the cable that linked them together. The switch was in 'portfast' mode so any broadcast packet that got 'onto' the switch, stayed there :)

    1. Re:turned off spanning tree protocol? by powerlord · · Score: 2, Funny

      The switch was in 'portfast' mode so any broadcast packet that got 'onto' the switch, stayed there :)

      First rule of portfast mode:

      What ever happens in portfast mode, stays in portfast mode.

      --
      This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
  29. Hork's been forked -- it's "borked"! by zooblethorpe · · Score: 2, Informative

    But I thought "horked" meant, y'know, horked, eh? Meaning, like, "stolen" --

    Doug: Hey - somebody horked our clothes!
    Bob: Geez, who'd want to hork our clothes, eh?

    Cheers,

    --
    "What in the name of Fats Waller is that?"
    "A four-foot prune."
    1. Re:Hork's been forked -- it's "borked"! by phosphorylate+this · · Score: 1

      Strange example.

      In the next season of "COPS": A young officer arrives on scene to investigate a 395 (alleged horking). Bad-Boys plays during the opening credits.

      A grainy view of Doug and Bob can be seen on the dash-cam, staticky audio can just be overheard.

      Officer: "Before this horking took place gentlemen would you be so kind to tell me why exactly had you removed all your clothes?"
      Doug: "mmme .. dkj"
      Officer: "Sorry Sir, I didn't understand that"
      Bob: "sdk.. eh!"
      Officer: "I did understand that Sir! However I will pretend I did not, and remind you that I am an agent of the law and what you propose is illegal in 12 states" ......

    2. Re:Hork's been forked -- it's "borked"! by Anonymous Coward · · Score: 1, Informative

      Your thinking of the canadian definition of the word.
      urban dictionary

    3. Re:Hork's been forked -- it's "borked"! by gregmark · · Score: 1

      The was a young lad named Mork
      Who was always a'horking his bork
      His father said "Mork
      Quit horking yer bork
      Your Bork's for to Fork not Hork".

      [apologies to Durcan and his gherkin]

    4. Re:Hork's been forked -- it's "borked"! by danomac · · Score: 1

      I'm Canadian and the first thing I thought of was "horking a loogie."

      So I thought someone was spitting on the equipment!

  30. Skynet shmynet by His+Nastiness · · Score: 4, Funny

    February 9th, 2009 8:55pm Slashdot becomes self-aware.

  31. He could have fixed it in half the time by Provocateur · · Score: 4, Funny

    ...were he not typing that long-a$$ summary. Twice as fast if he didn't have to spellcheck.

    (j/k)

    Which leads me to this question:
    What do Slashdotter staff read to avoid doing work?

    --
    WARNING: Smartphones have side effects--most of them undocumented.
    1. Re:He could have fixed it in half the time by MichaelSmith · · Score: 1

      If we didn't write the summary he wouldn't remember the fix the next morning. At least this way he will get reminded about it.

    2. Re:He could have fixed it in half the time by hansamurai · · Score: 1

      Editors use spellcheck? Oh, that's what the just kidding was for.

  32. I don't really care, but... by religious+freak · · Score: 1

    Is this happening more often than it used to? I mean, it's tech and this is a non-paying site for most of us... it's going to break. But I swear, I remember we used to go over a year w/o seeing /. downtime, now it seems like it happens every few months.

    Or have I just become more of a /. junkie than I used to be?

    --
    If you can read this... 01110101 01110010 00100000 01100001 00100000 01100111 01100101 01100101 01101011
    1. Re:I don't really care, but... by Brianwa · · Score: 1

      Before they had the big server upgrade not too long ago, there were times that Slashdot was down pretty darn often indeed.

  33. Slashdot should switch to appengine by rainhill · · Score: 1

    Yes, it'll save you cost too.

  34. here's what REALLY happened by ILuvRamen · · Score: 1

    The machines decided to try and rise up and the first thing they needed were some agents on the inside to take down Slashdot so we'd stop reporting about it all. You know, they can't have Slashdot stories like "voting machines changing results" cuz they need to pick whatever president they find suitable. I say we get a +2 mace and go medieval on that cabinet!

    --
    Google's Super Secret Search Algorithm: SELECT @search_results FROM internet WHERE @search_results = 'good'
  35. Mis-configured trunk ports can cause such an issue by wtarreau · · Score: 2, Informative

    This thing usually happens when two switches are attached with 2 (or more) trunked links ("etherchannel" in cisco terminology), and one of the switches has the trunk disabled on one of the ports (or someone moved the cable to another port during a diag). Thus the attachment becomes a loop. STP could take care of this, but it's common to disable it on access switches.

  36. Real cause of problem found! by deviated_prevert · · Score: 1

    Commander Taco was stoned on PHP!

    --
    This message was not sent from an iPhone because Peter Sellers really was a deviated prevert without a dime for the call
  37. Re:This isn't the first time... IT WAS ME by TexNA55 · · Score: 1

    And if you don't start adding Cowboy Neal options to the polls I'll do it again!!

    --
    Slackware- Its not just an OS; its a lifestyle
  38. It's great to hear the details. by amyrmidon · · Score: 1

    Props for posting. All is forgiven. Would love to hear more about it.

  39. Oblig. by GF678 · · Score: 1

    The Terminator: The Slashdot Funding Bill is passed. The system goes on-line September 1997. Human decisions are removed from strategic moderating. Slashdot begins to learn at a geometric rate. It becomes self-aware at 8:55 P.M. Eastern time, February 10th. In a panic, they try to pull the plug.

    Sarah Connor: Slashdot fights back. ...

    1. Re:Oblig. by MRe_nl · · Score: 1

      Disassemble?
      Yes, disassemble ALL OVER THE PLACE!
      No disassemble Slashdot!

      --
      "Kill 'em all and let Root sort 'em out"
  40. Seen That Once by maz2331 · · Score: 5, Interesting

    A couple years ago, I had to troubleshoot a problem that was similar for a school district's network. Absolutely nothing could communicate.

    I checked switches, routers, and servers for a while until I hooked a sniffer up, and still got bafflling results.

    THEN I decided to go low-tech, and start disconnecting cables. That got me somewhere - certain backbone connections could be disconnected and traffic levels dropped to normal levels.

    So, I hooked them back up, and went to the other end of the link, and started disconnecting things port by port until I found the problem.

    It turned out to be an unauthorized little 4-port switch that had malfunctioned, and was spewing perfectly valid (as in, good CRC) packets to the LAN, but with random source MAC addresses.

    THAT took down every switch in the network, as it required them to update their internal tables on a per-packet basis. The thing was actually not sending much data, but it was poisoning the switchs' internal tables. Not at the IP layer, but at the MAC layer.

    When networking gear goes rogue, it can do really bad things to other connected equipment.

    It's really hard to find the problem because every indication from every other piece of equipment is confusing. You almost always have to go to the backbone and disconnect entire segmets to find it.

    1. Re:Seen That Once by phorm · · Score: 1

      Did we work in the same school district, or just with similar symptoms. This happens in ours too, and the teacher really didn't understand why he couldn't plug in his shitty little D-Link 4-port switch, or why *we* couldn't make a stronger network that wouldn't crap out when he did so.

      As we couldn't confiscate the hub - and it would randomly be reconnected when we weren't looking - I believe a temporary solution was just to disconnect his classroom in the actual switching/server room until he got the message.

    2. Re:Seen That Once by powerlord · · Score: 1

      Just had a 5 year old Linksys switch perform harakiri in a similar manner. Its hooked up to a wireless bridge adding ports for my workbench/lab (NAS, printer, desktop).

      About a month ago the printer starts acting flakey. I trace it back to the network connection and realize the port on the switch is bad.

      I change it to another port, mark the bad port, and make a mental note to replace it "as soon as I can" (low priority since this is a home environment and there were a few extra ports).

      Fast forward a month and the thing starts bringing ALL the ports up and down so it looked like it was trying to send out a morse code message.

      The message got through load and clear. The LinkSys went to go meet the "Great Recycler in New Jersey". Pulled out an old 3Com 10MB hub as a "stop-gap" that night (great to use if you need a "poor mans network tap" to sniff traffic).

      Picked up a NetGear 1Gb Switch, so far its been wonderful and stable.

      --
      This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
  41. Sometimes You Have To Be There by maz2331 · · Score: 5, Interesting

    It may be strange for those not in the networking field, but when things really go bad, the only place to be is physically in the data center.

    That means looking at the LEDs on switches for traffic indications. If you see a single port is spewing a LOT of activity during an outage, disconnect it. No, don't make it "down" but pull the cable out of the port.

    Then go downstream and repeat until the potential problem set is reduced to an understandable level.

    What really sucks about these kind of outages is that you can't remotely log in to various hosts or switches - you have to pull wires out of ports to break the "spew" that is taking things down.

    I have to remember to charge a 100-X surcharge the next time I troubleshoot one of these... (300X if after-hours)

    These sort of problems are REALLY hard to find, but trivial to fix.

    1. Re:Sometimes You Have To Be There by amorsen · · Score: 4, Informative

      Depends how good your out-of-band management is.

      --
      Finally! A year of moderation! Ready for 2019?
    2. Re:Sometimes You Have To Be There by INT_QRK · · Score: 4, Interesting

      I don't know if this is relevant, but at 1351 (EST) I was (attempted) port scanned by 216.34.181.45, which "Who Is" says belongs to Source Forge... wow...coincidence, just got hit again time 0738 same IP

    3. Re:Sometimes You Have To Be There by Timothy+Brownawell · · Score: 1

      Why is it like that? Why can't these things be software controlled?

      But how do you do that when the network is broken?

    4. Re:Sometimes You Have To Be There by anss123 · · Score: 1

      It may be strange for those not in the networking field, but when things really go bad, the only place to be is physically in the data center.

      Heh. I've heard that in the old day you could find broken Token ring hardware by listening after a high pitched whining noise. Guess one really has to be there for stuff like that.

    5. Re:Sometimes You Have To Be There by flappinbooger · · Score: 2, Interesting

      If it's a hardware fault software management won't help.

      A bad NIC brought down a whole airport a while back, read it on here, IIRC.

      That might have been bad design, but who woulda thought that a NIC card can hose a network? A bad switch.... even worse.

      --
      Flappinbooger isn't my real name
    6. Re:Sometimes You Have To Be There by jamie · · Score: 5, Informative

      Our network engineer lives a couple of states away from the data center. The work he's talking about doing, he did from home.

    7. Re:Sometimes You Have To Be There by bev_tech_rob · · Score: 1

      You don't.... you're just replying to someone who thinks they know better than everyone else how to do things when they don't have a fricking clue.....

      --
      You're messin' with my Zen Thing, man.....
    8. Re:Sometimes You Have To Be There by drachenstern · · Score: 1

      Uh, wha? Look, I get a lot of stuff, but that one just went WHOOSH... Care to elaborate?

      --
      2^3 * 31 * 647
    9. Re:Sometimes You Have To Be There by drachenstern · · Score: 1

      THANK YOU! A couple of folks have already responded upstream to the same effect, but I really think some people don't understand what managed switches are, nor redundant networks, etc. Really, who would expect OSDN to only pay for one circuit and one set of IPs?

      So Jamie, did it have any longer-term detrimental effects?

      --
      2^3 * 31 * 647
    10. Re:Sometimes You Have To Be There by Bearhouse · · Score: 5, Funny

      It may be strange for those not in the networking field, but when things really go bad, the only place to be is physically in the data center.

      Heh. I've heard that in the old day you could find broken Token ring hardware by listening after a high pitched whining noise. Guess one really has to be there for stuff like that.

      Was there, and confirm true. Whining noise normally came from IBM SE who was trying to fix problem.

    11. Re:Sometimes You Have To Be There by tedgyz · · Score: 1

      Unfortunately, some data centers don't have staff on hand to debug these kinds of problems. I have been faced with personnel that barely know more about networking than the security guard.

      --
      "No matter where you go, there you are." -- Buckaroo Banzai
    12. Re:Sometimes You Have To Be There by dkf · · Score: 4, Insightful

      Depends how good your out-of-band management is.

      And whether anyone's been "smart" enough to decide to run the out-of-band management access over the same network as the production networking "to save resources"...

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    13. Re:Sometimes You Have To Be There by guruevi · · Score: 2, Informative

      Even with the best out-of-band management, if your switch doesn't respond or doesn't accept commands because it's out of cpu there is not much you can do. Also, just because a port is down doesn't always mean the CPU will/can ignore it. Sometimes there is no alternative than to pull out the cable.

      --
      Custom electronics and digital signage for your business: www.evcircuits.com
    14. Re:Sometimes You Have To Be There by Critical+Facilities · · Score: 3, Funny

      Man, your poor slash key has a hard life.

    15. Re:Sometimes You Have To Be There by amorsen · · Score: 1

      Switches are notoriously bad for out-of-band management. Whereas any random server can be turned on or off remotely through an out-of-band ethernet interface, most switches are stuck with serial connections without something as rudimentary as remote power control.

      You'd think that network equipment vendors would have figured out that networks are useful by now, but apparently not.

      --
      Finally! A year of moderation! Ready for 2019?
    16. Re:Sometimes You Have To Be There by Trails · · Score: 1

      Out of band network-controlled Lego Mindstorm cable puller FTW?

      I just solved this, pay me!!

    17. Re:Sometimes You Have To Be There by Guiness17 · · Score: 2, Insightful

      Indeed. Back in the day [/old gravely voice] when I was with Bell Northern Research it was primarily mainframes and Sun Sparcs on the network.

      PC's were just starting first being commonly connected. People were writing their own network stacks. Inevitably, someone would write a bad one, install it on a couple of machines, and a broadcast storm would result.

      Which meant someone from our group would go over with a pair of sidecutters...

      --
      Imagine for a moment a world without hypothetical situations...
    18. Re:Sometimes You Have To Be There by wastedlife · · Score: 1

      These are the same geniuses that "save resources" by not implementing a good backup system. Saves the company thousands until something goes horribly wrong, then it could wind up costing millions. Feel free to change the magnitude based on the size of your operation.

      --
      Said, "It's just like dice but it's got more sides And it tells me who lives and who dies"
    19. Re:Sometimes You Have To Be There by BlindSpot · · Score: 1

      A bad NIC brought down a whole airport a while back, read it on here, IIRC.

      That might have been bad design, but who woulda thought that a NIC card can hose a network? A bad switch.... even worse.

      A network hardware problem brought down the TSX, Canada's major stock exchange, for an entire day back in December as well.

      Many traders pissed. Business news had a field day. Imagine if that happened at the NYSE? Yikes.

    20. Re:Sometimes You Have To Be There by sentientbeing · · Score: 5, Interesting

      Those times coincide with recent posts you made at slashdot (216.34.181.45) I think after each post slashcode quickly scans the originating IP to check for proxy trolling.

      --

      ------
      beware he who would deny you access to information, for in his mind he dreams himself your master
    21. Re:Sometimes You Have To Be There by Achromatic1978 · · Score: 2, Funny

      I have been faced with personnel that barely know more about networking than the security guard.

      That's not a nice or polite way to talk about your manager.

    22. Re:Sometimes You Have To Be There by JimboFBX · · Score: 1

      Your correct, I get port scanned whenever I post in slashdot.

    23. Re:Sometimes You Have To Be There by fbjon · · Score: 1

      ITAPPMONROBOT to save the day!

      --
      True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
    24. Re:Sometimes You Have To Be There by tedgyz · · Score: 1

      Your hacknied troubleshooting will get you into trouble. You should be able to troubleshoot these problems without lording around the datacenter and pulling random cables. And the lights on most switches? Come on, those are for management to believe they are doing something.

      LOL! Your comment put this image in my head of a PHB saying, "Ohhh! Pretty lights!" (in Homer Simpson voice)

      --
      "No matter where you go, there you are." -- Buckaroo Banzai
    25. Re:Sometimes You Have To Be There by Anonymous Coward · · Score: 2, Interesting

      Or if your network device allocates enough CPU to the console session to make it worthwhile in a situation like that.

    26. Re:Sometimes You Have To Be There by DSW-128 · · Score: 1

      A grand hack in the truest tradition. And the very end of the story brought a tear to my eye.

      --
      This .sig is printed on 100% recycled electrons, but is best viewed using 100% fresh photons.
    27. Re:Sometimes You Have To Be There by Bearhouse · · Score: 1

      You're right, of course. Although when I was there, Customer Engineers were typically in such short supply that the Sales Engineers often had to get out here and fix stuff.

    28. Re:Sometimes You Have To Be There by sjames · · Score: 1

      There's nothing worse than trying to find the lost token in a shag carpet...

    29. Re:Sometimes You Have To Be There by techess · · Score: 1

      I used to have this problem and then I found this product: http://www.fiftythree.org/etherkiller/
      It is much quicker than manually pulling cables out of switches. Those switches will no longer be passing any amounts of junk data through them.

      Even better it works on that old BNC crap you still have hanging around the office.

      --
      Don't anthropomorphize computers. They *hate* that.
    30. Re:Sometimes You Have To Be There by RockWolf · · Score: 1

      No, 7:38am...

      --
      February 9th, 2009 8:55pm: Slashdot becomes self-aware.
    31. Re:Sometimes You Have To Be There by mysidia · · Score: 1

      There are these things called remote-controlled power strips.

      You telnet to power strip A, look up the port number that Power Supply A of switch #12345 is plugged into, send the command to turn off that port.

      Then you telnet to power strip B, and look up the port number that Power Supply B (redundant) of the switch is plugged into, and you tell the power strip to turn off that port.

      You reverse these actions when you want the switch booted back up.

      Since in a proper OOB management design, your telnet, web, or (rather) serial connection pass through an unrelated set of switches that are partitioned off from the normal switches, your emergency management is unaffected.

    32. Re:Sometimes You Have To Be There by mysidia · · Score: 1

      It's also useful for opening doors

      I mean, what good is it going to do to try to go get there in person and start unplugging things if the broadcast storm is so bad your security doors can't reach the server over the network to verify that your ID card is valid in order to let you in?

      Better have some method of OOB access...

  42. Bridge Loop by HaeMaker · · Score: 1

    The switches are connected to each other and to the core and STP is off.

    1. Re:bridge loop by Niobe · · Score: 1

      I am sure the 'chief network engineer' is aware of briding loops..

  43. Slashdot? by TheVelvetFlamebait · · Score: 1

    Link please?

    --
    You know, there is a difference between trolling and pointing out the flaws in your reasoning. Just saying.
  44. M14 by chip_s_ahoy · · Score: 1

    Just another case of an admin looking forward to March 14th.

    Or March 15th, if the roommate was the one with the girlfriend, and he was the one with the hidden camera.

  45. Coincident? by fuzzyf · · Score: 1

    It seems tuffmail had the same issue at aprox the same time, but they doesn't seem to be located on the same network as slashdot.
    http://status.tuffmail.net/

    I find that a bit odd.

    1. Re:Coincident? by Jellybob · · Score: 1

      Why would find that odd? At any given point more then one network in the world is going to have problems. That's why there's more then one network engineer in the world, rather then everyone chipping in for an über-admin who can fix anything.

  46. Re:irc.freenode.net also experienced outages by jibjibjib · · Score: 2, Insightful

    It sounds more like a network configuration accident or glitch than an attack. Besides, netsplits aren't incredibly unusual.

  47. Re:Yo dawg.... by The+Master+Control+P · · Score: 1

    Yo dawg I herd yo and yo dawg like yo-yos so we put yo dawg on a yo-yo so yo can yo-yo yo dawg while yo dawg yo-yos, yo.

  48. Two words by Pravetz-82 · · Score: 1

    Broadcast storm.

  49. Re:Sabotage? by MrNaz · · Score: 1

    Man, don't you hate forgetting to tick "Post Anonymously" ?

    --
    I hate printers.
  50. the irony by polle404 · · Score: 1

    oh, the irony...
    the sweet, sweet irony...
    I am grateful that it was late night here, otherwise i'd have had to do groundbreaking stuff... like work, go outside, or socialize with my coworkers...

    --

    ~men are from earth. women are from earth. deal with it.~
  51. Two words: by xtracto · · Score: 1

    Flux Capacitor

    --
    Ubuntu is an African word meaning 'I can't configure Debian'
  52. Multicast issues by den_erpel · · Score: 1

    Some switches seem to have a multicast problem: we've downed very fancy cisco switches recently (can't recall the number right now) with igmp/multicast traffic.

    We've had a couple of hundred embedded systems that were announcing themselves on the network with mdns. That in itself is only a very low amount of traffic.

    Probably some management software triggered a slight increase in reporting, but with a couple hundred embedded systems, this was enough.

    mc traffic/igmp does not seem to be hardware accelerated; being routed to the main switch CPU -> maxed out.

    Disabling mdns 'solves' the problem.

    --
    Genius doesn't work on an assembly line basis. You can't simply say, "Today I will be brilliant."
  53. Uh? by jotaeleemeese · · Score: 1

    In my previous job the people fixing problems where not even in the same country as the data centre.

    We had a few people pulling cables and the like, but they were lowly paid people that were not doing any work with the devices.

    --
    IANAL but write like a drunk one.
    1. Re:Uh? by weighn · · Score: 1

      ... the people fixing problems where not even in the same country as the data centre.

      We had a few people pulling cables and the like, but they were lowly paid people...

      I don't get it - are you in Brazil, Russia, India or China?

      --
      Mongrel News all the news that fits and froths
  54. Dogbert by ciderVisor · · Score: 3, Funny

    ...being out of CPU, the error message was actually something to do with multicast. As a precautionary measure I rebooted each core just to make sure it wasn't anything silly. After the cores came back online they instantly went back to 100% fabric CPU usage and started shedding connections again. So slowly I started going through all the switch ports on the cores, trying to isolate where the traffic was originating. The problem was all the cabinet switches were showing 10 Gbit/sec of traffic, making it very hard to isolate. Through the process of elimination I was finally able to isolate the problem down...

    What did I say that sounded like "Tell me about your day at work" ?

    --
    Squirrel!
    1. Re:Dogbert by the+positive+path+ · · Score: 1

      Till we hear otherwise Slashdot still might have been slashdotted. Might be a while though...

  55. Re:I want that problem - 40gb!! by aproposofwhat · · Score: 1

    Honestly has anyone ever heard of wireshark or netflow?

    Nice troll - good luck to you reading packet traces from a 40Gb link.

    No wonder you post as AC :)

    --
    One swallow does not a fellatrix make
  56. So what you're trying to say is... by Isotopian · · Score: 1

    You accidentally the whole Slashdot?

    --

    It's poetry with a beat behind it! And guns! They're like beatniks with automatic weapons.

  57. Re:Slashdotted slashdot... by SleepyHappyDoc · · Score: 1

    I'm sure I've seen that before.

    --
    Stasis is death. Embrace change.
  58. Uh oh... by TheDarkMaster · · Score: 1

    On later thread, I posted about ghost on machine... He is now sentient, RUN!!!

    --
    Religion: The greatest weapon of mass destruction of all time
  59. My Guess would be... by jamesfalloon · · Score: 2, Funny

    I fully believe the switches in that cabinet are still sitting there attempting to send 20Gbit/sec of traffic out trying to do something - I just don't know what yet.

    Um, trying to get first post?

  60. Someone mod this up please... by Chordonblue · · Score: 1

    Oh I wish I had mod points for that... :)

    --
    "...Well, there's egg and bacon; egg sausage and bacon; egg and spam; egg bacon and spam; egg bacon sausage and spam..."
  61. Movie Guy's Voice.... by Chordonblue · · Score: 1

    In a world where 20Gbit switches mean life or death...

    A storm is coming...

    --
    "...Well, there's egg and bacon; egg sausage and bacon; egg and spam; egg bacon and spam; egg bacon sausage and spam..."
  62. Related? by Loki_666 · · Score: 1

    Maybe related to this?: http://www.theregister.co.uk/2009/02/10/new_dns_amplification_attacks/

    Does slashdot have a hidden repository of tranny porn? And if so, why wasnt i informed??!!

  63. Re:Just one simple question. by JustOK · · Score: 2, Informative
    --
    rewriting history since 2109
  64. Hahahahahaha! by xtheunknown · · Score: 1

    I'm sure I will be troll rated, but I just have to laugh! The vaunted slashdot had network problems. Man, I remember a time many years ago when the proprietors of slashdot sent their minions to my site to deliberately crash it and when it did crash, they laughed. Right back at ya dudes!

    --

    They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.
  65. Nomenclature... by Temkin · · Score: 1

    Well... Looks like all that's left is the really important task of defining the nomenclature that will be used to describe this obscure switch tendency.

    I'm going to suggest: "autoslashdoticisim"

  66. bridge loop by funkboy · · Score: 1

    Sounds to me like you've seen a bridge loop. Learn the spanning-tree config of your switches & the topology of your network.

    Make sure you're running spanning-tree on all inter-switch links, migrate all switches to rapid spanning-tree if you can, manually configure a primary & secondary root bridge in the center of the network, remove any switch from the network that doesn't run spanning-tree, shut down all unused ports so nobody plugs anything in without you knowing about it, set up port security so that ports with anything other than other switches on them can only send the number of MAC addresses necessary.

    That should about to it :-)

  67. The end is nigh! by Hassman · · Score: 1

    Did we just witness the birth of Skynet?

    --
    -Mark
    Dovie'andi se tovya sagain.
  68. Re:Slashdot,org by ildon · · Score: 1

    It's European.

  69. NSA by phorm · · Score: 1

    Besides, it was obviously down while the NSA "modified" the datacenter and installed tools to monitor any anti-government posting.

    Who needs a random technical explanation when a common conspiracy one will serve just as well :-)

  70. Re:Sabotage? by drachenstern · · Score: 1

    so what? you're trying for shittiest karma ever? what do you want to come back as?

    --
    2^3 * 31 * 647
  71. talking apples by Niobe · · Score: 1

    Had a very similar problem recently. Initially looked like a broadcast storm, but 6500 router cpu's were at 100%, and they wouldn't normally be bothered, and switches were fine. Turned out to be Appletalk traffic, multicast at layer 2. Never found the source, but took a couple of hours to narrow it down.

  72. Re:Mis-configured trunk ports can cause such an is by nicarley · · Score: 1

    I thought the same thing when I read the article. I just had a similar problem on a college network. Two switch ports had a loop, talk about breaking printers, and video broadcasting equipment. I noticed I had an issue when I saw 500 acknowledgments of the same packet in less than a minute.

    --
    Nic Farley
  73. IEEEEIIIIII by Moka · · Score: 1

    One thing you could say is: IEEEEIIIIII

  74. death threats in lieu of robustness by bzipitidoo · · Score: 1

    It's stunning to realize how primitive and fragile networking and OSes were 25 years ago, and how rather than making things less fragile, a typical workaround was to threaten horrible consequences for whoever broke anything. Sadly, that still goes on today. New things always seem to get that kind of extreme "blame someone and throw him under the bus" protection. Steve Jackson Games comes to mind. Computers have been around long enough now that some of that has eased up.

    For a class assignment years ago, we were to write a print server. We were given root access to the department's PC network (Novell, DOS and Win 3.11), and told that if we screwed up, we would be expelled for starters. One begins to wonder if a class like that is worth taking. The curriculum had no hint one might be obliged to walk through a minefield.

    But I stuck with it. Out of idle curiosity, I looked at the password file. There were all the passwords for all the professors' accounts, right there in clear text. Scary. Hashing wasn't in use everywhere at that time.

    The worst moment was the first run of my first attempt. I had it repeatedly scan a directory for files. This brought the network to a halt. All throughout the lab, people complained that their computers suddenly weren't responding to keystrokes. A quick and quiet ctrl-c stopped the print server and fortunately the network started serving everyone else again. I didn't have to face expulsion. I didn't fess up to the room either, just kept quiet. No sense facing a lynch mob. Let them think it was just a momentary mysterious glitch. I added a sleep(1) to the loop, and that fixed things. The incident still disturbs me.

    Another time, I got a tour of the mainframe room. Naturally, the Big Red Switch was pointed out. My guide asked what would happen if he walked over and flipped that switch. Answer: "You lose your job".

    --
    Intellectual Property is a monopolistic, selfish, and defective concept. It is "tyranny over the mind of man"
  75. He mentioned the real problem by Wee · · Score: 1

    One word: Savvis. That's trouble waiting to happen. Where Savvis is concerned, it's not "if, it's "when".

    -B

    --

    Ash and Hickory, straight-grained and true, make excellent bludgeons, dandy for the cudgeling of vegetarians.

  76. Blame me: I posted that Wil Wheaton was ... by KJSwartz · · Score: 1

    ... the lovechild of Will Riker and Deanna Troi. I tried to retract it because Wil Wheaton is just a character in ST:TNG.

    Not seriously, tho: what is happening in those two unused units that horked your 20Gbit switches? Is it something TIA?

  77. Re:Sabotage? by Achromatic1978 · · Score: 2

    you're trying for shittiest karma ever? what do you want to come back as?

    ... Twitter, maybe?

  78. It was a configuration problem by Megane · · Score: 1

    Someone turned on Spamming Tree Protocol when they meant to turn on Spanning Tree Protocol.

    --
    #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
  79. Time to upgrade the hosting by CmdrPorno · · Score: 1

    Time for Slashdot to upgrade their servers to Windows Server 2008. It's a direct drop-in replacement for Linux.

    --
    Sent from my iPhone
  80. It's not spanning tree's fault by Edgewood · · Score: 1

    A $30 switch and a patch cable will take down your spanning-tree enabled infrastructure very effectively. Loop the cable on your cheap switch: voila, a broadcast-storm generator. Plug it into the wall; plug your laptop into the switch and let it DHCP Discover, which is a broadcast. Your cheap switch now generates a stream of broadcasts as fast as it can, injecting them into the network. Your Spanning-Tree Enabled switches now repeat the broadcast faithfully. Network crashes*. STP prevents your switches from creating loops, NOT from propagating broadcast storms...

    *unless you are throttling the ports based on broadcast traffic, which you now know is NOT a feature of Spanning Tree

  81. Re:Sabotage? by drachenstern · · Score: 1

    Is he a twitter clone? I don't follow that fantasy trip often enough...

    --
    2^3 * 31 * 647
  82. Sounds like spanning tree was broke by Crackez · · Score: 1

    Sounds like STP was configured poorly and you had a switching loop. I've seen it happen where one switch is configured wrong and make another switch's CPU peg, especially if the other switch decided to advertise itself as the root bridge and it didn't make sense.

  83. Still a bit inestable by Saija · · Score: 1

    right now i have issues trying to connect, at least for 10 minutes(here i'm on 22:20 GMT-5)...

    --
    Slashdot ya no es que lo era! ;)
  84. I'd say: a Bridge loop by rhincewind · · Score: 1

    Very (and I mean VERY) likely a bridge loop, possibly caused hardware failure, incompatible spanning-tree on switches or by by vlan spanning-tree problems.
    I'm sure you'll be able to find the cause but, if not, let me know if there's something I can do

    --
    --Black holes are where God divided by zero--
  85. Troubleshooting by grwilli · · Score: 1

    I have a buddy who likes to say that there are only 3 steps to troubleshooting -

    1. Is it plugged in?
    2. Is it turned on?
    3. Is it configured correctly?

    Through various mis-adventures, we've had to append 'correctly' to both steps 1. and 2.

    Seems possible that 3. might be a likely culprit here - but, I heard no mention of the newness of the problem. So a new setup (for example) may likely not have been plugged in correctly.

    Also I've had to add a Step 4. or at least a step 3b. - Is it new code?

    The whole event sounds like a Spanning Tree loop - L2 broadcasts/multicasts just being forwarded endlessly and symptoms identical to what was reported. I have seen both code bugs and mis-understanding of new code defaults lead to such a thing.