Plug In an Ethernet Cable, Take Your Datacenter Offline
New submitter jddj writes: The Next Web reports on a hilarious design failure built into Cisco's 3650 and 3850 Series switches, which TNW terms "A Network Engineer's Worst Nightmare". By plugging in a hooded Ethernet cable, you...well, you'll just have to see the picture and laugh. They write: "The cables, which are sometimes accidentally used in datacenters, feature a protective boot that sticks out over the top to ensure the release tab isn’t accidentally pressed or broken off, rendering the cable useless. That boot would hit the reset button which happened to be positioned directly above port one of the Cisco switch, which causes the device to quietly reset to factory settings."
"There’s an easy way to prevent it happening at all, by disabling the button" Another easy way to prevent this from happening would be DON'T BUY THIS SWITCH
Regardless of the design of the connector, having the reset button directly above the port is a bad design. It's simply too easy to hit it with your thumb just plugging in or removing a cable. I suppose holding it down for several seconds resets to factory, which is what happens when using cables with the boot. Still, regardless of that more severe problem, it was a bad design in the first place.
Better known as 318230.
Are 'config t' and 'write erase' too difficult to remember? Bothered by all those inconvenient keystrokes? Try the new EasyBoot(TM) from Cisco, the most convenient way to reset your router!
Building Better Software
From the article:
The cables, which are sometimes accidentally used in datacenters, feature a protective boot that sticks out over the top to ensure the release
and then
Such a situation could cause a problem in any size datacenter, where these switches and cables are commonly used
So are they commonly used on accident? Accidentally used commonly? I was reading the article to figure out what type of cable was often used, but apparently it's these cables but only by accident all the time.
If a single device brings down your entire data center, you've got design problems and your architect should be fired or retrained. These days everything is redundant in triplicate at minimum and new devices spin up automatically based on automatic provisioning and chef/puppet type setups. Even if your core router (why would you have just one!?!?!?!) shits the bed and resets to factory defaults with VLAN 1 and basic STP with no routing interfaces configured, if your NOC folks did a good job, a proper MSTP / VRF / TRILL / SDN ( OpenFlow, etc) / etc like setup should route around that shit and QA will have already tested the "core clos spine device reboots to factory defaults" test case at which point you have just another device for a low paid lackey to swap out based on your network monitor going yellow.
If you work in a Fortune 500 datacenter and you can't handle this sort of outage, get the fuck out. You're the reason shit's going downhill. Also if a Cisco 3650 or 3850 bring down your datacenter, see previous negative asshole sentiment or get a new job if your manager is responsible for the confines of such a clusterfuck. No participation trophy for such asshattery.
'We are trying to prove ourselves wrong as quickly as possible, because only in that way can we find progress.' RPF
Little slow to blog about this then huh ? Bad enough this is considered news, worse still, as usual slashdot bought your cluck-bait.
I've seen a few of them, but they're pretty rare. I avoid them because usually the boot does more harm than good - getting stuck under the tab, sliding to the side and making it hard to push the tab, getting stuck next to the jack/port, especially if it's slightly recessed like you might find in an IP phone. And, apparently, breaking Cisco switches. Something like This would probably do it.
Incidentally, I'm not really a Cisco guy, but I have helped recover a couple secondhand switches for friends and I'm pretty sure there are several more steps required than just holding the mode button. If you were to get it stuck pushed and the switch ever power cycled it'd likely end up stuck at a boot prompt until the cable was unplugged and it was rebooted again, but it shouldn't be the disaster implied.
You're plugging it in wrong.
The good "tab protector" cables actually use a hood,, not just a fragile tab, second reversed tab above the connector tab. I've had some problems with even those where the recess for the connector was too deep and too tightly encased, making it impossible to get a hooded cable in place. Those are especially handy because they cost considerably more, and can require a small screwdriver to lever under the hoold and release the connector tab.
I voted this article down for the use of the term "Hilarious" but it got in anyways.
On our 3850's the button is placed above and in between the Ethernet ports 1 & 3, not directly above Ethernet port 1 as shown in the article.
While I like the auto-LART feature, I wonder what the switch is doing there at all: If the switch is working properly, it doesn't need a reset button.
If the switch is not working properly, it needs to be burdensome to power-cycle it, to encourage people to complain loudly to the responsible vendor(s) until the product actually works.
In these modern times, I think an accessible reset switch is like: "Yo dawg, I heard you like to 'fix' things by pushing buttons, so we put buttons on your Enterprise switches so you can reset one-handed while you [...]"
ObTopic: I once helped take down an enterprise LAN with an Ethernet cable. It was 10-ish years ago, and we just installed a new-fangled VoIP phone system. Each VoIP deskset had a built-in unmanaged 10/100 switch. This was a very handy thing before our modern enlightened structured cabling roll-outs, because it could be trivially daisy-chained with a desktop computer and standardized PoE was not yet a thing.
Anyhow, we started late on a Wednesday, and finished just before start of business Thursday: Record time for replacing an old Nortel with a few hundred extensions, I tell you. And I went home and died on my couch, having been awake and actually working (prep, etc) for about 40 hours.
At 7:23AM, my phone rang. It was my manager. Their entire network had crashed, hard. They blamed us. They were livid. I read my manager the NSFW riot act, hung up, and went back to sleep.
Turns out that after we left, some unknown person had plugged both external switched ports of a deskset into both ports on a wallplate connected to a then high-end HP Procurve switch, which itself connected to a factory and office tower full of other HP Procurve switches carefully set up in a redundant "mesh fabric" mode. This carefully-constructed, redundant network then died in a broadcast packet storm.
Once they found the error and unplugged that one extraneous heads-will-roll wayward wire, things more-or-less instantly returned to normal.
(STP would've instantly made this a complete non-issue, but at that time STP and HP's mesh conflicted with eachother and could not cohabitate. I understand that this was subsequently resolved, though I don't deal with HP switches often enough to verify.)
Kid-proof tablet..
Yep very old news. I laughed when I heard about it nearly 2 years ago.
Relevant Field Notice from October 2013. http://www.cisco.com/c/en/us/support/docs/field-notices/636/fn63697.html
Article only show drawings/illustrations - where's an actual picture
That's exactly what I said in Sex Ed!
Those are especially handy because they cost considerably more
Does not compute.
and can require a small screwdriver to lever under the hoold and release the connector tab.
Still not seeing what is "especially handy" about that.
On Catalyst 3850s this has been fixed since the release of 3.3.5SE code (release November 2014), so this is old news. Even on older code, the problem can be fixed by using the command "no setup express". I have to say running into this the first time and trying to figure out why the switch had a blank config was a head scratcher...
http://almostsmart.com
I usually take a knife or scissors to that hood in those cases, and give it a circumcision.
If you think I voted for Trump because of this post, you're wrong. I voted for Dr. Jill Stein of the Green Party. Again.
You home-schooled kids are so funny.
If you think I voted for Trump because of this post, you're wrong. I voted for Dr. Jill Stein of the Green Party. Again.
For starters, assuming you fall prey to this, all you lose is the configuration of a single switch. If losing a single fixed configuration 1U switch causes your entire datacenter to go down, your datacenter is badly designed.
Second, this requires a particular style of booted cable, not just any booted cable. Most datacenters I've worked in don't use booted cables in their switch ports. Their cables are cut to length and crimped by hand. Booted cables can be a bitch to get out of the port, especially on 1U 48 port switches. Fiddling with a boot in a cramped cage or rack is a great way to take collateral links down.
Third - no good network engineer leaves the mode button enabled on a production switch, whether it's one of the express setup ones, or just the regular old boot to rommon ones.
Fourth - yeah, this is a shitty design choice by Cisco, normally the mode button is off to the side.
Sure this is funny, but the workaround in TFA is pretty straightforward.
Disable Express Setup with this command while in config mode:
Someone explain to me why you'd run Express Setup after deploying this switch?
I've got several thousand of those kinds of cables in my closets. They're not so bad if you don't have a reset button located adjacent to the tab protector. They actually slip out of the bundle fairly easily compared to most others.
Do not look into laser with remaining eye.
The cables, which are sometimes accidentally used in datacenters
In my opinion there's not any specific definition on that they shouldn't be used in datacenters - they do have the advantage of protecting the tab on the RJ-45 connector pretty good and would actually be preferred over unprotected connectors.
Overall the button placement is pretty stupid, and is probably the result of optimizing the size of the unit. So if you run a data center, then you will learn to deal with the button location.
Realize that this problem is just annoying, there are bigger design flaws in the area of computing.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
Ahh, is that the switch and cable combo Ubisoft is using for Uplay? So it's all really Cisco's fault then!
This. I've never seen anything where it wasn't recessed like that.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
There is a reason that vital reset commands need more than one action to complete. It would be a minor inconvenience if the router were to reboot when you press this button (by accident), but to have the complete configuration be wiped by this, and have it situated so that an involuntary application of said button is easy, is just epically stupid.
But ok. It's Cisco. You'd expect that from them.
If this is a problem, you have more serious issues to worry about, such as looking for a mental institution to house you.
I once did something similar. I had a screen on a web app which had a form. On the next screen the Delete button was at the same place the submit button on the form.
The nice lady user had a habit of DOUBLE clicking for some reason. Which means she submitted the form and then deleted the record directly in the next step because the second click went to the delete button.
Took us a bit to figure out why the docs were deleted.
The dangers of excessive individualism are nothing compared to the oppressiveness of excessive collectivism
The phrase "especially handy" was meant to be ironic.
I agree that this is a a crappy design. I was never a fan of the way Cisco designed equipment anyway, but back in the day I cut off boots on any Ethernet cable I used in either the data center or wiring closet simply because SOME equipment had ports slightly recessed and the boots would prevent the cable from locking in reliably. Caused a number of hard to find intermittent problems before we figured out what was going on.
The 3650 and 3850 are access layer switches. These are used in closets to connect client devices (desktops, phones, wireless AP's, etc). These are not top-of-rack server switches or core switches for datacenter usage.
"A plan fiendishly clever in its intricacies"- Homer Simpson
I'm not a network engineer but why are those types of cables not supposed to be used? The article seems to imply that using these hooded cables is wrong. I can see why they wouldn't be cost effective or not necessary but why wrong?
Just another second banana
I think the first thing we all need to understand is that the button mentioned is NOT a reset button. It's the display button for the lights and is clearly labeled "mode". It cycles between the different information modes such as speed, duplex, stack ID, POE usage, etc. See this article from the Cisco Support forums detailing how to determine which stack ID the different switches are as one example: https://supportforums.cisco.co...
Okay, you gotta admit- that's some funny shit. Poor design allows you to bork your entire network by plugging in a cable. Hilarity ensues.
And what's this crap: "The cables, which are sometimes accidentally used in datacenters..."
Cables are "accidentally" used? WTF?
Just cruising through this digital world at 33 1/3 rpm...
... and this is 'current news' because?
The last time I wrote code, it was Morse
just saw of the reset button - leave a ditch. For resetting you can always prick with a pin on that ditch :).