Stupid Data Center Tricks
jcatcw writes "A university network is brought down when two network cables are plugged into the wrong hub. An employee is injured after an ill-timed entry into a data center. Overheated systems are shut down by a thermostat setting changed from Fahrenheit to Celsius. And, of course, Big Red Buttons. These are just a few of the data center disasters caused by human folly."
The summary reads like a digg post, and has two different links that, in actuality, link to the exact same thing.
This needs some fixin'.
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
Amusingly anyone who ever worked as tech crew at a lan party knows that this is the first thing you look for... :p
Why the fuck was the button unlabeled? That's the REAL MISTAKE.
Thank you... you've single-handedly made spending my time on recycled, old digg news completely and totally worth it.
I have to agree with this guy. As soon as IP addresses started being assigned incorrectly, the first thing I would be doing is checking the DHCP server. ipconfig /all on a windows box (so may 3 seconds of typing) would give this answer.
More to the point, though - why was another DHCP allowed on the network? Can your switches not block or refuse to route DHCP traffic from the wrong host?? Otherwise every single student who brings in their own wifi box is going to shut down the network.
Covering those power strip buttons with a hardened glob fixing them in the "on" position is what an electric glue gun is for.
Why would you want a hub in the first place? The only hub on newegg was some $650 24-port 10/100 deal. But unless you were trying to keep some crazy legacy network alive, why not spring for a modern 10/100/1000 switch?
Or unplug it.
The slow part is figuring out that that's the problem. The first time it happens to you.
Which is why it's good to have oldbies around, to whom lots of weird shit has happened.
I had fun with a company awhile back. They are about 300 employees and ~90mil/year, so this is a small corporation.
Anyway, the company was trying to get a VPN tunnel established to their China office, and they were having a hell of a time at it. The employees on the China side had no IT experience so everything was done remotely.
It just so happens that one of the Chinese employees was recruited to make a change to the PIX firewall on the China side in order to get everything working. To our astonishment, it worked, and we had a secure VPN tunnel established.
The problem was accounts in the US started to get locked out, alphabetically, every 30 minutes. Our Active Directory was getting tons of password crack attempts from inside our internal network. I was using LDAP to develop an application at the time, so naturally I was suspect for causing all these lockouts.
Fast-forward a week. We look at the configuration of the Chinese firewall and it allowed all access from any IP address on the Chinese side. In other words, crackers were trying to get into our systems through our VPN tunnel in China. In effect, our corporate LAN had been directly connected to the Internet. Once we figured that out, I was free to go back to work and the network lived to see another day, but that incident caused major trouble for all our employees.
Moral of the story: Don't trust a Chinese firewall.
Well.
The foundry switch I was screwing around with today... wasn't letting the IP Engineer send all the vlans to the mirror port. I could only watch management traffic (STP, etc) and nothing of any actual use.
It was great! Finally I got pissed off and shoved a homemade passive tap on the uplink and was -then- able to see the issue.
A hub would have made this a 5 minute job.
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
I dont care where you work, if you're on site doing training, you're probably also sucked back into the work cycle. I see it all the time at work, I have always preferred offsite training, turn off the cell phones. It also helps if you have to use your laptop on the lab, because 99% of the time it means you can not vpn into work so email is not a concern either.
I think my other Data Center operators would agree were all understaffed, and I work on a network with hundreds of millions of customers using it on a 24/7 cycle. The other danger nobody speaks of is that some companies are too passive when it comes to testing redundancy because half the time while there's redundancy in the system to keep a DMZ up and running, there's no spare DMZ capacity to handle a true outage such as a fiber ring failure that isolates the data center or other disaster. Companies need to design their redundancy so you can unplug the entire data center and your customers never knows it, because if you do not, you will rue the day a true outage happens that impacts the entire datacenter and you will hear about it on the news later. Not a good thing.