Slashdot Mirror


Testing Network Changes When No Test Labs Exist?

vvaduva writes "The ugly truth is that many network guys secretly work on production equipment all the time, or test things on production networks when they face impossible deadlines. Management often expects us to get a job done but refuse to provide funds for expensive lab equipment, test circuits and for reasonable time to get testing done before moving equipment or configs into production. How do most of you handle such situations, and what recommendation do you have for creating a network test lab on the cheap, especially when core network devices are vendor-centric, like Cisco?"

20 of 164 comments (clear)

  1. The tag says it all by Lord+Byron+II · · Score: 4, Insightful

    There are zero replies and the story is already tagged with "youreboned". That's the truth. If your higher ups won't front the money for proper test equipment and expect you to roll out production-ready equipment on the first go, then you really are boned. Of course, you can mitigate this by simple pen-and-paper analysis. What should each piece of equipment do? Are the products we've selected appropriate for the roles we're going to put them in? These sorts of questions can find a lot of bugs without any sort of testing. If you think, "what would I do if it was the 1980's?" then you'll be fine.

    1. Re:The tag says it all by DigiShaman · · Score: 5, Insightful

      Not all changes are a one-way trip. Having a rollback plan is also important. Should something very unexpected happen, be prepared to roll back any and all changes to undo what has just been done.

      --
      Life is not for the lazy.
    2. Re:The tag says it all by BiggerIsBetter · · Score: 4, Insightful

      Not all changes are a one-way trip. Having a rollback plan is also important. Should something very unexpected happen, be prepared to roll back any and all changes to undo what has just been done.

      Couldn't agree more, except to say, don't assume you'll be rolling back from a known state. I've seen roll-back plans that assume they're undoing the changes just put in, not reverting to the state before the changes. Yes, there's a difference between the two! Eg, if your install fails, maybe you can't un-install. Yes, this might mean additional resources and the overhead of FS and DB snapshots, and complete copies of config files, but better that than the alternative.

      --
      Forget thrust, drag, lift and weight. Airplanes fly because of money.
    3. Re:The tag says it all by afidel · · Score: 4, Insightful

      This is networking equipment, other than transitory information like peer maps and MAC tables that can be re-learned you should always be able to revert to the previous state as far as the software and configuration.

      My comments are that out of band management are the networking guys best friend, and POTS is the best OOB available. Also learn how to change the running config without affecting the saved config, that way worst case is you have to power cycle (can be done with the correct OOB config or you can pre-schedule a reboot that you cancel if everything goes well). Oh and downtime windows might seem like a luxury but unless you are Google or Amazon the business needs to be made aware that they are necessary and critical to the smooth functioning of their IT infrastructure, so you should be making these changes during the downtime window where everyone is aware that things might break.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    4. Re:The tag says it all by eggoeater · · Score: 3, Interesting

      I'm a call-center telephony engineer. Kinda the same thing as network engineer in that you're routing calls instead of packets.
      Back around '01, I was working for First Union (which later became Wachovia). They had this massive corporate push for anyone and everyone in IT to roll out a standardized Software Configuration Management, and of course we were included. The big problem was the lab. The corporate standard was to test changes in a lab environment and then move to production (duh).
      For a telephony environment, we had a pretty good lab that could duplicate most of our production scenarios, but not all. Another problem was there were a LOT of people with their fingers in the lab since so many groups were involved: eg. The IVR team is in there because you have to have IVRs in the system. Same with call routing, call recording, desktop software, Q&A, etc.etc.
      So the lab was in a constant state of flux with multiple products, multiple teams, and different software cycles and endless testing always occurring. We made it work by testing the stuff we weren't sure about in the lab, only doing changes in prod after hours, and having really good testing and back-out plans.
      So when the corporate overlords started telling use we couldn't make any changes to production without running everything through the lab first, we basically laughed and told them we'd need around 500 million for the lab and dedicated resources to run it. I ended up telling them that to duplicate the production environment, we'd need another bank as our "test bank", and we could test changes on the test bank and then put them in the production bank.

      As with so many things in that IT department, it went from being a priority to fading away when something else became a priority.

  2. Could be worse by 7213 · · Score: 4, Insightful

    The best bet is to be ready to blame the vendor when things go south ;-)

    Seriously, I'm right there with you. If management does not want to provide for a test lab & reasonable time to test. Then it's clear they've made a 'business decision' that the network is not of sufficient value / risk is not great enough for such investments.

    This may change quickly once something goes south (assuming they understand why it did) but you're gonna be talking to a brick wall until then.

    It could be worse, you could have management that are afraid of there own shadows & who freak out at the idea of replacing redundant components after a HW failure. (Ever had to get VP approval to replace a failed GBIC? Oh, I have & yes, I hate my life).

  3. Virtualization? by bsDaemon · · Score: 4, Interesting

    It's perhaps not the best solution, as a lot of problems I've faced since I started getting more into networking stuff than software configuration and web server administration have been related to bad cables rather than bad IOS settings, but virtualization can help you create test situations on the cheep. Specifically, GNS3 allows you to create test networks in a virtual environment, then import software images for your Cisco routers, switches, PIX firewalls, Juniper hardware, etc, all run on hypervisor technology.

    You can also use QEMU to create virtual network nodes. If you have enough RAM, then this can help at least get the logical issues worked out and the software configurations square. Then you just need to do the real work :) I'm still pretty new to networking myself, and I use it to make little test labs for myself when I need to do more than I can with the two 3600 and the 2600-series routers I got to take home for experimenting with. I actually copied the IOS images off of them via TFTP and then can replicate them as many times as I need to, but I can claim I have whatever interfaces I need, plus it will (thankfully) simulate the ATM switch for me as well.

    1. Re:Virtualization? by value_added · · Score: 4, Informative

      Specifically, GNS3 allows you to create test networks in a virtual environment, then import software images for your Cisco routers, switches, PIX firewalls, Juniper hardware, etc, all run on hypervisor technology.

      For anyone unfamiliar with GNS3, a link to the website. There are versions available for Windows, Linux, and OS X. FreeBSD already has it in ports.

      As a side note, I'd add that maintaining a home lab (to the extent practicable and useful) is one way to side-step limitations of what your employer provides. Consider it a combination of "Ongoing Professional Education" and "Proactive Job Security Measures" (i.e., "I better test this shit to save my ass tomorrow").

  4. Document and test at night by jdigriz · · Score: 5, Informative

    Step 1) Make a formal request for the test lab. Make it as detailed as possible. Explain the impact to business if various components fail. Make a plain-language executive summary calling out risks. step 2) Once the request is denied, make sure you have a paper trail of the rejection step 3) If possible test network changes on the production equipment at 2am so that impact on users will be less step 4) Once the inevitable failure occurs, haul out the paper trail and get the bean counter fired. Repeat until test lab is approved. Note, step 4 may get you fired instead. Business decisions are somewhat nondeterministic.

    1. Re:Document and test at night by Keruo · · Score: 3, Informative

      step 3) If possible test network changes on the production equipment at 2am so that impact on users will be less

      Been there, done that. Sadly the only way to see how your setup works is to try it in production.
      Sure it helps if you can test it beforehand, but sometimes your lab might not reflect what happens in real network when you roll something out.
      Just make sure you can clock those am hours as overtime/nighttime work.
      And remember to backup the running config twice so you can restore the production network if something goes fubar.

      --
      There are no atheists when recovering from tape backup.
    2. Re:Document and test at night by SethJohnson · · Score: 4, Funny

      If it goes smoothly anyway, you might look like a whiner that didn't need the expensive toys to keep on the shelf.

      Hence, you have the plug to the main router beneath your own desk. When the sailing looks smooth, you kick out the cord. While everyone freaks out, you open up a terminal window and begin typing nonsensical commands. Say, "Ahaaah! As you re-plug in the router.

      Job security.

      Seth

    3. Re:Document and test at night by Anonymous Coward · · Score: 3, Interesting

      Note, step 4 may get you fired instead. Business decisions are somewhat nondeterministic.

      And that's what happened to me.

      I was forced into making changes in the production environment, and caused an outage that affected 2 people. Once I realized what happened, I quickly fixed it; however due to internal politics I was terminated the next day.

      Initially I was in shock. 10 years, 2 months employed in a single company. Gone. I have a stay-at-home wife and 3 kids; which made things look even bleaker.

      In hindsight, it may be one of the better things to happen to me. I had spoken with a recruiter a few days before hand to start looking for work. When this happened, I was able to dedicate myself full time for job-searching. I was also off for hunting season, and able to do many things with my family that I normally wouldn't be able to do. The environment where I was was just awful. Several former co-workers have left since my special day. The CTO is a psychopath. He has 2 sayings he likes to use - the first is 'to do the job right the 1st time'. The second is a Mario Andretti quote of 'If you don't feel like you are out of control, then you aren't going fast enough'. These sayings are mutually exclusive, but logic doesn't apply.

      I start a new position on Jan 5th (but it is only a 6 month contract position). It is a bit more money, and I have about 1/2 the commute. It is also a much better work environment.

      Things I learned:

      - Stockholm syndrome is apparently real. I didn't want to leave because 'it's not that bad'. It was bad. Worse.
      - I hate job hunting.
      - Employment law in Ontario, Canada is not what I thought it was. Pretty much everything I though I knew was wrong.
      - The economy here in Ontario is poor, but improving (but vastly better than the US).
      - Legal advise in Ontario is tax deductible (at least in reference to employment issues).
      - A certain CTO is a complete and total prick.

      (ha - my captcha word is 'inaction')

  5. My last resort by tchdab1 · · Score: 5, Funny

    I call my buddies at RIM and test my mods on their system.

  6. Packet Life by z4ns4stu · · Score: 3, Informative

    Stretch, over at Packet Life has a great lab set up that anyone who needs to test Cisco configurations on can sign up for and use.

    --
    The whole moon and the entire sky are reflected in one dewdrop on the grass. - Dogen
  7. Tools by Tancred · · Score: 5, Informative

    Here are a few tools:

    GNS3 - http://www.gns3.net/ - free network simulator, based on Dynamips Cisco emulator
    Opnet - http://www.opnet.com/ - detailed planning of networks, from scratch
    Traffic Explorer - http://packetdesign.com/ - plan changes to an existing network

  8. Re:Pretty simple, really by symbolset · · Score: 5, Funny

    Oh, no. We do this all the time. Around the holidays we rewire the production server racks so their ethernet cables droop over the aisles, so we can hang up Christmas cards. Jimmy has a script that blinks the blue UID lights for a festive holiday display.

    --
    Help stamp out iliturcy.
  9. Go virtual! by leegaard · · Score: 3, Informative

    If you are unable to recycle old equipment into your testlab you should go virtual.

    For Cisco routers, GSN3/Dynamips (www.gns3.net) is your friend. Any recent PC or laptop will allow you to build a large and complex topology that will satisfy most experiments and even support you when doing certification preparation. It will only work for routers so switch-based platforms are out (like the 3570,6500 and 7600). The good news is that the features are more or less the same and they more or less behave the same way. If "more or less" is not close enough you need a replica of your production network or at least a few devices of each to test what can be labelled as critical.

    For Juniper routers, google juniper Olive. It will run a juniper router the same way dynamips runs a Cisco router.

    In both cases a proactive partnership deal with the vendor will be a good idea. Both Cisco and Juniper (and I am sure all other major network vendors) have programs where they will more or less advise, test and prepare the configurations for you. If you run a critical network this is money well spent.

    In the end it comes down to the level of risk your management is willing to take. Ask them if they will allow the network to be less up since you are unable to properly test your changes before implementation.

  10. Borrow a lab! by jimpop · · Score: 3, Interesting

    Cisco have many (large) labs located around the world. Sign up for some time in one of them.

  11. Paper Trail by tengu1sd · · Score: 3, Interesting
    >>>refuse to provide funds for expensive lab equipment, test circuits and for reasonable time to get testing done before moving equipment or configs into production.

    Make sure that every change request implementation documents that this change is being placed intro the production environment for testing. Document impact ranging from total network failure to moderate inconvenience and include roll out time tables. The roll out needs include travel times such drive to site B or fly cross country.

    Of course the downside of this is that management may go out and hire someone who knows, or at least pretends to know, how to drop changes into place without whining about ignorance and making customers uncomfortable.

  12. Re:Pretty simple, really by lukas84 · · Score: 3, Insightful

    Everyone has a test environment. But not everyone has a production environment.