Trying to Help a Troubled Network with Linux?
vmehta asks: "I was recently put in a situation where I am trying to help a troubled network with many students accessing it. There are issues with broadcast packets and random outages which seem to be plaguing the network. What tools and methods are the best practice when trying to use Linux and Open Source to analyze and fix a network?"
First step isn't to blunder in and migrate - the first step is to work out what's causing the outages etc. use ethereal or some other packet sniffer to establish where the broadcast floods are coming from - use nmap to find insecure hosts - also, investigate what kind of routers are being used, and what rules are being employed.
Basically, OSS/Linux are great, but don't rush in without establishing the issues first.
Almost any time I see this, its some random box flooding the network. Just go to your switches...the light that is on solid continuously will point you in the right direction.
No use fixing symptoms go after the root cause.
Whats next, "How do I produce PDF files, using Linux and Open Source?" "How can I leverage Open Source to surf the web?"
Christ, this is like the late 90's, when everything suddenly had "e" in front of it. Dude, get Ethereal, slap it on any Windows box, and be done. No need to get nerdy with Linux. If you know enough that its broadcast traffic, you're halfway there.
I want to delete my account but Slashdot doesn't allow it.
The first step in troubleshooting is in knowing the network topology. How are network segments separated? How are the connected? Where are routers, hubs, switches, etc.? Which switches are managed, and how are the VLANs set up on them? Where are the DHCP servers, and what do they serve? Where are all your network drops?
Do your network segments have multiple subnets attached to them?
Is everything subnetted properly?
The first set of questions are ones YOU should be able to answer. After all, it's YOUR network, and YOU should know how it's set up. The last two are harder to deal with, because these settings may be on computers not in your control.
Answer the first questions first, then when you are looking at packet traces, TCP/IP dumps, logs, etc. and you see a problem, you'll have a better idea where the problem is physically located, saving much time and energy.
And then there's the "dumb questions" I shouldn't have to ask: Do you have a loop? Are your cables wired to T568A or T568B standards? Are all your cables in good repair?
Give me my freedom, and I'll take care of my own security, thank you.
Step 1) Map the network both logically (which networks, what is the routing, etc.) and physically... the "tug test". Label everything, and put it all in a spreadsheet. Tools are nmap, pen and paper, and a label printer. Access to the routers, or being friendly the the router admin is a must.
Step 2) Isolate the problem protocols and hosts. Be on the lookout for appletalk, IPX, or old netbios. All very chatty protocols. Look for old hubs and replace them with switches. Look for comprimised boxen. Try to VLAN things logically (by department, or usage which ever is best for the environment). Tools are snort, ethereal, ntop, and syslog (any managed switches should be sending to a syslog server (I've used syslog-ng))
Step 3) Trend as much as you can. Even before the network is cleaned up, start to collect statistics from the switches, and/or hosts on your network. Any gateways should be monitored as well. This will let you see if there are problems corelated to a particular time of day, if your're going over your bandwidth etc. Tools are MRTG, or for more in depth try Cacti http://www.cacti.net/
There is much more after you get to this point, but people will be much happier the faster you get here.
Good luck