mailing list archives
Network Troubleshooting - A Complex Process Made Simple
From: "Gideon T. Rasmussen, CISSP, CISM, CFSO, SCSA" <lists () infostruct net>
Date: Mon, 22 Mar 2004 18:41:48 -0500
A Complex Process Made Simple
Gideon T. Rasmussen - CISSP, CISM, CFSO, SCSA
The most efficient manner to troubleshoot a network issue is to approach
it in a systematic way. Start by gathering background information; then
troubleshoot following the Open System Interconnection (OSI) networking
GATHER BACKGROUND INFORMATION
It is critical to obtain a complete picture of the issue. Carefully
consider how the problem manifests itself. For example, does it apply to
inbound traffic, outbound traffic or both?
Try to determine when the issue started and consider how often the issue
occurs. Is this a constant or intermittent problem? Is this issue
reproducible? If so, how?
The cause may be an unforeseen side effect of maintenance. Has anyone
made any changes to the firewall or the networking equipment that it
Perhaps this is a symptom of a larger issue. Has anything else strange
If this is a new initiative involving a series of complex
configurations, there may be a better solution. In that case, consider
what the ultimate goal is and work from there.
TROUBLESHOOT UP THROUGH THE OSI MODEL
Now that you have a firm understanding of the issue, track down its
source. Conduct network troubleshooting following the OSI model. Start
with the physical layer and work up to the application layer. Network
problems are usually associated with the first three layers. This
section of the article provides troubleshooting tips for firewalls,
networking gear and the systems that connect to them. Unless otherwise
noted, commands apply to both Windows and UNIX/Linux.
The physical layer is one of the easiest to troubleshoot. It is also
frequently overlooked. If there is a network connectivity problem,
consider the following:
1. Ensure the equipment at the distant end is powered on. Don’t laugh.
2. Examine the cabling. Check for defects or damage. If a cable has been
cut or stretched, it may not pass traffic. Check the connectors too. The
cable may not be properly inserted into the connector. If the connector
is not crimped properly, the wiring may not be making contact with it.
3. Keep in mind that the maximum length of an Ethernet segment is 100
meters. If a cable is too long, there may be intermittent connectivity
problems, or it may not work at all.
4. Ensure that each cable connector clicks as it is inserted into the
5. Check the network port indicator lights on each system. If a link
light is out, there is an issue with either the network card or cabling.
6. Ensure that the proper type of cabling is in use:
a. Cabling between computer systems and network devices use a “straight
through” cable. To examine the wiring, look closely at the clear
connectors at each end of the cable. A straight through cable has an
identical wiring layout on both sides.
b. Direct cabling between computer systems requires a crossover cable.
For example, if you connect a laptop directly to a server. A crossover
cable has two wires flipped so the wiring on each side has a slightly
7. Finally, try swapping out the network cable with one that is known to
be good or test it with different equipment.
8. If you suspect hardware issues with the network card, use a hardware
diagnostic command to test it.
# getmib -l
dec3 DOWN 10 HD
dec2 DOWN 10 HD
dec1 DOWN 10 HD
dec0 UP 100 FD
In this example, we can see that the dec0 interface is up from a
hardware perspective. The remaining interfaces are down. NOTE: The
command used in this example is specific to CyberGuard firewalls.
9. Finally, check the operating system logs (i.e. syslog, osmlog, event
Data Link Layer
At the data link layer, local communications occur by network port
hardware addresses, also referred to as Media Access Control (MAC)
addresses. Failures at this layer are usually caused by an improperly
configured network port or a physical problem.
1.a. If there are network connectivity issues, check the Address
Resolution Protocol (ARP) table.
# arp -a
hostname (192.168.1.100) at 0:31:f8:3:b7:de
1.b. The IP address of at least one system should be listed. If there
are no systems listed, there is a problem at the physical layer (above).
1.c. >From the arp command results above, determine if the MAC address
matches the distant network port hosting that IP address. If the MAC
address is incorrect, delete the offending ARP entry.
# arp –d <IP address>
The ARP entry will be added automatically when network traffic arrives
for that IP address. In most cases this occurs almost immediately. If
the incorrect ARP entry appears again, there is a duplicate IP address
on the network.
1.d. In some instances, two systems are linked in a high availability
(HA) configuration. To ensure consistent service, if one system fails
the other takes over automatically. This is accomplished by the standby
system sending a gratuitous ARP broadcast across the local network
(i.e., my MAC address answers for this IP address). If HA failovers are
not taking place, the local router may have ARP caching enabled. To
restore HA functionality, disable ARP caching on the router.
1.e. MAC addresses can also be used to determine the vendor of systems
attached to the network. This can be useful in tracking down an
offending system. To determine the vendor of a network port, visit the
IEEE site at http://standards.ieee.org/regauth/oui/index.shtml. The
search format is separated by dashes (i.e. 08-00-20).
2. Systems must be configured to auto negotiate or use the same speed
and duplex settings. Otherwise there may be network performance issues
or intermittent loss of connectivity. If these are the symptoms of your
issue, confirm that network ports at each end of the wire are configured
in the same manner (e.g. auto negotiate or 100 Mbps full duplex).
3. If there are intermittent or constant connectivity problems, use the
netstat command to check the status of the network interfaces:
# netstat -in
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Collis
lo0 2048 127 127.0.0.1 333929 0 333929 0 0
eeE0 1500 10.0.4 10.0.4.1 0 0 448800 0 0
dec3 1500 10.0.2 10.0.2.1 1798 0 3 0 0
dec2 1500 192.168.11 192.168.11.11 0 0 0 2 0
dec1 1500 10.0.5 10.0.5.1 1108 0 224 0 0
dec0 1500 64.94.50 184.108.40.206 738768 0 101501 0 10519
Errors in the Ierrs and Oerrs are usually caused by defective network
hardware. Entries in the Collis column indicate that the network is very
busy or there is an issue with the network hardware.
4. It is possible that the hardware is fine and the interface is down
within the operating system. Use the ifconfig command to determine the
status of the interface:
# ifconfig –a
dec0: flags=4023<UP,BROADCAST,NOTRAILERS,EXTERNAL> mtu 1500
inet 220.127.116.11 netmask ffffff00 broadcast 18.104.22.168
dec1: flags=2023<BROADCAST,NOTRAILERS,INTERNAL> mtu 1500
inet 10.0.5.1 netmask ffffff00 broadcast 10.0.5.255
In this example, the dec0 interface is “UP” and operational. The dec1
interface is down because ifconfig does not list it as “UP.”
If “UP” is missing from the ifconfig status, use ifconfig to bring it
# ifconfig dec1 up
5. The default Message Transmit Unit (MTU) setting is 1500 (see
“ifconfig -a” output above).
a. If the MTU is set to something other than 1500, the network may run
slowly. To set the MTU to a default of 1500:
# ifconfig dec0 mtu 1500
b. In the event that there are issues with VPN connectivity over a Cable
or DSL connection, try setting the VPN client workstation to an MTU of
1400. The DrTCP utility can be used for this purpose
6. During the boot process, Windows workstations typically use the
Dynamic Host Configuration Protocol (DHCP) to obtain basic network
configurations. DHCP servers usually serve up an IP address, DNS server
settings, netmask and default gateway. To view active configurations,
use the ipconfig command.
c:\> ipconfig /all
If this process fails, the workstation will not have a proper network
configuration. This issue typically occurs in home environments when the
workstation boots before the system providing DHCP services (usually a
router or modem). The fix action is to release and renew the network
configurations using DHCP.
c:\> ipconfig /release
c:\> ipconfig /renew
In order to communicate across a network, each system needs an IP
address, a default gateway and a network mask.
1. Confirm that each node on the network has a unique IP address. If a
system boots and advertises an IP address that is already in use, the
system previously using that address will respond, and the new system
will shut down its own networking. Use the ipconfig and ifconfig
commands to determine the IP address assigned to each interface (Windows
and UNIX/Linux, respectively).
2. Each system sends network traffic to its default gateway. If the
default gateway is incorrect or missing, network traffic will not flow.
The only exceptions are manually configured static route entries.
Determine if the default gateway is correct in the output of “netstat –rn.”
3. The network mask tells the system which devices are on its local
network. All other traffic will have to go through a router. The most
common network mask is 255.255.255.0. Current mask configurations can be
displayed with the ipconfig and ifconfig commands. The topic of
subnetting is too complex to discuss here. If you are uncertain about a
network mask, contact your network administrator.
4. Try using the ping command between devices (when interacting with a
CyberGuard firewall, this requires echo/ICMP rules with enable replies
# ping 192.168.1.100
• From a client, can you ping the internal interface of the firewall?
• From the firewall, can you ping the client?
• From the firewall, can you ping the firewall’s default gateway?
5. If there are still issues with external connectivity, contact your
ISP and ask them to test the line.
6. Denial of Service (DoS) attacks can degrade the performance of a
system until it stops accepting network traffic. To determine what
systems are connected, use the netstat command:
# netstat –an (output abbreviated)
tcp 0 0 22.214.171.124 126.96.36.199.1112 ESTABLISHED
tcp 0 0 *.21 *.* LISTEN
tcp 0 0 188.8.131.52 184.108.40.206.80 SYN_RECEIVED
This syntax is also a good method detect a SYN flood attack (SYN_RECEIVED).
7. If you suspect ICMP based virus traffic:
a. On CyberGuard firewalls, use the netguard command:
# netguard -nS all
Under Sessions, take a look at the ICMP column.
Press Ctrl-C to exit the netguard session
b. On Windows and Linux/UNIX systems, use the netstat command to view
the ICMP protocol statistics:
# netstat –s –p icmp
8. If there are issues with routing, outbound traffic will not flow
a. Check the routing table for erroneous entries:
# route print
b. Use the lookup feature of the route command to determine how it will
route traffic based on an IP address.
# route –n lookup <IP address>
9. If there are external connectivity issues, determine if Dynamic
Network Address Translation (DNAT) is enabled on the external interface
of the firewall. If DNAT is not enabled, traffic cannot find a route
back to your system unless a static NAT is in place.
On CyberGuard firewalls, select enable replies in UDP and ICMP
packet-filter rules. These protocols support ping, the Domain Name
System (DNS) and syslog.
Session & Presentation
Issues rarely occur at the session and presentation layers.
The application layer is where the client-server issues fall. This
includes SMTP, POP3, HTTP, FTP, etc.
1. DNS supports many commonly used programs and services, including web
pages and e-mail. DNS translates host names into IP addresses (e.g.
www.cyberguard.com to 220.127.116.11). To confirm DNS functionality, use
the nslookup command:
# nslookup www.cyberguard.com
It should respond with an IP address. If it does not, check your DNS
server settings. Additional DNS troubleshooting is beyond the scope of
2. If the traffic flows through a CyberGuard firewall, use the netguard
and grep (search) commands:
# netguard –An | grep 192.168.1.100
P = Traffic was permitted
X = Traffic was proxied
D = Traffic was denied
If the traffic you are expecting is not listed in the netguard output,
then it never reached the firewall.
3. If all else fails, use the tcpdump command to troubleshoot:
# tcpdump -vvpni dec1 -s1514 -w /archive2/dec1.dmp host 10.0.1.13
Tcpdump functions from the application layer down to the data link
layer. If you are troubleshooting a proxy, you will need to run it on
both sides. Tcpdump output is not for the faint of heart. I recommend
viewing it with Ethereal (http://www.ethereal.com). The Windows version
is Windump (http://windump.polito.it).
As you can see, network troubleshooting can be quite involved. In
practice, the symptoms of the issue will contribute to how you approach
it. For example, if you can ping the remote system, you will
troubleshoot up the OSI model from there. Above all keep your cool and
Ethical Hacking at the InfoSec Institute. Mention this ad and get $545 off
any course! All of our class sizes are guaranteed to be 10 students or less
to facilitate one-on-one interaction with one of our expert instructors.
Attend a course taught by an expert instructor with years of in-the-field
pen testing experience in our state of the art hacking lab. Master the skills
of an Ethical Hacker to better assess the security of your organization.
Visit us at:
- Network Troubleshooting - A Complex Process Made Simple Gideon T. Rasmussen, CISSP, CISM, CFSO, SCSA (Mar 23)