System administrators routinely encounter network issues when deploying and maintaining software applications in Linux servers. Resolving these problems can be challenging and time-consuming due to several factors, including the complexity of applications, concurrent processes and services running in the server, or even having multiple processes that run simultaneously on the same processor.
We’ll discuss one of the most powerful tools for dealing with network issues: tcpdump. But before we set out to debug, we must first understand the fundamentals of the network. We’ll start with the TCP/IP model.
The TCP/IP model (short for Transmission Control Protocol/Internet Protocol) combines four different protocols deployed in networks. These protocols are four layers in the TCP/IP model:
When it comes to complex network issues, consider breaking down the steps to understand the debugging process better.
Start by understanding the problem. Debugging is made more difficult by the inherent complexity of networks and the applications using the network systems. You need to locate the exact layer where the problem occurs and you need to understand its mechanism—when and in what context it occurs.
For example, one of your testers has noticed that downloading a file from the application now takes roughly 20% more time: Download time has increased from the usual 30 seconds to 36 seconds. This is a strong indicator that there is indeed a problem somewhere in the network.
In this scenario, to isolate and narrow down the problem, you should first double-check the network’s internet connection speed. If network speed is fine, you should proceed by ruling out other issues, like network congestion, software bugs, or security issues.
Next, you should come up with a working theory of what’s behind the network problem. Continuing with the above example, there are many potential reasons for the increase in download time.
In a hypothetical scenario, the system administration team set up a virtual private network (VPN) for the test environment of the web application. When downloading a file directly from a computer that uses the applied VPN network, you don’t encounter any performance issues. However, they do occur when running automated testing to download files in the AWS cluster. This points to a problem with the AWS cluster configuration here.
After pair checking and using tcpdump for debugging the network, you find data packet loss when downloading a file from the AWS cluster. The culprit is distance: The current AWS cluster is in the us-east-2 region, whereas the local VPN is set up in Singapore.
In addition, there are unnecessary round trip requests. Each download request goes from the AWS cluster through the public internet, followed by local VPN; it is then processed by Kong (API gateway) before finally getting redirected to the AWS cluster where the downloading service is set up.
Change the domain from your automation test to the internal domain setup in the same AWS cluster—this will allow the download request to directly reach the running service in the cluster (instead of going through VPN and then back to the cluster). Download time from the AWS cluster should now take only 20 seconds.
tcpdump?
Sent and received network packets provide a lot of information about the network system that can help us with troubleshooting network problems. tcpdump
is a powerful tool that collects and analyzes these network packets.
If you get stuck while debugging, despite having tried a number of different tools, tcpdump
might do the job. It comes with various helpful capabilities, from showing available network interfaces to analyzing captured packets related to a specific file.
For example, we can get a list of the network interfaces available in the system by running tcpdump -D
tcpdump
We can then capture the data packets that are going through the eth0 network interface and save them into a file.
sudo tcpdump -w test.pcap -i eth0
eth0
interface
Next, to view the data captured in detail from the test.pcap saved file, we’ll run the following:
tcpdump -r test.pcap
test.pcap
file using tcpdump
TCP flags show the current state of a TCP connection and are placed in the TCP header. For example, to check whether the request has finished sending data to the server, we can filter for the FIN flag in the TCP header. The commonly used flags are:
TCP flags are useful for troubleshooting network issues. However, implementing a tool to capture TCP flags is complicated and involves a lot of work. Thankfully, tcpdump comes with a powerful filtering feature that makes it easy to find the packets that have a specific TCP flag or a combination of TCP flags. Moreover, we can even filter packets by IP address, network protocol, or source.
For example, we can filter the TCP packets by the IP of the host:
tcpdump -r test.pcap host 169.254.169.123
tcpdump
filtering network packets by their host IP
Alternatively, we can filter for finished packets only:
tcpdump -r test.pcap “tcp[tcpflags] & (tcp-syn)!=0”
tcpdump
filter option for TCP flag displaying finished packets only
Troubleshooting network issues can be a daunting and time-consuming task. In this article, we’ve discussed common problems system administrators encounter in network systems and demonstrated how using tcpdump
with advanced techniques like filtering TCP flags can speed up the debugging process.
Real-world troubleshooting can get quite complicated because we have a number of different services and processes running in the network system. The key to addressing these network problems effectively is to narrow down potential problems and debug each until their root cause is uncovered.
Write for Site24x7 is a special writing program that supports writers who create content for Site24x7 “Learn” portal. Get paid for your writing.
Apply Now