Performance Tools: Monitoring CPU, Memory, and IO with top, htop, iostat and More

Picture this: your server is running slower than molasses, users are complaining, and you need to figure out what’s going on – fast. Is the CPU maxed out? Is memory running low? Are the disks struggling to keep up? These are the moments when knowing your performance monitoring tools can save the day.

In the Linux world, we have an arsenal of powerful tools that can help you diagnose performance issues in real-time. Whether you’re a system administrator trying to keep servers running smoothly or a developer optimizing applications, understanding these tools is absolutely essential. Today, we’re going to dive deep into the most important performance monitoring tools, starting with the classics like top and htop, then expanding to cover CPU, memory, and IO monitoring in detail.

Don’t worry if you’re new to performance monitoring – I’ll walk you through each tool step by step, explain what all those numbers mean, and show you how to use them to actually solve real problems.

The Big Picture: What Are We Monitoring?

Before we jump into specific tools, let’s understand what we’re actually looking for when we monitor system performance. Every computer system has four main resources that can become bottlenecks:

  1. CPU (Processor): How much processing power is being used
  2. Memory (RAM): How much memory is being used and available
  3. Disk I/O: How busy your storage devices are
  4. Network I/O: How much network traffic is flowing

When your system slows down, it’s usually because one or more of these resources is overwhelmed. Our job is to figure out which one and why.

Starting with the Basics: top Command

The top command is probably the most famous system monitoring tool in Linux. It’s been around forever and comes pre-installed on virtually every Linux system. Think of it as your system’s dashboard – it gives you a real-time view of what’s happening.

Running top:

1
top

When you run this command, you’ll see something like this:

1
2
3
4
5
6
7
8
9
top - 14:23:45 up 5 days,  2:14,  3 users,  load average: 0.52, 0.58, 0.59
Tasks: 178 total,   1 running, 177 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.2 us,  1.1 sy,  0.0 ni, 95.4 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7982.1 total,   2847.3 free,   2938.5 used,   2196.3 buff/cache
MiB Swap:   2048.0 total,   2048.0 free,      0.0 used.   4721.8 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 1234 user      20   0  157532  45236  32164 S   5.3   0.6   0:23.45 firefox
 5678 root      20   0   12345   4567   3456 R   2.1   0.1   1:45.67 python3

Let me break down what all this information means:

The Header Section:

  • Current time and uptime: 14:23:45 up 5 days, 2:14 - Shows current time and how long the system has been running
  • Load average: 0.52, 0.58, 0.59 - These are the 1-minute, 5-minute, and 15-minute load averages. Values below 1.0 generally mean your system isn’t overloaded
  • Tasks: Shows total processes and their states (running, sleeping, stopped, zombie)
  • CPU usage: Broken down by user processes, system processes, idle time, etc.
  • Memory usage: Total, free, used, and cached memory

The Process List:

Each line shows a running process with:

  • PID: Process ID number
  • USER: Who owns the process
  • %CPU: Percentage of CPU this process is using
  • %MEM: Percentage of memory this process is using
  • COMMAND: The actual program name

Useful top commands while it’s running:

1
2
3
4
5
6
7
# Press these keys while top is running:
q         # Quit top
k         # Kill a process (you'll be asked for the PID)
1         # Show individual CPU cores
M         # Sort by memory usage
P         # Sort by CPU usage (default)
h         # Show help

Running top with useful options:

1
2
3
4
5
6
7
8
# Update every 2 seconds instead of default 3
top -d 2

# Show only processes owned by specific user
top -u username

# Run in batch mode (useful for scripts)
top -b -n 1

Upgrading to htop: A Better Experience

While top is universal, htop is like top with superpowers. It’s more colorful, more interactive, and generally easier to use. However, it’s not always installed by default.

Installing htop:

1
2
3
4
5
# On Ubuntu/Debian
sudo apt install htop

# On CentOS/RHEL/Fedora
sudo dnf install htop

Running htop:

1
htop

The htop interface is much more user-friendly. You’ll see:

  • Colorful CPU and memory bars at the top showing usage visually
  • Function keys at the bottom showing what each key does
  • Mouse support - you can actually click on things!
  • Tree view of processes showing parent-child relationships

Useful htop features:

1
2
3
4
5
6
7
8
9
# While htop is running:
F1        # Help
F2        # Setup (customize the display)
F3        # Search for a process
F4        # Filter processes
F5        # Tree view (shows process relationships)
F6        # Sort by different columns
F9        # Kill a process
F10       # Quit

You can also use your mouse to:

  • Click on column headers to sort
  • Click on processes to select them
  • Use the scroll wheel to navigate

Customizing htop:

Press F2 to enter setup mode where you can:

  • Add or remove columns
  • Change colors
  • Modify how information is displayed
  • Save your preferences

Monitoring CPU Performance in Detail

Sometimes you need more detailed CPU information than what top or htop provides. Here are some specialized tools:

Using vmstat for CPU statistics:

1
vmstat 1 5

This shows CPU statistics every 1 second for 5 iterations:

1
2
3
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 3045632 189564 2156789    0    0     8    12   45   89  5  2 93  0  0

Understanding vmstat output:

  • r: Processes waiting for CPU (runnable)
  • us: User CPU time percentage
  • sy: System CPU time percentage
  • id: Idle CPU time percentage
  • wa: Time waiting for I/O

Using sar for historical CPU data:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Install sysstat package first
sudo apt install sysstat    # Ubuntu/Debian
sudo dnf install sysstat    # CentOS/RHEL/Fedora

# View CPU usage for today
sar -u

# View CPU usage every 2 seconds, 10 times
sar -u 2 10

# View CPU usage for a specific date
sar -u -f /var/log/sysstat/saXX  # XX is the day of month

Monitoring individual CPU cores:

1
2
# Show per-CPU statistics
mpstat -P ALL 1 5

This is extremely useful on multi-core systems to see if load is balanced across cores.

Memory Monitoring Deep Dive

Memory issues can be tricky because Linux uses memory in complex ways. Let’s explore tools that help you understand memory usage.

The free command:

1
free -h

This shows memory usage in human-readable format:

1
2
3
              total        used        free      shared  buff/cache   available
Mem:          7.8Gi       2.9Gi       2.8Gi       156Mi       2.2Gi       4.6Gi
Swap:         2.0Gi          0B       2.0Gi

Understanding the output:

  • total: Total physical memory
  • used: Memory used by processes
  • free: Completely unused memory
  • buff/cache: Memory used for disk buffers and cache (can be freed if needed)
  • available: Memory available for new processes (includes reclaimable cache)

The key insight: Don’t panic if “free” is low – Linux uses free memory for caching to improve performance. Look at “available” instead.

Continuous memory monitoring:

1
2
3
4
5
6
# Update every 2 seconds
free -h -s 2

# Show memory usage in different units
free -m    # Megabytes
free -g    # Gigabytes

Finding memory-hungry processes:

1
2
3
4
# Sort processes by memory usage
ps aux --sort=-%mem | head -10

# Or use top/htop and sort by memory (press M in top)

Detailed memory analysis with smem:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Install smem (not always available by default)
sudo apt install smem    # Ubuntu/Debian

# Show memory usage by process
smem -r

# Show memory usage by user
smem -u

# Show memory usage graphically
smem --pie name -c "pss"

Disk I/O Monitoring with iostat and Friends

Disk performance issues can make your entire system feel sluggish. Here’s how to monitor and diagnose disk I/O problems.

Using iostat (part of sysstat):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Basic I/O statistics
iostat

# Update every 2 seconds, show 5 reports
iostat 2 5

# Show extended statistics
iostat -x

# Monitor specific devices
iostat -x sda sdb

Understanding iostat output:

1
2
Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %util
sda              2.45    8.32    156.78    445.23     0.12     2.34   15.67

Key metrics:

  • r/s, w/s: Read and write operations per second
  • rkB/s, wkB/s: Kilobytes read/written per second
  • %util: Percentage of time the device was busy
  • await: Average time for I/O requests (important for performance)

Using iotop to see which processes are using I/O:

1
2
3
4
5
6
# Install iotop
sudo apt install iotop    # Ubuntu/Debian
sudo dnf install iotop    # CentOS/RHEL/Fedora

# Run iotop (requires root privileges)
sudo iotop

iotop shows you exactly which processes are reading from and writing to disk, which is invaluable for finding I/O bottlenecks.

Alternative: using pidstat for I/O monitoring:

1
2
3
4
5
# Show I/O statistics for all processes
pidstat -d 1

# Show I/O statistics for a specific process
pidstat -d -p 1234 1

Network Performance Monitoring

Network issues can also cause performance problems. Here are some tools to monitor network activity:

Using netstat to see network connections:

1
2
3
4
5
6
7
8
# Show all network connections
netstat -tuln

# Show network statistics
netstat -i

# Show routing table
netstat -r

Using ss (modern replacement for netstat):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Show all TCP connections
ss -t

# Show all UDP connections  
ss -u

# Show listening ports
ss -l

# Show statistics
ss -s

Monitoring network traffic with iftop:

1
2
3
4
5
6
# Install iftop
sudo apt install iftop    # Ubuntu/Debian
sudo dnf install iftop    # CentOS/RHEL/Fedora

# Run iftop (requires root)
sudo iftop

iftop shows you real-time network traffic by connection, helping you identify which connections are using the most bandwidth.

Using nload for interface monitoring:

1
2
3
4
5
6
7
8
9
# Install nload
sudo apt install nload    # Ubuntu/Debian
sudo dnf install nload    # CentOS/RHEL/Fedora

# Monitor default interface
nload

# Monitor specific interface
nload eth0

Advanced Monitoring: Putting It All Together

Using atop for comprehensive monitoring:

1
2
3
4
5
6
# Install atop
sudo apt install atop    # Ubuntu/Debian
sudo dnf install atop    # CentOS/RHEL/Fedora

# Run atop
atop

atop combines CPU, memory, disk, and network monitoring in one tool. It’s particularly useful because it can log data for historical analysis.

Using glances for a dashboard view:

1
2
3
4
5
6
# Install glances
sudo apt install glances    # Ubuntu/Debian
sudo dnf install glances    # CentOS/RHEL/Fedora

# Run glances
glances

glances provides a comprehensive system overview in a single, colorful interface. It even has a web interface option:

1
2
3
# Run glances with web interface
glances -w
# Then visit http://localhost:61208

Practical Troubleshooting Scenarios

Let me walk you through some common performance issues and how to diagnose them:

Scenario 1: System is running slowly

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Step 1: Check overall system load
uptime

# Step 2: See what's using CPU
htop
# Look for processes with high %CPU

# Step 3: Check memory usage
free -h
# Is available memory very low?

# Step 4: Check disk I/O
iostat -x 1 5
# Is %util consistently high?

Scenario 2: Specific application is slow

1
2
3
4
5
6
7
# Find the process ID
pgrep application_name

# Monitor the specific process
pidstat -p PID 1
pidstat -d -p PID 1  # For I/O
pidstat -r -p PID 1  # For memory

Scenario 3: Intermittent performance issues

1
2
3
4
5
# Use sar to collect data over time
sar -u 60 1440 > cpu_usage.log  # Every minute for 24 hours

# Use atop for comprehensive logging
# atop automatically logs data every 10 minutes

Creating Your Own Monitoring Scripts

Sometimes you want to automate monitoring or create custom alerts. Here’s a simple script to check system health:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#!/bin/bash
# save as system_check.sh

echo "=== System Health Check ==="
echo "Date: $(date)"
echo

echo "Load Average:"
uptime

echo -e "\nMemory Usage:"
free -h

echo -e "\nDisk Usage:"
df -h | grep -E '^/dev'

echo -e "\nTop 5 CPU Processes:"
ps aux --sort=-%cpu | head -6

echo -e "\nTop 5 Memory Processes:"
ps aux --sort=-%mem | head -6

echo -e "\nDisk I/O:"
iostat -x | tail -n +4

Make it executable and run it:

1
2
chmod +x system_check.sh
./system_check.sh

Performance Monitoring Best Practices

Regular monitoring:

  • Check system performance regularly, not just when problems occur
  • Establish baselines for normal performance
  • Set up automated alerts for critical thresholds

Understanding normal vs. abnormal:

  • High CPU usage isn’t always bad – it might mean your system is doing useful work
  • Low free memory is normal on Linux – the system uses free memory for caching
  • Occasional I/O spikes are normal, but sustained high I/O might indicate problems

Know your system:

  • Different systems have different normal performance characteristics
  • A web server will have different patterns than a database server
  • Document what’s normal for your specific systems

Use multiple tools:

  • Don’t rely on just one tool
  • Cross-reference information from different sources
  • Some tools are better for real-time monitoring, others for historical analysis

When to Be Concerned

Here are some red flags that indicate performance problems:

CPU Issues:

  • Load average consistently above the number of CPU cores
  • High CPU usage with low actual work being done
  • Processes spending too much time waiting for CPU (high run queue)

Memory Issues:

  • Very low available memory (not just free memory)
  • Heavy swap usage (swapping to disk is very slow)
  • Out of memory errors in logs

Disk I/O Issues:

  • Very high disk utilization (>80% consistently)
  • High I/O wait times
  • Disk errors in system logs

Network Issues:

  • High packet loss
  • Consistently high bandwidth usage
  • Many connection timeouts

Wrapping Up

Performance monitoring in Linux might seem overwhelming at first, but once you understand the basic tools and what they’re telling you, it becomes much more manageable. Start with the basics like top and htop to get familiar with what normal performance looks like on your systems.

Remember that performance monitoring is both an art and a science. The numbers are important, but understanding what they mean in the context of your specific systems and workloads is even more important. Don’t just collect data – learn to interpret it and act on it.

The tools we’ve covered today – from the simple top command to advanced utilities like atop and glances – give you everything you need to monitor CPU, memory, disk, and network performance effectively. Practice using them regularly, not just during emergencies, and you’ll develop an intuitive understanding of system performance.

Most importantly, remember that performance monitoring is about solving problems and improving user experience. These tools are your diagnostic instruments, helping you keep systems running smoothly and users happy. Master them, and you’ll be well-equipped to handle whatever performance challenges come your way.