There are few things more stressful than a server running out of disk space in production. Services refuse to start, databases cannot write, logs stop recording, and deployments fail silently. The worst part? It usually happens at 3 AM on a Saturday. In this guide, you will learn how to find what is consuming your disk, clean up safely, and set up proactive monitoring so you never get caught off guard again. We will cover df, du, ncdu, log management, and LVM basics for expanding storage on the fly.
Understanding Your Disk: df
The df (disk free) command shows filesystem-level disk usage. It is the first command you should run when you suspect a disk space problem.
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 100G 87G 8.2G 92% /
/dev/sdb1 500G 312G 163G 66% /home
tmpfs 3.9G 1.2M 3.9G 1% /run
/dev/sdc1 50G 2.1G 45G 5% /backup
The critical column is Use%. When a filesystem hits 100%, things break. But problems often start at 90% because some services (like MySQL) need breathing room for temporary files.
Warning thresholds: Set up alerts at 80% (warning) and 90% (critical). At 95%, many services begin to fail. At 100%, your server can become unresponsive, requiring emergency intervention.
Useful df Variations
$ df -h / # Only show root filesystem
$ df -h --type=ext4 # Only ext4 filesystems
$ df -i # Show inode usage (not bytes)
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda1 6553600 342891 6210709 6% /
Inode exhaustion: A disk can show "free space" with df -h but still refuse to create files if inodes are exhausted (df -i shows 100%). This happens when millions of tiny files consume all available inodes. Common culprits: mail queue directories, session files, and cache directories with one file per entry.
Finding Large Directories: du
Once you know which filesystem is full, du (disk usage) helps you find where the space is going.
$ du -sh /var/*
2.4G /var/log
1.8G /var/lib
512M /var/cache
256M /var/www
128M /var/tmp
64M /var/mail
$ du -sh /var/log/* | sort -rh | head -10
1.2G /var/log/journal
450M /var/log/nginx
312M /var/log/syslog
128M /var/log/mysql
64M /var/log/auth.log
Drilling Down Efficiently
The key technique is to start broad and narrow down. Here is a systematic approach:
1
Start at root: du -sh /* 2>/dev/null | sort -rh | head to find the biggest top-level directory.
2
Drill into the biggest: du -sh /var/* | sort -rh | head
3
Keep going: du -sh /var/log/* | sort -rh | head
4
Find specific large files: find /var/log -size +100M -type f
$ du -sh /* 2>/dev/null | sort -rh | head -5
45G /home
12G /var
8.2G /opt
4.1G /usr
2.3G /snap
Interactive Exploration: ncdu
If du feels tedious, ncdu (NCurses Disk Usage) is a game-changer. It provides an interactive, browsable view of disk usage with keyboard navigation.
$ sudo apt install ncdu
$ ncdu / # Scan entire root filesystem
$ ncdu /var/log # Scan specific directory
$ ncdu -x / # Stay on same filesystem (skip mounts)
ncdu Keyboard Shortcuts
| Key | Action |
| Arrow keys | Navigate |
| Enter | Enter directory |
| d | Delete selected file/directory |
| n | Sort by name |
| s | Sort by size |
| g | Show percentage/graph |
| q | Quit |
Why ncdu is Better Than du
- Interactive — navigate without rerunning commands
- Visual bars show relative sizes at a glance
- Can delete files directly from the interface
- Scans once, then exploration is instant
- The
-x flag prevents scanning mounted volumes
Finding Large Files
Sometimes you need to find the biggest individual files, not directories. The find command excels at this.
# Find files larger than 100MB anywhere on the system
$ find / -type f -size +100M -exec ls -lh {} \; 2>/dev/null | sort -k5 -rh | head -20
-rw-r--r-- 1 root root 2.1G /var/log/journal/abc123/system.journal
-rw-r----- 1 mysql mysql 1.2G /var/lib/mysql/ibdata1
-rw-r--r-- 1 root root 856M /var/log/syslog.1
-rw-r--r-- 1 root root 512M /home/user1/backup-old.tar.gz
-rw-r--r-- 1 root root 256M /var/log/nginx/access.log
# Find files modified more than 90 days ago and larger than 50MB
$ find /var/log -type f -size +50M -mtime +90
/var/log/syslog.12.gz
/var/log/nginx/access.log.52.gz
# Find and list recently modified large files
$ find / -type f -size +500M -mtime -7 2>/dev/null
Safe Cleanup Strategies
Before deleting anything, understand what you are removing. Here are the safest cleanup targets, ordered from most to least aggressive.
1. Journal Logs
Systemd journal logs can grow to several gigabytes. This is the safest cleanup because the journal automatically recreates what it needs.
$ journalctl --disk-usage
Archived and active journals take up 2.4G
$ sudo journalctl --vacuum-size=500M
Vacuuming done, freed 1.9G of archived journals
$ sudo journalctl --vacuum-time=2weeks # Alternative: by age
To set a permanent limit, edit /etc/systemd/journald.conf:
[Journal]
SystemMaxUse=500M
MaxRetentionSec=2week
$ sudo systemctl restart systemd-journald
2. APT Package Cache
$ du -sh /var/cache/apt/
1.2G /var/cache/apt/
$ sudo apt clean # Remove ALL cached packages
$ sudo apt autoclean # Remove only obsolete packages
$ sudo apt autoremove # Remove unused dependencies
3. Old Kernels
Ubuntu keeps old kernel versions for rollback. After confirming your current kernel boots reliably, remove old ones.
$ uname -r # Current kernel
6.8.0-101-generic
$ dpkg -l 'linux-image-*' | grep '^ii' # Installed kernels
$ sudo apt autoremove --purge # Remove old kernels
4. Snap Cache
$ du -sh /var/lib/snapd/cache/
856M /var/lib/snapd/cache/
$ sudo sh -c 'rm -rf /var/lib/snapd/cache/*'
5. /tmp Cleanup
$ du -sh /tmp
2.3G /tmp
# Remove files older than 7 days
$ sudo find /tmp -type f -atime +7 -not -name '.' -delete
Be careful with /tmp: Some applications store runtime files in /tmp that they expect to persist during their lifecycle. Never blindly rm -rf /tmp/* on a running system. Use -atime +7 (not accessed in 7 days) to only remove stale files.
Log Rotation
The best long-term solution for log file growth is proper rotation. logrotate compresses old logs and removes them on a schedule.
$ cat /etc/logrotate.d/nginx
/var/log/nginx/*.log {
daily
missingok
rotate 14
compress
delaycompress
notifempty
create 0640 www-data adm
sharedscripts
postrotate
[ ! -f /var/run/nginx.pid ] || kill -USR1 $(cat /var/run/nginx.pid)
endscript
}
| Directive | Meaning |
daily | Rotate every day |
rotate 14 | Keep 14 rotated files |
compress | Gzip old logs |
delaycompress | Wait one cycle before compressing |
notifempty | Do not rotate if the log is empty |
maxsize 100M | Rotate if file exceeds 100 MB |
postrotate | Command to run after rotation (reload service) |
Test your logrotate configuration without actually rotating:
$ sudo logrotate --debug /etc/logrotate.d/nginx
$ sudo logrotate --force /etc/logrotate.d/nginx # Force rotation now
Docker Cleanup
Docker is one of the most common disk space hogs. Images, containers, volumes, and build cache accumulate quickly.
$ docker system df
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 24 8 12.5GB 8.2GB (65%)
Containers 12 5 1.2GB 800MB (66%)
Volumes 18 7 4.5GB 2.1GB (46%)
Build Cache 45 0 3.8GB 3.8GB (100%)
# Remove unused images, containers, networks
$ docker system prune
# More aggressive: also remove unused volumes
$ docker system prune --volumes
# Remove ALL images not used by running containers
$ docker image prune -a
# Remove dangling (untagged) images only
$ docker image prune
Automate Docker cleanup: Add a cron job or systemd timer to run docker system prune -f --filter "until=168h" weekly. This removes resources older than 7 days without confirmation prompts.
LVM Basics: Expanding Storage
LVM (Logical Volume Manager) adds a layer of abstraction between physical disks and filesystems. If your server uses LVM (common on cloud providers), you can expand storage without downtime.
LVM Architecture
Physical Volumes
(PV: /dev/sda1, /dev/sdb)
→
Volume Group
(VG: pool of storage)
→
Logical Volumes
(LV: like partitions)
→
Filesystem
(ext4, xfs)
Check Current LVM Layout
$ sudo pvs # Physical Volumes
PV VG Fmt Attr PSize PFree
/dev/sda3 vg-main lvm2 a-- 198.00g 48.00g
$ sudo vgs # Volume Groups
VG #PV #LV #SN Attr VSize VFree
vg-main 1 2 0 wz--n- 198.00g 48.00g
$ sudo lvs # Logical Volumes
LV VG Attr LSize Pool
root vg-main -wi-ao---- 100.00g
home vg-main -wi-ao---- 50.00g
Expanding a Logical Volume
If your Volume Group has free space (see PFree or VFree above), you can expand a Logical Volume online.
$ sudo lvextend -L +20G /dev/vg-main/root # Add 20 GB
Size of logical volume vg-main/root changed from 100.00 GiB to 120.00 GiB
# Or use all remaining free space:
$ sudo lvextend -l +100%FREE /dev/vg-main/root
# For ext4:
$ sudo resize2fs /dev/vg-main/root
# For XFS:
$ sudo xfs_growfs /dev/vg-main/root
# Verify:
$ df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg--main-root 120G 87G 28G 76% /
Zero downtime: Both resize2fs (ext4) and xfs_growfs (XFS) support online resizing. You do not need to unmount the filesystem or reboot the server. This is one of LVM's greatest advantages.
Adding a New Disk to LVM
If the Volume Group is full, you can add a new physical disk and extend the group.
# Initialize the new disk as a Physical Volume
$ sudo pvcreate /dev/sdb
# Add it to the existing Volume Group
$ sudo vgextend vg-main /dev/sdb
# Now extend the LV with the new space
$ sudo lvextend -l +100%FREE /dev/vg-main/root
$ sudo resize2fs /dev/vg-main/root
Monitoring Disk Usage Over Time
Reactive cleanup is good, but proactive monitoring is better. Here are approaches from simple to comprehensive.
Simple Cron Alert
# Add to root's crontab: check every hour, alert if > 90%
$ crontab -e
0 * * * * df -h / | awk 'NR==2 {gsub(/%/,"",$5); if($5>90) print "DISK WARNING: "$5"% used"}' | mail -s "Disk Alert" [email protected]
Quick Disk Report Script
#!/bin/bash
echo "=== Filesystem Usage ==="
df -h | grep -v tmpfs
echo ""
echo "=== Top 10 Directories ==="
du -sh /* 2>/dev/null | sort -rh | head -10
echo ""
echo "=== Files > 500MB ==="
find / -type f -size +500M 2>/dev/null | head -10
echo ""
echo "=== Journal Size ==="
journalctl --disk-usage
Cleanup Checklist
When your disk is full, work through this checklist in order:
| Action | Typical Space Saved | Risk |
Journal vacuum (journalctl --vacuum-size=200M) | 500 MB - 5 GB | None |
APT cache (apt clean) | 500 MB - 2 GB | None |
Old kernels (apt autoremove --purge) | 500 MB - 2 GB | Low |
Rotated logs (find /var/log -name "*.gz" -mtime +30 -delete) | 200 MB - 2 GB | Low |
Docker cleanup (docker system prune --volumes) | 2 - 20 GB | Medium |
| Snap cache cleanup | 500 MB - 2 GB | Low |
| /tmp old files | 100 MB - 5 GB | Medium |
User backup files (find /home -name "*.tar.gz" -mtime +60) | 1 - 50 GB | High |
Deleted Files Still Using Space
Gotcha: If you delete a file that is still open by a process, the disk space is NOT freed until the process closes the file handle. This is a common reason why rm does not free space. Find these "deleted but open" files with:
$ sudo lsof +L1 | head -20
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
nginx 1102 root 5w REG 8,1 524288000 0 1234 /var/log/nginx/access.log (deleted)
# Fix: restart the service to release the file handle
$ sudo systemctl restart nginx
Summary
- Use
df -h for filesystem overview and du -sh for directory-level analysis
- Install
ncdu for interactive, visual disk exploration
- Start cleanup with safe targets: journal logs, APT cache, old kernels
- Configure
logrotate to prevent log files from growing unbounded
- Use LVM to expand storage without downtime when cleanup is not enough
- Check for deleted-but-open files with
lsof +L1 when space is not freed after deletion
- Set up monitoring and alerts at 80% and 90% thresholds
Disk management at scale: Panelica provides per-user disk quota management and a monitoring dashboard that shows storage usage across all accounts. Administrators can set disk limits per user and receive alerts before quotas are reached, preventing any single user from filling the server's disk.