RAM Is Finite — But Your Server Doesn't Have to Crash
At some point, every server runs out of memory. It happens on a Tuesday afternoon when traffic spikes, or at 3 AM when a memory-leaking process finally eats the last gigabyte. What happens next — whether the server recovers gracefully or grinds to a halt — depends entirely on how well you've configured memory management.
Linux has a sophisticated memory subsystem. It caches aggressively, uses swap as an overflow valve, and has the OOM Killer as a last resort. Most admins know these exist, but few understand them well enough to tune them properly. This guide changes that.
We'll cover how Linux actually uses memory, what every metric in free -h means, how to create and tune swap, how the OOM Killer decides what to kill, and how cgroups v2 lets you set hard limits per user or service. At the end, you'll have a troubleshooting playbook for the four most common memory crisis scenarios.
1. How Linux Uses Memory
Physical RAM vs Virtual Memory
Every process in Linux operates in a virtual address space. The kernel maps these virtual addresses to physical RAM through page tables. A page is typically 4KB. This abstraction lets the kernel do clever things: share memory between processes, move pages to swap, and use free RAM for caching — all transparently.
Memory Zones
Physical RAM is divided into zones based on hardware constraints:
- DMA zone — Low memory (first 16MB) reserved for legacy DMA devices
- Normal zone — The main memory pool for most allocations
- HighMem zone — On 32-bit systems only; irrelevant on modern 64-bit servers
Buffer Cache vs Page Cache
Linux is famous for using "all" available RAM. Two main caches are responsible:
- Page cache — Recently read file data. When Nginx serves a file, its contents stay in cache. Next request: served from RAM, not disk.
- Buffer cache — Block device metadata (filesystem structures, inodes). Merged with page cache in modern kernels, but reported separately in
/proc/meminfo.
This is intentional and beneficial. Free RAM is wasted RAM. Linux fills it with cache to speed up future reads.
Reading free -h Correctly
total used free shared buff/cache available
Mem: 7.8Gi 2.1Gi 512Mi 128Mi 5.2Gi 5.4Gi
Swap: 4.0Gi 0B 4.0Gi
Here's what each column actually means:
- total — Total physical RAM installed
- used — RAM in active use by processes (total - free - buff/cache)
- free — Completely unused RAM (no cache, no process)
- shared — Memory used by tmpfs (shared memory segments)
- buff/cache — Memory used for buffers and page cache — reclaimable if needed
- available — Estimate of how much RAM is available for new processes without swapping
The column that matters is available, not free. In the example above, even though only 512MB shows as free, the system has 5.4GB available because the kernel can reclaim cache on demand.
A server showing "90% used" in some monitoring dashboards is usually fine — they're reporting used / total without accounting for reclaimable cache. Always check available.
2. Understanding Memory Metrics in Depth
/proc/meminfo — The Full Picture
While free gives you a summary, /proc/meminfo gives you every metric the kernel tracks:
cat /proc/meminfo
Key fields to understand:
- MemTotal — Total usable RAM (slightly less than installed due to kernel reservations)
- MemFree — Completely unallocated RAM
- MemAvailable — Estimated available for applications (the important one)
- Buffers — Raw disk block buffers
- Cached — Page cache (file data)
- SwapTotal / SwapFree — Swap space total and remaining
- Active — Recently used memory, less likely to be reclaimed
- Inactive — Memory not recently used, candidate for reclaim
- Dirty — Memory waiting to be written to disk (unsynced writes)
- Writeback — Memory currently being written to disk
- Slab — Kernel data structures (dentries, inodes)
- SReclaimable — Slab entries that can be reclaimed under pressure
- SUnreclaim — Slab entries that cannot be reclaimed
vmstat — Memory, CPU, and I/O in Real-Time
vmstat 1 5
This prints stats every 1 second, 5 times. The memory columns:
- swpd — Amount of virtual memory used (swap)
- free — Idle memory
- buff — Buffer cache
- cache — Page cache
- si — Swap in (pages read from swap per second) — bad if non-zero persistently
- so — Swap out (pages written to swap per second) — bad if non-zero persistently
Persistent non-zero si and so is a clear sign your server needs more RAM or better-tuned memory limits.
RSS vs VSZ vs PSS vs USS
When you look at per-process memory, the numbers can be confusing. Here's the breakdown:
| Metric | What It Measures | Includes Shared? |
|---|---|---|
| VSZ (Virtual Size) | Total virtual address space claimed | Yes (including not yet allocated) |
| RSS (Resident Set Size) | Physical RAM currently in use | Yes (counts shared pages fully) |
| PSS (Proportional Set Size) | RSS with shared pages split proportionally | Partial (fair share of shared) |
| USS (Unique Set Size) | Memory truly private to this process | No |
For understanding true memory consumption, USS is the most honest metric. Use smem -tk to see per-process USS/PSS/RSS sorted by totals.
# Install smem
apt install smem
# Sort by USS descending
smem -tk -s uss
3. What Is SWAP?
Swap is disk space (a file or partition) that Linux uses as overflow when physical RAM fills up. When the kernel needs to allocate memory and RAM is full, it takes the least recently used pages and writes them to swap — freeing that physical RAM for the new allocation.
How Swapping Works
The kernel tracks which memory pages are "hot" (recently accessed) and which are "cold" (not accessed in a while). When under memory pressure, it moves cold anonymous pages (process heap/stack — not file-backed) to swap. If that process later needs those pages, they're read back from swap — causing a page fault, which is slow.
This is the key tradeoff: swap lets your server survive a memory spike, but with degraded performance. An application that's actively swapping will feel sluggish because disk I/O (even NVMe) is orders of magnitude slower than RAM.
Swap Partition vs Swap File
| Feature | Swap Partition | Swap File |
|---|---|---|
| Performance | Marginally faster | Nearly identical on modern kernels |
| Flexibility | Fixed size, hard to resize | Can be resized, moved, added anytime |
| Setup complexity | Requires partition planning | Three commands |
| Recommended for | Bare metal with planned layout | VPS, cloud, containers |
| Hibernate support | Full | Possible with extra config |
For most VPS and cloud servers, a swap file is the right choice. It's just as fast, and you can add or resize it without touching partitions.
SSD vs HDD Swap
On an SSD, swap is tolerable for occasional spikes. On an HDD, heavy swapping will make your server feel frozen. If you're on an HDD VPS, swap is still better than an OOM crash, but it's not a performance safety net.
5. Partitioning with fdisk and gdisk
Critical warning: Never partition a disk that is mounted and in active use. For existing server disks, use LVM (covered next) to avoid downtime.
List All Partitions
# List all disks and partitions
fdisk -l
# Or with lsblk (tree view, very readable)
lsblk
# Sample lsblk output:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 50G 0 disk
|--sda1 8:1 0 49G 0 part /
|--sda2 8:2 0 1G 0 part [SWAP]
sdb 8:16 0 500G 0 disk
fdisk
# Open fdisk on a new disk
fdisk /dev/sdb
# Interactive commands:
# n -- New partition
# d -- Delete partition
# p -- Print current partition table
# w -- Write changes and exit
# q -- Quit without saving
Example: Creating a single partition on /dev/sdb:
Command: n
Partition type: p (primary)
Partition number: 1
First sector: [Enter for default]
Last sector: [Enter for full disk]
Command: w
gdisk — GPT Partitioning
gdisk works identically to fdisk but always uses GPT. Use gdisk for disks larger than 2TB or any new installation.
gdisk /dev/sdb
parted — Modern Alternative
# Create GPT partition table
parted /dev/sdb mklabel gpt
# Create a single partition using 100% of disk
parted /dev/sdb mkpart primary ext4 0% 100%
# List partitions
parted /dev/sdb print
6. Creating and Managing Filesystems
Creating Filesystems
# Create ext4 filesystem
mkfs.ext4 /dev/sdb1
# Create ext4 with a label
mkfs.ext4 -L data_disk /dev/sdb1
# Create XFS filesystem
mkfs.xfs /dev/sdb1
# Create btrfs filesystem
mkfs.btrfs /dev/sdb1
Mounting Filesystems
# Mount a partition
mount /dev/sdb1 /mnt/data
# Mount with options
mount -o noatime,data=writeback /dev/sdb1 /mnt/data
# Unmount
umount /mnt/data
# Lazy unmount (only when nothing else works)
umount -l /mnt/data
Finding UUIDs for fstab
# Get UUID of all block devices
blkid
# Sample output:
/dev/sda1: UUID=a1b2c3d4-e5f6-7890-abcd-ef1234567890 TYPE=ext4
/dev/sdb1: UUID=f1e2d3c4-b5a6-9870-fedc-ba9876543210 TYPE=ext4
# Get UUID of specific device
blkid /dev/sdb1
Persistent Mount via /etc/fstab
# Add to /etc/fstab:
UUID=f1e2d3c4-b5a6-9870-fedc-ba9876543210 /mnt/data ext4 defaults,noatime 0 2
# Test without rebooting
mount -a
# Verify
df -h /mnt/data
Filesystem Information and Health
# ext4 filesystem info
tune2fs -l /dev/sda1
# XFS filesystem info (must be mounted)
xfs_info /mnt/data
# Check and repair ext4 (unmounted only!)
fsck.ext4 -f /dev/sdb1
# Check XFS (unmounted)
xfs_repair /dev/sdb1
9. Memory Optimization Tips
Right-Size Your Services
The biggest wins come from tuning application memory usage, not just adding RAM:
- PHP-FPM
pm.max_children— Each worker uses ~30–100MB. With 2GB allocated to PHP, set max_children to 20, not 100. - MySQL
innodb_buffer_pool_size— Should be 50–70% of dedicated RAM. On a 4GB server, set it to 2GB. - PostgreSQL
shared_buffers— 25% of RAM as a starting point.effective_cache_sizeshould be 75% of RAM. - Redis
maxmemory— Always set this. Without it, Redis will grow until the OOM Killer hits it.
Use cgroups for Hard Limits
Don't rely on application-level limits alone. cgroups v2 memory.max is enforced by the kernel — no application can override it.
Enable Swap Even on High-RAM Servers
Even on a 64GB server, a small 2–4GB swap serves as a safety valve. Without it, the OOM Killer activates the moment any spike exceeds available RAM. With it, the kernel can move cold anonymous pages to swap first, giving you time to react.
jemalloc and tcmalloc
MySQL, Redis, and some other applications can use alternative memory allocators that reduce fragmentation and improve performance under load:
# Install jemalloc
sudo apt install libjemalloc-dev
# MySQL: add to /etc/mysql/mysql.conf.d/mysqld.cnf
# malloc-lib=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2
Transparent Hugepages and Redis
Transparent Huge Pages (THP) are enabled by default on Linux. They help database workloads with large sequential memory access patterns, but hurt Redis performance (causes latency spikes on write operations).
# Disable THP for Redis (temporary)
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
# Make permanent — add to /etc/rc.local or a systemd unit
PostgreSQL and Hugepages
For PostgreSQL on dedicated database servers, enabling hugepages reduces page table overhead:
# Calculate hugepages needed:
# shared_buffers / 2MB (default hugepage size)
# e.g., 8GB shared_buffers = 4096 hugepages
# Set in /etc/sysctl.conf:
vm.nr_hugepages = 4096
# Enable in postgresql.conf:
# huge_pages = on
10. Monitoring Memory
The right monitoring setup catches memory issues before they become outages.
Quick Command Reference
# Snapshot
free -h
# 5-second trend
vmstat 1 5
# Per-process with accurate metrics
smem -tk -s uss
# Full kernel metrics
cat /proc/meminfo
# Swap activity
swapon --show
Alert Thresholds
| Metric | Warning | Critical | Action |
|---|---|---|---|
| MemAvailable | < 20% of total | < 10% of total | Identify and limit high-memory processes |
| Swap used | > 20% | > 60% | Memory pressure — check vmstat si/so |
| vmstat si/so | Occasional spikes | Persistent > 10 MB/s | Active swapping — needs immediate attention |
| OOM kill events | Any occurrence | Repeated events | Investigate immediately |
Prometheus + Grafana
For production systems, point-in-time metrics aren't enough. You need historical trends. The standard stack:
- node_exporter — Exposes Linux memory metrics as Prometheus targets
- Prometheus — Scrapes and stores the metrics
- Grafana — Visualizes trends with alerting
Key Prometheus metrics for memory:
node_memory_MemAvailable_bytes— The one to alert onnode_memory_SwapFree_bytes / node_memory_SwapTotal_bytes— Swap usage rationode_vmstat_pswpin/node_vmstat_pswpout— Swap I/O rate
Quick Reference Cheat Sheet
Essential Commands
| Task | Command |
|---|---|
| Memory overview | free -h |
| Full memory details | cat /proc/meminfo |
| Real-time I/O + swap activity | vmstat 1 |
| Per-process accurate usage | smem -tk -s uss |
| Top by memory | top then Shift+M |
| Check OOM kills | dmesg | grep -i oom |
| Check OOM score | cat /proc/PID/oom_score |
| Protect process from OOM | echo -1000 > /proc/PID/oom_score_adj |
| Check swappiness | cat /proc/sys/vm/swappiness |
| Set swappiness (temp) | sysctl vm.swappiness=10 |
| Create 4GB swap file | fallocate -l 4G /swapfile |
| Enable swap | swapon /swapfile |
| List active swap | swapon --show |
| Disable swap | swapoff /swapfile |
Kernel Tuning Cheat Sheet
| Parameter | Default | Web Server | Database |
|---|---|---|---|
| vm.swappiness | 60 | 10 | 1–5 |
| vm.vfs_cache_pressure | 100 | 50 | 50 |
| vm.dirty_ratio | 20 | 15 | 10 |
| vm.dirty_background_ratio | 10 | 5 | 3 |
| vm.min_free_kbytes | ~4096 | 65536 | 65536 |
Memory Management in Panelica
Everything described in this guide can be configured and monitored through Panelica's interface without touching the command line:
- Per-user cgroup limits — Set
memory.maxper user from the user panel. Each user's processes run in an isolated cgroups v2 slice. No user can consume more RAM than their quota. - Real-time monitoring — Dashboard shows per-user memory usage, overall server memory trends, and swap usage with historical graphs via Prometheus + Grafana integration.
- PHP-FPM per-user pools — Each user gets their own PHP-FPM pool with configurable
pm.max_children. Memory leaks in one user's PHP process can't exhaust RAM for everyone else. - Automated alerts — Set thresholds for memory usage and receive webhook, Telegram, Slack, or email notifications when a server or user crosses them.
The 5-layer isolation architecture means that memory management isn't just a system-level concern — it's enforced at the user boundary. This is the difference between shared hosting that crashes under load and shared hosting that keeps running because individual users have hard limits.
Conclusion
Memory management is not glamorous, but it's one of the foundations of a stable server. Understanding what free -h is actually telling you, why Linux uses "all" your RAM for cache, and how to configure swap and OOM behavior correctly will save you from outages that catch most admins off guard.
The key takeaways:
- Watch
available, notfree— high memory usage infree -his usually cache, not a problem - Enable swap on every server — even 2GB on a high-RAM machine is a useful safety net
- Lower swappiness on servers (10–20 for web, 1–5 for databases)
- Protect critical processes from the OOM Killer via
oom_score_adj - Use cgroups v2 limits to isolate users and services from each other
- Monitor trends, not just snapshots — a gradual memory leak looks fine until it doesn't
A server that handles memory pressure gracefully — using swap as a buffer, throttling at soft limits, protecting critical processes — is a server you can trust at 3 AM.