SWAP and Memory Management in Linux: When Your Server Runs Out of RAM

RAM Is Finite — But Your Server Doesn't Have to Crash

At some point, every server runs out of memory. It happens on a Tuesday afternoon when traffic spikes, or at 3 AM when a memory-leaking process finally eats the last gigabyte. What happens next — whether the server recovers gracefully or grinds to a halt — depends entirely on how well you've configured memory management.

Linux has a sophisticated memory subsystem. It caches aggressively, uses swap as an overflow valve, and has the OOM Killer as a last resort. Most admins know these exist, but few understand them well enough to tune them properly. This guide changes that.

We'll cover how Linux actually uses memory, what every metric in free -h means, how to create and tune swap, how the OOM Killer decides what to kill, and how cgroups v2 lets you set hard limits per user or service. At the end, you'll have a troubleshooting playbook for the four most common memory crisis scenarios.

1. How Linux Uses Memory

Physical RAM vs Virtual Memory

Every process in Linux operates in a virtual address space. The kernel maps these virtual addresses to physical RAM through page tables. A page is typically 4KB. This abstraction lets the kernel do clever things: share memory between processes, move pages to swap, and use free RAM for caching — all transparently.

Memory Zones

Physical RAM is divided into zones based on hardware constraints:

DMA zone — Low memory (first 16MB) reserved for legacy DMA devices
Normal zone — The main memory pool for most allocations
HighMem zone — On 32-bit systems only; irrelevant on modern 64-bit servers

Buffer Cache vs Page Cache

Linux is famous for using "all" available RAM. Two main caches are responsible:

Page cache — Recently read file data. When Nginx serves a file, its contents stay in cache. Next request: served from RAM, not disk.
Buffer cache — Block device metadata (filesystem structures, inodes). Merged with page cache in modern kernels, but reported separately in /proc/meminfo.

This is intentional and beneficial. Free RAM is wasted RAM. Linux fills it with cache to speed up future reads.

Reading `free -h` Correctly

              total        used        free      shared  buff/cache   available
Mem:          7.8Gi       2.1Gi       512Mi       128Mi       5.2Gi       5.4Gi
Swap:         4.0Gi          0B       4.0Gi

Here's what each column actually means:

total — Total physical RAM installed
used — RAM in active use by processes (total - free - buff/cache)
free — Completely unused RAM (no cache, no process)
shared — Memory used by tmpfs (shared memory segments)
buff/cache — Memory used for buffers and page cache — reclaimable if needed
available — Estimate of how much RAM is available for new processes without swapping

The column that matters is available, not free. In the example above, even though only 512MB shows as free, the system has 5.4GB available because the kernel can reclaim cache on demand.

A server showing "90% used" in some monitoring dashboards is usually fine — they're reporting used / total without accounting for reclaimable cache. Always check available.

2. Understanding Memory Metrics in Depth

/proc/meminfo — The Full Picture

While free gives you a summary, /proc/meminfo gives you every metric the kernel tracks:

cat /proc/meminfo

Key fields to understand:

MemTotal — Total usable RAM (slightly less than installed due to kernel reservations)
MemFree — Completely unallocated RAM
MemAvailable — Estimated available for applications (the important one)
Buffers — Raw disk block buffers
Cached — Page cache (file data)
SwapTotal / SwapFree — Swap space total and remaining
Active — Recently used memory, less likely to be reclaimed
Inactive — Memory not recently used, candidate for reclaim
Dirty — Memory waiting to be written to disk (unsynced writes)
Writeback — Memory currently being written to disk
Slab — Kernel data structures (dentries, inodes)
SReclaimable — Slab entries that can be reclaimed under pressure
SUnreclaim — Slab entries that cannot be reclaimed

vmstat — Memory, CPU, and I/O in Real-Time

vmstat 1 5

This prints stats every 1 second, 5 times. The memory columns:

swpd — Amount of virtual memory used (swap)
free — Idle memory
buff — Buffer cache
cache — Page cache
si — Swap in (pages read from swap per second) — bad if non-zero persistently
so — Swap out (pages written to swap per second) — bad if non-zero persistently

Persistent non-zero si and so is a clear sign your server needs more RAM or better-tuned memory limits.

RSS vs VSZ vs PSS vs USS

When you look at per-process memory, the numbers can be confusing. Here's the breakdown:

Metric	What It Measures	Includes Shared?
VSZ (Virtual Size)	Total virtual address space claimed	Yes (including not yet allocated)
RSS (Resident Set Size)	Physical RAM currently in use	Yes (counts shared pages fully)
PSS (Proportional Set Size)	RSS with shared pages split proportionally	Partial (fair share of shared)
USS (Unique Set Size)	Memory truly private to this process	No

For understanding true memory consumption, USS is the most honest metric. Use smem -tk to see per-process USS/PSS/RSS sorted by totals.

# Install smem
apt install smem

# Sort by USS descending
smem -tk -s uss

3. What Is SWAP?

Swap is disk space (a file or partition) that Linux uses as overflow when physical RAM fills up. When the kernel needs to allocate memory and RAM is full, it takes the least recently used pages and writes them to swap — freeing that physical RAM for the new allocation.

How Swapping Works

The kernel tracks which memory pages are "hot" (recently accessed) and which are "cold" (not accessed in a while). When under memory pressure, it moves cold anonymous pages (process heap/stack — not file-backed) to swap. If that process later needs those pages, they're read back from swap — causing a page fault, which is slow.

This is the key tradeoff: swap lets your server survive a memory spike, but with degraded performance. An application that's actively swapping will feel sluggish because disk I/O (even NVMe) is orders of magnitude slower than RAM.

Swap Partition vs Swap File

Feature	Swap Partition	Swap File
Performance	Marginally faster	Nearly identical on modern kernels
Flexibility	Fixed size, hard to resize	Can be resized, moved, added anytime
Setup complexity	Requires partition planning	Three commands
Recommended for	Bare metal with planned layout	VPS, cloud, containers
Hibernate support	Full	Possible with extra config

For most VPS and cloud servers, a swap file is the right choice. It's just as fast, and you can add or resize it without touching partitions.

SSD vs HDD Swap

On an SSD, swap is tolerable for occasional spikes. On an HDD, heavy swapping will make your server feel frozen. If you're on an HDD VPS, swap is still better than an OOM crash, but it's not a performance safety net.

5. Partitioning with fdisk and gdisk

Critical warning: Never partition a disk that is mounted and in active use. For existing server disks, use LVM (covered next) to avoid downtime.

List All Partitions

# List all disks and partitions
fdisk -l

# Or with lsblk (tree view, very readable)
lsblk

# Sample lsblk output:
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda           8:0    0   50G  0 disk
|--sda1        8:1    0   49G  0 part /
|--sda2        8:2    0    1G  0 part [SWAP]
sdb           8:16   0  500G  0 disk

fdisk

# Open fdisk on a new disk
fdisk /dev/sdb

# Interactive commands:
# n -- New partition
# d -- Delete partition
# p -- Print current partition table
# w -- Write changes and exit
# q -- Quit without saving

Example: Creating a single partition on /dev/sdb:

Command: n
Partition type: p (primary)
Partition number: 1
First sector: [Enter for default]
Last sector: [Enter for full disk]
Command: w

gdisk — GPT Partitioning

gdisk works identically to fdisk but always uses GPT. Use gdisk for disks larger than 2TB or any new installation.

gdisk /dev/sdb

parted — Modern Alternative

# Create GPT partition table
parted /dev/sdb mklabel gpt

# Create a single partition using 100% of disk
parted /dev/sdb mkpart primary ext4 0% 100%

# List partitions
parted /dev/sdb print

6. Creating and Managing Filesystems

Creating Filesystems

# Create ext4 filesystem
mkfs.ext4 /dev/sdb1

# Create ext4 with a label
mkfs.ext4 -L data_disk /dev/sdb1

# Create XFS filesystem
mkfs.xfs /dev/sdb1

# Create btrfs filesystem
mkfs.btrfs /dev/sdb1

Mounting Filesystems

# Mount a partition
mount /dev/sdb1 /mnt/data

# Mount with options
mount -o noatime,data=writeback /dev/sdb1 /mnt/data

# Unmount
umount /mnt/data

# Lazy unmount (only when nothing else works)
umount -l /mnt/data

Finding UUIDs for fstab

# Get UUID of all block devices
blkid

# Sample output:
/dev/sda1: UUID=a1b2c3d4-e5f6-7890-abcd-ef1234567890 TYPE=ext4
/dev/sdb1: UUID=f1e2d3c4-b5a6-9870-fedc-ba9876543210 TYPE=ext4

# Get UUID of specific device
blkid /dev/sdb1

Persistent Mount via /etc/fstab

# Add to /etc/fstab:
UUID=f1e2d3c4-b5a6-9870-fedc-ba9876543210  /mnt/data  ext4  defaults,noatime  0  2

# Test without rebooting
mount -a

# Verify
df -h /mnt/data

Filesystem Information and Health

# ext4 filesystem info
tune2fs -l /dev/sda1

# XFS filesystem info (must be mounted)
xfs_info /mnt/data

# Check and repair ext4 (unmounted only!)
fsck.ext4 -f /dev/sdb1

# Check XFS (unmounted)
xfs_repair /dev/sdb1

9. Memory Optimization Tips

Right-Size Your Services

The biggest wins come from tuning application memory usage, not just adding RAM:

PHP-FPM pm.max_children — Each worker uses ~30–100MB. With 2GB allocated to PHP, set max_children to 20, not 100.
MySQL innodb_buffer_pool_size — Should be 50–70% of dedicated RAM. On a 4GB server, set it to 2GB.
PostgreSQL shared_buffers — 25% of RAM as a starting point. effective_cache_size should be 75% of RAM.
Redis maxmemory — Always set this. Without it, Redis will grow until the OOM Killer hits it.

Use cgroups for Hard Limits

Don't rely on application-level limits alone. cgroups v2 memory.max is enforced by the kernel — no application can override it.

Enable Swap Even on High-RAM Servers

Even on a 64GB server, a small 2–4GB swap serves as a safety valve. Without it, the OOM Killer activates the moment any spike exceeds available RAM. With it, the kernel can move cold anonymous pages to swap first, giving you time to react.

jemalloc and tcmalloc

MySQL, Redis, and some other applications can use alternative memory allocators that reduce fragmentation and improve performance under load:

# Install jemalloc
sudo apt install libjemalloc-dev

# MySQL: add to /etc/mysql/mysql.conf.d/mysqld.cnf
# malloc-lib=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2

Transparent Hugepages and Redis

Transparent Huge Pages (THP) are enabled by default on Linux. They help database workloads with large sequential memory access patterns, but hurt Redis performance (causes latency spikes on write operations).

# Disable THP for Redis (temporary)
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled

# Make permanent — add to /etc/rc.local or a systemd unit

PostgreSQL and Hugepages

For PostgreSQL on dedicated database servers, enabling hugepages reduces page table overhead:

# Calculate hugepages needed:
# shared_buffers / 2MB (default hugepage size)
# e.g., 8GB shared_buffers = 4096 hugepages

# Set in /etc/sysctl.conf:
vm.nr_hugepages = 4096

# Enable in postgresql.conf:
# huge_pages = on

10. Monitoring Memory

The right monitoring setup catches memory issues before they become outages.

Quick Command Reference

# Snapshot
free -h

# 5-second trend
vmstat 1 5

# Per-process with accurate metrics
smem -tk -s uss

# Full kernel metrics
cat /proc/meminfo

# Swap activity
swapon --show

Alert Thresholds

Metric	Warning	Critical	Action
MemAvailable	< 20% of total	< 10% of total	Identify and limit high-memory processes
Swap used	> 20%	> 60%	Memory pressure — check vmstat si/so
vmstat si/so	Occasional spikes	Persistent > 10 MB/s	Active swapping — needs immediate attention
OOM kill events	Any occurrence	Repeated events	Investigate immediately

Prometheus + Grafana

For production systems, point-in-time metrics aren't enough. You need historical trends. The standard stack:

node_exporter — Exposes Linux memory metrics as Prometheus targets
Prometheus — Scrapes and stores the metrics
Grafana — Visualizes trends with alerting

Key Prometheus metrics for memory:

node_memory_MemAvailable_bytes — The one to alert on
node_memory_SwapFree_bytes / node_memory_SwapTotal_bytes — Swap usage ratio
node_vmstat_pswpin / node_vmstat_pswpout — Swap I/O rate

Quick Reference Cheat Sheet

Essential Commands

Task	Command
Memory overview	`free -h`
Full memory details	`cat /proc/meminfo`
Real-time I/O + swap activity	`vmstat 1`
Per-process accurate usage	`smem -tk -s uss`
Top by memory	`top` then Shift+M
Check OOM kills	`dmesg \| grep -i oom`
Check OOM score	`cat /proc/PID/oom_score`
Protect process from OOM	`echo -1000 > /proc/PID/oom_score_adj`
Check swappiness	`cat /proc/sys/vm/swappiness`
Set swappiness (temp)	`sysctl vm.swappiness=10`
Create 4GB swap file	`fallocate -l 4G /swapfile`
Enable swap	`swapon /swapfile`
List active swap	`swapon --show`
Disable swap	`swapoff /swapfile`

Kernel Tuning Cheat Sheet

Parameter	Default	Web Server	Database
vm.swappiness	60	10	1–5
vm.vfs_cache_pressure	100	50	50
vm.dirty_ratio	20	15	10
vm.dirty_background_ratio	10	5	3
vm.min_free_kbytes	~4096	65536	65536

Memory Management in Panelica

Everything described in this guide can be configured and monitored through Panelica's interface without touching the command line:

Per-user cgroup limits — Set memory.max per user from the user panel. Each user's processes run in an isolated cgroups v2 slice. No user can consume more RAM than their quota.
Real-time monitoring — Dashboard shows per-user memory usage, overall server memory trends, and swap usage with historical graphs via Prometheus + Grafana integration.
PHP-FPM per-user pools — Each user gets their own PHP-FPM pool with configurable pm.max_children. Memory leaks in one user's PHP process can't exhaust RAM for everyone else.
Automated alerts — Set thresholds for memory usage and receive webhook, Telegram, Slack, or email notifications when a server or user crosses them.

The 5-layer isolation architecture means that memory management isn't just a system-level concern — it's enforced at the user boundary. This is the difference between shared hosting that crashes under load and shared hosting that keeps running because individual users have hard limits.

Conclusion

Memory management is not glamorous, but it's one of the foundations of a stable server. Understanding what free -h is actually telling you, why Linux uses "all" your RAM for cache, and how to configure swap and OOM behavior correctly will save you from outages that catch most admins off guard.

The key takeaways:

Watch available, not free — high memory usage in free -h is usually cache, not a problem
Enable swap on every server — even 2GB on a high-RAM machine is a useful safety net
Lower swappiness on servers (10–20 for web, 1–5 for databases)
Protect critical processes from the OOM Killer via oom_score_adj
Use cgroups v2 limits to isolate users and services from each other
Monitor trends, not just snapshots — a gradual memory leak looks fine until it doesn't

A server that handles memory pressure gracefully — using swap as a buffer, throttling at soft limits, protecting critical processes — is a server you can trust at 3 AM.