Tutorial

Troubleshooting 502 Bad Gateway: Every Possible Cause and How to Fix It

March 31, 2026

Back to Blog

You refresh the page. White screen. 502 Bad Gateway. Your heart rate goes up. A client is calling. The site was fine an hour ago.

502 is the most common server error web administrators encounter — and also one of the most misunderstood. Unlike 404 (not found) or 403 (forbidden), a 502 doesn't point you directly at a file or permission. It means the proxy couldn't get a valid response from upstream. The break is somewhere in the chain between Nginx and your backend.

This guide covers every possible cause — from the obvious (PHP-FPM is down) to the obscure (DNS caching at Nginx startup, SELinux blocking socket access, malformed response headers). We'll go through each one systematically, with the exact commands to diagnose and fix it.

What Does 502 Actually Mean?

The HTTP 502 status code means: "I am a proxy server, and the server I was trying to reach gave me an invalid response (or no response at all)."

In the most common web setup, Nginx acts as a reverse proxy. It receives a request from the user, forwards it to a backend (PHP-FPM, Node.js, Python/Gunicorn, etc.), and returns the response. A 502 means step two failed — Nginx couldn't get a usable response from the backend.

Key insight: 502 is always a server-side problem. The client did nothing wrong. You need to fix something on the server.

502 vs 503 vs 504 — What's the Difference?

Code Meaning Common Cause Who's at Fault
502 Bad Gateway Proxy got invalid/no response from upstream Backend crashed, wrong socket, misconfiguration Backend / Config
503 Service Unavailable Server temporarily overloaded or in maintenance All workers busy, maintenance mode, rate limiting Capacity / Load
504 Gateway Timeout Proxy waited too long for upstream response Slow database query, heavy computation, network latency Performance / Timeout

In practice, 502 and 504 are closely related — a very slow backend can trigger a 504, and a crashed backend causes 502. But the distinction helps narrow down the problem.

The Architecture: Understanding the Chain

Before debugging, you need to understand what's actually happening when a request comes in:

User → [CDN/Cloudflare] → Nginx (reverse proxy) → Backend (PHP-FPM / Node.js / Python)
                                                           ↓
                                                    Database (MySQL / PostgreSQL)
                                                           ↓
                                                    Filesystem (files, sessions, uploads)

A 502 means the connection broke at the Nginx → Backend arrow. Everything to the left of that arrow was fine. Everything to the right of it is where your problem lives.

The backend itself might have crashed, be too slow, be misconfigured, or be blocked by a firewall or security module. Your job is to walk that chain and find the break.

Cause #1: PHP-FPM Is Not Running

This is the most common cause for PHP-based sites. PHP-FPM crashed or was never started, and Nginx can't reach it.

Symptoms in Nginx error log:

connect() to unix:/var/run/php/php8.4-fpm.sock failed (2: No such file or directory)

Diagnose:

# Check FPM status
systemctl status php8.4-fpm

# Or if using a custom init system
/opt/panelica/bin/pn-service status php84

Fix:

systemctl restart php8.4-fpm

But don't just restart and walk away. Check why it stopped:

journalctl -u php8.4-fpm -n 50 --no-pager

Common reasons FPM stops:

  • OOM killer terminated it (check dmesg | grep -i oom)
  • Config file error after a change (check php-fpm8.4 -t)
  • Corrupt pool configuration
  • All workers crashed due to a fatal PHP error in a shared script

Cause #2: PHP-FPM Socket or Port Mismatch

PHP-FPM is running, but Nginx is pointing at the wrong socket path or port. This is a configuration mismatch — one of the most frustrating 502 causes because both services are healthy, they just can't find each other.

Symptoms in Nginx error log:

connect() to unix:/run/php/php8.4-fpm.sock failed (2: No such file or directory)
# or
connect() failed (111: Connection refused) while connecting to upstream

Check your Nginx config:

grep -r "fastcgi_pass\|proxy_pass" /etc/nginx/sites-enabled/

Check your PHP-FPM pool config:

grep "^listen" /etc/php/8.4/fpm/pool.d/www.conf

These two values must match exactly. Common mismatches:

Nginx Config PHP-FPM Listen Status
unix:/var/run/php/php8.4-fpm.sock /var/run/php/php8.4-fpm.sock Correct
unix:/run/php/php8.4-fpm.sock /var/run/php/php8.4-fpm.sock Mismatch (symlinks differ)
127.0.0.1:9000 9001 Port mismatch
unix:/run/php/php8.4-fpm.sock 9000 (TCP mode) Socket vs TCP mismatch

Check socket file permissions:

ls -la /var/run/php/php8.4-fpm.sock

Nginx must be able to read/write the socket. PHP-FPM pool settings control this:

; In your pool.d/www.conf
listen.owner = www-data
listen.group = www-data
listen.mode = 0660

The Nginx worker process user must match listen.owner or be in listen.group.

Cause #3: Upstream Timeout

The backend is running but taking too long to respond. Nginx gives up and returns 502 (or 504, depending on which timeout fires first).

Symptoms in Nginx error log:

upstream timed out (110: Connection timed out) while reading response header from upstream

Diagnose — is the backend actually slow?

# Test the backend directly (bypass Nginx)
curl -o /dev/null -w "%{time_total}\n" http://127.0.0.1:9000/slow-endpoint

# Check slow query log (MySQL)
tail -f /var/log/mysql/slow-query.log

# Check for long-running PHP processes
ps aux | grep php-fpm | awk '{print $1, $11}' | head -20

Temporary fix — increase timeouts:

server {
    # For PHP-FPM backends
    fastcgi_read_timeout 300;
    fastcgi_connect_timeout 60;
    fastcgi_send_timeout 300;

    # For proxy_pass backends (Node.js, Python, etc.)
    proxy_read_timeout 300;
    proxy_connect_timeout 60;
    proxy_send_timeout 300;
}

Increasing timeouts buys you time, but the real fix is optimizing the slow backend code or database query causing the bottleneck.

Cause #4: PHP Fatal Error or Script Crash

A PHP script throws a fatal error, the PHP-FPM worker process dies mid-request, and Nginx sees a broken connection.

Check the PHP-FPM error log:

tail -f /var/log/php8.4-fpm.log

# Or the specific pool log if configured
tail -f /var/log/php8.4-fpm-www.log

Common fatal errors that cause 502:

  • Allowed memory size of X bytes exhausted — PHP hit the memory_limit
  • Maximum execution time of X seconds exceeded — Script too slow, different from timeout
  • Fatal error: Call to undefined function — Missing extension or autoload failure
  • Segmentation fault — PHP extension bug, extremely rare
  • Class not found — Composer autoload cache stale or corrupt

Fix memory issues in php.ini or pool config:

; php.ini
memory_limit = 256M
max_execution_time = 60

; Or per-pool in www.conf
php_admin_value[memory_limit] = 512M

After changing, reload FPM:

systemctl reload php8.4-fpm

Cause #5: Out of Memory — OOM Killer

The Linux kernel's OOM (Out of Memory) killer terminated PHP-FPM or your Node.js process because the server ran out of RAM. The process is gone, Nginx gets 502.

Diagnose:

# Check kernel messages for OOM events
dmesg | grep -i "oom\|killed process" | tail -20

# Check system memory right now
free -h

# Check if swap exists and is being used
swapon --show

Immediate fix:

# Add 2GB swap file (if no swap exists)
fallocate -l 2G /swapfile
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile

# Make persistent
echo '/swapfile none swap sw 0 0' >> /etc/fstab

Structural fixes:

  • Reduce pm.max_children in PHP-FPM pools — fewer workers = less RAM
  • Set pm = ondemand or pm = dynamic instead of static
  • Add memory monitoring alerts
  • Upgrade RAM if the server is legitimately undersized for the load

Calculate correct pm.max_children:

# Available RAM for PHP (in MB)
AVAILABLE_RAM=512

# Average PHP process memory usage
AVG_PROCESS=30

# Max children = available RAM / avg process memory
echo "$((AVAILABLE_RAM / AVG_PROCESS))"
# Result: 17 — don't exceed this number

Cause #6: PHP-FPM Worker Pool Exhausted

PHP-FPM is running and configured correctly, but all available workers are busy handling existing requests. New requests queue up, hit the timeout, and Nginx returns 502 (or 503).

Symptoms in Nginx error log:

no live upstreams while connecting to upstream
# or
connect() to unix:/run/php/php8.4-fpm.sock failed (11: Resource temporarily unavailable)

Diagnose using PHP-FPM status page:

# Enable status in your pool config:
pm.status_path = /status

# Query it directly
curl http://127.0.0.1/status?full

Key metrics to watch:

  • active processes close to max children — worker pool near capacity
  • listen queue above 0 — requests waiting for a free worker
  • max active processes ever reached max children — it has happened before

Fix — tune the pool:

; www.conf
pm = dynamic
pm.max_children = 25        ; Increase if RAM allows
pm.start_servers = 5
pm.min_spare_servers = 3
pm.max_spare_servers = 10
pm.max_requests = 500       ; Restart workers after N requests (prevents memory leaks)

Cause #7: Node.js or Python Backend Crash

Your application (Node.js, Python/Flask/Django/FastAPI, Ruby on Rails, Go app) crashed due to an unhandled exception. The port it was listening on is now empty, so Nginx gets connection refused.

Diagnose:

# Check if the port is actually listening
ss -tulnp | grep :3000

# Check application logs
journalctl -u myapp -n 100 --no-pager

# Check PM2 if used
pm2 logs --lines 50

Fix — ensure auto-restart:

Using PM2 (Node.js):

pm2 start app.js --name myapp
pm2 startup  # Enable auto-start on boot
pm2 save

Using systemd:

[Service]
ExecStart=/usr/bin/node /var/www/myapp/app.js
Restart=on-failure
RestartSec=5s
StandardOutput=journal
StandardError=journal

Using Docker:

docker run --restart=unless-stopped myapp

Cause #8: Nginx Proxy Pointing to Wrong Port

Your application is listening on port 3000, but your Nginx config says proxy_pass http://127.0.0.1:8080. Simple misconfiguration, easy to miss after deployments.

Diagnose:

# What's actually listening on what port?
ss -tulnp | grep LISTEN

# What does Nginx think?
grep -r "proxy_pass" /etc/nginx/sites-enabled/

Fix: Align the port numbers. Either change the application's listening port or update the Nginx proxy_pass directive — then reload Nginx:

nginx -t && systemctl reload nginx

Cause #9: DNS Resolution Failure Inside Nginx

This is one of the more obscure causes. If your Nginx config uses a hostname in proxy_pass (e.g., proxy_pass http://backend.internal;), Nginx resolves that DNS name at startup. If DNS was unavailable at startup, or if the IP changes later, Nginx keeps using the stale (or failed) resolution.

Symptoms:

  • 502 immediately after Nginx starts or restarts
  • 502 after a backend service moves to a new IP
  • Works with IP directly, fails with hostname

Fix — force dynamic DNS resolution:

server {
    # Add a resolver directive
    resolver 127.0.0.53 valid=30s ipv6=off;

    location / {
        # Use variable to force runtime DNS lookup
        set $backend "http://backend.internal:3000";
        proxy_pass $backend;
    }
}

When you use a variable in proxy_pass, Nginx resolves DNS at request time (using the configured resolver), not at startup.

Cause #10: Firewall Blocking Internal Traffic

A firewall rule is blocking the connection between Nginx and the backend. This sounds unlikely for traffic on the same server (localhost to localhost), but it happens more often than you'd expect — especially with Docker networks, containers with separate network namespaces, or overly aggressive iptables rules.

Diagnose:

# Check iptables rules
iptables -L -n -v | grep -E "DROP|REJECT"

# Check nftables
nft list ruleset | grep -E "drop|reject"

# Test connection directly
curl -v http://127.0.0.1:3000/health

# Test from inside a Docker network
docker exec nginx-container curl http://app-container:3000/health

Fix — allow the connection:

# Allow localhost traffic (iptables)
iptables -A INPUT -i lo -j ACCEPT
iptables -A OUTPUT -o lo -j ACCEPT

# For Docker networks, allow the bridge
iptables -A FORWARD -i docker0 -o docker0 -j ACCEPT

Cause #11: SELinux or AppArmor Blocking Socket Access

Linux security modules (SELinux on RHEL/CentOS/Fedora, AppArmor on Ubuntu/Debian) may prevent Nginx from connecting to PHP-FPM sockets or network ports, even if the file permissions are correct.

Diagnose SELinux:

# Check for recent AVC denials
ausearch -m AVC -ts recent | grep nginx

# Quick check
getenforce  # Should show "Permissive" to test
setenforce 0  # Temporarily disable to test if SELinux is the cause

If the 502 goes away after setenforce 0, SELinux is the culprit.

Fix SELinux properly:

# Allow Nginx to connect to network (for proxy_pass)
setsebool -P httpd_can_network_connect 1

# Allow Nginx to connect to PHP-FPM via socket
setsebool -P httpd_can_network_relay 1

# Generate a custom policy from the AVC denials
ausearch -m AVC -ts recent | audit2allow -M nginx-custom
semodule -i nginx-custom.pp

Diagnose AppArmor:

# Check denied operations
grep "DENIED" /var/log/syslog | grep nginx | tail -20

# Check AppArmor status
aa-status

Fix AppArmor — add socket permission:

# Edit Nginx AppArmor profile
/etc/apparmor.d/usr.sbin.nginx

# Add permission for the socket
/run/php/php8.4-fpm.sock rw,

Cause #12: Disk Full

A full disk causes 502s in ways that are not immediately obvious. PHP can't write session files. Nginx can't write to its proxy temp directory. PHP-FPM can't write error logs and exits. Databases can't write WAL files and crash.

Diagnose:

# Check disk usage
df -h

# Find what's eating the disk
du -sh /* 2>/dev/null | sort -rh | head -20

# Check inode exhaustion (can be full even if bytes are free)
df -i

Common culprits:

  • Log files that weren't rotated (Nginx, PHP-FPM, MySQL, application logs)
  • PHP session files accumulating in /var/lib/php/sessions/
  • Nginx proxy cache or temp files in /var/cache/nginx/
  • Old package cache in /var/cache/apt/
  • Docker images and stopped containers
  • Core dump files

Quick cleanup:

# Clean apt cache
apt-get clean

# Remove old log files (be careful — check before deleting)
find /var/log -name "*.gz" -mtime +30 -delete

# Clean PHP sessions older than 24 hours
find /var/lib/php/sessions -type f -mmin +1440 -delete

# Clean Docker (stopped containers, dangling images)
docker system prune -f

Cause #13: Upstream Sent Too Large a Header

The backend returns response headers that exceed Nginx's buffer size. This often happens with applications that set many cookies, large JWT tokens, or verbose session data in response headers.

Symptoms in Nginx error log:

upstream sent too big header while reading response header from upstream

Fix — increase buffer sizes:

server {
    # For PHP-FPM
    fastcgi_buffer_size 32k;
    fastcgi_buffers 8 16k;
    fastcgi_busy_buffers_size 64k;

    # For proxy_pass backends
    proxy_buffer_size 32k;
    proxy_buffers 4 64k;
    proxy_busy_buffers_size 64k;
}

Reload Nginx after making changes:

nginx -t && systemctl reload nginx

Cause #14: TLS/SSL Between Nginx and Backend

Your Nginx config uses proxy_pass https://backend:8443 — HTTPS to the backend. If the backend's SSL certificate is self-signed, expired, or doesn't match the hostname, Nginx refuses to connect.

Symptoms in Nginx error log:

SSL_do_handshake() failed (SSL: error:...) while SSL handshaking to upstream
upstream SSL certificate verify error

Fix option 1 — use HTTP internally (recommended for same-server backends):

# Change proxy_pass to http for internal traffic
proxy_pass http://127.0.0.1:8080;
# TLS termination at the Nginx level is sufficient

Fix option 2 — disable verification for trusted internal backends:

proxy_ssl_verify off;
proxy_ssl_server_name on;

Fix option 3 — configure proper certificate verification:

proxy_ssl_trusted_certificate /path/to/ca.crt;
proxy_ssl_verify on;
proxy_ssl_verify_depth 2;

Cause #15: CDN or Cloudflare Issues

Sometimes the 502 isn't from your Nginx at all — it's from Cloudflare (or another CDN) failing to reach your server, or your Nginx failing in a way that Cloudflare reports as 502.

Distinguish Cloudflare 502 from your server's 502:

  • Cloudflare's 502 — shows Cloudflare branding, ray ID in footer, error page is orange-ish
  • Your server's 502 — your custom error page, or Nginx default white/grey page

Test by bypassing the CDN:

# Access directly via server IP (bypasses Cloudflare)
curl -sk -H "Host: example.com" https://YOUR_SERVER_IP/

# Check if your server is even responding
curl -sk -o /dev/null -w "%{http_code}" https://YOUR_SERVER_IP/

If direct IP access works but Cloudflare shows 502, the issue is Cloudflare→server connectivity (IP blocked, port 443 not open, Cloudflare IPs not whitelisted, SSL mode mismatch).

Common Cloudflare-related 502 causes:

  • SSL mode mismatch — Cloudflare set to "Full (strict)" but your server has a self-signed cert
  • Origin IP blocked by firewall — Cloudflare IP ranges blocked in iptables/nftables
  • Port 443 not open — Server firewall blocks 443
  • Cloudflare status — Check cloudflarestatus.com for incidents
  • Cache rules interfering — Cloudflare caching a 502 response aggressively

Systematic Debugging: The 502 Flowchart

When you see 502, follow this decision tree. Don't skip steps — false assumptions waste time.

502 Error Received
│
├─ Step 1: What does Nginx error log say?
│  tail -f /var/log/nginx/error.log
│  │
│  ├─ "connect() failed (2: No such file)" → socket path wrong or FPM not running
│  ├─ "connect() failed (111: Connection refused)" → backend not listening on that port
│  ├─ "upstream timed out" → backend slow, increase timeout or optimize
│  ├─ "no live upstreams" → all workers busy (pool exhausted)
│  ├─ "recv() failed (104: Connection reset)" → backend crashed mid-request
│  └─ "SSL handshake error" → TLS issue between Nginx and backend
│
├─ Step 2: Is the backend running?
│  systemctl status php8.4-fpm   # or node, gunicorn, etc.
│  ss -tulnp | grep :9000         # Is it actually listening?
│  │
│  ├─ Not running → restart it, check why it stopped
│  └─ Running → continue to Step 3
│
├─ Step 3: Can you reach the backend directly?
│  curl http://127.0.0.1:9000/   # or via socket
│  │
│  ├─ Works → Nginx config issue (wrong socket/port path)
│  └─ Fails → backend issue (continue)
│
├─ Step 4: Check backend application log
│  tail -f /var/log/php8.4-fpm.log
│  journalctl -u myapp -n 50
│  │
│  ├─ Memory error → increase memory_limit
│  ├─ max_children reached → increase pool size or fix slow requests
│  ├─ Fatal error / exception → fix application bug
│  └─ Nothing → continue
│
├─ Step 5: Check system resources
│  free -h           # RAM available?
│  df -h             # Disk space?
│  dmesg | grep oom  # OOM kills?
│  │
│  ├─ OOM → add swap, reduce workers, upgrade RAM
│  ├─ Disk full → cleanup, rotate logs
│  └─ Resources OK → continue
│
├─ Step 6: Check firewall and security modules
│  iptables -L -n | grep -E "DROP|REJECT"
│  ausearch -m AVC -ts recent   # SELinux
│  grep DENIED /var/log/syslog  # AppArmor
│  │
│  └─ Blocking → update rules or policies
│
└─ Step 7: Check Nginx configuration
   nginx -t                    # Syntax OK?
   nginx -T | grep -A5 "your-domain"   # Full effective config
   └─ Config issue → fix and reload

Quick Diagnosis Command Reference

What to Check Command
Nginx error log (live) tail -f /var/log/nginx/error.log
Nginx access log tail -f /var/log/nginx/access.log
PHP-FPM status systemctl status php8.4-fpm
PHP-FPM log (live) tail -f /var/log/php8.4-fpm.log
PHP-FPM config test php-fpm8.4 -t
Open ports ss -tulnp | grep LISTEN
Nginx config test nginx -t
OOM events dmesg | grep -i oom
Disk usage df -h && df -i
RAM usage free -h
SELinux denials ausearch -m AVC -ts recent
AppArmor denials grep DENIED /var/log/syslog | grep nginx
Firewall rules iptables -L -n -v
PHP-FPM workers count ps aux | grep php-fpm | wc -l
Test backend directly curl -v http://127.0.0.1:9000/
Check socket existence ls -la /var/run/php/php8.4-fpm.sock

Prevention: Stop 502 Errors Before They Happen

The best 502 is the one you never see. Here's a checklist for making your setup resilient:

Configure Auto-Restart

# In your systemd service file
[Service]
Restart=on-failure
RestartSec=5s

Set Proper Resource Limits

; php-fpm pool.conf
pm = dynamic
pm.max_children = 20        ; Based on available RAM
pm.max_requests = 500       ; Prevent memory leaks
php_admin_value[memory_limit] = 256M

Enable PHP-FPM Status Monitoring

; pool.conf
pm.status_path = /fpm-status

# Nginx location block
location /fpm-status {
    allow 127.0.0.1;
    deny all;
    fastcgi_pass unix:/run/php/php8.4-fpm.sock;
    include fastcgi_params;
}

Set Up Log Rotation

# /etc/logrotate.d/nginx
/var/log/nginx/*.log {
    daily
    missingok
    rotate 14
    compress
    delaycompress
    notifempty
    sharedscripts
    postrotate
        nginx -s reopen
    endscript
}

Monitor Disk Space Proactively

# Simple cron alert when disk exceeds 80%
*/30 * * * * root df -h | awk '$5 > 80 {print}' | mail -s "Disk Alert" [email protected]

Add Swap

Even on servers with sufficient RAM, swap acts as a safety net against sudden OOM kills. A 2GB swap file costs almost nothing and can prevent a 3 AM wake-up call.

How Panelica Handles This Automatically

Everything described in this guide is something you have to set up, monitor, and respond to manually on a traditional server. With Panelica, most of it is handled automatically:

  • Per-user cgroups (v2) — Each user's PHP-FPM workers have memory limits. One user's out-of-control script can't OOM-kill another user's PHP-FPM pool.
  • Isolated PHP-FPM pools — Every user gets their own per-version PHP-FPM pool. A crash in one pool doesn't affect other users.
  • Real-time monitoring — Panelica tracks PHP-FPM worker counts, memory usage, and process status. You see the numbers before they become 502s.
  • Automatic service restart — Managed services are configured with systemd Restart=on-failure. If a service crashes, it comes back automatically.
  • Security Advisor — 50+ server checks, including disk usage warnings, before they cause problems.
  • Log access in the panel — Nginx error logs, PHP-FPM logs, and application logs are accessible from the web interface. No SSH required to start debugging.

A 502 at 3 AM is still possible. But the distance between "something went wrong" and "I know exactly what it is" shrinks significantly when your infrastructure is properly instrumented.

Conclusion

502 Bad Gateway errors are always fixable. There are only so many things that can go wrong between Nginx and your backend, and once you know where to look, the diagnosis is usually straightforward:

  1. Read the Nginx error log first — it usually tells you exactly what failed
  2. Check if the backend is running and listening on the right socket/port
  3. Check system resources (RAM, disk) for OOM kills or full disks
  4. Check application logs for crashes or fatal errors
  5. Check security modules and firewalls if everything else looks fine

The most important habit is reading logs before guessing. The error message in /var/log/nginx/error.log almost always points directly at the problem. Start there, follow the chain, and you'll have the site back up before the coffee gets cold.

Share: