Why Every Sysadmin Needs Regex
Regular expressions are the Swiss Army knife of system administration. You use them every day whether you realize it or not — filtering log files, searching configuration, validating input, transforming text. The difference between a sysadmin who knows regex and one who does not is the difference between solving a problem in 10 seconds and spending 30 minutes writing a script.
This guide presents 20 battle-tested regex patterns that cover the most common sysadmin tasks. Each pattern includes a practical example using grep, sed, or awk, so you can use them immediately. We start with a quick refresher on regex fundamentals, then dive straight into the patterns you will use daily.
Regex Fundamentals Quick Reference
| Symbol | Meaning | Example | Matches |
|---|---|---|---|
^ | Start of line | ^Error | Lines starting with "Error" |
$ | End of line | \.conf$ | Lines ending with ".conf" |
. | Any single character | h.t | hat, hit, hot, h2t |
* | Zero or more of previous | go*d | gd, god, good, goood |
+ | One or more of previous | go+d | god, good, goood (not gd) |
? | Zero or one of previous | colou?r | color, colour |
[abc] | Character class | [aeiou] | Any vowel |
[^abc] | Negated class | [^0-9] | Any non-digit |
\d | Digit (0-9) | \d{3} | Three digits |
\w | Word character | \w+ | One or more word chars |
\s | Whitespace | \s+ | One or more spaces/tabs |
{n,m} | n to m repetitions | \d{2,4} | 2 to 4 digits |
() | Capture group | (error|warn) | error or warn |
| | Alternation (OR) | cat|dog | cat or dog |
\( and \) for groups. Use grep -E (extended regex) or egrep to avoid backslash overload. All examples in this guide use grep -E.
The 20 Essential Patterns
Pattern 1: Match IPv4 Addresses
This is the most common pattern a sysadmin needs. Match valid IPv4 addresses in logs, configs, and output:
# Extract all IPs from a log file
grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' /var/log/auth.log
# Count unique IPs
grep -Eo '[0-9]{1,3}(\.[0-9]{1,3}){3}' /var/log/auth.log | sort -u | wc -l
# Find a specific IP (escaped dots)
grep '192\.168\.1\.100' /var/log/syslog
Pattern 2: Email Address Extraction
# Extract emails from a file
grep -Eo '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' contacts.txt
# Find lines containing emails in mail logs
grep -E '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' /var/log/mail.log
Pattern 3: URL Extraction
# Extract all URLs from a web page or log
grep -Eo 'https?://[a-zA-Z0-9./?=_&%-]+' access.log
# Find HTTPS URLs only
grep -Eo 'https://[a-zA-Z0-9./?=_&%-]+' access.log | sort -u
Pattern 4: Log Timestamp Parsing
grep -E '^[A-Z][a-z]{2}\s+[0-9]{1,2}\s[0-9]{2}:[0-9]{2}:[0-9]{2}' /var/log/syslog
# ISO 8601 format: 2026-03-15T14:23:45
grep -Eo '[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}' app.log
# Apache/Nginx combined format: [15/Mar/2026:14:23:45 +0000]
grep -Eo '\[[0-9]{2}/[A-Z][a-z]{2}/[0-9]{4}:[0-9]{2}:[0-9]{2}:[0-9]{2}' access.log
Pattern 5: Error Log Filtering
grep -Ei '(error|warning|critical|fatal|panic)' /var/log/syslog
# Exclude known harmless errors
grep -Ei '(error|critical)' app.log | grep -Ev '(deprecated|notice)'
# Count errors by type
grep -Eoi '(error|warning|critical)' app.log | sort | uniq -c | sort -rn
Pattern 6: Nginx/Apache Access Log Parsing
awk '{print $1, $9}' access.log | sort | uniq -c | sort -rn | head -20
# Find all 500 errors with URLs
awk '$9 == 500 {print $1, $7}' access.log
# Find requests taking over 5 seconds (if response time is last field)
awk '$NF > 5.0 {print $7, $NF"s"}' access.log
# Top 10 requested URLs
awk '{print $7}' access.log | sort | uniq -c | sort -rn | head -10
Pattern 7: Finding Config Values
grep -E '^[^#;]*=' /etc/mysql/my.cnf
# Extract value for a specific key
grep -E '^\s*max_connections\s*=' /etc/mysql/my.cnf | sed 's/.*=\s*//'
# Find all listen directives in nginx
grep -rE '^\s*listen\s+' /etc/nginx/
# Find all server_name directives
grep -rE '^\s*server_name\s+' /etc/nginx/sites-enabled/
Pattern 8: MAC Address Matching
# Extract MAC addresses from system output
ip link show | grep -Eo '([0-9a-f]{2}:){5}[0-9a-f]{2}'
# Find MAC addresses in DHCP logs
grep -Eo '([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}' /var/log/syslog
Pattern 9: Domain Name Validation
# Extract domain names from a text file
grep -Eo '[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?(\.[a-zA-Z]{2,})+' hosts.txt
# Find all virtual host domains in nginx configs
grep -rEo 'server_name\s+[a-zA-Z0-9.-]+' /etc/nginx/ | awk '{print $2}'
Pattern 10: Port Number Extraction
# Extract ports from netstat/ss output
ss -tlnp | grep -Eo ':([0-9]+)' | sort -t: -k2 -n -u
# Find all ports in nginx listen directives
grep -rEo 'listen\s+([0-9]+)' /etc/nginx/ | grep -Eo '[0-9]+'
Pattern 11: SSH Failed Login Extraction
grep 'Failed password' /var/log/auth.log | \
grep -Eo 'from [0-9.]+' | sort | uniq -c | sort -rn | head -20
# Find brute force attempts (invalid users)
grep -E 'Invalid user \S+ from' /var/log/auth.log | \
grep -Eo 'from [0-9.]+' | sort | uniq -c | sort -rn
# Extract usernames being tried
grep 'Invalid user' /var/log/auth.log | \
grep -Eo 'Invalid user \S+' | awk '{print $3}' | sort | uniq -c | sort -rn
Pattern 12: Disk Usage Parsing
df -h | awk '{print $5, $6}' | grep -E '^[89][0-9]%|^100%'
# Extract percentage as number for comparison
df -h | awk 'NR>1 {gsub(/%/,"",$5); if($5+0 > 80) print $5"%", $6}'
# Find large directories
du -sh /var/*/ 2>/dev/null | grep -E '^[0-9]+G' | sort -rh
Pattern 13: Version Number Matching
grep -Eo '[0-9]+\.[0-9]+\.[0-9]+' package.json
# Version with optional pre-release
grep -Eo '[0-9]+\.[0-9]+\.[0-9]+(-[a-zA-Z0-9.]+)?' CHANGELOG.md
# Extract PHP version from phpinfo
php -v | head -1 | grep -Eo '[0-9]+\.[0-9]+\.[0-9]+'
Pattern 14: CSV Field Parsing
awk -F',' '{print $1, $3}' data.csv
# Handle quoted CSV fields with commas inside
grep -Eo '"[^"]*"|[^,]+' data.csv
# Find rows where a field matches a pattern
awk -F',' '$3 ~ /error/ {print $0}' data.csv
Pattern 15: Finding Empty Lines
grep -v '^$' config.conf
sed '/^$/d' config.conf
# Remove blank lines (including whitespace-only)
grep -v '^\s*$' config.conf
sed '/^\s*$/d' config.conf
# Remove empty lines and comments
grep -Ev '^\s*(#|;|$)' config.conf
Pattern 16: Matching Quoted Strings
grep -Eo '"[^"]*"' config.php
# Single-quoted strings
grep -Eo "'[^']*'" config.php
# Both single and double quoted
grep -Eo '"[^"]*"|'"'"'[^'"'"']*'"'"'' config.php
Pattern 17: Hex Color Codes
grep -Eo '#[0-9A-Fa-f]{3,6}\b' styles.css
# Find all unique colors in a CSS file
grep -Eoi '#[0-9a-f]{3,6}\b' styles.css | tr 'A-F' 'a-f' | sort -u
# Replace a color throughout a file
sed -i 's/#ff0000/#e74c3c/gi' styles.css
Pattern 18: File Path Matching
grep -Eo '/[a-zA-Z0-9_/.+-]+' error.log
# Find PHP file references in logs
grep -Eo '/[a-zA-Z0-9_/.-]+\.php' error.log | sort -u
# Find all file paths referenced in nginx config
grep -Eo '(/[a-zA-Z0-9_/.+-]+)+' /etc/nginx/nginx.conf | sort -u
Pattern 19: Cron Expression Validation
grep -Ev '^\s*(#|$)' /etc/crontab
# Find cron jobs running every minute (possible mistake)
grep -E '^\*\s+\*\s+\*\s+\*\s+\*' /var/spool/cron/crontabs/*
# List all cron schedules with their commands
grep -Eh '^[0-9*,/-]+\s+[0-9*,/-]+\s+[0-9*,/-]+\s+[0-9*,/-]+\s+[0-9*,/-]+' \
/etc/cron.d/* /var/spool/cron/crontabs/* 2>/dev/null
Pattern 20: JSON Key-Value Extraction
grep -Eo '"version"\s*:\s*"[^"]*"' package.json
# Extract all keys from JSON
grep -Eo '"[^"]+"\s*:' config.json | sed 's/"\s*://' | tr -d '"'
# For complex JSON, use jq instead
jq '.version' package.json
jq -r '.dependencies | keys[]' package.json
jq. It is purpose-built for JSON parsing and handles edge cases that regex cannot.
Power Combinations
The real power of regex comes from combining patterns with Unix pipes. Here are some common multi-tool workflows:
Security Audit Pipeline
grep 'Failed password' /var/log/auth.log | \
grep -Eo 'from [0-9.]+' | \
awk '{print $2}' | \
sort | uniq -c | sort -rn | head -20
# Find all 404s and the referring IPs
awk '$9 == 404 {print $1, $7}' access.log | \
sort | uniq -c | sort -rn | head -20
Bulk Configuration Changes with sed
sed -i 's/listen\s\+80;/listen 8080;/g' /etc/nginx/sites-available/*
# Update all database hostnames
sed -i 's/db-old\.example\.com/db-new.example.com/g' /var/www/*/config.php
# Comment out a line matching a pattern
sed -i '/^max_connections/s/^/#/' /etc/mysql/my.cnf
Regex Cheat Sheet
| Task | Pattern | Tool |
|---|---|---|
| Match IP address | [0-9]{1,3}(\.[0-9]{1,3}){3} | grep -Eo |
| Match email | [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} | grep -Eo |
| Match URL | https?://[a-zA-Z0-9./?=_&%-]+ | grep -Eo |
| Strip comments | ^\s*(#|;|$) | grep -Ev |
| Match version | [0-9]+\.[0-9]+\.[0-9]+ | grep -Eo |
| Find errors | (error|critical|fatal) | grep -Ei |
| Match MAC | ([0-9a-f]{2}:){5}[0-9a-f]{2} | grep -Eo |
| Match hex color | #[0-9A-Fa-f]{3,6}\b | grep -Eo |
| Match file path | /[a-zA-Z0-9_/.+-]+ | grep -Eo |
| Replace text | s/old/new/g | sed -i |
Summary
These 20 patterns cover the vast majority of text-matching tasks you will encounter as a system administrator. The key takeaway is not to memorize every pattern but to understand the building blocks: character classes, quantifiers, anchors, and groups. Once you internalize those fundamentals, constructing new patterns becomes intuitive.
Start with the simple patterns — IP matching, error filtering, empty line removal — and gradually incorporate more complex ones. Keep this guide bookmarked for reference, and within a few weeks, you will find yourself reaching for regex instinctively whenever you need to find, filter, or transform text. And remember: when regex gets too complex (especially for JSON or HTML parsing), reach for purpose-built tools like jq, xmllint, or proper parsers. Regex is powerful, but knowing its limits is just as important as knowing its strengths.