Monitoring Your VPS — What to Watch and What to Ignore

A practical monitoring routine for a self-managed WordPress VPS. The five metrics that actually matter, how to check them in under five minutes, and what numbers should make you act.

Terminal showing htop, df -h, and free -m output — the three-command server health check

You don’t need a full monitoring stack for a personal WordPress VPS. You need five commands and a weekly habit.


The Five-Minute Weekly Check

Run these in order. Each takes seconds.

1. RAM Usage

free -m
              total  used  free  shared  buff/cache  available
Mem:           1987   892   312      45         782        897
Swap:          1023    12  1011

What to watch:

  • available column — this is what PHP-FPM can actually use. Under 200MB means you’re running tight.
  • Swap used — any consistent swap usage means RAM pressure. Occasional is fine; sustained means action needed.

Action threshold: Available RAM consistently under 200MB → reduce pm.max_children in PHP-FPM config or upgrade the VPS plan.


2. Disk Usage

df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        50G   18G   30G  38% /

What to watch: The Use% column for your root filesystem (/). Over 80% needs attention. Over 90% will cause service failures.

Where disk fills up silently:

# Check what's using the most space
du -sh /var/log/nginx/
du -sh /var/www/*/wp-content/cache/
du -sh /var/cache/nginx/

The usual suspects: Nginx access logs (rotate them), WP Rocket cache (disable page caching if using FastCGI), and FastCGI cache itself.

Action threshold: Over 80% used → find what’s filling up, clean it.


3. Service Status

systemctl is-active nginx php-fpm mariadb

All three should return active. If any returns inactive or failed:

sudo systemctl status nginx
sudo tail -20 /var/log/nginx/error.log

Action threshold: Any service not active → restart it and read the error log.


4. Recent Nginx Errors

sudo tail -20 /var/log/nginx/error.log

Look for patterns — the same error repeating every few seconds indicates an active problem. Occasional errors are normal.

Common ones to investigate:

  • connect() failed (111: Connection refused) — PHP-FPM not running
  • Permission denied — file ownership issue
  • no live upstreams — all PHP-FPM workers busy or dead

5. SSL Certificate Expiry

sudo certbot certificates
Found the following certs:
  Certificate Name: yourdomain.com
    Expiry Date: 2026-09-08 (VALID: 89 days)

Action threshold: Under 30 days → run sudo certbot renew --dry-run to verify auto-renewal works.


One-Line Health Check Script

Save this as /usr/local/bin/vps-check.sh:

#!/bin/bash
echo "=== RAM ===" && free -m | grep -E "Mem|Swap"
echo "=== DISK ===" && df -h / | tail -1
echo "=== SERVICES ===" && systemctl is-active nginx php-fpm mariadb redis
echo "=== SSL ===" && sudo certbot certificates 2>/dev/null | grep -E "Name|Expiry"
echo "=== FAIL2BAN ===" && sudo fail2ban-client status sshd 2>/dev/null | grep -E "banned|failed"
echo "=== LAST NGINX ERROR ===" && sudo tail -3 /var/log/nginx/error.log
sudo chmod +x /usr/local/bin/vps-check.sh

Run it: sudo vps-check.sh — full picture in under 10 seconds.


Automated Uptime Monitoring

The commands above tell you the server’s current state. They don’t tell you when the site goes down at 3am.

For that, use a free external monitor:

UptimeRobot (free tier) — monitors your domain every 5 minutes and sends an email or Telegram notification when it’s unreachable. Takes 2 minutes to set up at uptimerobot.com.

This is the minimum external monitoring worth having. You’ll know about downtime before your visitors complain.


What’s Safe to Ignore

High CPU spikes — brief spikes during cron jobs, WordPress cron, or cache warming are normal. Sustained high CPU (over 80% for more than a few minutes) needs investigation.

Fail2ban banning IPs — this is Fail2ban doing its job. High ban counts mean active scanning activity, which is background internet noise. Not a problem unless the server is slow.

Log file entries for 404s — bots constantly probe for common paths. wp-config.php.bak, /.env, /admin — all normal background noise. Only investigate if the same IP probes repeatedly with unusual patterns.

Redis memory usage growing — Redis grows as it caches more data, then stabilizes. As long as it stays well under your server’s available RAM, it’s working correctly.