How to Troubleshoot a Website That Suddenly Becomes Unavailable: A Complete Diagnostic Process from DNS to the Server

A sudden website outage is one of the most critical online issues. While users see a blank or error page, you see monitoring alerts—and the root cause could lie at a completely different level: DNS changes haven’t taken effect, a CDN failure, a server crash, an Nginx crash, a database failure, or simply a full disk.

This article provides a systematic diagnostic process to help you pinpoint the root cause and restore service in the shortest possible time.

I. Rapid Troubleshooting Classification
First, determine the scope of the issue:

# Test website accessibility across multiple regions
# Tool: https://www.isitdownrightnow.com/
# Or use the command line:
curl -I –connect-timeout 10 https://your-domain.com
Symptom Possible Cause Priority
Unavailable globally Server downtime, DNS failure, expired domain Server Status → DNS
Unavailable domestically, accessible overseas CDN origin fetch issues, network outage CDN Status → Network
Some pages display errors, homepage works Failure of services relied upon by specific features (database, cache) Database → Redis
Website responds extremely slowly but not completely down CPU/memory exhaustion, slow database queries Server Resource Usage
II. Step 1: Check DNS Resolution
# Check if the domain is correctly resolved to the server IP
nslookup your-domain.com
dig your-domain.com +short

# Check if DNS changes have propagated globally
# Tool: https://www.whatsmydns.net/
If the DNS resolution result is not your server IP, possible causes:

DNS records for the domain were accidentally deleted or modified
The domain has expired and been deactivated
DNS provider outage
Changes to origin server configuration when using a CDN
Domain expiration
# Check the domain’s Whois information to confirm the expiration date
whois your-domain.com | grep -i “expir”
3. Step 2: Verify Server Reachability
# Ping the server IP (Note: Some servers disable ICMP; a failed ping does not necessarily indicate an outage)
ping your_server_ip

# Test whether HTTP/HTTPS ports are open
curl -I –connect-timeout 5 http://your_server_ip
telnet your_server_ip 80
telnet your_server_ip 443
If the server IP shows no response at all, log in to the VPS control panel to check the server’s running status and force a reboot if necessary.

IV. Step 3: SSH In to Check Services
Once you’ve successfully SSHed in, check the following in this order:

Check if system resources are exhausted
# View CPU, memory, and load
top -bn1 | head -20

# Check if the disk is full (this is one of the most common causes)
df -h

# Check memory usage
free -h
Check Nginx status
# Check if Nginx is running
sudo systemctl status nginx

# View the Nginx error log (last 50 lines)
sudo tail -n 50 /var/log/nginx/error.log

# Test the Nginx configuration and reload
sudo nginx -t && sudo systemctl reload nginx
Check the PHP-FPM status (For PHP sites like WordPress)
sudo systemctl status php8.1-fpm
sudo tail -n 30 /var/log/php8.1-fpm.log
Check MySQL Status
sudo systemctl status mysql
sudo tail -n 30 /var/log/mysql/error.log

# Attempt to connect to the database
mysql -u root -p -e “SELECT 1;”
Check port listening status
# Verify that ports 80 and 443 are listening
sudo ss -tlnp | grep -E ‘:80|:443’
V. Step 4: CDN Troubleshooting
If using Cloudflare or another CDN:

# Directly access the origin server IP (bypassing the CDN)
curl -H “Host: your-domain.com” http://your_server_ip

# If direct access via IP works but access via domain does not
# This indicates the issue is at the CDN layer; log in to the Cloudflare console to check
Common Cloudflare Issues:

Incorrect SSL/TLS encryption mode settings (recommended: “Full (Strict)”)
Firewall rules mistakenly blocking traffic
Origin server IP blocked by server firewall (ensure Cloudflare IP ranges are allowed)
VI. Quick Recovery Commands for Common Issues
# Nginx crashed → Restart
sudo systemctl restart nginx

# MySQL crashed → Restart
sudo systemctl restart mysql

# PHP-FPM crash → Restart
sudo systemctl restart php8.1-fpm

# Disk full → Quick cleanup
sudo journalctl –vacuum-time=7d
sudo apt clean
docker system prune -f # If Docker is installed

# Out of memory → Identify the top memory-consuming process and restart it
ps aux –sort=-%mem | head -10
VII. Set up monitoring to avoid reactive detection next time
The best way to handle website downtime is to receive an alert before users notice it. We recommend the following free monitoring solutions:

UptimeRobot: Free monitoring for 50 sites, checks every 5 minutes, sends immediate email/Telegram notifications upon downtime
Betterstack: Free plan offers 3-minute check intervals and supports multi-channel alerts
Self-hosted monitoring: Use crontab to run a curl check every minute and send notifications upon failure (see Section 10: Advanced Operations)

Summary
Troubleshooting steps for website downtime: Check if DNS is resolving properly → Check if the server is reachable → Check if system resources are exhausted (disk/memory/CPU) → Check if the web service is running → Check if the database is running → Check if the CDN configuration is correct. By following this sequence, 90% of website outages can be identified and resolved within 15 minutes.

官网全新升级 🚀