Advanced performance monitoring system for OpenLiteSpeed servers with intelligent resource spike detection.
- Real-time monitoring of CPU, memory, I/O, network
- Adaptive thresholds based on dynamic baselines calculated from historical data
- Intelligent spike detection with pattern analysis
- Detailed diagnostics during spikes: processes, connections, MySQL queries
- OpenLiteSpeed-specific metrics: lsphp processes, HTTP/HTTPS connections
- SQLite database for historical data and temporal analysis
- Structured logging in JSON and human-readable formats
- Automatic reports with cause analysis and temporal patterns
- Python 3.8+
- Root/sudo permissions for full system metrics access
- OpenLiteSpeed (optional, for specific metrics)
- MySQL/MariaDB client (optional, for query monitoring)
- Ubuntu 24.04 (tested), compatible with most Linux distributions
Based on benchmarks, SpikeWatch has a very light footprint:
- CPU: < 1% average usage on typical systems
- RAM: ~40-60 MB average usage
- Disk: ~1-2 GB for logs and database (with 7-day retention)
- Network: Minimal (local-only operations)
Recommended minimum VPS specs:
- 1 vCPU core
- 512 MB RAM (1 GB recommended for small servers)
- 10 GB disk space
Note: Run the included benchmark tool on your target system to get accurate measurements:
python3 test/benchmark_resources.pySee test/README.md for detailed benchmarking instructions.
- Clone the repository:
git clone https://github.com/netwaretcs/spikewatch.git
cd spikewatch
- Install dependencies:
pip3 install -r requirements.txt
or for installation system-wide
apt install python3-psutil
- Create the log directory (requires sudo):
sudo mkdir -p /var/log/spikewatch
sudo chown $USER:$USER /var/log/spikewatch
# Run in foreground (press Ctrl+C to stop)
python3 spikewatch.py
# Run in background (Unix/Linux/macOS only)
python3 spikewatch.py --daemon
# Or use shell background operator
python3 spikewatch.py &
python3 spikewatch.py --once
# Last 24 hours report
python3 spikewatch.py --report 24
# Last week report
python3 spikewatch.py --report 168
# Clear all historical data from database (with interactive confirmation)
python3 spikewatch.py --clear-db
# Clear database and start monitoring immediately
python3 spikewatch.py --clear-db --daemon
# Clear database and run single check
python3 spikewatch.py --clear-db --once
Warning: The
--clear-db
option permanently deletes all metrics and spike events from the database. Use with caution! When used alone (without--daemon
or--once
), it will prompt for confirmation before proceeding.
SpikeWatch looks for config.json
in the script directory by default. You can also specify a custom path:
# Use default config.json in script directory
python3 spikewatch.py
# Or specify a custom path
python3 spikewatch.py --config /path/to/custom_config.json
# Default (uses config.json in script directory, English language)
python3 analyze_spikes.py
# With custom configuration file
python3 analyze_spikes.py --config /path/to/custom_config.json
# Override language (en/it)
python3 analyze_spikes.py --language it
python3 spikewatch.py [OPTIONS]
Options:
--once Run a single check instead of continuous monitoring
--daemon Run as background daemon (Unix/Linux/macOS only)
--report HOURS Generate analysis report for last N hours
--config FILE Use custom configuration file (default: config.json)
--clear-db Clear all database contents before starting
--force Force start even if another instance is running
-h, --help Show help message and exit
Thresholds and parameters can be configured in the CONFIG
dictionary in spikewatch.py
or via an external JSON file.
Copy the example configuration to the project directory:
cp config.json.example config.json
Edit the configuration file as needed:
{
"log_dir": "/var/log/spikewatch",
"db_file": "/var/log/spikewatch/metrics.db",
"thresholds": {
"cpu": 60,
"load": 6.0,
"memory": 85,
"iowait": 30,
"disk_usage": 90
},
"adaptive_threshold": true,
"baseline_window": 3600,
"spike_duration": 60,
"collect_interval": 10,
"detailed_on_spike": true,
"language": "en"
}
- thresholds: Fixed thresholds for spike detection (percentage or load values)
- adaptive_threshold: Enable dynamic threshold calculation based on baseline
- baseline_window: Time window (seconds) for baseline calculation (default: 1 hour)
- spike_duration: Minimum duration (seconds) to consider a spike significant
- collect_interval: Metrics collection interval (seconds)
- detailed_on_spike: Collect comprehensive diagnostics during spikes
- language: Output language for analyze_spikes.py (choices: "en" or "it", default: "en")
All files are saved in /var/log/spikewatch/
:
- metrics.db: SQLite database with complete history
- spikewatch.log: Monitor text log
- metrics.json: JSON metrics stream (one per line)
- spike_YYYYMMDD_HHMMSS.json: Detailed diagnostics for each spike
- spike_analysis.log: Human-readable spike summaries
- report_YYYYMMDD.json: Daily analysis reports
Create the file /etc/systemd/system/spikewatch.service
:
[Unit]
Description=SpikeWatch Performance Monitor
After=network.target
[Service]
Type=simple
User=root
WorkingDirectory=/path/to/spikewatch
ExecStart=/usr/bin/python3 /path/to/spikewatch/spikewatch.py
Restart=always
RestartSec=10
# Optional: Reduce system impact with nice/ionice (recommended for production)
Nice=10
IOSchedulingClass=2
IOSchedulingPriority=5
[Install]
WantedBy=multi-user.target
Enable and start the service:
sudo systemctl enable spikewatch
sudo systemctl start spikewatch
sudo systemctl status spikewatch
For production environments or systems under heavy load, it's recommended to run SpikeWatch with reduced priority to minimize any potential impact on critical services.
Using nice and ionice (recommended):
# Run with reduced CPU priority (nice) and I/O priority (ionice)
nice -n 10 ionice -c 2 -n 5 python3 spikewatch.py --daemon
Priority levels explained:
-
nice: CPU scheduling priority (-20 to 19, higher = lower priority)
0-5
: Slightly reduced priority (recommended for most cases)10-15
: Moderately reduced priority (good for busy servers)15-19
: Heavily reduced priority (use only if you can tolerate delays)
-
ionice: I/O scheduling priority
- Class
2
(best-effort) with priority3-5
: Balanced I/O access (recommended) - Class
3
(idle): Only uses I/O when system is idle (use with caution)
- Class
Recommendations by use case:
# Standard production server (recommended)
nice -n 5 ionice -c 2 -n 3 python3 spikewatch.py --daemon
# High-traffic web server (minimal impact)
nice -n 10 ionice -c 2 -n 5 python3 spikewatch.py --daemon
# Development/testing environment (no priority adjustment needed)
python3 spikewatch.py --daemon
Note: Given SpikeWatch's lightweight footprint (< 0.2% CPU, ~18 MB RAM), nice/ionice is optional but recommended as a best practice for production environments. The systemd service example above includes sensible defaults (Nice=10
, IOSchedulingClass=2
, IOSchedulingPriority=5
).
Collects system metrics with minimal overhead:
- System metrics (psutil): CPU, memory, swap, I/O, network
- Top processes by CPU/memory
- OpenLiteSpeed metrics (lsphp processes, connections)
- MySQL metrics (connections, slow queries)
Intelligent detection with:
- Adaptive thresholds based on dynamic baselines
- Sudden spike detection (2x recent average)
- Spike event tracking with duration and cause
- Multi-metric analysis (CPU, load, memory, I/O)
Main orchestration:
- Continuous monitoring loop
- Structured logging (JSON + text)
- On-demand detailed diagnostics
- Periodic report generation
- Automatic cleanup of old data (7 days)
CREATE TABLE metrics (
timestamp TEXT PRIMARY KEY,
cpu_percent REAL,
load_1min REAL,
memory_percent REAL,
iowait REAL,
spike_detected BOOLEAN,
spike_reason TEXT,
details TEXT
);
CREATE TABLE spikes (
id INTEGER PRIMARY KEY AUTOINCREMENT,
start_time TEXT,
end_time TEXT,
duration_seconds INTEGER,
max_cpu REAL,
max_load REAL,
cause TEXT,
details TEXT
);
Contributions are welcome! Please:
- Fork the project
- Create a feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is distributed under the MIT License. See the LICENSE
file for more details.
Gabriele Venturini
Want to know exactly how much resources SpikeWatch will use on your system? Run the included benchmark:
cd test
python3 benchmark_resources.py --duration 60
This will generate a detailed report including:
- System information (CPU, RAM, disk)
- Resource usage statistics (CPU%, memory MB, I/O)
- Recommendations for minimum VPS requirements
See test/README.md for more details.
Test Environment (production VPS):
- Platform: Linux
- CPU: 6 vCPU cores (AMD EPYC Processor with IBPB) @ 3195 MHz
- RAM: 11.68 GB
- Disk: 144 GB
- OS: Ubuntu 24.04 LTS
Benchmark Results (60-second test):
CPU Usage:
Average: 0.15%
Min: 0.0%
Max: 1.0%
Memory Usage:
Average: 18.12 MB (0.15%)
Min: 18.12 MB
Max: 18.12 MB
Threads:
Average: 1
Max: 1
File Descriptors:
Average: 4
Max: 4
Disk I/O (cumulative):
Read: 0.54 MB
Write: 1.01 MB
Conclusion: ✓ Extremely lightweight - Uses less than 0.2% CPU and only ~18 MB of RAM. Suitable for VPS with as little as 512 MB RAM and 1 vCPU core. Perfect for shared hosting and minimal VPS configurations.
Note: These are real results from a production VPS running Ubuntu 24.04. Your results may vary based on system configuration, server load, and collection interval settings. Run the benchmark on your own system for accurate measurements.
For bugs, feature requests or questions, open an issue on GitHub.