SpikeWatch

Advanced performance monitoring system for OpenLiteSpeed servers with intelligent resource spike detection.

Features

Real-time monitoring of CPU, memory, I/O, network
Adaptive thresholds based on dynamic baselines calculated from historical data
Intelligent spike detection with pattern analysis
Detailed diagnostics during spikes: processes, connections, MySQL queries
OpenLiteSpeed-specific metrics: lsphp processes, HTTP/HTTPS connections
SQLite database for historical data and temporal analysis
Structured logging in JSON and human-readable formats
Automatic reports with cause analysis and temporal patterns

Requirements

Software Requirements

Python 3.8+
Root/sudo permissions for full system metrics access
OpenLiteSpeed (optional, for specific metrics)
MySQL/MariaDB client (optional, for query monitoring)
Ubuntu 24.04 (tested), compatible with most Linux distributions

Hardware Requirements

Based on benchmarks, SpikeWatch has a very light footprint:

CPU: < 1% average usage on typical systems
RAM: ~40-60 MB average usage
Disk: ~1-2 GB for logs and database (with 7-day retention)
Network: Minimal (local-only operations)

Recommended minimum VPS specs:

1 vCPU core
512 MB RAM (1 GB recommended for small servers)
10 GB disk space

Note: Run the included benchmark tool on your target system to get accurate measurements:
python3 test/benchmark_resources.py
See test/README.md for detailed benchmarking instructions.

Installation

Clone the repository:

git clone https://github.com/netwaretcs/spikewatch.git
cd spikewatch

Install dependencies:

pip3 install -r requirements.txt

or for installation system-wide

apt install python3-psutil

Create the log directory (requires sudo):

sudo mkdir -p /var/log/spikewatch
sudo chown $USER:$USER /var/log/spikewatch

Usage

Continuous monitoring (daemon mode)

# Run in foreground (press Ctrl+C to stop)
python3 spikewatch.py

# Run in background (Unix/Linux/macOS only)
python3 spikewatch.py --daemon

# Or use shell background operator
python3 spikewatch.py &

Single check

python3 spikewatch.py --once

Generate analysis report

# Last 24 hours report
python3 spikewatch.py --report 24

# Last week report
python3 spikewatch.py --report 168

Database management

# Clear all historical data from database (with interactive confirmation)
python3 spikewatch.py --clear-db

# Clear database and start monitoring immediately
python3 spikewatch.py --clear-db --daemon

# Clear database and run single check
python3 spikewatch.py --clear-db --once

Warning: The --clear-db option permanently deletes all metrics and spike events from the database. Use with caution! When used alone (without --daemon or --once), it will prompt for confirmation before proceeding.

Custom configuration

SpikeWatch looks for config.json in the script directory by default. You can also specify a custom path:

# Use default config.json in script directory
python3 spikewatch.py

# Or specify a custom path
python3 spikewatch.py --config /path/to/custom_config.json

Analyze historical data

# Default (uses config.json in script directory, English language)
python3 analyze_spikes.py

# With custom configuration file
python3 analyze_spikes.py --config /path/to/custom_config.json

# Override language (en/it)
python3 analyze_spikes.py --language it

Command-line options summary

python3 spikewatch.py [OPTIONS]

Options:
  --once           Run a single check instead of continuous monitoring
  --daemon         Run as background daemon (Unix/Linux/macOS only)
  --report HOURS   Generate analysis report for last N hours
  --config FILE    Use custom configuration file (default: config.json)
  --clear-db       Clear all database contents before starting
  --force          Force start even if another instance is running
  -h, --help       Show help message and exit

Configuration

Thresholds and parameters can be configured in the CONFIG dictionary in spikewatch.py or via an external JSON file.

Creating a Configuration File

Copy the example configuration to the project directory:

cp config.json.example config.json

Edit the configuration file as needed:

{
  "log_dir": "/var/log/spikewatch",
  "db_file": "/var/log/spikewatch/metrics.db",
  "thresholds": {
    "cpu": 60,
    "load": 6.0,
    "memory": 85,
    "iowait": 30,
    "disk_usage": 90
  },
  "adaptive_threshold": true,
  "baseline_window": 3600,
  "spike_duration": 60,
  "collect_interval": 10,
  "detailed_on_spike": true,
  "language": "en"
}

Main parameters

thresholds: Fixed thresholds for spike detection (percentage or load values)
adaptive_threshold: Enable dynamic threshold calculation based on baseline
baseline_window: Time window (seconds) for baseline calculation (default: 1 hour)
spike_duration: Minimum duration (seconds) to consider a spike significant
collect_interval: Metrics collection interval (seconds)
detailed_on_spike: Collect comprehensive diagnostics during spikes
language: Output language for analyze_spikes.py (choices: "en" or "it", default: "en")

Output and Logs

All files are saved in /var/log/spikewatch/:

metrics.db: SQLite database with complete history
spikewatch.log: Monitor text log
metrics.json: JSON metrics stream (one per line)
spike_YYYYMMDD_HHMMSS.json: Detailed diagnostics for each spike
spike_analysis.log: Human-readable spike summaries
report_YYYYMMDD.json: Daily analysis reports

Running as systemd service

Create the file /etc/systemd/system/spikewatch.service:

[Unit]
Description=SpikeWatch Performance Monitor
After=network.target

[Service]
Type=simple
User=root
WorkingDirectory=/path/to/spikewatch
ExecStart=/usr/bin/python3 /path/to/spikewatch/spikewatch.py
Restart=always
RestartSec=10

# Optional: Reduce system impact with nice/ionice (recommended for production)
Nice=10
IOSchedulingClass=2
IOSchedulingPriority=5

[Install]
WantedBy=multi-user.target

Enable and start the service:

sudo systemctl enable spikewatch
sudo systemctl start spikewatch
sudo systemctl status spikewatch

Reducing system impact with nice/ionice

For production environments or systems under heavy load, it's recommended to run SpikeWatch with reduced priority to minimize any potential impact on critical services.

Using nice and ionice (recommended):

# Run with reduced CPU priority (nice) and I/O priority (ionice)
nice -n 10 ionice -c 2 -n 5 python3 spikewatch.py --daemon

Priority levels explained:

nice: CPU scheduling priority (-20 to 19, higher = lower priority)
- 0-5: Slightly reduced priority (recommended for most cases)
- 10-15: Moderately reduced priority (good for busy servers)
- 15-19: Heavily reduced priority (use only if you can tolerate delays)
ionice: I/O scheduling priority
- Class 2 (best-effort) with priority 3-5: Balanced I/O access (recommended)
- Class 3 (idle): Only uses I/O when system is idle (use with caution)

Recommendations by use case:

# Standard production server (recommended)
nice -n 5 ionice -c 2 -n 3 python3 spikewatch.py --daemon

# High-traffic web server (minimal impact)
nice -n 10 ionice -c 2 -n 5 python3 spikewatch.py --daemon

# Development/testing environment (no priority adjustment needed)
python3 spikewatch.py --daemon

Note: Given SpikeWatch's lightweight footprint (< 0.2% CPU, ~18 MB RAM), nice/ionice is optional but recommended as a best practice for production environments. The systemd service example above includes sensible defaults (Nice=10, IOSchedulingClass=2, IOSchedulingPriority=5).

Architecture

MetricsCollector

Collects system metrics with minimal overhead:

System metrics (psutil): CPU, memory, swap, I/O, network
Top processes by CPU/memory
OpenLiteSpeed metrics (lsphp processes, connections)
MySQL metrics (connections, slow queries)

SpikeDetector

Intelligent detection with:

Adaptive thresholds based on dynamic baselines
Sudden spike detection (2x recent average)
Spike event tracking with duration and cause
Multi-metric analysis (CPU, load, memory, I/O)

SpikeWatchMonitor

Main orchestration:

Continuous monitoring loop
Structured logging (JSON + text)
On-demand detailed diagnostics
Periodic report generation
Automatic cleanup of old data (7 days)

Database Schema

Table `metrics`

CREATE TABLE metrics (
    timestamp TEXT PRIMARY KEY,
    cpu_percent REAL,
    load_1min REAL,
    memory_percent REAL,
    iowait REAL,
    spike_detected BOOLEAN,
    spike_reason TEXT,
    details TEXT
);

Table `spikes`

CREATE TABLE spikes (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    start_time TEXT,
    end_time TEXT,
    duration_seconds INTEGER,
    max_cpu REAL,
    max_load REAL,
    cause TEXT,
    details TEXT
);

Contributing

Contributions are welcome! Please:

Fork the project
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

This project is distributed under the MIT License. See the LICENSE file for more details.

Author

Gabriele Venturini

Performance Benchmarking

Want to know exactly how much resources SpikeWatch will use on your system? Run the included benchmark:

cd test
python3 benchmark_resources.py --duration 60

This will generate a detailed report including:

System information (CPU, RAM, disk)
Resource usage statistics (CPU%, memory MB, I/O)
Recommendations for minimum VPS requirements

See test/README.md for more details.

Real-World Benchmark Results

Test Environment (production VPS):

Platform: Linux
CPU: 6 vCPU cores (AMD EPYC Processor with IBPB) @ 3195 MHz
RAM: 11.68 GB
Disk: 144 GB
OS: Ubuntu 24.04 LTS

Benchmark Results (60-second test):

CPU Usage:
  Average: 0.15%
  Min: 0.0%
  Max: 1.0%

Memory Usage:
  Average: 18.12 MB (0.15%)
  Min: 18.12 MB
  Max: 18.12 MB

Threads:
  Average: 1
  Max: 1

File Descriptors:
  Average: 4
  Max: 4

Disk I/O (cumulative):
  Read: 0.54 MB
  Write: 1.01 MB

Conclusion: ✓ Extremely lightweight - Uses less than 0.2% CPU and only ~18 MB of RAM. Suitable for VPS with as little as 512 MB RAM and 1 vCPU core. Perfect for shared hosting and minimal VPS configurations.

Note: These are real results from a production VPS running Ubuntu 24.04. Your results may vary based on system configuration, server load, and collection interval settings. Run the benchmark on your own system for accurate measurements.

Support

For bugs, feature requests or questions, open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
test		test
.gitignore		.gitignore
README.md		README.md
analyze_spikes.py		analyze_spikes.py
config.json.example		config.json.example
requirements.txt		requirements.txt
spikewatch.py		spikewatch.py

netwaretcs/spikewatch

Folders and files

Latest commit

History

Repository files navigation