Skip to content

Database Migration

pancakes-proxy edited this page Jul 14, 2025 · 1 revision

Database Migration

This guide covers the migration process from JSON file storage to PostgreSQL database, including data preservation, validation, and rollback procedures.

📋 Migration Overview

Migration Scope

AIMod's migration from JSON to PostgreSQL involves converting the following data:

Data Files to Migrate:

  • guild_config.jsonguild_config table
  • user_infractions.jsonuser_infractions table
  • appeals.jsonappeals table
  • global_bans.jsonglobal_bans table
  • logging_data.jsonevent_logs table
  • botdetect_config.jsonbotdetect_config table
  • user_data.jsonuser_data table

Benefits of Migration

Performance Improvements:

  • 10x faster query performance
  • Concurrent access support
  • ACID transaction guarantees
  • Efficient indexing and searching

Scalability Benefits:

  • Support for millions of records
  • Connection pooling
  • Query optimization
  • Horizontal scaling potential

Data Integrity:

  • Foreign key constraints
  • Data type validation
  • Atomic operations
  • Backup and recovery

🔧 Pre-Migration Setup

Prerequisites

System Requirements:

  • PostgreSQL 13+ installed and running
  • Python 3.11+ with asyncpg
  • Sufficient disk space (2x current JSON data size)
  • Database user with appropriate permissions

Backup Existing Data:

# Create backup directory
mkdir -p backups/pre-migration

# Backup JSON data
cp -r wdiscordbot-json-data/ backups/pre-migration/
tar -czf backups/pre-migration/json-backup-$(date +%Y%m%d).tar.gz wdiscordbot-json-data/

# Backup current bot configuration
cp .env backups/pre-migration/
cp bot.py backups/pre-migration/

Database Preparation

Install PostgreSQL

# Ubuntu/Debian
sudo apt update
sudo apt install postgresql postgresql-contrib

# Start and enable PostgreSQL
sudo systemctl start postgresql
sudo systemctl enable postgresql

Create Database and User

# Switch to postgres user
sudo -u postgres psql

# Create database and user
CREATE DATABASE aimod_bot;
CREATE USER aimod_user WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE aimod_bot TO aimod_user;
ALTER USER aimod_user CREATEDB;
\q

Configure Database Connection

# Update .env file
echo "DATABASE_URL=postgresql://aimod_user:secure_password@localhost:5432/aimod_bot" >> .env
echo "REDIS_URL=redis://localhost:6379" >> .env

🗄️ Database Schema Creation

Automated Schema Setup

The migration script automatically creates all required tables:

# Run schema creation
python -c "
from database.connection import initialize_database
import asyncio
asyncio.run(initialize_database())
print('Database schema created successfully')
"

Manual Schema Creation

If you need to create the schema manually:

-- Connect to database
psql -h localhost -U aimod_user -d aimod_bot

-- Guild configuration table
CREATE TABLE IF NOT EXISTS guild_config (
    guild_id BIGINT NOT NULL,
    key VARCHAR(255) NOT NULL,
    value JSONB,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (guild_id, key)
);

-- User infractions table
CREATE TABLE IF NOT EXISTS user_infractions (
    id SERIAL PRIMARY KEY,
    guild_id BIGINT NOT NULL,
    user_id BIGINT NOT NULL,
    timestamp TIMESTAMP WITH TIME ZONE NOT NULL,
    rule_violated VARCHAR(50),
    action_taken VARCHAR(100),
    reasoning TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- Appeals table
CREATE TABLE IF NOT EXISTS appeals (
    id SERIAL PRIMARY KEY,
    user_id BIGINT NOT NULL,
    guild_id BIGINT NOT NULL,
    infraction_id INTEGER REFERENCES user_infractions(id),
    reason TEXT NOT NULL,
    status VARCHAR(20) DEFAULT 'PENDING',
    admin_response TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- Global bans table
CREATE TABLE IF NOT EXISTS global_bans (
    user_id BIGINT PRIMARY KEY,
    reason TEXT NOT NULL,
    banned_by BIGINT NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- Event logs table
CREATE TABLE IF NOT EXISTS event_logs (
    id SERIAL PRIMARY KEY,
    guild_id BIGINT NOT NULL,
    event_type VARCHAR(50) NOT NULL,
    user_id BIGINT,
    channel_id BIGINT,
    data JSONB,
    timestamp TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- Bot detection config table
CREATE TABLE IF NOT EXISTS botdetect_config (
    guild_id BIGINT PRIMARY KEY,
    enabled BOOLEAN DEFAULT false,
    keywords JSONB DEFAULT '[]',
    action VARCHAR(50) DEFAULT 'timeout',
    timeout_duration INTEGER DEFAULT 3600,
    log_channel BIGINT,
    whitelist_roles JSONB DEFAULT '[]',
    whitelist_users JSONB DEFAULT '[]',
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- User data table
CREATE TABLE IF NOT EXISTS user_data (
    user_id BIGINT PRIMARY KEY,
    username VARCHAR(255),
    discriminator VARCHAR(10),
    avatar_url TEXT,
    first_seen TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    last_seen TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    message_count INTEGER DEFAULT 0,
    infraction_count INTEGER DEFAULT 0
);

-- Create indexes for performance
CREATE INDEX IF NOT EXISTS idx_guild_config_guild_id ON guild_config(guild_id);
CREATE INDEX IF NOT EXISTS idx_user_infractions_guild_user ON user_infractions(guild_id, user_id);
CREATE INDEX IF NOT EXISTS idx_user_infractions_timestamp ON user_infractions(timestamp);
CREATE INDEX IF NOT EXISTS idx_appeals_user_guild ON appeals(user_id, guild_id);
CREATE INDEX IF NOT EXISTS idx_event_logs_guild_type ON event_logs(guild_id, event_type);
CREATE INDEX IF NOT EXISTS idx_event_logs_timestamp ON event_logs(timestamp);

🔄 Migration Process

Running the Migration

The migration script handles all data conversion automatically:

# Run the complete migration
python migrate_json_to_postgresql.py

Migration Script Overview

The migration script performs the following steps:

  1. Validation: Checks JSON file integrity
  2. Conversion: Transforms JSON data to PostgreSQL format
  3. Import: Inserts data into database tables
  4. Verification: Validates migration success
  5. Cleanup: Optional JSON file archival

Detailed Migration Steps

1. Guild Configuration Migration

async def migrate_guild_config():
    """Migrate guild_config.json to PostgreSQL."""
    
    json_file = "wdiscordbot-json-data/guild_config.json"
    if not os.path.exists(json_file):
        print("No guild_config.json found, skipping...")
        return
    
    print("Migrating guild configuration...")
    
    with open(json_file, "r") as f:
        data = json.load(f)
    
    migrated_count = 0
    async with get_connection() as conn:
        for guild_id_str, config in data.items():
            guild_id = int(guild_id_str)
            
            for key, value in config.items():
                await conn.execute("""
                    INSERT INTO guild_config (guild_id, key, value)
                    VALUES ($1, $2, $3)
                    ON CONFLICT (guild_id, key) DO NOTHING
                """, guild_id, key, json.dumps(value))
                migrated_count += 1
    
    print(f"✅ Migrated {migrated_count} guild configuration entries")

2. User Infractions Migration

async def migrate_user_infractions():
    """Migrate user_infractions.json to PostgreSQL."""
    
    json_file = "wdiscordbot-json-data/user_infractions.json"
    if not os.path.exists(json_file):
        print("No user_infractions.json found, skipping...")
        return
    
    print("Migrating user infractions...")
    
    with open(json_file, "r") as f:
        data = json.load(f)
    
    migrated_count = 0
    async with get_connection() as conn:
        for key, infractions in data.items():
            # Parse guild_id and user_id from key format: "guild_id_user_id"
            parts = key.split("_")
            if len(parts) >= 2:
                guild_id = int(parts[0])
                user_id = int(parts[1])
                
                for infraction in infractions:
                    # Convert timestamp format
                    timestamp_str = infraction.get("timestamp", "")
                    try:
                        timestamp = datetime.fromisoformat(timestamp_str.replace("Z", "+00:00"))
                    except:
                        timestamp = datetime.utcnow()
                    
                    await conn.execute("""
                        INSERT INTO user_infractions 
                        (guild_id, user_id, timestamp, rule_violated, action_taken, reasoning)
                        VALUES ($1, $2, $3, $4, $5, $6)
                    """, 
                        guild_id,
                        user_id,
                        timestamp,
                        infraction.get("rule_violated", ""),
                        infraction.get("action_taken", ""),
                        infraction.get("reasoning", "")
                    )
                    migrated_count += 1
    
    print(f"✅ Migrated {migrated_count} user infractions")

3. Appeals Migration

async def migrate_appeals():
    """Migrate appeals.json to PostgreSQL."""
    
    json_file = "wdiscordbot-json-data/appeals.json"
    if not os.path.exists(json_file):
        print("No appeals.json found, skipping...")
        return
    
    print("Migrating appeals...")
    
    with open(json_file, "r") as f:
        data = json.load(f)
    
    migrated_count = 0
    async with get_connection() as conn:
        for appeal_data in data:
            # Convert timestamp
            created_at = datetime.fromisoformat(
                appeal_data.get("created_at", datetime.utcnow().isoformat())
            )
            
            await conn.execute("""
                INSERT INTO appeals 
                (user_id, guild_id, reason, status, admin_response, created_at)
                VALUES ($1, $2, $3, $4, $5, $6)
            """,
                appeal_data.get("user_id"),
                appeal_data.get("guild_id"),
                appeal_data.get("reason", ""),
                appeal_data.get("status", "PENDING"),
                appeal_data.get("admin_response"),
                created_at
            )
            migrated_count += 1
    
    print(f"✅ Migrated {migrated_count} appeals")

Migration Validation

Data Integrity Checks

async def validate_migration():
    """Validate that migration was successful."""
    
    print("\n=== Migration Validation ===")
    
    # Check record counts
    validation_results = {}
    
    # Guild config validation
    json_guilds = len(load_json_data("guild_config.json", {}))
    async with get_connection() as conn:
        db_guilds = await conn.fetchval(
            "SELECT COUNT(DISTINCT guild_id) FROM guild_config"
        )
    validation_results["guild_config"] = (json_guilds, db_guilds)
    
    # User infractions validation
    json_infractions = sum(
        len(infractions) 
        for infractions in load_json_data("user_infractions.json", {}).values()
    )
    async with get_connection() as conn:
        db_infractions = await conn.fetchval(
            "SELECT COUNT(*) FROM user_infractions"
        )
    validation_results["user_infractions"] = (json_infractions, db_infractions)
    
    # Print validation results
    all_valid = True
    for table, (json_count, db_count) in validation_results.items():
        status = "✅" if json_count == db_count else "❌"
        print(f"{status} {table}: JSON={json_count}, DB={db_count}")
        if json_count != db_count:
            all_valid = False
    
    if all_valid:
        print("\n🎉 Migration validation successful!")
    else:
        print("\n⚠️  Migration validation failed - please review the data")
    
    return all_valid

🧪 Testing Migration

Pre-Migration Testing

# Test database connection
python -c "
from database.connection import get_pool
import asyncio

async def test():
    pool = await get_pool()
    print('Database connection successful' if pool else 'Connection failed')

asyncio.run(test())
"

# Test JSON file integrity
python -c "
import json
import os

files = ['guild_config.json', 'user_infractions.json', 'appeals.json', 'global_bans.json']
for file in files:
    path = f'wdiscordbot-json-data/{file}'
    if os.path.exists(path):
        try:
            with open(path) as f:
                json.load(f)
            print(f'✅ {file} is valid JSON')
        except Exception as e:
            print(f'❌ {file} is invalid: {e}')
    else:
        print(f'⚠️  {file} not found')
"

Post-Migration Testing

# Run comprehensive migration test
python test_postgresql_migration.py

# Test specific functionality
python -c "
from database.operations import get_guild_config, set_guild_config
import asyncio

async def test():
    # Test configuration operations
    await set_guild_config(123456789, 'TEST_KEY', 'test_value')
    value = await get_guild_config(123456789, 'TEST_KEY')
    print(f'Config test: {value}')
    
    # Test infraction operations
    from database.operations import get_user_infractions
    infractions = await get_user_infractions(123456789, 987654321)
    print(f'Infractions test: {len(infractions)} records')

asyncio.run(test())
"

🔄 Rollback Procedures

Emergency Rollback

If migration fails or issues are discovered:

# Stop the bot
sudo systemctl stop aimod-bot.service

# Restore JSON files
cp -r backups/pre-migration/wdiscordbot-json-data/ ./

# Revert to JSON-based code
git checkout json-storage-branch  # If you have a backup branch

# Restart with JSON storage
python bot.py

Partial Rollback

To rollback specific tables:

-- Connect to database
psql -h localhost -U aimod_user -d aimod_bot

-- Clear specific table
TRUNCATE TABLE guild_config CASCADE;

-- Re-run specific migration
-- Then run: python migrate_guild_config_only.py

Data Recovery

# Recover from backup
tar -xzf backups/pre-migration/json-backup-YYYYMMDD.tar.gz

# Verify data integrity
python -c "
import json
import os

for file in os.listdir('wdiscordbot-json-data'):
    if file.endswith('.json'):
        with open(f'wdiscordbot-json-data/{file}') as f:
            data = json.load(f)
            print(f'{file}: {len(data)} records')
"

📊 Migration Monitoring

Progress Tracking

The migration script provides detailed progress information:

=== AIMod JSON to PostgreSQL Migration ===

Initializing database connection...
✅ Database connection established

Migrating guild configuration...
Progress: [████████████████████████████████] 100% (1250/1250)
✅ Migrated 1250 guild configuration entries

Migrating user infractions...
Progress: [████████████████████████████████] 100% (5430/5430)
✅ Migrated 5430 user infractions

Migrating appeals...
Progress: [████████████████████████████████] 100% (23/23)
✅ Migrated 23 appeals

=== Migration Complete ===
Total time: 2m 34s
All JSON data has been successfully migrated to PostgreSQL

Performance Metrics

Monitor migration performance:

import time
import psutil

def monitor_migration():
    """Monitor system resources during migration."""
    
    start_time = time.time()
    process = psutil.Process()
    
    print(f"CPU usage: {process.cpu_percent()}%")
    print(f"Memory usage: {process.memory_info().rss / 1024 / 1024:.2f} MB")
    print(f"Elapsed time: {time.time() - start_time:.2f}s")

🔧 Troubleshooting

Common Issues

Permission Errors:

# Fix database permissions
sudo -u postgres psql
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO aimod_user;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO aimod_user;

Memory Issues:

# Process large files in chunks
# Increase PostgreSQL memory settings
sudo nano /etc/postgresql/13/main/postgresql.conf
# shared_buffers = 256MB
# work_mem = 4MB

Encoding Issues:

# Handle encoding problems
with open(json_file, "r", encoding="utf-8") as f:
    data = json.load(f)

Connection Timeouts:

# Increase connection timeout
DATABASE_URL = "postgresql://user:pass@localhost:5432/db?connect_timeout=60"

Migration Logs

Check migration logs for issues:

# View migration output
tail -f migration.log

# Check for errors
grep -i error migration.log

# Check database logs
sudo tail -f /var/log/postgresql/postgresql-13-main.log

Next: Developer Guide - Development setup and contribution guidelines

Clone this wiki locally