GitHub - mdhishaamakhtar/wikiweb-backend

Iris Wikipedia Pathfinder

A high-performance service for discovering shortest paths between Wikipedia pages using optimized graph algorithms

Overview

Iris Wikipedia Pathfinder is a sophisticated web service that implements advanced graph traversal algorithms to find the shortest path between any two Wikipedia pages. Built with modern software architecture principles, the system leverages Redis-based breadth-first search (BFS) algorithms to efficiently navigate Wikipedia's link graph while maintaining scalability and performance.

The project demonstrates expertise in:

Domain-Driven Design: Clean separation between API, business logic, and infrastructure layers
Distributed Systems: Redis-based queuing and caching for horizontal scalability
Asynchronous Processing: Celery task queues for non-blocking pathfinding operations
Algorithm Optimization: Memory-efficient BFS implementation using external storage
Production-Ready Architecture: Comprehensive error handling, monitoring, and deployment automation

Core Features

✅ Pathfinding Algorithms

Redis-Based BFS: Memory-efficient pathfinding using external Redis queues
Configurable Depth Limits: Prevents infinite searches with customizable depth constraints
Batch Processing: Optimized Wikipedia API usage through intelligent batching

✅ Scalable Architecture

Asynchronous Task Processing: Non-blocking operations using Celery workers
Distributed Caching: Redis-based caching for Wikipedia API responses
Session Isolation: Concurrent searches with isolated Redis namespaces
Auto-cleanup: Automatic resource cleanup to prevent memory accumulation

✅ Production Features

Health Monitoring: Comprehensive system health checks and metrics
Error Handling: Structured exception hierarchy with detailed error responses
API Validation: Request/response validation using Marshmallow schemas
Rate Limiting: Configurable API rate limiting for resource protection
CORS Support: Cross-origin resource sharing for frontend integration

✅ Interactive Visualization

Web-Based UI: Interactive interface for pathfinding with real-time progress
Graph Visualization: D3.js-powered interactive graph with physics simulation
Mobile Support: Touch-optimized interface that works on mobile devices
State Persistence: Saves progress and resumes interrupted searches
Dynamic Features: Drag-and-drop nodes, responsive layout, smart text truncation

✅ Development Tools

Comprehensive Testing: Unit and integration tests with 100% pass rate
Environment Management: Separate configurations for development, testing, and production
CI/CD Ready: GitHub Actions integration for automated testing and deployment

Core Technologies

Frontend & Visualization

Development & Testing

Project Information

Infrastructure

Quick Start

Development Setup (One Command)

# Clone and setup
git clone <repository-url>
cd iris-web-backend

# Create virtual environment  
python3 -m venv env
source env/bin/activate  # On Windows: env\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Start everything (Redis + Flask + Celery)
./dev.sh

The application will be available at:

Interactive UI: http://localhost:9020 (default landing page)
API Documentation: http://localhost:9020/api

Production Deployment

# Set environment variables
export FLASK_ENV=production
export SECRET_KEY=your-secure-secret-key
export REDIS_URL=redis://localhost:6379/0

# Deploy with startup script
./start.sh

API Documentation

Complete API documentation with examples, request/response schemas, and integration guides is available in API_DOCUMENTATION.md.

Key Endpoints

GET / - Interactive UI (default landing page)
GET /<any-path> - All non-API paths redirect to main UI
POST /getPath - Start pathfinding task (returns task ID for polling)
GET /tasks/status/<task_id> - Poll task status with progress updates
POST /explore - Discover page connections for graph visualization
GET /health - System health monitoring endpoint
GET /api - API documentation and information

Architecture Highlights

Redis-Based BFS Algorithm

The core pathfinding algorithm demonstrates advanced system design:

Memory Efficiency: Uses Redis queues instead of in-memory data structures
Horizontal Scalability: Multiple workers can process different search sessions
Session Isolation: Unique Redis namespaces prevent search interference
Automatic Cleanup: Resource cleanup prevents Redis memory accumulation

Service Layer Architecture

Dependency Injection: Service factory pattern with proper abstractions
Interface Segregation: Clear contracts between components
Error Propagation: Structured exception handling throughout the stack
Configuration Management: Environment-specific settings with validation

Testing & Quality Assurance

# Run comprehensive test suite
pytest -v

# Run with coverage reporting (console + HTML)
pytest --cov=app --cov-report=term-missing --cov-report=html

# Test specific components
pytest tests/unit/ -v      # Unit tests
pytest tests/integration/ -v  # Integration tests

Current test coverage: 107 tests passing with approximately 80% line coverage across the app/ package (see htmlcov/index.html after running coverage for a browsable report).

Key areas covered by new tests:

Cache and queue infrastructure with Redis client mocking
ServiceFactory lifecycle and Celery task configuration helpers
Wikipedia client parsing, batching, and request handling with a fake session
API middleware decorators (error handling, CORS, rate limiting, size checks)
Logging configuration, including file handler setup for non-testing environments

Contributors

This project was developed by:

Md Hishaam Akhtar - GitHub | LinkedIn
Sharanya Mukherjee - GitHub | LinkedIn

Made with ❤️ by DSC VIT

Name		Name	Last commit message	Last commit date
Latest commit History 129 Commits
.github		.github
app		app
config		config
static		static
tests		tests
.coverage		.coverage
.gitignore		.gitignore
API_DOCUMENTATION.md		API_DOCUMENTATION.md
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
celery_worker.py		celery_worker.py
dev.sh		dev.sh
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run.py		run.py
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Iris Wikipedia Pathfinder

A high-performance service for discovering shortest paths between Wikipedia pages using optimized graph algorithms

Overview

Core Features

✅ Pathfinding Algorithms

✅ Scalable Architecture

✅ Production Features

✅ Interactive Visualization

✅ Development Tools

Core Technologies

Frontend & Visualization

Development & Testing

Project Information

Infrastructure

Quick Start

Development Setup (One Command)

Production Deployment

API Documentation

Key Endpoints

Architecture Highlights

Redis-Based BFS Algorithm

Service Layer Architecture

Testing & Quality Assurance

Contributors

About

Uh oh!

Releases

Packages

Languages

License

mdhishaamakhtar/wikiweb-backend

Folders and files

Latest commit

History

Repository files navigation

Iris Wikipedia Pathfinder

A high-performance service for discovering shortest paths between Wikipedia pages using optimized graph algorithms

Overview

Core Features

✅ Pathfinding Algorithms

✅ Scalable Architecture

✅ Production Features

✅ Interactive Visualization

✅ Development Tools

Core Technologies

Frontend & Visualization

Development & Testing

Project Information

Infrastructure

Quick Start

Development Setup (One Command)

Production Deployment

API Documentation

Key Endpoints

Architecture Highlights

Redis-Based BFS Algorithm

Service Layer Architecture

Testing & Quality Assurance

Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages