A modular ML framework for creating machine learning dashboards with API serving, experiment tracking, and LLM integration (planned). Built for rapid prototyping and easy adaptation across domains. Intended as a platform for learning new techniques and technologies.
# Clone and setup
git clone https://github.com/TomTonroe/modular-ml-platform
cd modular-ml-platform
conda create -n your_project python=3.12.8 && conda activate your_project
# Install and configure
make install
cp .env.example .env
# Edit .env to set PROJECT_NAME, PROJECT_DATA_PATH, LLM_PROVIDER
# Initialize and run
make db-upgrade
python -m src.train --task your_task --register-model
# Start services
make api & # FastAPI server (localhost:8000)
make ui & # Streamlit dashboard (localhost:8501)
make mlflow-server # MLflow UI (localhost:5000)Windows users: See Makefile for equivalent commands
To adapt for your domain, modify only these directories:
src/
├── data/{project}_loader.py # Your data loading logic
├── features/{project}_features.py # Your feature engineering (optional)
└── models/{project}_models.py # Your ML tasks and models
Everything else stays the same - the core framework, API, and dashboard automatically work with your domain-specific code. Easy extensions like adding new dashboard panels or API endpoints can be made by following the existing patterns.
- Automatic Model Discovery: MLflow integration with auto-generated API endpoints
- Modular Dashboard: Add panels by creating files in
frontend/streamlit/panels/ - Multi-source Data: CSV, API, and database loading with Parquet caching
- Experiment Tracking: MLflow model registry and versioning
- Generic Core Framework: Easily adaptable to any ML domain
For a working example, see the sparcs-example branch which implements a platform around a dataset of hospital admissions made available by the New York State Department of Health.
git checkout sparcs-exampleFollow the naming convention: {project}_*.py where {project} matches your PROJECT_NAME in .env
Required Files:
src/data/{project}_loader.py- Must haveload_{project}_data(data_source, nrows=None) -> pd.DataFramesrc/models/{project}_models.py- Must haveAVAILABLE_TASKSdict andtrain_task()function
Optional:
src/features/{project}_features.py- Domain-specific transformations
See example files (*_ex.py, example_models.py) for exact interfaces and implementations.
src/
├── core/ # Framework (don't modify)
├── data/ # YOUR PROJECT: Data loading
│ ├── *_loader_ex.py # Examples provided
│ └── {project}_loader.py # Your loader
├── models/ # YOUR PROJECT: ML models
│ ├── example_models.py # Examples provided
│ └── {project}_models.py # Your models
├── features/ # YOUR PROJECT: Features (optional)
└── train.py # Training script
frontend/streamlit/panels/ # Dashboard components
├── data_explorer/ # Built-in panels
├── model_inference/
├── database_analytics/
└── {project}_dashboard/ # Your custom panels
- Try the SPARCS Example: Switch to
sparcs-examplebranch to see a complete implementation - Start with Data: Create
src/data/your_domain_loader.pyfollowing the example patterns - Add Your Models: Create training tasks in
src/models/your_domain_models.py - Build Domain Panels: Add custom analytics in
frontend/streamlit/panels/your_dashboard/ - Scale Production: Switch to PostgreSQL, remote MLflow, production LLM APIs
# Training
cd src && python -m train --list # List available tasks
cd src && python -m train --task your_task # Train model
cd src && python -m train --task your_task --register-model # Train + register
# Services
make api # Start API server
make ui # Start dashboard
make mlflow-server # Start MLflow UI
# Database
make db-upgrade # Apply migrations
make db-migrate # Create new migration
# Code Quality
make lint # Check code quality
make fmt # Format code
make clean # Clean artifactsWindows: Review makefile for commands.