BayesPilot is a production-style machine learning system for churn risk estimation and retention decision support.
It is designed to show how an ML model moves from training to operational use, where predictions are translated into actions, logged, and evaluated against business context.
Customer churn prediction is useful only when it informs retention action.
In practice, teams must decide who to contact, when to intervene, and how to prioritize limited retention capacity.
BayesPilot frames this as a decision-support problem:
- estimate churn probability for each customer
- map probability to a retention action tier
- keep the pipeline reproducible and measurable
- monitor serving behavior in production-style API flow
Stage 1 establishes a reliable baseline:
- a single
scikit-learnpipeline combines preprocessing and logistic regression - training is config-driven and tracked in MLflow
- one serialized artifact (
churn_pipeline.pkl) is produced for consistent inference
Run:
python -m training.trainPrimary output:
models/artifacts/churn_pipeline.pkl
Stage 2 upgrades the baseline into a structured model selection workflow:
- trains multiple candidate models under shared preprocessing
- evaluates and compares models with common metrics
- calibrates probabilities
- runs threshold evaluation
- selects and stores a deployment artifact
Candidate models:
- logistic regression
- random forest
- gradient boosting
Run:
python -m training.train_stage2Primary outputs:
- per-model artifacts in
models/artifacts/ - selected deployment artifact:
models/artifacts/deployed_pipeline.pkl - comparison/evaluation outputs in
reports/stage2/
BayesPilot follows a modular training-to-serving architecture:
- Data generation and configuration
- Training pipeline (
training/) - Experiment tracking (MLflow)
- Model artifacts (
models/artifacts/) - Deployment API (
app/api/main.py) - Decision logic (
app/services/decision.py) - Monitoring and logs (
app/monitoring/,logs/predictions.jsonl)
Inference flow:
Request payload
-> deployed pipeline (preprocessing + model)
-> churn probability
-> decision tier
-> logged prediction and latency
-> API response
A probability alone is not an operational decision.
BayesPilot includes a decision layer that converts model output into action categories (for example, monitor vs review vs escalate).
This matters because it:
- makes model output actionable for retention teams
- creates a transparent policy surface that can be tested and revised
- supports alignment between ML performance and operational constraints
Current implementation uses fixed thresholds in app/services/decision.py.
The system includes lightweight production-oriented monitoring:
- prediction logging to
logs/predictions.jsonl - latency tracking in the serving layer
Evaluation and reporting include:
- per-model metrics and comparison files in
reports/stage2/ - selection record in
reports/stage2/selected_model.json
BayesPilot goes beyond a notebook-style model demo by emphasizing:
- reproducible pipelines and artifact discipline
- controlled model comparison and deployment selection
- calibration and threshold analysis, not raw accuracy only
- explicit decision layer between prediction and action
- API serving with monitoring and test coverage
- clean modular boundaries across training, serving, and operations
Current limitations:
- decision policy is threshold-based rather than explicit expected-value optimization
- monitoring is foundational and does not yet include drift or outcome feedback loops
- interpretability outputs are limited to current reporting artifacts
Practical next steps:
- introduce expected-value decision logic with business cost/benefit parameters
- add business-aware evaluation metrics for retention impact
- expand interpretability reporting for deployment decisions
- strengthen monitoring with risk signals, drift checks, and closed-loop outcomes
Setup:
python -m venv venv_bayespilot
source venv_bayespilot/bin/activate
pip install -r requirements.txtGenerate synthetic dataset:
python scripts/generate_churn_data.pyRun API:
uvicorn app.api.main:app --reloadRun tests:
pytest