LeukomiaVision

Early leukemia detection through explainable computer vision

An AI-powered system that detects leukemia cells from microscopic blood images and provides interpretable explanations for medical professionals. This project combines state-of-the-art deep learning with clinical explainability to support early diagnosis.

🎯 Project Goal

Early detection of leukemia is critical for successful treatment. This project aims to:

Detect leukemia cells from microscopic images with high accuracy
Explain which visual features the AI model uses for classification (Grad-CAM heatmaps)
Communicate findings in natural language through LLM-generated medical explanations

The system is designed to be a decision support tool for hematologists, not a replacement for medical expertise.

🏆 Results & Achievements

Timeline

January 29, 2026 - Achieved 97.66% accuracy with ResNet18 on C-NMC dataset (SOTA-level performance)
January 22, 2026 - Completed data pipeline and EDA-driven augmentation strategy

ResNET18 Classification Performance

ResNet18 model achieves state-of-the-art performance on the C-NMC dataset (Cancer and Normal Myeloid Cells):

Metric	Score	Clinical Relevance
Accuracy	97.66%	Overall diagnostic accuracy
Recall (Sensitivity)	98.32%	Only 1.68% of leukemia cases missed ⚠️
Precision	98.18%	High confidence in positive diagnoses
F1-Score	98.25%	Balanced performance

🎯 Clinical Impact: With 98.32% sensitivity, the model successfully detects 98 out of 100 leukemia cases, making it highly suitable for screening applications where minimizing false negatives is critical.

✨ Key Features

🧠 Deep Learning Classification: ResNet/EfficientNet/ViT architectures for cell classification
🔍 Visual Explainability: Grad-CAM heatmaps highlighting relevant cellular features
💬 Natural Language Explanations: LLM-powered descriptions of diagnostic reasoning
📊 Medical-Grade Metrics: Precision, Recall, F1-Score, AUC-ROC optimized for clinical use
🎨 Interactive Demo: Streamlit/Gradio interface for easy testing

📁 Project Structure

leukocare-ai/
├── data/                      # Dataset storage
├── notebooks/                 # Jupyter notebooks for exploration
│   ├── 01_eda.ipynb          # Data exploration
│   ├── 02_preprocessing.ipynb
│   ├── 03_modeling.ipynb
│   └── 04_explainability.ipynb
├── src/                       # Source code (production-ready)
│   ├── data/                 # Data loading and augmentation
│   ├── models/               # Model architectures
│   ├── training/             # Training loops and metrics
│   ├── explainability/       # Grad-CAM, SHAP, visualization
│   └── llm/                  # LLM explanation generation
├── scripts/                   # Executable scripts
│   ├── train.py              # Training script
│   ├── evaluate.py           # Model evaluation
│   └── inference.py          # Single image inference
├── configs/                   # Configuration files
├── tests/                     # Unit tests
└── outputs/                   # Model checkpoints and results

. Vision Transformer (ViT): State-of-the-art attention mechanism

Training Strategy

Transfer Learning: Pre-trained on ImageNet
Fine-tuning: All layers unfrozen after initial training
Loss Function: Focal Loss (handles class imbalance)
Optimizer: Adam with cosine annealing
Augmentation: Medical-specific (rotations, color jitter, no vertical flips)

🔍 Explainability

Visual Explanations

Grad-CAM (Gradient-weighted Class Activation Mapping)

Generates heatmaps showing which regions of the cell image influenced the model's decision.

from src.explainability.gradcam import GradCAM

# Generate explanation
cam = GradCAM(model, target_layer='layer4')
heatmap = cam.generate_cam(image, target_class=1)

Key Features Analyzed:

Nucleus morphology (size, shape, chromatin pattern)
Cytoplasm characteristics
Nuclear-cytoplasmic ratio
Presence of granulations

Natural Language Explanations

LLM-powered explanations translate visual features into clinical language:

"The model classifies this cell as LEUKEMIC (confidence: 94.2%) 
based on the following observations:

1. Enlarged nucleus with irregular chromatin pattern (highlighted 
   in red on the heatmap)
2. High nuclear-cytoplasmic ratio characteristic of blast cells
3. Absence of normal granulations in the cytoplasm

These features are consistent with acute lymphoblastic leukemia (ALL) 
morphology. Clinical correlation and additional testing recommended."

🗺️ Development Roadmap

Phase 1: Data & Exploration 📊

Setup project structure
Download datasets (ALL-IDB, C-NMC)
Notebook 01: EDA - visualize images, analyze class distribution
Notebook 02: Preprocessing - normalization, augmentation, train/val/test split

Phase 2: Baseline Model 🧠

Build data pipeline (src/data/)
Notebook 03: Train ResNet50 baseline
Implement training loop with metrics
First evaluation on validation set

Phase 3: Model Optimization 🚀

Notebook 03: Test EfficientNet & ViT
Hyperparameter tuning (LR, optimizer, augmentation)
Select best model and evaluate on test set

Phase 4: Explainability 🔍

Notebook 04: Implement Grad-CAM
Generate heatmaps for validation set
Validate model attention on medical features
(Optional) Test SHAP

Phase 5: LLM Integration 💬

Notebook 05: Setup and test LLM API
Design explanation prompts
Build pipeline: Image → Prediction → Heatmap → LLM → Explanation
Refine explanations quality

Phase 6: Demo & Polish 🎨

Build Streamlit/Gradio demo
Write production scripts (train.py, inference.py)
Complete documentation
Create presentation

🤝 Contributing

Contributions are welcome! Please read our Contributing Guidelines.

Development Setup

# Install dev dependencies
pip install -r requirements-dev.txt

# Run tests
pytest tests/

# Code formatting
black src/
flake8 src/

⚠️ Medical Disclaimer

THIS SOFTWARE IS FOR RESEARCH AND EDUCATIONAL PURPOSES ONLY.

❌ NOT approved as a medical device
❌ NOT validated for clinical diagnosis
❌ NOT a replacement for professional medical judgment
❌ NOT suitable for patient care without proper validation

Always consult qualified healthcare professionals for medical decisions.

📚 References

Datasets

Labati et al. (2011). "ALL-IDB: Acute Lymphoblastic Leukemia Image Database"
Gupta et al. (2019). "C-NMC Challenge Dataset"

Methods

Selvaraju et al. (2017). "Grad-CAM: Visual Explanations from Deep Networks"
Lin et al. (2017). "Focal Loss for Dense Object Detection"
Dosovitskiy et al. (2020). "An Image is Worth 16x16 Words: Transformers for Image Recognition"

Medical Context

Terwilliger & Abdul-Hay (2017). "Acute lymphoblastic leukemia: a comprehensive review"
WHO Classification of Tumours of Haematopoietic and Lymphoid Tissues (2017)

👨‍💻 Author

Charlie - ML Engineer & Computer Vision Specialist

GitHub: @Ekliipce
LinkedIn: Charles-André Arsenec
Project: WearIT Paris - AI-powered virtual try-on

📧 Contact

For questions, suggestions, or collaboration opportunities:

Email: [email protected]
Open an issue on GitHub

**⭐ Star this repo if you find it useful!**

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data/C-NMC		data/C-NMC
notebook		notebook
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LeukomiaVision

🎯 Project Goal

🏆 Results & Achievements

Timeline

ResNET18 Classification Performance

✨ Key Features

📁 Project Structure

Training Strategy

🔍 Explainability

Visual Explanations

Natural Language Explanations

🗺️ Development Roadmap

Phase 1: Data & Exploration 📊

Phase 2: Baseline Model 🧠

Phase 3: Model Optimization 🚀

Phase 4: Explainability 🔍

Phase 5: LLM Integration 💬

Phase 6: Demo & Polish 🎨

🤝 Contributing

Development Setup

⚠️ Medical Disclaimer

📚 References

Datasets

Methods

Medical Context

👨‍💻 Author

📧 Contact

About

Uh oh!

Releases

Packages

Languages

Ekliipce/LeukemiaVision

Folders and files

Latest commit

History

Repository files navigation

LeukomiaVision

🎯 Project Goal

🏆 Results & Achievements

Timeline

ResNET18 Classification Performance

✨ Key Features

📁 Project Structure

Training Strategy

🔍 Explainability

Visual Explanations

Natural Language Explanations

🗺️ Development Roadmap

Phase 1: Data & Exploration 📊

Phase 2: Baseline Model 🧠

Phase 3: Model Optimization 🚀

Phase 4: Explainability 🔍

Phase 5: LLM Integration 💬

Phase 6: Demo & Polish 🎨

🤝 Contributing

Development Setup

⚠️ Medical Disclaimer

📚 References

Datasets

Methods

Medical Context

👨‍💻 Author

📧 Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages