Skip to content

An AI-powered system that detects leukemia cells from microscopic blood images and provides interpretable explanations for medical professionals. This project combines state-of-the-art deep learning with clinical explainability to support early diagnosis.

Notifications You must be signed in to change notification settings

Ekliipce/LeukemiaVision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LeukomiaVision

Python 3.8+ PyTorch

Early leukemia detection through explainable computer vision

An AI-powered system that detects leukemia cells from microscopic blood images and provides interpretable explanations for medical professionals. This project combines state-of-the-art deep learning with clinical explainability to support early diagnosis.

🎯 Project Goal

Early detection of leukemia is critical for successful treatment. This project aims to:

  1. Detect leukemia cells from microscopic images with high accuracy
  2. Explain which visual features the AI model uses for classification (Grad-CAM heatmaps)
  3. Communicate findings in natural language through LLM-generated medical explanations

The system is designed to be a decision support tool for hematologists, not a replacement for medical expertise.

πŸ† Results & Achievements

Timeline

  • January 29, 2026 - Achieved 97.66% accuracy with ResNet18 on C-NMC dataset (SOTA-level performance)
  • January 22, 2026 - Completed data pipeline and EDA-driven augmentation strategy

ResNET18 Classification Performance

ResNet18 model achieves state-of-the-art performance on the C-NMC dataset (Cancer and Normal Myeloid Cells):

Metric Score Clinical Relevance
Accuracy 97.66% Overall diagnostic accuracy
Recall (Sensitivity) 98.32% Only 1.68% of leukemia cases missed ⚠️
Precision 98.18% High confidence in positive diagnoses
F1-Score 98.25% Balanced performance

🎯 Clinical Impact: With 98.32% sensitivity, the model successfully detects 98 out of 100 leukemia cases, making it highly suitable for screening applications where minimizing false negatives is critical.

✨ Key Features

  • 🧠 Deep Learning Classification: ResNet/EfficientNet/ViT architectures for cell classification
  • πŸ” Visual Explainability: Grad-CAM heatmaps highlighting relevant cellular features
  • πŸ’¬ Natural Language Explanations: LLM-powered descriptions of diagnostic reasoning
  • πŸ“Š Medical-Grade Metrics: Precision, Recall, F1-Score, AUC-ROC optimized for clinical use
  • 🎨 Interactive Demo: Streamlit/Gradio interface for easy testing

πŸ“ Project Structure

leukocare-ai/
β”œβ”€β”€ data/                      # Dataset storage
β”œβ”€β”€ notebooks/                 # Jupyter notebooks for exploration
β”‚   β”œβ”€β”€ 01_eda.ipynb          # Data exploration
β”‚   β”œβ”€β”€ 02_preprocessing.ipynb
β”‚   β”œβ”€β”€ 03_modeling.ipynb
β”‚   └── 04_explainability.ipynb
β”œβ”€β”€ src/                       # Source code (production-ready)
β”‚   β”œβ”€β”€ data/                 # Data loading and augmentation
β”‚   β”œβ”€β”€ models/               # Model architectures
β”‚   β”œβ”€β”€ training/             # Training loops and metrics
β”‚   β”œβ”€β”€ explainability/       # Grad-CAM, SHAP, visualization
β”‚   └── llm/                  # LLM explanation generation
β”œβ”€β”€ scripts/                   # Executable scripts
β”‚   β”œβ”€β”€ train.py              # Training script
β”‚   β”œβ”€β”€ evaluate.py           # Model evaluation
β”‚   └── inference.py          # Single image inference
β”œβ”€β”€ configs/                   # Configuration files
β”œβ”€β”€ tests/                     # Unit tests
└── outputs/                   # Model checkpoints and results

. Vision Transformer (ViT): State-of-the-art attention mechanism

Training Strategy

  • Transfer Learning: Pre-trained on ImageNet
  • Fine-tuning: All layers unfrozen after initial training
  • Loss Function: Focal Loss (handles class imbalance)
  • Optimizer: Adam with cosine annealing
  • Augmentation: Medical-specific (rotations, color jitter, no vertical flips)

πŸ” Explainability

Visual Explanations

Grad-CAM (Gradient-weighted Class Activation Mapping)

Generates heatmaps showing which regions of the cell image influenced the model's decision.

from src.explainability.gradcam import GradCAM

# Generate explanation
cam = GradCAM(model, target_layer='layer4')
heatmap = cam.generate_cam(image, target_class=1)

Key Features Analyzed:

  • Nucleus morphology (size, shape, chromatin pattern)
  • Cytoplasm characteristics
  • Nuclear-cytoplasmic ratio
  • Presence of granulations

Natural Language Explanations

LLM-powered explanations translate visual features into clinical language:

"The model classifies this cell as LEUKEMIC (confidence: 94.2%) 
based on the following observations:

1. Enlarged nucleus with irregular chromatin pattern (highlighted 
   in red on the heatmap)
2. High nuclear-cytoplasmic ratio characteristic of blast cells
3. Absence of normal granulations in the cytoplasm

These features are consistent with acute lymphoblastic leukemia (ALL) 
morphology. Clinical correlation and additional testing recommended."

πŸ—ΊοΈ Development Roadmap

Phase 1: Data & Exploration πŸ“Š

  • Setup project structure
  • Download datasets (ALL-IDB, C-NMC)
  • Notebook 01: EDA - visualize images, analyze class distribution
  • Notebook 02: Preprocessing - normalization, augmentation, train/val/test split

Phase 2: Baseline Model 🧠

  • Build data pipeline (src/data/)
  • Notebook 03: Train ResNet50 baseline
  • Implement training loop with metrics
  • First evaluation on validation set

Phase 3: Model Optimization πŸš€

  • Notebook 03: Test EfficientNet & ViT
  • Hyperparameter tuning (LR, optimizer, augmentation)
  • Select best model and evaluate on test set

Phase 4: Explainability πŸ”

  • Notebook 04: Implement Grad-CAM
  • Generate heatmaps for validation set
  • Validate model attention on medical features
  • (Optional) Test SHAP

Phase 5: LLM Integration πŸ’¬

  • Notebook 05: Setup and test LLM API
  • Design explanation prompts
  • Build pipeline: Image β†’ Prediction β†’ Heatmap β†’ LLM β†’ Explanation
  • Refine explanations quality

Phase 6: Demo & Polish 🎨

  • Build Streamlit/Gradio demo
  • Write production scripts (train.py, inference.py)
  • Complete documentation
  • Create presentation

🀝 Contributing

Contributions are welcome! Please read our Contributing Guidelines.

Development Setup

# Install dev dependencies
pip install -r requirements-dev.txt

# Run tests
pytest tests/

# Code formatting
black src/
flake8 src/

⚠️ Medical Disclaimer

THIS SOFTWARE IS FOR RESEARCH AND EDUCATIONAL PURPOSES ONLY.

  • ❌ NOT approved as a medical device
  • ❌ NOT validated for clinical diagnosis
  • ❌ NOT a replacement for professional medical judgment
  • ❌ NOT suitable for patient care without proper validation

Always consult qualified healthcare professionals for medical decisions.


πŸ“š References

Datasets

  • Labati et al. (2011). "ALL-IDB: Acute Lymphoblastic Leukemia Image Database"
  • Gupta et al. (2019). "C-NMC Challenge Dataset"

Methods

  • Selvaraju et al. (2017). "Grad-CAM: Visual Explanations from Deep Networks"
  • Lin et al. (2017). "Focal Loss for Dense Object Detection"
  • Dosovitskiy et al. (2020). "An Image is Worth 16x16 Words: Transformers for Image Recognition"

Medical Context

  • Terwilliger & Abdul-Hay (2017). "Acute lymphoblastic leukemia: a comprehensive review"
  • WHO Classification of Tumours of Haematopoietic and Lymphoid Tissues (2017)

πŸ‘¨β€πŸ’» Author

Charlie - ML Engineer & Computer Vision Specialist


πŸ“§ Contact

For questions, suggestions, or collaboration opportunities:


**⭐ Star this repo if you find it useful!**

About

An AI-powered system that detects leukemia cells from microscopic blood images and provides interpretable explanations for medical professionals. This project combines state-of-the-art deep learning with clinical explainability to support early diagnosis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published