Survival analysis framework based on NeuralFineGray models, Survival Stacking, and foundation model embeddings (TabICL, TabPFN, TARTE).
Two environments are available depending on the embedding method:
# Create environment
conda create -n tab_env python=3.11
conda activate tab_env
# setup environment
cd NeuralFineGray
python -m setup_tabpfn_tabicl --install-deps# Create environment
conda create -n tarte_env python=3.11
conda activate tarte_env
# setup environment
cd NeuralFineGray
python -m setup_tarte --install-depsexport HF_TOKEN="your_token_here"
# Or create .env file with: HF_TOKEN=your_token_heremake sure to run
cd NeuralFineGray
python -m setup_tartebefore running TARTE experiments and
cd NeuralFineGray
python -m setup_tabpfn_tabiclotherwise.
This framework provides four main experimental pipelines:
- Baseline Experiments - Individual survival models (CoxPH, DeepSurv, RSF, XGBoost, NFG) with hyperparameter tuning
- Tabular Foundation Model Embeddings - Enhance survival models with TabICL, TabPFN, or TARTE embeddings
- Survival Stacking - Ensemble methods combining multiple base learners with optional embeddings
- Competing Risks Analysis - Multi-event survival models using discrete-time approaches
| Dataset | Type | Description |
|---|---|---|
| METABRIC | Binary survival | Breast cancer, ~2000 samples |
| SUPPORT | Binary survival | ICU mortality, ~9000 samples |
| PBC | Binary survival | Primary biliary cirrhosis, ~418 samples |
| SYNTHETIC_COMPETING | Competing risks | Synthetic data with 2 event types |
| SEER_competing_risk | Competing risks | Cancer registry (requires local file) |
Each pipeline has detailed step-by-step instructions in its own README:
📖 See experiments/README.md for detailed instructions on:
- Running individual models (CoxPH, DeepSurv, RSF, XGBoost, NFG)
- Hyperparameter search configuration
- Using raw features or TabPFN embeddings
- SLURM batch job submission
📖 See tfm/README.md for detailed instructions on:
- Generating TabICL, TabPFN, or TARTE embeddings
- Running cross-validation experiments
- Comparing raw vs deep vs deep+raw feature modes
- Environment-specific requirements
📖 See survivalStacking/README.md for detailed instructions on:
- Running ensemble stacking benchmarks
- Combining base learners with embeddings
- Statistical significance testing
- Visualization of results
📖 See CompetingRisks/README.md for detailed instructions on:
- Discrete-time multiclass approaches
- Hybrid NFG models
- Benchmarking on synthetic and real datasets
# Baseline experiment
python -m experiments.run_experiment --dataset METABRIC --model coxph --mode raw
# Survival stacking
python -m survivalStacking.run_full_benchmark --dataset METABRIC --cv 5
# Competing risks
python -m CompetingRisks.run_benchmark --datasets SYNTHETIC_COMPETINGAll experiments save results to results/ with organized subdirectories:
results/experiments/- Baseline model resultsresults/tabicl/,results/tabpfn/,results/tarte/- Embedding experimentsresults/survival_stacking/- Stacking ensemble resultsresults/competing_risks/- Competing risks benchmarks
Plots are saved in plots/ subdirectories within each results folder
https://github.com/SajbenDani
This project is based on NeuralFineGray Copyright (c) 2021 Vincent Jeanselme, developed at TUM (Lab for AI in Medicine) by Dániel Sajben, Amelie Trautwein and Mohamed Amine Frouja and supervised by Dmitrii Seletkov.