This repository performs some Simulation Based Algorithm (SBI) on abundance distribution data. One application done for the LostMa project consist in modelling the transmission and survival of textual witnesses through time, enabling researchers to infer model parameters from observed data.
To develop and/or contribute to the project, see more detailed instructions here.
-
Have Python installed on your computer or in your virtual environment manager, i.e.
pyenv. For this project, you'll need version 3.12 of Python. -
Create a new virtual Python environment (version 3.12) and activate it.
-
Install this package with
pip. Because it depends on several "heavy" Python libraries (i.e.torch), the installation may take several minutes. ☕a. Option 1: Install directly from the project's GitHub repository URL.
b. Option 2: Download ("clone") the repository using
git(must be installed), then install the downloaded files in your virtual Python environment.
pip install git+https://github.com/LostMa-ERC/simMAtree.gitNote: Requires that you have
gitinstalled on your computer.
git clone https://github.com/LostMa-ERC/simMAtree.git
cd simMAtree
pip install .- Test the installation.
$ simmatree-test
Looks good!Note: It's normal for the command to take a while. Some of the Python dependencies are very "heavy" and, when starting up, importing everything in the library can be slow.
The script supports three tasks: inference, generate and score.
No matter the task in your experiment, prepare a configuration YAML file. Follow the model here.
When running any of the simmatree tasks, you'll need to provide your experiment's configuration file.
Create experiment.yml:
generator:
name: YuleAbundance # or BirthDeathAbundance
config:
n_init: 1
Nact: 1000
Ninact: 1000
max_pop: 50000
stats:
name: Abundance
config:
additional_stats: true
prior:
name: ConstrainedUniform4D # or ConstrainedUniform2D for Birth-Death
config:
low: [0.0, 0.0, 0.0, 0.0]
high: [1.0, 0.015, 0.01, 0.01]
params:
LDA: 0.3 # Rate of new independent trees (Yule only)
lda: 0.009 # Probability of copying/reproduction
gamma: 0.001 # Probability of speciation (Yule only)
mu: 0.0033 # Probability of death
inference:
name: SBI
config:
method: NPE
num_simulations: 500
num_rounds: 2
random_seed: 42
num_samples: 500
num_workers: 10
device: cpuThis example performs all three simmatree tasks (generate, score and infer).
Certain blocks of information need not be provided if only one of the three tasks is to be performed (e.g. params if you only wish to perform inference and have no ground truth).
simmatree -c experiment.yml generate -o synthetic_data.csv -s 42simmatree -c experiment.yml infer -i synthetic_data.csv -o results/simmatree -c experiment.yml score -d results/-
Generators (
src/generator/): Implement stochastic evolutionary modelsYuleAbundance: Full 4-parameter Yule processBirthDeathAbundance: Simplified 2-parameter Birth-Death processGeneralizedAbundanceGenerator: Base class with shared simulation logic
-
Statistics (
src/stats/): Extract summary statistics from simulated dataAbundanceStats: Witness count distributions and derived metrics
-
Priors (
src/priors/): Constrained uniform distributionsConstrainedUniform4D: For Yule model with biological constraintsConstrainedUniform2D: For Birth-Death model
-
Inference (
src/inference/): SBI backendsSbiBackend: Neural Posterior Estimation and related methods
-
CLI (
src/cli/): Command-line interface and configuration management## Outputs
posterior_samples.npy: Raw posterior samplesposterior_summary.csv: Summary statistics (mean, quantiles, HPDI)posterior_predictive.npy: Posterior predictive samplespp_summaries.png: Posterior predictive check visualizationsposterior.png: Marginal posterior distributionspairplot.png: Parameter correlation plots
summary_metrics.csv: RMSE, coverage probability, relative errorsrelative_error.png: Parameter-wise relative error analysis- Additional diagnostic plots
The project includes comprehensive tests:
# Run all tests
python tests/run_all_tests.py
# Run specific test categories
python tests/run_all_tests.py --category unit
python tests/run_all_tests.py --category integration
python tests/run_all_tests.py --category e2eSee CONTRIBUTING.md for detailed development instructions, including:
- Setting up the development environment
- Code formatting with
ruffandisort - Pre-commit hooks
- Testing guidelines
Funded by the European Union (ERC, LostMA, 101117408). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.