PyQemistree uses PySirius (Python API of SIRIUS) to get fingerprint vector of given mass spectra then calculates all pairwise distance of spectra. The distance matrix can be used for visualizing chemical space of a given sample.
PyQemistree requires SIRIUS and its python API.
- Install CLI of SIRIUS at https://boecker-lab.github.io/docs.sirius.github.io/install/ . Current PyQemistree is built using SIRIUS 6.0.7.
- To run SIRIUS an account is required. See https://boecker-lab.github.io/docs.sirius.github.io/account-and-license/ to generate one.
- Install dependency using the uploaded .yaml file
conda env create -n pyqemistree --file PyQemistree_1.yaml
conda activate pyqemistree
Three input files can be prepared for PyQemistree.
- Mass spectra file (*.mgf)
- SIRIUS configuration file (*.ini)
- (Optional) Feature quantification table (*.csv)
PyQemistree offers two modes: get_fingerprint_distance.py and qemistree_MC.py Both calculates the fingerprint-based pairwise distances of all spectra using the mass spectra file. The latter code additionally visualize the feature quantification table along with the dendrogram of features (calculated from fingerprint distance) as a heat map. Two default configuration files are provided for both ion modes and can be modified. It contains configuration of running SIRIUS.
Following code run get_fingerprint_distance.py to get the distance matrix.
python3 ./get_fingerprint_distance.py \
--sirius-path '/PATH/TO/SIRIUS/bin/sirius' \
--spectra-path "/PATH/TO/SPECTRA/spectra.mgf" \
--username 'USERID_OF_SIRIUS' \
--password 'PASSWORD_OF_SIRIUS' \
--sirius-config-path 'PATH/TO/CONFIG/config_nega.ini' \
--distance-metric 'euclidean' \
--distance-matrix-path 'distance_matrix.csv'
Following code run qeminstree_MC.py to get the distance matrix and a heat map (the input file format is under work)
python3 ./qemistree_MC.py \
--sirius-path '/PATH/TO/SIRIUS/bin/sirius' \
--spectra-path "/PATH/TO/SPECTRA/spectra.mgf" \
--username 'USERID_OF_SIRIUS' \
--password 'PASSWORD_OF_SIRIUS' \
--sirius-config-path 'PATH/TO/CONFIG/config_nega.ini' \
--distance-metric 'euclidean' \
--distance-matrix-path 'distance_matrix.csv' \
--fig-path 'distance_matrix.png'