Snakemake pipeline that corrects for ambient RNA expression in single-cell RNA sequencing (scRNA-seq) data using DecontX.
This project contains the code to run a two-phase islet decontamination protocol, as well as configurations and wrapper scripts to run the pipeline on sample data.
- Clone the GitHub repo and cd into the repo directory.
# clone the repo
git clone https://github.com/CollinsLabBioComp/islet_decontamination.git
# set code base path
SNK_REPO="$(pwd)/islet_decontamination"
cd ${SNK_REPO}- Launch the Docker app and download the Docker image
docker pull letaylor/sc_decontx:latest- Run the sample data:
chmod +x ./run_docker.sh
./run_docker.sh
The expected input data is a folder containing standard 10x outputs. Each folder should contain the following standard folders:
[sample]/outs/filtered_feature_bc_matrix[sample]/outs/raw_feature_bc_matrix
with each [raw/filtered]_feature_bc_matrix folder containing barcodes.tsv.gz, features.tsv.gz, and matrix.mtx.gz. For reference, see the provided sample data.
To configure to use your own data:
- Update
workflow/src/threeprime.yamlto change the run ID (name) and specify the sample IDs (samples). note: if your 10x output directory format differs, you may need to updateinput_dir_basenameandinput_path_formatto match. - Place the 10x output folder (containing a minimum of
outs/, see here for reference) in the./data/folder. Alternatively, you can modifyinput_dir_baseandinput_path_formatso the base (i.e.data/) points to your parent directory containing all samples.