Skip to content

cteng585/gwas-scrnaseq-integration

Repository files navigation

Integrated scRNA-seq/GWAS Approach to Identifying Disease-Relevant Cell Types

This repository contains the code for the Identification of Disease-Relevant Cell-Types in Rheumatoid Arthritis Using an Integrated scRNA-seq/GWAS Approach capstone project by Chris Teng, Rucha Deo, and Rachel Zeng for the Summer 2023 cohort of the University of Chicago's Masters of Biomedical Informatics program.

The Python modules of this repository create a series of wrappers around MAGMA and PLINK to handle munging and processing of input data to be used by scDRS, a tool which associates individual cells within scRNA-seq data with disease GWAS data. The modules are then used within the associated Jupyter notebooks.

The R module of this repository and associated Quarto documents use Seurat to perform standard pre-processing on scRNA-seq data then map disease scores and p-values (generated by scDRS) onto individual cells for visualization purposes.

The expected flow of notebooks is: make_reference.ipynb -> gwas_processing.ipynb -> celseq_processing.qmd -> score_cells.ipynb -> cell_score_viz.qmd

Dependencies

Python dependencies for this repository are handled with poetry. Alternatively, a requirements.txt file has been provided.

R dependencies for this repository are handled with renv.

In addition to Python and R dependencies, a local installation of both MAGMA and PLINK is required. To ensure that the notebooks run smoothly, make sure that both the MAGMA and PLINK installations are accessible through $PATH.

Execution

In each notebook, parameters that are used to run the notebooks are placed outside the function calls. This is to provide an easier way of modifying the run parameters to suit the local environment. File paths should be provided relative to the location of the notebook.

Expected outputs

Each notebook should create/use both a tmp directory and an output directory. The output directory holds the primary/final outputs of the notebook, whereas the tmp directory contains work directories holding intermediate files that are generated as part of the workflow. Only the final outputs are listed below.

make_reference

  • a merged PLINK bed file
  • a merged PLINK bim file
  • a merged PLINK fam file

gwas_processing

  • a .genes.annot file mapping GWAS SNPs to their associated genes
  • a .genes.out file containing the gene analysis results (i.e. which genes are significantly associated with the trait of interest)

celseq_processing

  • cell_type_distribution.png - a histogram of the cell type distribution
  • pca_10dims.png - a PCA of the first 10 dimensions
  • umap_cell_clusters_v1.png and umap_cell_clusters_v2.png - UMAPs of the single-cell data to try and recapitulate the cell clusters of the sourced article.
  • canonical_markers.png - a UMAP of the single-cell data annotated with canonical cell markers in Fibroblasts, Monocytes, B-cells, and T-cells

score_cells

  • CSVs of individual cells in the scRNA-seq data scored on their disease relevance

cell_score_viz

  • plots of the disease relevance scores (as calculated by scDRS) mapped onto individual cells from scRNA-seq
  • plots of the disease relevance p-values (as calculated by scDRS) mapped onto individual cells from scRNA-seq
  • plot of markers that differentiate rheumatoid arthritis (RA)-relevant vs RA-irrelevant cell subpopulations within an analyzed cell type
  • table of differentially expressed markers between RA-relevant and RA-irrelevant cell subpopulations within a defined cell type
  • plot of distribution of RA associated cells (case) vs osteoarthritis (OA) associated cells (control) on the previously generated UMAP clusters
  • a CSV of RA markers
  • a CSV of markers differentiating RA and OA cells within a defined cell type
  • a CSV of markers differentiating RA and OA cells within the disease relevant cells of a defined cell type
  • a CSV of markers differentiating RA and OA cells within the disease irrelevant cells of a defined cell type

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published