Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
stkmrc committed Sep 5, 2024
1 parent df22b52 commit 3718d46
Show file tree
Hide file tree
Showing 9 changed files with 422 additions and 18 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# A dynamic benchmark for gene regulatory network (GRN) inference

[![Documentation Status](https://readthedocs.org/projects/grn-inference-benchmarking/badge/?version=latest)](https://grn-inference-benchmarking.readthedocs.io/en/latest/?badge=latest)
The full documentation is hosted on [ReadTheDocs](https://openproblems-grn-task.readthedocs.io/en/latest/index.html). [![Documentation Status](https://readthedocs.org/projects/grn-inference-benchmarking/badge/?version=latest)](https://grn-inference-benchmarking.readthedocs.io/en/latest/?badge=latest)

<!--
This file is automatically generated from the tasks's api/*.yaml files.
Expand Down
4 changes: 0 additions & 4 deletions docs/source/add_stuff.rst

This file was deleted.

28 changes: 21 additions & 7 deletions docs/source/evaluation.rst
Original file line number Diff line number Diff line change
@@ -1,16 +1,30 @@
Evaluation
=====
==========

The evaluation is done with the help of pertubation data, using two different approaches:
1. Regression from GRN regulations to target expression
2. Regression from TF expression of predicted regulators to target expression

Regression from GRN regulations to target expression
----------------
#. Regression from GRN regulations to target expression
#. Regression from TF expression of predicted regulators to target expression

|
Regression from TF expression of predicted regulators to target expression
----------------
.. image:: images/regressions.png
:width: 100 %
:alt: overview of the two regression evaluation approaches
:align: center

|
|

Evaluation 1: Regression from GRN regulations to target expression
------------------------------------------------------------------
The first approach we used is similar to GRaNPA and the multivariate decision tree in Decoupler, where regulatory weights from the GRN form the feature space to predict perturbation data. In this method, we train one model per sample. The feature space matrix has dimensions of genes by transcription factors (TFs), with values being the regulatory weights from the GRN or 0 if the link is absent. The target space matrix represents the perturbation data for each sample. We evaluate the model's predictive performance using a 5-fold cross-validation scheme and the coefficient of determination (R²) as the metric. LightGBM is used for computational efficiency.


Evaluation 2: Regression from TF expression of predicted regulators to target expression
----------------------------------------------------------------------------------------
In the second approach, instead of using regulatory weights, we utilized the expression of putative regulators (TFs) from the perturbation data to construct the feature space. We fit one model per gene, selecting regulators based on the regulatory weights suggested by the GRNs. This method is similar to many modern GRN inference techniques.



Expand Down
21 changes: 21 additions & 0 deletions docs/source/extending.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Extending the pipeline
======================

Currently the perturbation dataset from the Open Problems: Single Cell Perturbation 2023 data science competition is used.
It provides single-cell perturbation gene expression for peripheral blood mononuclear cells (PBMCs), along with integrated multi-omics data of scRNA-seq and scATAC-seq for the baseline compound from the same experiment.
It includes 146 perturbations, making it the largest drug perturbation study on primary human tissue with donor replicates.

Currently, the following six enhancer aware GRN inference methods (eGRN methods) are implemented in the pipeline:

#. Scenic+ (`Paper <https://doi.org/10.1038/s41592-023-01938-4>`_)
#. CellOracle (`Paper <https://doi.org/10.1038/s41586-022-05688-9>`_)
#. FigR (`Paper <https://doi.org/10.1016/j.xgen.2022.100166>`_)
#. scGLUE (`Paper <https://doi.org/10.1038/s41587-022-01284-4>`_)
#. GRaNIE (`Paper <https://doi.org/10.15252/msb.202311627>`_)
#. ANANSE (`Paper <https://doi.org/10.1093/nar/gkab598>`_)

To add a method to the repository, follow the instructions in the ``scripts/add_a_method.sh`` script.




Binary file added docs/source/images/overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/images/regressions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
31 changes: 25 additions & 6 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,14 +1,30 @@
Welcome to the documentation for the GRN inference task on OpenProblems Benchmarking platform!
===================================
Welcome to the documentation for the GRN inference task on the OpenProblems Benchmarking platform!
==================================================================================================

This task is one of many other tasks hosted on the `OpenProblems benchmarking platform <https://openproblems.bio/>`_.
The interactive benchmarking results are hosted `here <https://openproblems.bio/results/>`_.

The *GRN inference* task focuses on the inference of gene regulatory networks (GRN) from RNA-Seq expression or chromatin accessibility data (ATAC-Seq) or both.
The **GRN inference** task focuses on the inference of gene regulatory networks (GRN) from RNA-Seq expression or chromatin accessibility data (ATAC-Seq) or both.
The pipeline evaluates inferred GRNs against pertubation data, by training two types of regression models. This type of evaluation is closer to evaluating the biological knowledge that a GRN should represent, instead of evaluating the presence of edges in a statistical way only, as commonly done by using metrics, such as AUPRC or AUROC.

Jump to the :doc:`evaluation` section to get a deeper explanation on how the benchmarking task is setup.
If you want to add your own datasets or algorithms to the benchmark, check our the :doc:`add_stuff` section.
Jump to the :doc:`overview` section to get a first summary of the pipeline.
If you want to add your own datasets or algorithms to the benchmark, check our the :doc:`extending` section.


.. list-table:: Authors & contributors
:widths: 25 25
:header-rows: 1

* - name
- roles
* - Jalil Nourisa
- author
* - Antoine Passemiers
- author
* - Robrecht Cannoodt
- author
* - Marco Stock
- author

.. note::

Expand All @@ -19,5 +35,8 @@ Contents

.. toctree::

overview
objects
evaluation
add_stuff
extending

Loading

0 comments on commit 3718d46

Please sign in to comment.