Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cytoscape #66

Merged
merged 49 commits into from
Sep 22, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
8600f45
add cytoscape post processing
ajshedivy Apr 25, 2022
da5a00f
add py4cytoscape container and dev stuff
ajshedivy Jun 27, 2022
635c6ed
integrate cytoscape into snakemake
ajshedivy Jun 28, 2022
ca5ab35
remove py4cy package
ajshedivy Jun 29, 2022
5b6567c
add cytoscape to config.yaml
ajshedivy Jun 29, 2022
e464451
code review changes
ajshedivy Jul 5, 2022
3739a9a
add other changes
ajshedivy Jul 5, 2022
60816a5
Merge branch 'master' into cytoscape
agitter Jul 6, 2022
d6521b5
Check whether tests pass with Cytoscape step disabled
agitter Jul 9, 2022
3d03cec
Whitespace formatting
agitter Jul 9, 2022
4816c0b
Remove prints
agitter Jul 9, 2022
97e3c55
Re-enable Cytoscape step
agitter Jul 9, 2022
1652f45
Merge branch 'master' into cytoscape
agitter Aug 12, 2022
bdce7f8
Cleanup snakefile and init file
agitter Aug 25, 2022
c26e19f
Move Cytoscape code
agitter Aug 25, 2022
9635ad6
Clean up Cytoscape util and Dockerfile
agitter Aug 25, 2022
86702ac
Revert util changes
agitter Aug 25, 2022
713466a
Switch Cytoscape wrapper to use generic util
agitter Aug 25, 2022
3566390
Move Docker image to reedcompbio
agitter Aug 25, 2022
79d5825
Allow file paths and pathway labels as arguments
agitter Aug 26, 2022
4d1f3e0
Add Cytoscape Docker image build to workflow
agitter Aug 26, 2022
075013e
Implement Singularity support for Cytoscape
agitter Aug 26, 2022
d66ca3c
Fix merge conflicts
agitter May 3, 2023
0439541
Add initial readme
agitter May 27, 2023
ad71a8d
Update Cytoscape for tab delimited pathways
agitter May 27, 2023
d9553e4
Simplify Cytoscape connection by removing x11 and novnc
agitter May 27, 2023
92d53fa
Remove dependence on root inside image
agitter May 28, 2023
f139e4b
Remove unused auth file
agitter May 28, 2023
fdab0d7
Create vmoptions and add to Docker image
agitter Jun 10, 2023
47ce1f1
Fix line endings and change memory usage
agitter Jun 10, 2023
c2150a6
Update TODOs in readme
agitter Jun 22, 2023
b4a4def
Merge with master
agitter Jun 22, 2023
72580c9
Run auto formatter
agitter Jun 22, 2023
ec81f28
Merge branch 'master' into cytoscape
agitter Aug 25, 2023
694b0f9
Set HOME in Dockerfile
agitter Sep 1, 2023
1d32311
Add Cytoscape zip error and alternative HOME workarounds
agitter Sep 2, 2023
d894c46
Support Singularity HOME environment variable
agitter Sep 2, 2023
bd6c7d7
Set up Cytoscape config file volume mapping
agitter Sep 8, 2023
fa8a8e5
Write one Cytoscape session per dataset
agitter Sep 8, 2023
c3fa558
Container now works in Singularity
agitter Sep 14, 2023
cfc85ca
Rename and document Cytoscape function
agitter Sep 15, 2023
62edad1
Add Cytoscape test
agitter Sep 15, 2023
344e1e7
Misc cleanup
agitter Sep 15, 2023
3c5e238
Add max connection retries
agitter Sep 16, 2023
4508c43
Match Apptainer version to version on test server
agitter Sep 21, 2023
fedc826
Mark test expected to fail
agitter Sep 21, 2023
ddc5b94
Add v1 tag to Cytoscape image
agitter Sep 22, 2023
9622918
Remove file with bad line endings
agitter Sep 22, 2023
80e70df
Add back vmoptions file
agitter Sep 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion PRRunner.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
import Dataset

import os
ajshedivy marked this conversation as resolved.
Show resolved Hide resolved
# supported algorithm imports
from src.meo import MEO as meo
from src.omicsintegrator1 import OmicsIntegrator1 as omicsintegrator1
from src.omicsintegrator2 import OmicsIntegrator2 as omicsintegrator2
from src.pathlinker import PathLinker as pathlinker

ROOT = os.path.dirname(os.path.abspath(__file__))

def run(algorithm, params):
"""
A generic interface to the algorithm-specific run functions
Expand Down Expand Up @@ -67,4 +69,7 @@ def parse_output(algorithm, raw_pathway_file, standardized_pathway_file):
algorithm_runner = globals()[algorithm.lower()]
except KeyError:
raise NotImplementedError(f'{algorithm} is not currently supported')

with open(os.path.join(ROOT, 'output', 'pathways.txt'), 'a') as f:
f.write(f"{standardized_pathway_file}\n")
return algorithm_runner.parse_output(raw_pathway_file, standardized_pathway_file)
16 changes: 16 additions & 0 deletions Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,17 @@ import os
import PRRunner
import shutil
import yaml
import sys
from src.util import process_config
from src.analysis.summary import summary
from src.analysis.viz import graphspace

from src.analysis.cytoscape import cytoscape

# Snakemake updated the behavior in the 6.5.0 release https://github.com/snakemake/snakemake/pull/1037
# and using the wrong separator prevents Snakemake from matching filenames to the rules that can produce them
SEP = '/'
ROOT = os.path.abspath(__file__)

wildcard_constraints:
params="params-\w+"
Expand Down Expand Up @@ -66,6 +70,9 @@ def make_final_input(wildcards):
# add graph and style JSON files.
final_input.extend(expand('{out_dir}{sep}{dataset}-{algorithm_params}{sep}gs.json',out_dir=out_dir,sep=SEP,dataset=dataset_labels,algorithm_params=algorithms_with_params))
final_input.extend(expand('{out_dir}{sep}{dataset}-{algorithm_params}{sep}gsstyle.json',out_dir=out_dir,sep=SEP,dataset=dataset_labels,algorithm_params=algorithms_with_params))

if config["analysis"]["cytoscape"]["include"]:
final_input.extend(expand('{out_dir}{sep}cytoscape-session.cys',out_dir=out_dir,sep=SEP))

if len(final_input) == 0:
# No analysis added yet, so add reconstruction output files if they exist.
Expand All @@ -76,6 +83,7 @@ def make_final_input(wildcards):
final_input.extend(expand('{out_dir}{sep}logs{sep}parameters-{algorithm_params}.yaml', out_dir=out_dir, sep=SEP, algorithm_params=algorithms_with_params))
final_input.extend(expand('{out_dir}{sep}logs{sep}datasets-{dataset}.yaml', out_dir=out_dir, sep=SEP, dataset=dataset_labels))


return final_input

# A rule to define all the expected outputs from all pathway reconstruction
Expand Down Expand Up @@ -219,6 +227,14 @@ rule viz_graphspace:
run:
graphspace.write_json(input.standardized_file,output.graph_json,output.style_json,directed=algorithm_directed[wildcards.algorithm])

rule viz_cytoscape:
input: pathways = expand('{out_dir}{sep}{dataset}-{algorithm_params}{sep}pathway.txt', out_dir=out_dir, sep=SEP, dataset=dataset_labels, algorithm_params=algorithms_with_params)

output:
session = SEP.join([out_dir, 'cytoscape-session.cys'])
run:
cytoscape.run_cytoscape_container(input.pathways, out_dir)

# Remove the output directory
rule clean:
shell: f'rm -rf {out_dir}'
10 changes: 7 additions & 3 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@
algorithms:
- name: "pathlinker"
params:
include: false
directed: true
include: true
directed: True
run1:
k: range(100,201,100)

Expand All @@ -56,7 +56,7 @@
g: [3]
- name: "meo"
params:
include: false
include: true
directed: true
run1:
max_path_length: [3]
Expand Down Expand Up @@ -107,3 +107,7 @@
graphspace:

include: true

cytoscape:

include: true
113 changes: 113 additions & 0 deletions config/config_cyto.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Global workflow control
ajshedivy marked this conversation as resolved.
Show resolved Hide resolved

# The length of the hash used to identify a parameter combination
hash_length: 7
# If true, use Singularity instead of Docker
# Singularity support is only available on Unix
singularity: false

# This list of algorithms should be generated by a script which checks the filesystem for installs.
# It shouldn't be changed by mere mortals. (alternatively, we could add a path to executable for each algorithm
# in the list to reduce the number of assumptions of the program at the cost of making the config a little more involved)
# Each algorithm has an 'include' parameter. By toggling 'include' to true/false the user can change
# which algorithms are run in a given experiment.
#
# algorithm-specific parameters are embedded in lists so that users can specify multiple. If multiple
# parameters are specified then the algorithm will be run as many times as needed to cover all parameter
# combinations. For instance if we have the following:
# - name: "myAlg"
# params:
# include: true
# directed: true
# a: [1,2]
# b: [0.5,0.75]
#
# then myAlg will be run on (a=1,b=0.5),(a=1,b=0.75),(a=2,b=0.5), and (a=2,b=0,75). Pretty neat, but be
# careful: too many parameters might make your runs take a long time.

algorithms:
- name: "pathlinker"
params:
include: true
directed: true
run1:
k: range(100,201,100)

- name: "omicsintegrator1"
params:
include: true
directed: false
run1:
r: [5]
b: [5, 10]
w: [5]
g: [3]
d: [10]

- name: "omicsintegrator2"
params:
include: false
directed: false
run1:
b: [4]
g: [0]
run2:
b: [2]
g: [3]
- name: "meo"
params:
include: true
directed: true
run1:
max_path_length: [3]
local_search: ["Yes"]
rand_restarts: [10]

# Here we specify which pathways to run and other file location information.
# DataLoader.py can currently only load a single dataset
# Assume that if a dataset label does not change, the lists of associated input files do not change
datasets:
-
label: data0
node_files: ["node-prizes.txt", "sources.txt", "targets.txt"]
# DataLoader.py can currently only load a single edge file, which is the primary network
edge_files: ["network.txt"]
# Placeholder
other_files: []
# Relative path from the spras directory
data_dir: "input"
-
label: data1
# Reuse some of the same sources file as 'data0' but different network and targets
node_files: ["sources.txt", "alternative-targets.txt"]
edge_files: ["alternative-network.txt"]
other_files: []
# Relative path from the spras directory
data_dir: "input"

# If we want to reconstruct then we should set run to true.
# TODO: if include is true above but run is false here, algs are not run.
# is this the behavior we want?
reconstruction_settings:

#set where everything is saved
locations:

#place the save path here
# TODO move to global
reconstruction_dir: "output"

run: true

analysis:
summary:

include: true

graphspace:

include: true

cytoscape:

include: true
84 changes: 84 additions & 0 deletions config/egfr.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# The length of the hash used to identify a parameter combination
ajshedivy marked this conversation as resolved.
Show resolved Hide resolved
hash_length: 7

# If true, use Singularity instead of Docker
# Singularity support is only available on Unix
singularity: false

algorithms:
-
name: pathlinker
params:
directed: true
include: true
run1:
k:
- 20
- 200
-
name: omicsintegrator1
params:
directed: false
include: true
run1:
b:
- 0.55
- 2
- 10
d:
- 10
g:
- 1e-3
r:
- 0.01
w:
- 0.5
mu:
- 0.008
-
name: omicsintegrator2
params:
directed: false
include: false
run1:
b:
- 4
g:
- 0
run2:
b:
- 2
g:
- 3
-
name: meo
params:
directed: true
include: false
run1:
local_search:
- "Yes"
max_path_length:
- 3
rand_restarts:
- 10
datasets:
-
data_dir: input
edge_files:
- phosphosite-irefindex13.0-uniprot.txt
label: tps-egfr
node_files:
- tps-egfr-prizes.txt
other_files: []
reconstruction_settings:
locations:
reconstruction_dir: output/egfr
run: true
analysis:
graphspace:
include: false
summary:
include: true
cytoscape:
include: true
Loading