generated from Wang-Bioinformatics-Lab/Nextflow_Workflow_Template
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathworkflowinput.yaml
More file actions
176 lines (152 loc) · 10.8 KB
/
workflowinput.yaml
File metadata and controls
176 lines (152 loc) · 10.8 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
workflowname: NP3_MS_Workflow_nextflow
workflowdescription: This is the NP3 MS Workflow v1.3.0 for GNPS2 using Nextflow.
workflowlongdescription: The NP3 MS Workflow is a pipeline for LC-MS/MS metabolomics data process and analysis focused on untargeted data. The NP3 *run* command is implemented here, which executes Steps 2 to 10. The Pre Process command (Step 2) is executed separated with provided parameters. ** Checkout the NP³ repository for details about these commands and the pipeline steps - https://github.com/danielatrivella/NP3_MS_Workflow **
workflowversion: "2025.11.26"
workflowfile: nf_workflow.nf
workflowautohide: false
adminonly: false
#This maps the parameters from an input form to those that will appear in nextflow
parameterlist:
- displayname: Mandatory Parameters - Files Selection
paramtype: section
- displayname: Input Metadata Table
paramtype: fileselector
nf_paramname: metadata
formplaceholder: Enter the path to the metadata table CSV file.
formvalue: ""
targettaskfolder: metadata
optional: false
selectsinglefile: true
tooltip: "the path to the metadata table following the format expected by the NP3"
- displayname: Input Raw Data Folder
paramtype: fileselector
nf_paramname: raw_data_path
formplaceholder: Enter the path to the raw_data_path.
formvalue: ""
targettaskfolder: raw_data_path
optional: false
selectsinglefile: false
folderunroll: true
tooltip: "the path to the folder containing the input LC-MS/MS raw spectra data files (mzXML format is recommended). The pre_process (Step 2) output will be stored here. To use a previous pre processed result, this path should point to the directory storing the pre_processed result folder (the 'np3_results/pre_processed_results/' folder of a previous workflow result)."
- displayname: Critical Parameters - LC-MS/MS
paramtype: section
# critical parms for *run*
- displayname: Precursor m/z tolerance
paramtype: text
nf_paramname: mz_tolerance
formplaceholder: Enter the m/z tolerance for MS2 precursor
formvalue: "0.025"
tooltip: "the tolerance in Daltons for the m/z of the precursor that determines if two spectra will be compared and possibly joined. Used in the clustering jobs (Step 3), in the cleaning (Step 5), in the library identifications (Step 6) and in the annotation of ionization variants (Step 7 - also used for the fragment tolerance of the annotations) (default to 0.025)."
- displayname: Fragment tolerance for MS2 peaks
paramtype: text
nf_paramname: fragment_tolerance
formplaceholder: Enter the tolerance in Daltons for MS2 peaks
formvalue: "0.05"
tooltip: "The tolerance in Daltons for fragment peaks. Peaks in the original MS/MS spectra that are closer than this get merged in the clustering jobs (Step 3). Also used in the pre process (Step 2), in the spectra similarity comparisons and in cleaning (Step 5) (default to 0.05)."
- displayname: Ion Mode
paramtype: text
nf_paramname: ion_mode
formplaceholder: Enter '1' for positive or '2' for negative ion mode
formvalue: "1"
tooltip: "the precursor ion mode. One of the following numeric values corresponding to an ion adduct type = '1' for positive [M+H]+; or '2' for negative [M-H]- ion mode (default to '1')."
# critical parms for *pre_process*
- displayname: Critical Parameters - Pre_process
paramtype: section
- displayname: Expected peak width of MS1 - minimum and maximum
paramtype: text
nf_paramname: peak_width
formplaceholder: Enter the expected minimum,maximum peak width
formvalue: "2,10"
tooltip: "two numeric values separated by comma without spaces and using decimal point equals dot, containing the expected approximate peak width in chromatographic space. Given as a range (min,max) in seconds. The mean value will be used to simulate the width of the fake MS1 peaks (see documentation, default to '2,10')."
- displayname: MS1 and MS2 retention time deviation
paramtype: text
nf_paramname: rt_tolerance_deviation
formplaceholder: Enter the retention time deviation between MS1 and MS2
formvalue: "3.0"
tooltip: "The retention time tolerance in seconds used to enlarge the MS1 peak boundaries and accept as a match all MS2 ions that have a retention time value within the enlarged MS1 peak range. This tolerance is applied to both sides of the MS1 peaks (RTmin - rt_tolerance and RTmax + rt_tolerance). Tries to overcome bad MS1 peak integrations (default to 3s)."
- displayname: MS1 and MS2 m/z deviation
paramtype: text
nf_paramname: mz_tolerance_deviation
formplaceholder: Enter the m/z deviation between MS1 and MS2
formvalue: "0.05"
tooltip: "The tolerance in Daltons for matching a MS1 peak m/z with a MS2 spectrum precursor m/z (default to 0.05)."
- displayname: PPM tolerance for MS1
paramtype: text
nf_paramname: ppm_tolerance
formplaceholder: Enter the PPM tolerance for MS1
formvalue: "15"
tooltip: "the maximal tolerated m/z deviation in consecutive MS1 scans in parts per million (ppm) for the initial ROI definition of the R::xcms::centWave algorithm (Step 2). Typically set to a generous multiple of the mass accuracy of the mass spectrometer (default to 15)."
# other parms for *run*
- displayname: Clustering Parameters
paramtype: section
- displayname: Retention time tolerances
paramtype: text
nf_paramname: rt_tolerance
formplaceholder: Enter x,y retention time tolerances in seconds
formvalue: "1,2"
tooltip: "x,y retention time tolerances in seconds for the retention time width of the precursor that determines if two spectra will be compared and possibly joined. It is directly applied to the retention time minimum (subtracted) and maximum (added) of the spectra. It enlarges the peak boundaries to deal with misaligned samples or ionization variant spectra. The first tolerance [x] is used in Step 3 (first clustering) and Step 7 (ionization variants annotation); and the second tolerance [y] is used in Step 3 (final clustering) and Step 5 (Clean) (default to '1,2')."
# removed for now
# - displayname: Similarity Function for Clean and MN
# paramtype: select
# nf_paramname: similarity_function
# formvalue: np3_shifted_cosine
# tooltip: "the similarity function to be used in the spectra comparison to create the pairwise similarity tables for clean (Step 5) and molecular networking (Step 10). One of 'np3_shifted_cosine' or 'spec2vec'. If 'spec2vec' is selected, the model trained on UniqueInchikey subset (12,797 spectra) is used by spec2vec in the spectra comparison; otherwise, the NP3 shifted cosine function is used (default to 'np3_shifted_cosine')."
# options:
# - value: np3_shifted_cosine
# display: np3_shifted_cosine
# - value: spec2vec
# display: spec2vec
- displayname: Trim Precursor m/z
paramtype: select
nf_paramname: trim_mz
formvalue: TRUE
tooltip: "A logical 'True' or 'False' indicating if the spectra fragmented peaks around the precursor m/z +-20 Da should be deleted before the pairwise comparisons. If 'True' this removes the residual precursor ion, which is frequently observed in MS/MS spectra acquired on qTOFs (default to 'True')."
options:
- value: TRUE
display: True
- value: FALSE
display: False
- displayname: Noise cutoff
paramtype: text
nf_paramname: noise_cutoff
formplaceholder: Enter the noise cutoff value
formvalue: "FALSE"
tooltip: "A positive numeric value to scale the interquartile range (IQR) of the blank spectra basePeakInt distribution from the clustering Step 3 result and to remove the spectra with a basePeakInt value below this distribution median plus IQR*noise_cutoff after the clean Step 5. Or FALSE to disable it. When no blank sample is present in the metadata, the full distribution is used. This cutoff will affect the spectra with a low basePeakInt value that probably are noise features."
- displayname: Molecular Networking Parameters
paramtype: section
- displayname: Minimum Similarity
paramtype: text
nf_paramname: similarity_mn
formplaceholder: Enter the minimum similarity score for molecular networking
formvalue: "0.6"
tooltip: "the minimum similarity score that must occur between a pair of consensus spectra to connect them with a link in the molecular network of similarity. Lower values will increase the components sizes by inducing the connection of less related spectra; and higher values will limit the components sizes to the opposite."
- displayname: Network Top K
paramtype: text
nf_paramname: net_top_k
formplaceholder: Enter the maximum top K for molecular networking
formvalue: "15"
tooltip: "the maximum number of connections for one single node in the molecular network of similarity. A link between two nodes is kept only if both nodes are within each other's [x] most similar nodes. Keeping this value low makes very large networks (many nodes) much easier to visualize (default to 15)."
- displayname: Maximum Component Size
paramtype: text
nf_paramname: max_component_size
formplaceholder: Enter the maximum number of nodes for the network components
formvalue: "200"
tooltip: "the maximum number of nodes that each component of the molecular network of similarity must have (Step 10). The links of this network will be removed using an increasing cosine threshold until each component has at most X nodes. Keeping this value low makes very large networks (many nodes and links) much easier to visualize (default to 200)."
- displayname: Minimum Number of Matched Peaks
paramtype: text
nf_paramname: min_matched_peaks
formplaceholder: Enter the minimum number of matched peaks to connect spectra
formvalue: "6"
tooltip: "The minimum number of common peaks that two spectra must share to be connected by an edge in the filtered SSMN. Connections between spectra with less common peaks than this cutoff will be removed when filtering the SSMN. Except for when one of the spectra have a number of fragment peaks smaller than the given min_matched_peaks value, in this case the spectra must share at least 2 peaks. The fragment peaks count is performed after the spectra are normalized and cleaned (default to 6)."
- displayname: Optional Parameters - admins
paramtype: section
# install_dependencies
- displayname: Install Dependencies
paramtype: select
nf_paramname: install_dependencies
formvalue: "No"
options:
- value: "Yes"
display: "Yes"
- value: "No"
display: "No"