adding some extra documentation (#3018)

antgonza · ElDeveloper · web-flow · commit a3fb699d56db · 2020-07-08T12:01:15.000-07:00
* adding some extra documentation

* Apply suggestions from code review

Co-authored-by: Yoshiki Vázquez Baeza &lt;yoshiki@ucsd.edu&gt;

Co-authored-by: Yoshiki Vázquez Baeza &lt;yoshiki@ucsd.edu&gt;
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,23 @@
 # Qiita changelog
 
+Version 072020
+--------------
+
+* Added per preparation LIBRARY_STRATEGY and removed the study wide STUDY_TYPE values for EBI-ENA submissions to comply with newer metadata standards
+* Changed `Ion Torrent` to `Ion_Torrent` as described by EBI-ENA
+* Added a VALIDATOR job_type to be able to specify job validator resources
+* Added a job.shape method that returns the number of columns, samples and input size of each job based its input artifacts
+* Added the possibility of requesting memory resources for a job based on the input size, number of samples and/or columns
+* Warnings from commands will only use the message part of the warning/errors (#2898)
+* Fixed error when deleting multiple artifacts with summaries and support_files
+* Button now will be disabled when submitting a workflow via GUI to avoid double clicking from users
+* Jobs will now display their "external job id" to users, in practice their barnacle job id
+* Fixed bug that prevented delete of full analyses when the processing tree had multiple paths
+* Added initial script for nightly auto-processing of workflows
+* Removed legacy future dependencies from Python2.7
+* Users can see the available system plugins, their commands and resource allocations: https://qiita.ucsd.edu/software/
+* Added qiime2.2020.06 to the system; which updated these plugins: qp-qiime2, qtp-biom, qtp-diversity, qtp-visualization
+
 Version 052020
 --------------
 
diff --git a/qiita_pet/support_files/doc/source/dev/index.rst b/qiita_pet/support_files/doc/source/dev/index.rst
@@ -7,6 +7,7 @@ The following is a full list of the available developer tutorials
 
     plugins
     rest
+    resource_allocation
 
 To request documentation on any developer use-cases not addressed here,
 please add an issue `here <https://github.com/biocore/qiita/issues>`__.
diff --git a/qiita_pet/support_files/doc/source/dev/resource_allocation.rst b/qiita_pet/support_files/doc/source/dev/resource_allocation.rst
@@ -0,0 +1,68 @@
+.. _resource_allocation:
+
+.. index :: _resource_allocation
+
+Qiita's per job resources allocation
+====================================
+
+Qiita will request specific resource allocations based on the name of the command,
+the job type and it's definition in the database. These definitions are in the
+qiita.processing_job_resource_allocation table in the database. This table has a name
+(the name of the job), a description, job_type (more below), and the allocation for
+that job.
+
+Job types
+---------
+
+The Qiita job types allows us to better group the jobs based on what they do and
+separate possible name conflicts while at the same time keeping this separation
+simple.
+
+#. RESOURCE_PARAMS_COMMAND: This is the most common entry as it defines the allocation
+   for an specific command name, like "Shogun v1.0.7" or "Beta diversity (phylogenetic)",
+   for the complete list of commands visit: `Qiita Software <https://qiita.ucsd.edu/software/>`__
+#. COMPLETE_JOBS_RESOURCE_PARAM: When a RESOURCE_PARAMS_COMMAND completes, it will define if the job
+   finished successfully and a set of artifact(s) that need to be validated and then added to Qiita -
+   move to the final locations and register them in the database. For these jobs the name is the actual
+   artifact type that is being generated, for example: "per_sample_FASTQ" or "q2_visualization"
+#. RELEASE_VALIDATORS_RESOURCE_PARAM: The completed job will create a new job to release and coordinate
+   all the artifact validators for a given command
+#. VALIDATOR: Each new artifact needs a validator and depends on the Qiita plugin that defined
+   that artifact type. Similar to COMPLETE_JOBS_RESOURCE_PARAM here the name of the job is the
+   artifact type being validated
+#. REGISTER: Used to install or register a new plugin and their commands in the Qiita system
+
+Note that all these job types have a default value (name of the entry is default) so if there is no definition
+for that command or artifact it will use those resources
+
+Resources allocation
+--------------------
+
+The allocation of each job is what a user will normally use to define resources when
+submitting a job into a queueing system, for example: `-q qiita -l nodes=1:ppn=1 -l mem=8gb -l walltime=300:00:00`
+
+We have defined some "internal" rules:
+
+#. Always submitted to the qiita queue
+#. Memory allocation should be done using: mem (memory for the full job); suggest using 1G as the
+   minimum request (no benefit selecting 1G vs 700M)
+#. The nodes and cores allocations should be in the form of: nodes=<num>:ppn=<num>
+#. Always request walltime!
+#. The queueing system uses mem for vacating jobs, not vmem, so focus on mem utilization (ignore
+   vmem - at least for now)
+
+Resources allocation by formula
+-------------------------------
+
+It is possible to define a memory allocation by a formula using the values: "{samples}" - the
+number of samples in the information file, "{columns}" - the number of columns in the information file,
+and "{input_size}" -  the total size of the artifact type (in bytes).
+
+Some examples:
+
+#. Request 1K per sample: `samples*1000` -> `-q qiita -l nodes=1:ppn=5 -l mem={samples}*1000 -l walltime=130:00:00`
+#. Request at least 4M and then add `samples+columns` and request 1M for each:
+   `((samples+columns)*1000000)+4000000` -> `-q qiita -l nodes=1:ppn=5 -l mem=(({samples}+{columns})*1000000)+4000000
+   -l walltime=130:00:00`
+#. Request at least 2G and grow based on input size: `{input_size}+(2*1e+9)` -> `-q qiita -l nodes=1:ppn=5 -l
+   mem={input_size}+(2*1e+9) -l walltime=130:00:00`