|
| 1 | +.. _resource_allocation: |
| 2 | + |
| 3 | +.. index :: _resource_allocation |
| 4 | +
|
| 5 | +Qiita's per job resources allocation |
| 6 | +==================================== |
| 7 | + |
| 8 | +Qiita will request specific resource allocations based on the name of the command, |
| 9 | +the job type and it's definition in the database. These definitions are in the |
| 10 | +qiita.processing_job_resource_allocation table in the database. This table has a name |
| 11 | +(the name of the job), a description, job_type (more below), and the allocation for |
| 12 | +that job. |
| 13 | + |
| 14 | +Job types |
| 15 | +--------- |
| 16 | + |
| 17 | +The Qiita job types allows us to better group the jobs based on what they do and |
| 18 | +separate possible name conflicts while at the same time keeping this separation |
| 19 | +simple. |
| 20 | + |
| 21 | +#. RESOURCE_PARAMS_COMMAND: This is the most common entry as it defines the allocation |
| 22 | + for an specific command name, like "Shogun v1.0.7" or "Beta diversity (phylogenetic)", |
| 23 | + for the complete list of commands visit: `Qiita Software <https://qiita.ucsd.edu/software/>`__ |
| 24 | +#. COMPLETE_JOBS_RESOURCE_PARAM: When a RESOURCE_PARAMS_COMMAND completes, it will define if the job |
| 25 | + finished successfully and a set of artifact(s) that need to be validated and then added to Qiita - |
| 26 | + move to the final locations and register them in the database. For these jobs the name is the actual |
| 27 | + artifact type that is being generated, for example: "per_sample_FASTQ" or "q2_visualization" |
| 28 | +#. RELEASE_VALIDATORS_RESOURCE_PARAM: The completed job will create a new job to release and coordinate |
| 29 | + all the artifact validators for a given command |
| 30 | +#. VALIDATOR: Each new artifact needs a validator and depends on the Qiita plugin that defined |
| 31 | + that artifact type. Similar to COMPLETE_JOBS_RESOURCE_PARAM here the name of the job is the |
| 32 | + artifact type being validated |
| 33 | +#. REGISTER: Used to install or register a new plugin and their commands in the Qiita system |
| 34 | + |
| 35 | +Note that all these job types have a default value (name of the entry is default) so if there is no definition |
| 36 | +for that command or artifact it will use those resources |
| 37 | + |
| 38 | +Resources allocation |
| 39 | +-------------------- |
| 40 | + |
| 41 | +The allocation of each job is what a user will normally use to define resources when |
| 42 | +submitting a job into a queueing system, for example: `-q qiita -l nodes=1:ppn=1 -l mem=8gb -l walltime=300:00:00` |
| 43 | + |
| 44 | +We have defined some "internal" rules: |
| 45 | + |
| 46 | +#. Always submitted to the qiita queue |
| 47 | +#. Memory allocation should be done using: mem (memory for the full job); suggest using 1G as the |
| 48 | + minimum request (no benefit selecting 1G vs 700M) |
| 49 | +#. The nodes and cores allocations should be in the form of: nodes=<num>:ppn=<num> |
| 50 | +#. Always request walltime! |
| 51 | +#. The queueing system uses mem for vacating jobs, not vmem, so focus on mem utilization (ignore |
| 52 | + vmem - at least for now) |
| 53 | + |
| 54 | +Resources allocation by formula |
| 55 | +------------------------------- |
| 56 | + |
| 57 | +It is possible to define a memory allocation by a formula using the values: "{samples}" - the |
| 58 | +number of samples in the information file, "{columns}" - the number of columns in the information file, |
| 59 | +and "{input_size}" - the total size of the artifact type (in bytes). |
| 60 | + |
| 61 | +Some examples: |
| 62 | + |
| 63 | +#. Request 1K per sample: `samples*1000` -> `-q qiita -l nodes=1:ppn=5 -l mem={samples}*1000 -l walltime=130:00:00` |
| 64 | +#. Request at least 4M and then add `samples+columns` and request 1M for each: |
| 65 | + `((samples+columns)*1000000)+4000000` -> `-q qiita -l nodes=1:ppn=5 -l mem=(({samples}+{columns})*1000000)+4000000 |
| 66 | + -l walltime=130:00:00` |
| 67 | +#. Request at least 2G and grow based on input size: `{input_size}+(2*1e+9)` -> `-q qiita -l nodes=1:ppn=5 -l |
| 68 | + mem={input_size}+(2*1e+9) -l walltime=130:00:00` |
0 commit comments