Skip to content
aturner-epcc edited this page Dec 3, 2012 · 5 revisions

Produce job submission scripts using a common interface.

Summary

This tool produces job submission scripts for a variety of compute resources and batch systems. It attempts to partition the work in a pseudo-optimal way and select sensible options. It will also include compusory options required on particular resources.

Limitations

  • No support for accelerator devices in compute nodes is currently available.

Usage

bolt [options] executable_file arg1 arg2 ...

The executable_file is the program or command you wish to run in your job submission script.

Note: If you are specifying a parallel job then the program or command you specify will be appended to the parallel launch command (e.g. ‘mpirun’) so you cannot currently create jobs where the parallel launch command exists in a sub-script.

Options

-A,--account <account>
Specify the account to charge the job to. If not specified then it is not included in the output.
-b,--batch <batch>
Specify the batch system to create job submission script for. Default is specified by the resource configuration. Use the ‘-l’ option to list valid values.
-c,--code <code>
Specify a simulation code to generate a batch script for. Use the ‘-l’ option to list valid values and details on the arguments that should be provided.
-d,--threads <n>
The number of shared-memory OpenMP threads per parallel task. Default is 1.
-h,--help
Show help information.
-i,--info
Print the program licence and warranty.
-l,--list
List the resources, batch systems and codes available.
-n,--tasks <n>
Number of parallel tasks. Defaults to 1. If number of parallel tasks is 1 then the tool will try to produce a serial job submission script (unless the ‘-p’ option is specified).
-N,--tasks-per-node <n>
Number of parallel tasks per node. Defaults to the minimum of the number of tasks or the number of cores per node for the specified resource.
-o,--output <filename>
The output filename to use. The default is “a.bolt”.
-p,--force-parallel
Force the tool to create a parallel job even if the number of tasks is 1.
-q,--queue <queue>
Specify the queue to submit the job to. This will usually be set correctly by default.
-r,--resource <resource>
Specify the resource to create a job submission script for. Default is set by the install system. Use the ‘-l’ option to list valid values.
-s,--submit
Submit the created job submission script to the batch system. Default is not to submit job.
-t,--job-time <hh:mm:ss>
Specify the wallclock limit for the job.

Examples

Serial jobs

To create a serial job submission script to run the program ‘postprocess.x’ with the arguments ‘input.file’ and ‘output.file’ for 20 minutes you would use:

bolt -t 0:20:0 postprocess.x input.file output.file

(The command ‘postprocess.x’ must be in your execution search path for the job to work correctly.)

The resulting job submission script would be in the file ‘a.bolt’. You would then need to submit the job with the job submission command on your compute resource (e.g. qsub). To have bolt submit the job for you, you can add the ‘-s’ option:

bolt -s -t 0:20:0 postprocess.x input.file output.file

If you wish to force your serial job to run in the parallel queues - for example, to use compute nodes which may be of a different architecture to the nodes that run serial jobs you add the ‘-p’ option:

bolt -p -t 0:20:0 postprocess.x input.file output.file

To specify the name of the job submission script to create (instead of the default ‘a.bolt’) you can use the ‘-o’ option:

bolt -p -t 0:20:0 -o post.bolt postprocess.x input.file output.file

(If you do not specify an output file name then the default ‘a.bolt’ will be used.)

Parallel jobs

To create a parallel job submission script to run the program ‘castep’ with the argument ‘alx3’ over 2048 cores for 6 hours you would use:

bolt -t 6:0:0 -n 2048 castep alx3

As for serial jobs, the command ‘castep’ must be in your execution search path for this script to work. Remember that this command only creates the job submission script in a file called ‘a.bolt’, you must submit the job yourself using the job submission command (e.g. qsub) on your system. If you want create and submit the job script automatically, add the ‘-s’ option:

bolt -s -t 6:0:0 -n 2048 castep alx3

If you wanted to specify the number of cores per node to use then you can use the ‘-N’ option. For example, on a system with 32 cores per node we may want to use only 16 of them to allow extra memory for each parallel task:

bolt -t 6:0:0 -n 2048 -N 16 castep alx3

To specify the name of the job submission script to create you can use the ‘-o’ option:

bolt -t 6:0:0 -n 2048 -N 16 -o castep_job.bolt castep alx3

(If you do not specify an output file name then the default ‘a.bolt’ will be used.)

Bugs

If you find any bugs please report them to epcc-support@epcc.ed.ac.uk.

Clone this wiki locally