The ClusterManagers.jl
package implements code for different job queue systems commonly used on compute clusters.
Warning
This package is not currently being actively maintained or tested.
We are in the process of splitting this package up into multiple smaller packages, with a separate package for each job queue systems.
We are seeking maintainers for these new packages. If you are an active user of any of the job queue systems listed below and are interested in being a maintainer, please open a GitHub issue - say that you are interested in being a maintainer, and specify which job queue system you use.
The following managers are implemented in this package (the ClusterManagers.jl
package):
Job queue system | Command to add processors |
---|---|
Local manager with CPU affinity setting | addprocs(LocalAffinityManager(;np=CPU_CORES, mode::AffinityMode=BALANCED, affinities=[]); kwargs...) |
Job queue system | External package | Command to add processors |
---|---|---|
Slurm | SlurmClusterManager.jl | addprocs(SlurmManager(); kwargs...) |
Load Sharing Facility (LSF) | LSFClusterManager.jl | addprocs_lsf(np::Integer; bsub_flags=``, ssh_cmd=``) or addprocs(LSFManager(np, bsub_flags, ssh_cmd, retry_delays, throttle)) |
ElasticManager | ElasticClusterManager.jl | addprocs(ElasticManager(...); kwargs...) |
Kubernetes (K8s) | K8sClusterManagers.jl | addprocs(K8sClusterManager(np; kwargs...)) |
Azure scale-sets | AzManagers.jl | addprocs(vmtemplate, n; kwargs...) |
Warning
The following managers are not currently being actively maintained or tested.
We are seeking maintainers for the following managers. If you are an active user of any of the following job queue systems listed and are interested in being a maintainer, please open a GitHub issue - say that you are interested in being a maintainer, and specify which job queue system you use.
Job queue system | Command to add processors |
---|---|
Sun Grid Engine (SGE) via qsub |
addprocs_sge(np::Integer; qsub_flags=``) or addprocs(SGEManager(np, qsub_flags)) |
Sun Grid Engine (SGE) via qrsh |
addprocs_qrsh(np::Integer; qsub_flags=``) or addprocs(QRSHManager(np, qsub_flags)) |
PBS (Portable Batch System) | addprocs_pbs(np::Integer; qsub_flags=``) or addprocs(PBSManager(np, qsub_flags)) |
Scyld | addprocs_scyld(np::Integer) or addprocs(ScyldManager(np)) |
HTCondor | addprocs_htc(np::Integer) or addprocs(HTCManager(np)) |
You can also write your own custom cluster manager; see the instructions in the Julia manual.
Slurm: please see SlurmClusterManager.jl
For Slurm, please see the SlurmClusterManager.jl package.
- Linux only feature.
- Requires the Linux
taskset
command to be installed. - Usage :
addprocs(LocalAffinityManager(;np=CPU_CORES, mode::AffinityMode=BALANCED, affinities=[]); kwargs...)
.
where
np
is the number of workers to be started.affinities
, if specified, is a list of CPU IDs. As many workers as entries inaffinities
are launched. Each worker is pinned to the specified CPU ID.mode
(used only whenaffinities
is not specified, can be eitherCOMPACT
orBALANCED
) -COMPACT
results in the requested number of workers pinned to cores in increasing order, For example, worker1 => CPU0, worker2 => CPU1 and so on.BALANCED
tries to spread the workers. Useful when we have multiple CPU sockets, with each socket having multiple cores. ABALANCED
mode results in workers spread across CPU sockets. Default isBALANCED
.
For ElasticManager
, please see the ElasticClusterManager.jl package.
See docs/sge.md