Skip to content

working with slurm cluster #8

@ntoia

Description

@ntoia

Hello,

We have installed DeepETPicker onto our SLURM cluster. I have attempted testing the program and have come across a problem when running the training. It seems that the program is unable to detect the GPUs on the cluster. Is this yet to be compatible for use on a cluster or am I missing a step in the procedure?

Traceback (most recent call last):

File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner

self.run()
File "/usr/lib/python3.8/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)
File "/DeepETPicker/train.py", line 352, in train_func

runner = Trainer(min_epochs=min(50, args.max_epoch),
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/connectors/env_vars_connector.py", line 41, in overwrite_by_env_vars

return fn(self, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 345, in init

self.accelerator_connector.on_trainer_init(
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/accelerators/accelerator_connector.py", line 101, in on_trainer_init

self.trainer.data_parallel_device_ids = device_parser.parse_gpu_ids(self.trainer.gpus)
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/utilities/device_parser.py", line 78, in parse_gpu_ids

gpus = _sanitize_gpu_ids(gpus)
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/utilities/device_parser.py", line 139, in _sanitize_gpu_ids

raise MisconfigurationException(f"""
pytorch_lightning.utilities.exceptions
MisconfigurationException
:

            You requested GPUs: [1]
            But your machine only has: []

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions