--label_nc and --contain_dontcare_label #137

zerolatnc · 2021-03-25T18:01:49Z

Hi,

I have a dataset with labels 1 to 121 and an unknown label 0.

I am a little confused as to what value I have to set --label_nc to and whether I should use the flag --contain-dontcare-label. I tried --label_nc 121 --contain_dontcare_label and this gives me a CUDA error like this:

/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [55,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [56,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [57,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [50,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [51,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [52,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [53,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [54,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [55$0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [21$0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [22$0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.                   /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [23$
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [29,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
Traceback (most recent call last):
  File "train.py", line 43, in <module>
    trainer.run_generator_one_step(data_i)
  File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/trainers/pix2pix_trainer.py", line 36, in run_ge
nerator_one_step
    g_losses, generated, pred_seg = self.pix2pix_model(data, mode='generator')
  File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72
7, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py",
 line 159, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72
7, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/pix2pix_model.py", line 59, in forward
    generator_input, real_image)
  File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/pix2pix_model.py", line 183, in compute_g
enerator_loss
    input_semantics, real_image, compute_kld_loss=self.opt.use_vae)
  File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/pix2pix_model.py", line 247, in generate_
fake
    fake_image = self.netG(input_semantics, z=z)
  File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72
7, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/networks/generator.py", line 92, in forwa
rd
    x = F.interpolate(seg, size=(self.sh, self.sw))
  File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/functional.py", line 3132,
in interpolate
    return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
RuntimeError: CUDA error: device-side assert triggered
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: device-side assert triggered
Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:687 (most recen
t call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f21021568b2 in /home/user/anacon
da/envs/synthcp/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xad2 (0x7f21023a8952 in /home/user/anacon
da/envs/synthcp/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f2102141b7d in /home/user/anaconda/envs/synthcp
/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x60246a (0x7f2150b3e46a in /home/user/anaconda/envs/synthcp/lib/python3.7
/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x602516 (0x7f2150b3e516 in /home/user/anaconda/envs/synthcp/lib/python3.7
/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #20: __libc_start_main + 0xe7 (0x7f2153662b97 in /lib/x86_64-linux-gnu/libc.so.6)

exp1_train_spade.sh: line 17: 10074 Aborted                 (core dumped) python3 train.py --name caos --
dataset_mode caos --dataroot ../anomaly/exp1_train_spade --label_nc 121 --contain_dontcare_label --no_ins
tance --niter 1 --batchSize 2 --nThread 15 --gpu_ids 0 --no_html --tf_log --dataroot_source ../anomaly/ex
p1_train_spade --dataroot_target ../anomaly/exp1_train_spade --image_dir ../anomaly/exp1_train_spade/trai
n/imgs --image_dir_source ../anomaly/exp1_train_spade --image_dir_target ../anomaly/exp1_train_spade --la
bel_dir ../anomaly/exp1_train_spade/train/segs_gs --label_dir_source ../anomaly/exp1_train_spade --label_
dir_target ../anomaly/exp1_train_spade --checkpoints_dir ./checkpoints_spade

I tried other combinations of --label_nc and --contain_dontcare_label, but I haven't been able to get the training to run successfully. If you could provide some clarity on how these parameters should be set for a custom data set, it would help me a lot!

The text was updated successfully, but these errors were encountered:

HaotianWang6897 · 2021-09-02T00:50:22Z

I have faced the similar issue. Please organize your label ID from 0-N, N is the total number of your class. All the don't care label is set to be 255.

ruibo5 · 2023-08-02T10:59:00Z

I already resetting this label pixel number. class pixel is 1~~6 or 0~~5, and background pixel number is 255, but i got the same error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--label_nc and --contain_dontcare_label #137

--label_nc and --contain_dontcare_label #137

zerolatnc commented Mar 25, 2021

HaotianWang6897 commented Sep 2, 2021

ruibo5 commented Aug 2, 2023

--label_nc and --contain_dontcare_label #137

--label_nc and --contain_dontcare_label #137

Comments

zerolatnc commented Mar 25, 2021

HaotianWang6897 commented Sep 2, 2021

ruibo5 commented Aug 2, 2023