You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a dataset with labels 1 to 121 and an unknown label 0.
I am a little confused as to what value I have to set --label_nc to and whether I should use the flag --contain-dontcare-label. I tried --label_nc 121 --contain_dontcare_label and this gives me a CUDA error like this:
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [55,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [56,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [57,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [50,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [51,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [52,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [53,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [54,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [55$0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed. /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [21$0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed. /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [22$0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed. /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [686,0,0], thread: [23$
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [688,0,0], thread: [29,
0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
Traceback (most recent call last):
File "train.py", line 43, in <module>
trainer.run_generator_one_step(data_i)
File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/trainers/pix2pix_trainer.py", line 36, in run_ge
nerator_one_step
g_losses, generated, pred_seg = self.pix2pix_model(data, mode='generator')
File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72
7, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py",
line 159, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72
7, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/pix2pix_model.py", line 59, in forward
generator_input, real_image)
File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/pix2pix_model.py", line 183, in compute_g
enerator_loss
input_semantics, real_image, compute_kld_loss=self.opt.use_vae)
File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/pix2pix_model.py", line 247, in generate_
fake
fake_image = self.netG(input_semantics, z=z)
File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72
7, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/user/work/anom-seg-eval/SynthCP/spade-caos/models/networks/generator.py", line 92, in forwa
rd
x = F.interpolate(seg, size=(self.sh, self.sw))
File "/home/user/anaconda/envs/synthcp/lib/python3.7/site-packages/torch/nn/functional.py", line 3132,
in interpolate
return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
RuntimeError: CUDA error: device-side assert triggered
terminate called after throwing an instance of 'c10::Error'
what(): CUDA error: device-side assert triggered
Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:687 (most recen
t call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f21021568b2 in /home/user/anacon
da/envs/synthcp/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xad2 (0x7f21023a8952 in /home/user/anacon
da/envs/synthcp/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f2102141b7d in /home/user/anaconda/envs/synthcp
/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x60246a (0x7f2150b3e46a in /home/user/anaconda/envs/synthcp/lib/python3.7
/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x602516 (0x7f2150b3e516 in /home/user/anaconda/envs/synthcp/lib/python3.7
/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #20: __libc_start_main + 0xe7 (0x7f2153662b97 in /lib/x86_64-linux-gnu/libc.so.6)
exp1_train_spade.sh: line 17: 10074 Aborted (core dumped) python3 train.py --name caos --
dataset_mode caos --dataroot ../anomaly/exp1_train_spade --label_nc 121 --contain_dontcare_label --no_ins
tance --niter 1 --batchSize 2 --nThread 15 --gpu_ids 0 --no_html --tf_log --dataroot_source ../anomaly/ex
p1_train_spade --dataroot_target ../anomaly/exp1_train_spade --image_dir ../anomaly/exp1_train_spade/trai
n/imgs --image_dir_source ../anomaly/exp1_train_spade --image_dir_target ../anomaly/exp1_train_spade --la
bel_dir ../anomaly/exp1_train_spade/train/segs_gs --label_dir_source ../anomaly/exp1_train_spade --label_
dir_target ../anomaly/exp1_train_spade --checkpoints_dir ./checkpoints_spade
I tried other combinations of --label_nc and --contain_dontcare_label, but I haven't been able to get the training to run successfully. If you could provide some clarity on how these parameters should be set for a custom data set, it would help me a lot!
The text was updated successfully, but these errors were encountered:
I have faced the similar issue. Please organize your label ID from 0-N, N is the total number of your class. All the don't care label is set to be 255.
Hi,
I have a dataset with labels 1 to 121 and an unknown label 0.
I am a little confused as to what value I have to set
--label_nc
to and whether I should use the flag--contain-dontcare-label
. I tried--label_nc 121 --contain_dontcare_label
and this gives me a CUDA error like this:I tried other combinations of
--label_nc
and--contain_dontcare_label
, but I haven't been able to get the training to run successfully. If you could provide some clarity on how these parameters should be set for a custom data set, it would help me a lot!The text was updated successfully, but these errors were encountered: