-
Notifications
You must be signed in to change notification settings - Fork 54
Open
Description
Dear, developer.
Thanks for you work.
While forecasting multivariate tme series on ETTm1 dataset, on kaggle GPU p100 I encountered an error due to insufficient resources, what parameters would you recommend me to reduce and to what values?
18283.9s | 19 | Traceback (most recent call last):
-- | -- | --
18283.9s | 20 | File "/kaggle/working/ModernTCN/run.py", line 164, in <module>
18283.9s | 21 | exp.train(setting)
18283.9s | 22 | File "/kaggle/working/ModernTCN/exp/exp_ModernTCN.py", line 169, in train
18283.9s | 23 | outputs = self.model(batch_x, batch_x_mark)
18283.9s | 24 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
18283.9s | 25 | return self._call_impl(*args, **kwargs)
18283.9s | 26 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
18283.9s | 27 | return forward_call(*args, **kwargs)
18283.9s | 28 | File "/kaggle/working/ModernTCN/models/ModernTCN.py", line 411, in forward
18283.9s | 29 | x = self.model(x, te)
18283.9s | 30 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
18283.9s | 31 | return self._call_impl(*args, **kwargs)
18283.9s | 32 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
18283.9s | 33 | return forward_call(*args, **kwargs)
18283.9s | 34 | File "/kaggle/working/ModernTCN/models/ModernTCN.py", line 332, in forward
18283.9s | 35 | x = self.forward_feature(x,te)
18283.9s | 36 | File "/kaggle/working/ModernTCN/models/ModernTCN.py", line 322, in forward_feature
18283.9s | 37 | x = self.stages[i](x)
18283.9s | 38 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
18283.9s | 39 | return self._call_impl(*args, **kwargs)
18283.9s | 40 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
18283.9s | 41 | return forward_call(*args, **kwargs)
18283.9s | 42 | File "/kaggle/working/ModernTCN/models/ModernTCN.py", line 199, in forward
18283.9s | 43 | x = blk(x)
18283.9s | 44 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
18283.9s | 45 | return self._call_impl(*args, **kwargs)
18283.9s | 46 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
18283.9s | 47 | return forward_call(*args, **kwargs)
18283.9s | 48 | File "/kaggle/working/ModernTCN/models/ModernTCN.py", line 166, in forward
18283.9s | 49 | x = self.ffn1drop1(self.ffn1pw1(x))
18283.9s | 50 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
18283.9s | 51 | return self._call_impl(*args, **kwargs)
18283.9s | 52 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
18283.9s | 53 | return forward_call(*args, **kwargs)
18283.9s | 54 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/dropout.py", line 58, in forward
18283.9s | 55 | return F.dropout(input, self.p, self.training, self.inplace)
18283.9s | 56 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/functional.py", line 1266, in dropout
18283.9s | 57 | return _VF.dropout_(input, p, training) if inplace else _VF.dropout(input, p, training)
18283.9s | 58 | torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.15 GiB. GPU 0 has a total capacty of 15.89 GiB of which 978.12 MiB is free. Process 148877 has 14.94 GiB memory in use. Of the allocated memory 14.61 GiB is allocated by PyTorch, and 38.23 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels