Skip to content

question changing parameters for more resource-intensive operation #5

@m6129

Description

@m6129

Dear, developer.
Thanks for you work.

While forecasting multivariate tme series on ETTm1 dataset, on kaggle GPU p100 I encountered an error due to insufficient resources, what parameters would you recommend me to reduce and to what values?

notebook Kaggle

18283.9s | 19 | Traceback (most recent call last):
-- | -- | --
18283.9s | 20 | File "/kaggle/working/ModernTCN/run.py", line 164, in <module>
18283.9s | 21 | exp.train(setting)
18283.9s | 22 | File "/kaggle/working/ModernTCN/exp/exp_ModernTCN.py", line 169, in train
18283.9s | 23 | outputs = self.model(batch_x, batch_x_mark)
18283.9s | 24 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
18283.9s | 25 | return self._call_impl(*args, **kwargs)
18283.9s | 26 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
18283.9s | 27 | return forward_call(*args, **kwargs)
18283.9s | 28 | File "/kaggle/working/ModernTCN/models/ModernTCN.py", line 411, in forward
18283.9s | 29 | x = self.model(x, te)
18283.9s | 30 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
18283.9s | 31 | return self._call_impl(*args, **kwargs)
18283.9s | 32 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
18283.9s | 33 | return forward_call(*args, **kwargs)
18283.9s | 34 | File "/kaggle/working/ModernTCN/models/ModernTCN.py", line 332, in forward
18283.9s | 35 | x = self.forward_feature(x,te)
18283.9s | 36 | File "/kaggle/working/ModernTCN/models/ModernTCN.py", line 322, in forward_feature
18283.9s | 37 | x = self.stages[i](x)
18283.9s | 38 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
18283.9s | 39 | return self._call_impl(*args, **kwargs)
18283.9s | 40 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
18283.9s | 41 | return forward_call(*args, **kwargs)
18283.9s | 42 | File "/kaggle/working/ModernTCN/models/ModernTCN.py", line 199, in forward
18283.9s | 43 | x = blk(x)
18283.9s | 44 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
18283.9s | 45 | return self._call_impl(*args, **kwargs)
18283.9s | 46 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
18283.9s | 47 | return forward_call(*args, **kwargs)
18283.9s | 48 | File "/kaggle/working/ModernTCN/models/ModernTCN.py", line 166, in forward
18283.9s | 49 | x = self.ffn1drop1(self.ffn1pw1(x))
18283.9s | 50 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
18283.9s | 51 | return self._call_impl(*args, **kwargs)
18283.9s | 52 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
18283.9s | 53 | return forward_call(*args, **kwargs)
18283.9s | 54 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/dropout.py", line 58, in forward
18283.9s | 55 | return F.dropout(input, self.p, self.training, self.inplace)
18283.9s | 56 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/functional.py", line 1266, in dropout
18283.9s | 57 | return _VF.dropout_(input, p, training) if inplace else _VF.dropout(input, p, training)
18283.9s | 58 | torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.15 GiB. GPU 0 has a total capacty of 15.89 GiB of which 978.12 MiB is free. Process 148877 has 14.94 GiB memory in use. Of the allocated memory 14.61 GiB is allocated by PyTorch, and 38.23 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions