Stable versions of torchrl/tensordict still getting internal dynamo error

I am currently getting the same issue in #10.

I have torch 2.5.1, torchrl 0.6.0, tensordict 0.6.0 at the moment. I am running a slightly modified version of the original code. I can run with cudagraphs or compile, but not both. Although with cudagraphs things are working great!

Trace:
```
python leanrl/ppo_continuous_action_torchcompile.py --num-envs 1 --num-steps 64 --total-timesteps 256 --compile --cudagraphs
/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/tyro/_fields.py:181: UserWarning: The field target_kl is annotated with type <class 'float'>, but the default value None has type <class 'NoneType'>. We'll try to handle this gracefully, but it may cause unexpected behavior.
  warnings.warn(
wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
wandb: Currently logged in as: stonet2000. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.18.5
wandb: Run data is saved locally in /home/stao/work/external/leanrl/wandb/run-20241030_192316-jgwpim8y
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run ppo_continuous_action_torchcompile-HalfCheetah-v4__ppo_continuous_action_torchcompile__1__True__True
wandb: ⭐️ View project at https://wandb.ai/stonet2000/ppo_continuous_action
wandb: 🚀 View run at https://wandb.ai/stonet2000/ppo_continuous_action/runs/jgwpim8y
/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/tensordict/nn/cudagraphs.py:194: UserWarning: Tensordict is registered in PyTree. This is incompatible with CudaGraphModule. Removing TDs from PyTree. To silence this warning, call tensordict.nn.functional_module._exclude_td_from_pytree().set() or set the environment variable `EXCLUDE_TD_FROM_PYTREE=1`. This operation is irreversible.
  warnings.warn(
  0%|                                                                                                                                 | 0/4 [00:00<?, ?it/s]/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_inductor/compile_fx.py:167: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance.
  warnings.warn(
W1030 19:23:26.159230 2648226 site-packages/torch/_logging/_internal.py:1081] [11/0] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
 25%|██████████████████████████████▎                                                                                          | 1/4 [00:10<00:31, 10.50s/it]/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/cuda/graphs.py:84: UserWarning: The CUDA Graph is empty. This usually means that the graph was attempted to be captured on wrong device or stream. (Triggered internally at ../aten/src/ATen/cuda/CUDAGraph.cpp:208.)
  super().capture_end()
 25%|██████████████████████████████▎                                                                                          | 1/4 [00:10<00:31, 10.65s/it]
Traceback (most recent call last):
  File "/home/stao/work/external/leanrl/leanrl/ppo_continuous_action_torchcompile.py", line 358, in <module>
    container = gae(next_obs, next_done, container)
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/tensordict/nn/cudagraphs.py", line 439, in __call__
    return self._call_func(*args, **kwargs)
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/tensordict/nn/cudagraphs.py", line 345, in _call
    out = self.module(*self._args, **self._kwargs)
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 465, in _fn
    return fn(*args, **kwargs)
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1269, in __call__
    return self._torchdynamo_orig_callable(
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 526, in __call__
    return _compile(
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 952, in _compile
    raise InternalTorchDynamoError(
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 924, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 666, in compile_inner
    return _compile_inner(code, one_graph, hooks, transform)
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_utils_internal.py", line 87, in wrapper_function
    return function(*args, **kwargs)
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 699, in _compile_inner
    out_code = transform_code_object(code, transform)
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1322, in transform_code_object
    transformations(instructions, code_options)
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 208, in _fn
    cuda_rng_state = torch.cuda.get_rng_state()
  File "/home/stao/miniforge3/envs/leanrl/lib/python3.10/site-packages/torch/cuda/random.py", line 42, in get_rng_state
    return default_generator.get_state()
torch._dynamo.exc.InternalTorchDynamoError: RuntimeError: Cannot call CUDAGeneratorImpl::current_seed during CUDA graph capture. If you need this call to be captured, please file an issue. Current cudaStreamCaptureStatus: cudaStreamCaptureStatusActive


You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True
```


currently trying to add leanrl tricks to maniskill, update times have massively improved!
![image](https://github.com/user-attachments/assets/5f6636db-613b-40df-bc93-e786f9bd5bca)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stable versions of torchrl/tensordict still getting internal dynamo error #14

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stable versions of torchrl/tensordict still getting internal dynamo error #14

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions