Skip to content

[Bug Report] Recurrent Policy Memory Not Reset During Evaluation #3837

@bikcrum

Description

@bikcrum

Describe the Bug

When evaluating a recurrent policy in the RSL-RL framework, its hidden state is not reset upon episode termination.
This leads to an unexpected behavior due to residual hidden states from the previous episode, causing the policy to act inconsistently at the start of new episodes.


Steps to Reproduce

  1. Train a recurrent policy using the RSL-RL framework.
  2. Evaluate the trained policy using play.py.
  3. Let an episode terminate and a new one start.
  4. Observe that the recurrent memory retains information from the old episode, affecting initial and subsequent actions of the next episode.

Expected Behavior:
Each new episode should start with a fully reset recurrent state.

Actual Behavior:
Recurrent state persists between episodes during evaluation.


System Info

Component Version / Info
Commit 1103a0f
Isaac Sim Version 4.5.0
OS Ubuntu 22.04
GPU RTX 3060
CUDA 12.9
GPU Driver 575.64.03

Checklist

  • I have checked that there is no similar issue in the repo (required)
  • I have confirmed that the issue is not related to Isaac Sim itself but to this repository

Acceptance Criteria

  • Hidden state of recurrent policy is reset when an episode terminates during evaluation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions