Skip to content

Commit 6f013fb

Browse files
authored
Updates SB3 ppo cfg so it trains under reasonable amount of time (#3726)
# Description This PR fixes the sb3_ppo_cfg for task Isaac-Ant-v0 the parameter before had 4096 num_envs + horizon 512 + batch size 128 + n_epoch 20, that means the training one cycle it needs to for loop (20 * 512 * 4096) / 128 = 327680 times! which appears as if it is hanging forever the new config matches more closely with that of rl_games. I verified it will trains under 5 min [Screencast from 2025-10-15 13-56-21.webm](https://github.com/user-attachments/assets/2bc7bcd8-0063-46b9-adb0-67a6aa686732) ## Type of change <!-- As you go through the list, delete the ones that are not applicable. --> - Bug fix (non-breaking change which fixes an issue) ## Screenshots Please attach before and after screenshots of the change if applicable. <!-- Example: | Before | After | | ------ | ----- | | _gif/png before_ | _gif/png after_ | To upload images to a PR -- simply drag and drop an image while in edit mode and it should upload the image directly. You can then paste that source into the above before/after sections. --> ## Checklist - [x] I have read and understood the [contribution guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html) - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there <!-- As you go through the checklist above, you can mark something as done by putting an x character in it For example, - [x] I have done this task - [ ] I have not done this task -->
1 parent 47780cf commit 6f013fb

File tree

3 files changed

+15
-6
lines changed

3 files changed

+15
-6
lines changed

source/isaaclab_rl/config/extension.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22

33
# Note: Semantic Versioning is used: https://semver.org/
4-
version = "0.4.1"
4+
version = "0.4.2"
55

66
# Description
77
title = "Isaac Lab RL"

source/isaaclab_rl/docs/CHANGELOG.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,15 @@
11
Changelog
22
---------
33

4+
0.4.2 (2025-10-15)
5+
~~~~~~~~~~~~~~~~~~
6+
7+
Fixed
8+
^^^^^
9+
10+
* Isaac-Ant-v0's sb3_ppo_cfg default value, so it trains under reasonable amount of time.
11+
12+
413
0.4.1 (2025-09-09)
514
~~~~~~~~~~~~~~~~~~
615

source/isaaclab_tasks/isaaclab_tasks/manager_based/classic/ant/agents/sb3_ppo_cfg.yaml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,19 @@
66
# Reference: https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/hyperparams/ppo.yml#L161
77
seed: 42
88

9-
n_timesteps: !!float 1e7
9+
n_timesteps: !!float 1e8
1010
policy: 'MlpPolicy'
11-
batch_size: 128
12-
n_steps: 512
11+
batch_size: 32768
12+
n_steps: 16
1313
gamma: 0.99
1414
gae_lambda: 0.9
15-
n_epochs: 20
15+
n_epochs: 4
1616
ent_coef: 0.0
1717
sde_sample_freq: 4
1818
max_grad_norm: 0.5
1919
vf_coef: 0.5
2020
learning_rate: !!float 3e-5
21-
use_sde: True
21+
use_sde: False
2222
clip_range: 0.4
2323
device: "cuda:0"
2424
policy_kwargs:

0 commit comments

Comments
 (0)