fixed new gym API related to step() and reset()#87
Conversation
pseudo-rnd-thoughts
left a comment
There was a problem hiding this comment.
Hey, Im one of the developers of Gym so thanks for starting this PR
I have added a couple of changes for updating the API
This page includes a migration guide, https://gymnasium.farama.org/content/migration-guide/
I couldn't see a function like this, but I would add a testing function like this in Gym to test that environment follow the API.
| _, _ = env.reset() | ||
| action = env.action_space.sample() | ||
| _, reward, done, info = env.step(action) | ||
| _, reward, done, _, info = env.step(action) |
There was a problem hiding this comment.
Replace with
_, reward, terminated, truncated, info = env.step(action)
done = terminated or truncated| return [seed] | ||
|
|
||
| def reset(self, seed=None, options=None, return_info=None): | ||
| def reset(self, seed=None, options=None, return_info=None) -> Tuple[ObsType, dict]: |
There was a problem hiding this comment.
Remove return_info parameter
| self.viewer.close() | ||
|
|
||
| def render(self, mode='human'): | ||
| def render(self, mode='human') -> Optional[Union[RenderFrame, List[RenderFrame]]]: |
There was a problem hiding this comment.
Remove mode parameter and add render_mode to __init__ for specifying the type of rendering
| _, _ = env.reset() | ||
| action = env.action_space.sample() | ||
| _, _, done, _ = env.step(action) | ||
| _, _, done, _, _ = env.step(action) |
There was a problem hiding this comment.
Replace done with terminated and truncated as in the comment before
| _, _ = envs[idx].reset() | ||
| action = envs[idx].action_space.sample() | ||
| _, _, dones[idx], _ = envs[idx].step(action) | ||
| _, _, dones[idx], _, _ = envs[idx].step(action) |
| self.assertEqual(5, len(output)) | ||
| # check each output | ||
| state, reward, done, info = output | ||
| state, reward, done, truncated, info = output |
There was a problem hiding this comment.
terminated, truncated, as terminated != done
| state, _ = env.reset() | ||
| done = False | ||
| state, _, done, _ = env.step(0) | ||
| state, _, done, _, _ = env.step(0) |
There was a problem hiding this comment.
Same comment on terminated and truncated
| state = env.reset() | ||
| done = False | ||
| state, _, done, _ = env.step(0) | ||
| state, _, done, _, _ = env.step(0) |
There was a problem hiding this comment.
Same comment on terminated and truncated
| done = False | ||
| else: | ||
| state, reward, done, info = env.step(env.action_space.sample()) | ||
| state, reward, done, truncated, info = env.step(env.action_space.sample()) |
There was a problem hiding this comment.
Same comment on terminated and truncated
| setup( | ||
| name='nes_py', | ||
| version='8.2.1', | ||
| version='8.2.2', |
There was a problem hiding this comment.
Minor point, but I would make this a minor or major release due to the significant code changes
|
Any reason why this shouldn't be merged with the suggested edits? |
Description
There are several issues related to changes in API. I re-adjusted all code, so it returned truncated in step and readjusted reset to match what the gym expected. In parallel, fixed Mario env. Note old code won't work with this since GYM now return five elements in a tuple vs. 4. Same for reset. I also re-adjusted all the unit tests and examples, so it matched and passed all tests.
Type of change
Please select all relevant options:
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Test Configuration
Target: x86_64-apple-darwin21.6.0
Thread model: posix
Checklist