feat: Add episode replay buffer for RL agents#19
Open
joshgreaves wants to merge 10 commits intomainfrom
Open
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
User description
Summary
Adds a production-ready episode replay buffer implementation for reinforcement learning agents with full asyncio support.
Key Features
Implementation Details
observations[t],actions[t],rewards[t]per episodeTest Coverage
Comprehensive test suite (35 tests) covering:
All tests pass, ruff checks pass (lint + format).
Files Changed
src/ares/contrib/rl/replay_buffer.py: Core implementation (585 lines)tests/contrib/rl/test_replay_buffer.py: Test suite (662 lines)pyproject.toml: Add pytest-asyncio dependency and config🤖 Generated with Claude Code
Generated description
Below is a concise technical summary of the changes proposed in this PR:
graph LR EpisodeReplayBuffer_("EpisodeReplayBuffer"):::added Episode_("Episode"):::added ReplaySample_("ReplaySample"):::added compute_discounted_return_("compute_discounted_return"):::added EpisodeReplayBuffer_ -- "Added buffer storing Episode instances with capacity eviction" --> Episode_ EpisodeReplayBuffer_ -- "Adds n-step ReplaySample construction for uniform sampling" --> ReplaySample_ ReplaySample_ -- "ReplaySample.reward computes discounted return via compute_discounted_return" --> compute_discounted_return_ classDef added stroke:#15AA7A classDef removed stroke:#CD5270 classDef modified stroke:#EDAC4C linkStyle default stroke:#CBD5E1,font-size:13pxIntroduces a production-ready
EpisodeReplayBufferwithin theares.contrib.rlmodule, providing an asyncio-compatible mechanism for episodic experience storage, uniform n-step sampling, and automatic capacity management for reinforcement learning agents. Establishes the necessary package structure and includes a comprehensive test suite to validate the buffer's lifecycle, sampling, and eviction policies.EpisodeReplayBuffercovering lifecycle, sampling, concurrency, and eviction, and updatespyproject.tomlto includepytest-asyncioand adjust test path configurations.Modified files (4)
Latest Contributors(2)
EpisodeReplayBufferclass and its associated data structures (Episode,ReplaySample) within theares.contrib.rlpackage, enabling efficient, asyncio-compatible storage and n-step sampling of agent experiences with capacity management.Modified files (3)
Latest Contributors(0)