Skip to content

feat: Add episode replay buffer for RL agents#19

Open
joshgreaves wants to merge 10 commits intomainfrom
feat/replay-buffer
Open

feat: Add episode replay buffer for RL agents#19
joshgreaves wants to merge 10 commits intomainfrom
feat/replay-buffer

Conversation

@joshgreaves
Copy link
Copy Markdown
Contributor

@joshgreaves joshgreaves commented Jan 14, 2026

User description

Summary

Adds a production-ready episode replay buffer implementation for reinforcement learning agents with full asyncio support.

Key Features

  • Episode-based storage: Store complete episodes with explicit lifecycle control (start, append transitions, end)
  • N-step sampling: Sample n-step experiences uniformly across all time steps with automatic episode boundary handling
  • Asyncio-safe: All operations use internal locking for safe concurrent access by multiple tasks
  • Capacity management: Automatic eviction of oldest episodes when capacity limits (max_episodes or max_steps) are exceeded
  • Memory-efficient: Avoids state duplication by storing observations sequentially and deriving next_obs during sampling

Implementation Details

  • Storage format: observations[t], actions[t], rewards[t] per episode
  • Uniform sampling over all valid time steps (not episodes)
  • N-step windows never cross episode boundaries
  • Eviction policy: oldest finished episodes first, then oldest in-progress
  • Thread-safety through asyncio.Lock (designed for asyncio only, not threading.Thread)

Test Coverage

Comprehensive test suite (35 tests) covering:

  • Episode lifecycle operations
  • N-step sampling with boundary conditions
  • Concurrent access patterns
  • Capacity and eviction policies
  • Edge cases and error conditions

All tests pass, ruff checks pass (lint + format).

Files Changed

  • src/ares/contrib/rl/replay_buffer.py: Core implementation (585 lines)
  • tests/contrib/rl/test_replay_buffer.py: Test suite (662 lines)
  • pyproject.toml: Add pytest-asyncio dependency and config

🤖 Generated with Claude Code


Generated description

Below is a concise technical summary of the changes proposed in this PR:

graph LR
EpisodeReplayBuffer_("EpisodeReplayBuffer"):::added
Episode_("Episode"):::added
ReplaySample_("ReplaySample"):::added
compute_discounted_return_("compute_discounted_return"):::added
EpisodeReplayBuffer_ -- "Added buffer storing Episode instances with capacity eviction" --> Episode_
EpisodeReplayBuffer_ -- "Adds n-step ReplaySample construction for uniform sampling" --> ReplaySample_
ReplaySample_ -- "ReplaySample.reward computes discounted return via compute_discounted_return" --> compute_discounted_return_
classDef added stroke:#15AA7A
classDef removed stroke:#CD5270
classDef modified stroke:#EDAC4C
linkStyle default stroke:#CBD5E1,font-size:13px
Loading

Introduces a production-ready EpisodeReplayBuffer within the ares.contrib.rl module, providing an asyncio-compatible mechanism for episodic experience storage, uniform n-step sampling, and automatic capacity management for reinforcement learning agents. Establishes the necessary package structure and includes a comprehensive test suite to validate the buffer's lifecycle, sampling, and eviction policies.

TopicDetails
Testing & Project Setup Adds a comprehensive test suite for the EpisodeReplayBuffer covering lifecycle, sampling, concurrency, and eviction, and updates pyproject.toml to include pytest-asyncio and adjust test path configurations.
Modified files (4)
  • pyproject.toml
  • src/ares/contrib/rl/replay_buffer_test.py
  • tests/contrib/__init__.py
  • tests/contrib/rl/__init__.py
Latest Contributors(2)
UserCommitDate
joshua.greaves@gmail.comfix-avoid-os.getlogin-...January 13, 2026
ryanscais3@gmail.comAdd-DM-Env-Interface-3December 18, 2025
Episodic Replay Buffer Implements the EpisodeReplayBuffer class and its associated data structures (Episode, ReplaySample) within the ares.contrib.rl package, enabling efficient, asyncio-compatible storage and n-step sampling of agent experiences with capacity management.
Modified files (3)
  • src/ares/contrib/__init__.py
  • src/ares/contrib/rl/__init__.py
  • src/ares/contrib/rl/replay_buffer.py
Latest Contributors(0)
UserCommitDate
This pull request is reviewed by Baz. Review like a pro on (Baz).

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants