rl

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

agent reinforcement-learning rl lora llms qwen agentic-ai grpo qwen3

Updated Oct 24, 2025
Python

junxiaosong / AlphaZero_Gomoku

Star

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

board-game reinforcement-learning tensorflow pytorch mcts gomoku rl monte-carlo-tree-search self-learning gobang alphago alphago-zero alphazero

Updated Apr 24, 2024
Python

pytorch / ELF

Star

ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation

go reinforcement-learning rl alphago-zero alpha-zero rl-environment

Updated Jun 21, 2019
C++

pytorch / rl

Star

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

machine-learning control reinforcement-learning ai robotics decision-making distributed-computing torch pytorch rl model-based-reinforcement-learning multi-agent-reinforcement-learning marl

Updated Nov 3, 2025
Python

inclusionAI / AReaL

Star

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

agent reinforcement-learning rl machine-learning-systems mlsys llm llm-agent llm-reasoning

Updated Nov 4, 2025
Python

werner-duvaud / muzero-general

Star

MuZero

machine-learning reinforcement-learning deep-learning neural-network deep-reinforcement-learning python3 pytorch gym mcts rl tensorboard residual-network monte-carlo-tree-search self-learning alphago model-based-rl alphazero muzero muzero-general

Updated Sep 3, 2024
Python

DLR-RM / rl-baselines3-zoo

Star

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

reinforcement-learning robotics optimization lab deep-reinforcement-learning pytorch openai gym hyperparameter-optimization rl sde hyperparameter-tuning hyperparameter-search pybullet stable-baselines pybullet-environments tuning-hyperparameters

Updated Oct 15, 2025
Python

IntelLabs / coach

Star

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms

reinforcement-learning deep-learning mxnet tensorflow openai-gym rl starcraft imitation-learning hierarchical-reinforcement-learning coach mujoco starcraft2 onnx roboschool carla starcraft2-ai distributed-reinforcement-learning

Updated Dec 11, 2022
Python

TsinghuaC3I / Awesome-RL-for-LRMs

Star

A Survey of Reinforcement Learning for Large Reasoning Models

open-source rl awesome-list reasoning lrm llm deepseek-r1

Updated Oct 29, 2025

PRIME-RL / PRIME

Star

Scalable RL solution for advanced reasoning of language models

rl reasoning llm

Updated Mar 18, 2025
Python

MaximeVandegar / Papers-in-100-Lines-of-Code

Star

Implementation of papers in 100 lines of code.

Updated Nov 3, 2025
Python

pathak22 / noreward-rl

Star

[ICML 2017] TensorFlow code for Curiosity-driven Exploration for Deep Reinforcement Learning

mario deep-neural-networks deep-learning tensorflow deep-reinforcement-learning openai-gym doom exploration rl curiosity self-supervised

Updated Dec 7, 2022
Python

zzli2022 / Awesome-System2-Reasoning-LLM

Star

Latest Advances on System-2 Reasoning

benchmark mcts rl reasoning r1 prm o3 o1 slow-fast system-2 self-improve macro-action

Updated Jun 8, 2025
Python

FareedKhan-dev / all-rl-algorithms

Star

Implementation of all RL algorithms in a simpler way

python agent reinforcement-learning openai rl llm

Updated Aug 29, 2025
Jupyter Notebook

araffin / rl-baselines-zoo

Star

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

reinforcement-learning optimization openai-gym hyperparameters openai gym hyperparameter-optimization rl zoo hyperparameter-tuning hyperparameter-search pybullet stable-baselines

Updated Oct 17, 2022
Python

sail-sg / understand-r1-zero

Star

Understanding R1-Zero-Like Training: A Critical Perspective

rl reasoning llm r1-zero

Updated Aug 27, 2025
Python

JudgmentLabs / judgeval

Star

The open source post-building layer for agents. Our environment data and evals power agent post-training (RL, SFT) and monitoring.

agent open-source reinforcement-learning openai rl agents llm prompt-engineering langchain llama-index llm-evaluation langgraph llm-observability agentic-ai grpo

Updated Nov 4, 2025
Python

Improve this page

Add a description, image, and links to the rl topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rl topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rl

Here are 1,133 public repositories matching this topic...

LlamaFamily / Llama-Chinese

google / dopamine

thu-ml / tianshou

OpenPipe / ART

junxiaosong / AlphaZero_Gomoku

pytorch / ELF

pytorch / rl

inclusionAI / AReaL

werner-duvaud / muzero-general

DLR-RM / rl-baselines3-zoo

IntelLabs / coach

TsinghuaC3I / Awesome-RL-for-LRMs

PRIME-RL / PRIME

MaximeVandegar / Papers-in-100-Lines-of-Code

pathak22 / noreward-rl

zzli2022 / Awesome-System2-Reasoning-LLM

FareedKhan-dev / all-rl-algorithms

araffin / rl-baselines-zoo

sail-sg / understand-r1-zero

JudgmentLabs / judgeval

Improve this page

Add this topic to your repo