[Feature, Example] A3C Atari Implementation for TorchRL #3001

simeetnayan81 · 2025-06-15T16:24:47Z

Description

Describe your changes in detail.
This PR adds an implementation of the Asynchronous Advantage Actor-Critic (A3C) algorithm for Atari environments in the torchrl/sota-implementations directory. The main files added are:

a3c_atari.py: Contains the A3C worker class, shared optimizer, and main training loop using multiprocessing.
utils_atari.py: Provides utility functions for environment creation, model construction, and evaluation, adapted for Atari tasks.
config_atari.yaml: Configuration file for hyperparameters, environment settings, and logging.

The implementation leverages TorchRL's collectors, objectives, and logging utilities, and is designed to be modular and extensible for research and benchmarking. Some of the utils functions are also borrowed from a2c_atari.

Motivation and Context

This change is required to provide a strong, reproducible baseline for A3C on Atari environments using TorchRL. It enables researchers and practitioners to benchmark and compare reinforcement learning algorithms within the TorchRL ecosystem. The implementation follows best practices for distributed RL and is compatible with TorchRL's API.

This PR solves the issue: #1755

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds core functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)
Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

I have read the CONTRIBUTION guide (required)
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.

pytorch-bot · 2025-06-15T16:24:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3001

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 9 Unrelated Failures

As of commit a6eb18d with merge base 3ebe93d ():

NEW FAILURE - The following job has failed:

Habitat Tests on Linux / tests (3.9, 12.8) / linux-job (gh)
RuntimeError: Command docker exec -t 47bc1c1157fc48f7280c4cc48a39022caeae3c257c293a185a9dbd036dc33259 /exec failed with exit code 1

FLAKY - The following job failed but was likely due to flakiness present on trunk:

Continuous Benchmark (PR) / GPU Pytest benchmark (gh) (similar failure)
RuntimeError: PassManager::run failed

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Build M1 Wheels / pytorch/rl / build-wheel-py3_9-cpu (gh) (trunk failure)
Process completed with exit code 1.
Build M1 Wheels / pytorch/rl / upload / upload-wheel-py3_9-cpu (gh) (trunk failure)
Continuous Benchmark / CPU Pytest benchmark (gh) (trunk failure)
Continuous Benchmark / GPU Pytest benchmark (gh) (trunk failure)
RuntimeError: PassManager::run failed
Libs Tests on Linux / unittests-gym (3.9, 12.8) / linux-job (gh) (trunk failure)
test/test_libs.py::TestGym::test_gym_fake_td[True-False-3-HalfCheetah-v2]
LLM Tests on Linux / unittests (3.9, 12.8) / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
Unit-tests on Linux / tests-olddeps (3.9, 11.6) / linux-job (gh) (trunk failure)
test/test_transforms.py::TestActionDiscretizer::test_transform_env[pendulum-SamplingStrategy.RANDOM-False-True]
Unit-tests on Windows / unittests-cpu (3.10, windows.4xlarge, cpu) / windows-job (gh) (trunk failure)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens

This all looks pretty good!
Could you share a (couple of) learning curve?
Another thing to do before landing is to add it to the sota-implementations CI run:
https://github.com/pytorch/rl/blob/main/.github/unittest/linux_sota/scripts/test_sota.py
Make sure the config passed there is as much barebone as we can - we just want to run the script for a couple of collection / optim iters and make sure it runs without error (not that it properly trains).
We also need to add it to the sota-check runs

simeetnayan81 · 2025-06-18T04:02:09Z

Thanks @vmoens . I'll add the required changes as well as some training curves.

simeetnayan81 · 2025-07-06T20:02:35Z

@vmoens, I have added the required scripts as well.

Not getting enough resources and time for hyperparam tuning to generate a proper training curve.

vmoens

LGTM, just a minor comment on the logger!

vmoens · 2025-07-10T03:56:34Z