Add arithmetic_chain community environment by nevasini1 · Pull Request #420 · NousResearch/atropos

nevasini1 · 2026-03-21T17:50:12Z

PR Type

RL Environment PR — Environment Snapshot & Zero-Training sections below
Non-Environment PR

Description

Adds arithmetic_chain: procedural multi-step integer word problems; answers scored with math_verify + \boxed{}, same pattern as GSM8K-style envs. Uses ManagedServer for training logprobs. No Hugging Face dataset.

Environment Snapshot

Field	Your Entry
Environment Name	`arithmetic_chain`
Short Description	Procedural add/sub/mul chains; model must output final integer in `\boxed{}`.
Category	Verifiable-Reasoning
Dataset Needed?	No
External Deps	Same as core Atropos stack (`math_verify`, `latex2sympy2_extended`, etc.) — already in `atroposlib`.
Environmental Variables	None required; use `OPENAI_API_KEY` only if you point `--openai.base_url` at a provider that needs it.
Compute Footprint Estimate	Lightweight verification (integer equality via parser); rollout cost is dominated by inference API.

Zero-Training Test Results

Full process run: Not executed in the contributor CI sandbox (no inference key / local vLLM here). Please run locally when reviewing:

python environments/community/arithmetic_chain/arithmetic_chain_server.py process \
  --slurm false \
  --env.total_steps 5 \
  --env.data_path_to_save_groups arithmetic_chain_zero_train.jsonl

(Override --openai.* to match your server; see process --help.)

Offline scoring check (same parse / verify path as score() in arithmetic_chain_server.py):

Example	Assistant tail	vs gold `\boxed{14}`	`verify`
Good	`Step: 3+4=7, 7*2=14. \boxed{14}`	correct	True
Bad	`\boxed{13}`	wrong	False

W&B link: Not used for this check; enable use_wandb in config/CLI if you want a run URL for the template.

CI / checks (clarification)

pre-commit.ci - pr: Should show passing on the latest commit on this branch (including any [pre-commit.ci] auto fixes from pre-commit.com hooks commit). That is the authoritative style/lint check for this repo — not tied to the wording of any intermediate commit title.
Import order: An intermediate commit titled “Apply isort import order (ruff)…” may not match the final file: pre-commit.ci can push a small follow-up commit to match this repository’s isort/ruff profile exactly. If you see multiple commits touching imports, trust the latest tree + green pre-commit.ci.
Run Tests (GitHub Actions pytest): On PRs from forks, runs sometimes stay in action_required until a maintainer approves workflow execution for the contributor. That state is workflow approval, not “the isort commit failed.” Once approved, any real pytest failures would appear in the job logs.

Contributor local run (optional reference): pytest was run in a disposable venv: 137 passed, 50 skipped, 1 failed in an unrelated test (test_tokenize_for_trainer_mask_len_last_turn_only / HF model load). Your CI matrix may differ.

Test plan (for reviewers)

python environments/community/arithmetic_chain/arithmetic_chain_server.py process --help
Short process with your inference endpoint (JSONL + HTML output)
Optional: serve + run-api + trainer smoke test

Happy to adjust naming, defaults, or reward shaping to match maintainer preferences.

Procedural multi-step integer chains with boxed answers; uses ManagedServer and math_verify for scoring. No external dataset required. Made-with: Cursor

Made-with: Cursor

for more information, see https://pre-commit.ci

nevasini1 force-pushed the feature/arithmetic-chain-env branch from a5e6f65 to d4cff17 Compare March 21, 2026 21:42

Add arithmetic_chain community environment

e6bc008

Procedural multi-step integer chains with boxed answers; uses ManagedServer and math_verify for scoring. No external dataset required. Made-with: Cursor

nevasini1 force-pushed the feature/arithmetic-chain-env branch from d4cff17 to e6bc008 Compare March 21, 2026 21:42

nevasini1 and others added 2 commits March 21, 2026 17:59

Apply isort import order (ruff) to arithmetic_chain_server

ce58c3a

Made-with: Cursor

[pre-commit.ci] auto fixes from pre-commit.com hooks

4b77dc4

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add arithmetic_chain community environment#420

Add arithmetic_chain community environment#420
nevasini1 wants to merge 3 commits intoNousResearch:mainfrom
nevasini1:feature/arithmetic-chain-env

nevasini1 commented Mar 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nevasini1 commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Environment Snapshot

Zero-Training Test Results

CI / checks (clarification)

Test plan (for reviewers)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nevasini1 commented Mar 21, 2026 •

edited

Loading