docs(tutorials): rewrite index as table with Colab badge column#640
Conversation
Replaces the minimal bullet list with a table describing what each tutorial covers and whether a GPU is required. Includes all eight tutorials from the current and in-progress PR set. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Greptile SummaryThis PR rewrites
Confidence Score: 4/5Safe to merge after fixing the broken Colab link; the rest of the change is straightforward documentation. The Wordle GRPO 'Open In Colab' badge navigates to a raw GitHub blob URL instead of Colab, so every user who clicks it on the published docs will land in GitHub's file viewer instead of opening the notebook. The fix is a one-word prefix change to the URL. Everything else in the PR — table layout, GPU flags, descriptions, toctree ordering — looks correct. docs/source/tutorials/index.md — specifically line 6, the Wordle GRPO Colab badge URL. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Tutorials Index Page] --> B[Table Row: OpenEnv Tutorial]
A --> C[Table Row: Wordle GRPO]
A --> D[Table Row: RL Training 2048]
A --> E[Table Row: End-to-end Walkthrough]
A --> F[Table Row: SFT Warm-up for GRPO]
A --> G[Table Row: Rubrics]
A --> H[Table Row: MCP Tools]
A --> I[Table Row: Evaluating with Inspect AI]
B --> B1["Colab: colab.research.google.com - OK"]
C --> C1["Colab: github.com/huggingface/trl - WRONG HOST"]
D --> D1["No notebook"]
E --> E1["Colab: colab.research.google.com - OK"]
F --> F1["Colab: colab.research.google.com - OK"]
G --> G1["Colab: colab.research.google.com - OK"]
H --> H1["Colab: colab.research.google.com - OK"]
I --> I1["Colab: colab.research.google.com - OK"]
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
docs/source/tutorials/index.md:6
The "Open In Colab" badge for the Wordle GRPO row links to a plain GitHub blob URL (`github.com/huggingface/trl/blob/...`) instead of the `colab.research.google.com/github/...` form. Clicking the badge takes users to GitHub's file viewer rather than opening the notebook in Colab, which defeats the purpose of the badge entirely.
```suggestion
| [Wordle GRPO](wordle-grpo.md) | Train an agent to play Wordle using GRPO via TRL's `environment_factory`. Shows the multi-turn tool-calling loop: the model guesses a word each turn and receives letter-position feedback until it wins or the episode ends. | Yes | [](https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/openenv_wordle_grpo.ipynb) |
```
Reviews (1): Last reviewed commit: "docs(tutorials): add Colab badge for MCP..." | Re-trigger Greptile |
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
burtenshaw
left a comment
There was a problem hiding this comment.
Approved per maintainer merge request after required checks passed.
burtenshaw
left a comment
There was a problem hiding this comment.
Approved per maintainer merge request after conflict resolution and required checks passed.
Summary
Rewrites
docs/source/tutorials/index.mdfrom a minimal bullet list into a table that describes what each tutorial covers and whether a GPU is required. The toctree isupdated to include all eight tutorials in the current + in-progress set.Must be merged after all prerequisite PRs below. Until those land, the Sphinx build will warn about missing pages (
end-to-end-walkthrough,sft-warmup,rubrics,mcp-environment,evaluation-inspect).Prerequisites (merge in any order before this PR)
wordle-grpomigrated toenvironment_factoryrubrics— composable reward computationmcp-environment— MCP tools in training and evalend-to-end-walkthrough— full GRPO pipelineevaluation-inspect— evaluating with Inspect AIfeature/harness-interface— required by #636sft-warmup— SFT warm-up for GRPOType of Change
Alignment Checklist
Before submitting, verify:
.claude/docs/PRINCIPLES.mdand this PR aligns with our principles.claude/docs/INVARIANTS.mdand no invariants are violated/pre-submit-pr(orbash .claude/hooks/lint.shand tests) and addressed all issuesRFC Status
Test Plan
maincd docs && make html— build should complete with zero warnings about missing tutorial pages_build/html/tutorials/index.html— confirm table renders correctly with all 8 rows linkedClaude Code Review
Automated Checks
Tier 1: Fixes Required
None.
Tier 2: Alignment Discussion
None. Pure docs rewrite with no API surface changes.