-
Notifications
You must be signed in to change notification settings - Fork 16
Typed renderer configs #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
b37d3f2
Add renderer chat template kwargs passthrough
eligotts 0bd7e6d
Reject constructor kwargs in chat template kwargs
eligotts d80d4ac
Simplify chat template kwargs validation
eligotts 7fbf390
Format chat template kwargs changes
eligotts b543277
Address chat template kwargs review comments
eligotts deb3bdf
Verify chat_template_kwargs parity vs apply_chat_template
hallerite c0384a2
Expose every chat-template kwarg the upstream Jinja accepts
hallerite f593a29
Refuse bridge when add_vision_id loses prior count
hallerite f545378
Apply ruff format to chat-template-kwargs changes
hallerite 255db59
Replace chat_template_kwargs with typed renderer configs
hallerite eb03934
Clean up stale references to the deleted chat_template_kwargs API
hallerite d2bcf7e
Strip doc-rot framing — describe current state, not migration history
hallerite 8c514e0
Rewrite renderer-config doc in the prime-rl / verifiers docs style
hallerite 0769548
Rename config_for_name → config_from_name
hallerite 3dab877
Inherit BaseRendererConfig from pydantic_config.BaseConfig
hallerite 2d74a6b
Trim stale pyproject comments on pydantic and prime-pydantic-config
hallerite 3e07d7a
Drop direct pydantic dep — get it transitively via prime-pydantic-config
hallerite 4c9099d
Bump prime-pydantic-config floor to the latest dev release (0.3.0.dev83)
hallerite File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,3 +31,6 @@ coverage.xml | |
| .idea/ | ||
| .vscode/ | ||
| *.swp | ||
|
|
||
| # agent harness state | ||
| .claude/ | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,163 @@ | ||
| # Renderer config | ||
|
|
||
| `renderers.RendererConfig` is the typed input to `create_renderer` and | ||
| `create_renderer_pool`. It pins the renderer choice and its template-control | ||
| kwargs at construction. | ||
|
|
||
| ```python | ||
| from renderers import create_renderer, Qwen35RendererConfig | ||
|
|
||
| r = create_renderer(tokenizer, Qwen35RendererConfig(enable_thinking=False)) | ||
| ``` | ||
|
|
||
| `RendererConfig` is a pydantic discriminated union (one variant per renderer, | ||
| dispatched on the `name` field). Selecting a variant exposes exactly the | ||
| fields that renderer's chat template honours; anything else raises a | ||
| `pydantic.ValidationError` at construction. | ||
|
|
||
| ## Per-renderer configs | ||
|
|
||
| Each hand-coded renderer has a typed config class with the template kwargs | ||
| its Jinja chat template reads. For example: | ||
|
|
||
| | Renderer | Config class | Template fields | | ||
| |----------------|--------------------------|----------------------------------------------------------------| | ||
| | Qwen3 | `Qwen3RendererConfig` | `enable_thinking` | | ||
| | Qwen3.5 / 3.6 | `Qwen35RendererConfig` | `enable_thinking`, `add_vision_id` | | ||
| | Qwen3-VL | `Qwen3VLRendererConfig` | `add_vision_id` | | ||
| | GLM-5 / 5.1 | `GLM5RendererConfig` | `enable_thinking`, `clear_thinking` | | ||
| | GLM-4.5 | `GLM45RendererConfig` | `enable_thinking` | | ||
| | Nemotron-3 | `Nemotron3RendererConfig`| `enable_thinking`, `truncate_history_thinking` | | ||
| | Kimi K2.5 | `KimiK25RendererConfig` | `thinking` | | ||
| | MiniMax-M2 | `MiniMaxM2RendererConfig`| `model_identity` | | ||
| | Laguna-XS.2 | `LagunaXS2RendererConfig`| `enable_thinking`, `render_assistant_messages_raw` | | ||
| | gpt-oss | `GptOssRendererConfig` | `reasoning_effort`, `conversation_start_date` | | ||
|
|
||
| Field names mirror the upstream Jinja variable names. Passing | ||
| `Qwen3RendererConfig(add_vision_id=True)` raises — Qwen3 is text-only, so | ||
| the field doesn't exist on its config. Use | ||
| `type(config).template_field_names()` to introspect the fields that mirror | ||
| chat-template kwargs (parity is verified against `apply_chat_template` in | ||
| `tests/test_renderer_config_parity.py`). | ||
|
|
||
| Configs are frozen. To override a field, construct a new instance or call | ||
| `config.model_copy(update={...})`. | ||
|
|
||
| ## Auto-resolution | ||
|
|
||
| `create_renderer(tokenizer)` (no config) resolves the renderer from | ||
| `tokenizer.name_or_path` via `MODEL_RENDERER_MAP`: | ||
|
|
||
| ```python | ||
| r = create_renderer(tokenizer) # AutoRendererConfig() is the default | ||
| r = create_renderer(tokenizer, AutoRendererConfig(preserve_all_thinking=True)) | ||
| ``` | ||
|
|
||
| `AutoRendererConfig` carries only the shared `preserve_*` flags. Template | ||
| kwargs depend on the renderer, so overriding them requires naming the | ||
| renderer explicitly: | ||
|
|
||
| ```python | ||
| r = create_renderer(tokenizer, GLM5RendererConfig(clear_thinking=False)) | ||
| ``` | ||
|
|
||
| Auto-resolution fails loudly for VLMs that miss the exact-match lookup — | ||
| `DefaultRenderer` only knows `apply_chat_template` + text tokens, so silently | ||
| falling back for a VLM would produce token streams the trainer can't | ||
| reconstruct. Text-only fine-tunes without a registered renderer fall back to | ||
| `DefaultRenderer` and log the choice at INFO. | ||
|
|
||
| ## `preserve_*` flags | ||
|
|
||
| Every variant carries two renderer-agnostic flags on `_BaseRendererConfig`: | ||
|
|
||
| - `preserve_all_thinking: bool = False` — re-emit `reasoning_content` on | ||
| every past assistant turn, even when the chat template would drop it. | ||
| - `preserve_thinking_between_tool_calls: bool = False` — re-emit | ||
| `reasoning_content` only inside the in-flight tool cycle (the contiguous | ||
| A-T-…-A block after the most recent `user` message, when it contains at | ||
| least one `tool` response). A new user turn closes the block and drops | ||
| its thinking. | ||
|
|
||
| These OR-compose with template-level toggles. GLM-5's `clear_thinking` and | ||
| Nemotron-3's `truncate_history_thinking` already gate past thinking; the | ||
| `preserve_*` flags add to that: | ||
|
|
||
| | `clear_thinking` | `preserve_all_thinking` | past thinking? | | ||
| |------------------|-------------------------|----------------| | ||
| | `True` (default — drop) | `False` (default) | dropped | | ||
| | `True` | `True` | kept | | ||
| | `False` (keep) | `False` | kept | | ||
| | `False` | `True` | kept | | ||
|
|
||
| `preserve_*` can only extend retention, never force a drop. The canonical | ||
| use case is **compaction**: injecting a `user` turn like *"summarize the work | ||
| so far"* puts every prior assistant in a past cycle, and | ||
| `preserve_all_thinking=True` keeps reasoning visible end-to-end. | ||
|
|
||
| ## `DefaultRendererConfig` accepts arbitrary Jinja kwargs | ||
|
|
||
| `DefaultRenderer` wraps `tokenizer.apply_chat_template` for any model that | ||
| doesn't have a hand-coded renderer. Its config sets `extra="allow"`: | ||
|
|
||
| ```python | ||
| from renderers import create_renderer, DefaultRendererConfig | ||
|
|
||
| r = create_renderer( | ||
| tokenizer, | ||
| DefaultRendererConfig( | ||
| tool_parser="qwen3", # registered in renderers.parsers | ||
| reasoning_parser="think", | ||
| enable_thinking=False, # forwarded to apply_chat_template | ||
| custom_jinja_kwarg=True, # ditto | ||
| ), | ||
| ) | ||
| ``` | ||
|
|
||
| `tool_parser` and `reasoning_parser` are typed because they configure | ||
| `DefaultRenderer`'s own parsing pipeline. Every other field lands in | ||
| `model_extra` and `DefaultRenderer._apply` forwards `model_extra` verbatim | ||
| to `apply_chat_template`. | ||
|
|
||
| ## Downstream integration | ||
|
|
||
| Downstream pydantic configs (`prime-rl` orchestrator, `verifiers` | ||
| `ClientConfig`) hold a single field typed as `RendererConfig`: | ||
|
|
||
| ```python | ||
| from pydantic import BaseModel, Field | ||
| from renderers import AutoRendererConfig, RendererConfig | ||
|
|
||
| class ClientConfig(BaseModel): | ||
| renderer: RendererConfig = Field(default_factory=AutoRendererConfig) | ||
| ``` | ||
|
|
||
| In TOML / YAML, the discriminator routes deserialization: | ||
|
|
||
| ```toml | ||
| [client.renderer] | ||
| name = "qwen3.5" | ||
| enable_thinking = false | ||
| add_vision_id = true | ||
| preserve_all_thinking = true | ||
| ``` | ||
|
|
||
| Pydantic dispatches on `name = "qwen3.5"` to `Qwen35RendererConfig`. Bogus | ||
| combinations (e.g. `add_vision_id` under `name = "qwen3"`) raise at | ||
| config-load with a clear message naming the offending field and the variant | ||
| that rejected it. | ||
|
|
||
| To construct a config from a renderer name string (e.g. from a CLI flag): | ||
|
|
||
| ```python | ||
| from renderers import config_from_name | ||
|
|
||
| cfg = config_from_name("glm-5") # → GLM5RendererConfig() with defaults | ||
| cfg = config_from_name("auto") # → None, the implicit "auto" form | ||
| ``` | ||
|
|
||
| ## Renaming a renderer is a breaking change | ||
|
|
||
| The discriminator key is the renderer name string. Renaming `"qwen3.5"` to | ||
| something else would break any downstream config that references it by | ||
| name. Add new renderers; don't rename existing ones. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.