Skip to content

Pipe chat_template_kwargs into typed RendererConfig#1468

Merged
eligotts merged 4 commits into
mainfrom
feat/typed-chat-template-kwargs
May 26, 2026
Merged

Pipe chat_template_kwargs into typed RendererConfig#1468
eligotts merged 4 commits into
mainfrom
feat/typed-chat-template-kwargs

Conversation

@eligotts
Copy link
Copy Markdown
Contributor

@eligotts eligotts commented May 26, 2026

Summary

Pipe sampling_args.extra_body.chat_template_kwargs through to the typed RendererConfig so users can set enable_thinking etc. in sampling and have it land on the right renderer — no [renderer] block needed.

  • Pop chat_template_kwargs from sampling in RendererClient.get_native_response.
  • Resolve None / AutoRendererConfig → concrete config via MODEL_RENDERER_MAP before merging, so kwargs validate against the actual renderer's schema (Qwen3RendererConfig.enable_thinking) instead of AutoRendererConfig's intentionally-minimal one. Mirrors renderers._resolve_auto's preserve_* carry.
  • Validation comes for free: typed configs (extra="forbid") reject unknown keys with a field-path error; DefaultRendererConfig (extra="allow") keeps the escape hatch for arbitrary jinja kwargs.
  • Kwargs override fields with the same name on the (resolved) base (last-write-wins, matches MITO's per-request override semantics).
  • Pool cache already keys on effective renderer_config_json, so identical kwargs across requests reuse a single renderer.

Replaces #1447 (pre-typed-config era).

Effective truth table

ClientConfig.renderer_config chat_template_kwargs Result
Qwen3RendererConfig(...) (explicit) {enable_thinking: False} merges → pool built with enable_thinking=False
Qwen3RendererConfig(...) (explicit) {enable_thinkng: False} (typo) ValueError(... Qwen3RendererConfig ... extra_forbidden)
AutoRendererConfig() (prime-rl default) {enable_thinking: False} auto-resolves Qwen3 from MODEL_RENDERER_MAP[model], then merges
None (verifiers eval CLI default) {enable_thinking: False} same as above
any none unchanged from main

Companion PRs

  • prime-rl: nothing required — already passes renderer + sampling_args through unchanged.
  • renderers: nothing required — MODEL_RENDERER_MAP + config_from_name + extra="forbid" / extra="allow" cover everything we need.

Test plan

  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run pytest tests/test_renderer_client.py -k "honors_configured_renderer_config or renderer_model_name_override or merges_chat_template_kwargs or rejects_invalid_chat_template or auto_resolves_for_chat_template" — 5 passed
  • uvx ruff check verifiers/clients/renderer_client.py tests/test_renderer_client.py — clean

🤖 Generated with Claude Code


Note

Medium Risk
Changes per-request renderer pool selection and sampling param forwarding; mis-merged kwargs could affect tokenization across rollouts, though behavior is covered by new unit tests.

Overview
RendererClient now treats sampling_args.extra_body.chat_template_kwargs as renderer configuration instead of engine sampling: they are removed from params sent to /generate and merged into the effective RendererConfig used when building or selecting a shared renderer pool.

A new _resolve_renderer_config path resolves None / AutoRendererConfig to the concrete config for the tokenizer model (via MODEL_RENDERER_MAP, honoring ClientConfig.renderer_model_name when set), carries preserve_* fields from auto config, then applies kwargs with pydantic validation (unknown keys fail on strict configs). _get_renderer_or_pool accepts a per-request resolved config so pool cache keys reflect merged settings.

Tests cover explicit/auto/None bases, renderer_model_name override during auto-resolution, stripping kwargs from forwarded sampling params, and ValidationError on typos.

Reviewed by Cursor Bugbot for commit 14ab2a7. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Pipe chat_template_kwargs from extra_body into a typed RendererConfig before generation

  • Extracts chat_template_kwargs from sampling_params.extra_body in RendererClient.get_native_response and removes them before forwarding to /generate.
  • Introduces _resolve_renderer_config in renderer_client.py to merge kwargs into a concrete RendererConfig, resolving Auto/None bases via MODEL_RENDERER_MAP and config_from_name.
  • When ClientConfig.renderer_model_name is set, auto-resolution uses that override instead of the request model name.
  • _get_renderer_or_pool now accepts an optional renderer_config parameter so the resolved config drives pool selection.
  • Risk: unknown chat_template_kwargs keys now raise a pydantic ValidationError before generation, which is a breaking change for callers passing unrecognized keys.

Macroscope summarized 14ab2a7.

eligotts and others added 2 commits May 26, 2026 03:13
Pop sampling_args.extra_body.chat_template_kwargs in
RendererClient.get_native_response and fold them into a typed
RendererConfig via pydantic model_validate before the pool lookup. The
typed configs' extra="forbid" rejects unknown keys with a field-path
error; DefaultRendererConfig's extra="allow" keeps the escape hatch for
arbitrary jinja kwargs. Kwargs override fields with the same name on
the base config (last-write-wins).

The pool cache key already keys on the effective config_json, so
identical kwargs across requests reuse a single renderer; varied kwargs
just produce additional pool entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When ``ClientConfig.renderer_config`` is ``None`` (verifiers eval CLI
default) or ``AutoRendererConfig`` (prime-rl default), pull the
``MODEL_RENDERER_MAP`` lookup forward so that
``chat_template_kwargs.enable_thinking`` etc. validate against the
concrete renderer's schema (Qwen3RendererConfig) instead of the
intentionally-minimal Auto schema.

Mirrors ``renderers._resolve_auto``'s preserve_* carry. Pool cache key
still keys on the effective concrete config so this changes nothing for
explicit Qwen3RendererConfig users.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@eligotts eligotts marked this pull request as ready for review May 26, 2026 04:08
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit dd02a30. Configure here.

Comment thread verifiers/clients/renderer_client.py
@macroscopeapp
Copy link
Copy Markdown

macroscopeapp Bot commented May 26, 2026

Approvability

Verdict: Needs human review

This PR introduces new feature capability by piping chat_template_kwargs into typed RendererConfig, enabling per-request configuration overrides with auto-resolution logic. Changes runtime behavior by creating new configuration propagation paths and validating against concrete renderer schemas.

You can customize Macroscope's approvability policy. Learn more.

eligotts and others added 2 commits May 26, 2026 04:11
- Drop ValueError wrap; let pydantic ValidationError propagate with its
  structured field path intact
- Drop dead assert (MODEL_RENDERER_MAP.get(..., "default") never returns
  "auto")
- Use model_copy(update=...) for preserve_* carry instead of
  model_validate round-trip
- Collapse explicit-base happy-path test into the auto-resolve test
  (covers Qwen3 / Auto / None bases in one loop)
- Update error-path test to assert pydantic ValidationError directly

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bugbot caught a divergence: ``_resolve_renderer_config`` was looking up
``MODEL_RENDERER_MAP[model]`` while ``_get_renderer_or_pool`` builds the
pool against ``ClientConfig.renderer_model_name`` when set. With per-
request kwargs + None/Auto base, this could pick a renderer schema that
doesn't match the actual tokenizer model. Now resolve against the same
``renderer_model_name`` override the pool uses.

Also restore the ``assert concrete is not None`` (with comment) and
add an explicit ``cast`` to silence ty's union-narrowing — the runtime
path is unreachable but the type checker can't see through
``MODEL_RENDERER_MAP.get(..., "default")``.

Adds one test that pins the override-vs-request-model behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@eligotts eligotts requested a review from mikasenghaas May 26, 2026 17:55
Copy link
Copy Markdown
Member

@mikasenghaas mikasenghaas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed, approving this to unblock quickly. short-term cleanup would be to expose resolve_renderer(model_name: str) -> RendererConfig from renderers so we can resolve 1. the renderer 2. the chat template kwargs in the prime-rl configs

@eligotts eligotts merged commit f9c68eb into main May 26, 2026
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants