fix(examples): update multi-turn examples to current renderer API by hallerite · Pull Request #68 · PrimeIntellect-ai/renderers

hallerite · 2026-05-27T15:23:17Z

Why

The five multi-turn examples were written against renderers <= 0.1.x and no longer run against current main — they fail before reaching any inference engine.

Three API drifts (fixed in all five scripts)

Was (old API)	Now (current API)
`Qwen35Renderer(tokenizer, enable_thinking=...)`	`Qwen35Renderer(tokenizer, Qwen35RendererConfig(enable_thinking=...))` — the kwarg was removed by the typed-config refactor (#60)
`tool_call.get("function")` / `.get("arguments")`	`parse_response().tool_calls` are `ParsedToolCall` dataclasses → attribute access (`tc.name` / `tc.arguments` / `tc.id`)
`bridged_ids = bridge_to_next_turn(...)` then sliced as a list	`bridge_to_next_turn()` returns `RenderedTokens` → read `.token_ids`

Also replaced json.dumps(parsed.tool_calls) in print_parsed (throws on dataclasses) with a readable per-call line.

Validation (actually run on GPU, current renderer code)

Example	Qwen3.5-4B (think on/off)	gpt-oss-20b
transformers	✅	✅
vllm	✅	✅
sglang (offline)	✅	⚠️ host-blocked (see below)

Each ✅ is the full loop: render → generate → multiply tool-call parsed [ok] → bridge_to_next_turn → tool result → final answer "391".

Notes:

vllm on this Blackwell box needed env flags TORCH_CUDA_ARCH_LIST=12.0 and VLLM_USE_FLASHINFER_SAMPLER=0 (environment config, not example changes).
sglang requires the harmony floor relaxation in fix(deps): lower openai-harmony floor to >=0.0.4 for SGLang compatibility #69 (merged) to install alongside renderers — every sglang through 0.5.12.post1 hard-pins openai-harmony==0.0.4. The Qwen path is validated end-to-end. The gpt-oss-via-sglang target fails only because this host's CUDA toolkit is 11.5 (no nvcc new enough to JIT-compile the sm_120/c++20 CUDA-graph kernels); gpt-oss itself is validated via transformers + vllm.
tinker not run (needs the hosted Tinker API); its renderer-side code is identical to the others.

🤖 Generated with Claude Code

The examples were written against renderers <=0.1.x and no longer run against current main. Three API drifts, fixed across all five scripts: - Constructor: `Qwen35Renderer(tokenizer, enable_thinking=...)` — the `enable_thinking` kwarg was removed by the typed-config refactor (#60). Pass `Qwen35RendererConfig(enable_thinking=...)` instead. - Tool calls: `parse_response().tool_calls` are `ParsedToolCall` dataclasses, not dicts. Use attribute access (`tc.name` / `tc.arguments` / `tc.id`) instead of `tool_call.get(...)`, and build OpenAI-format tool_calls explicitly when echoing the assistant turn. - Bridge: `bridge_to_next_turn()` returns `RenderedTokens` (not `list[int]`); read the extended id stream from `.token_ids`. Also replaced the `json.dumps(parsed.tool_calls)` print (which now fails on dataclasses) with a readable per-call line. Validated the transformers example end-to-end on GPU (Qwen3.5-4B, both thinking modes): render -> generate -> tool-call parse -> bridge -> tool result -> final answer "17 x 23 = 391". The other four scripts share identical renderer-side logic (only the engine transport differs) and pass compile + ruff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

macroscopeapp · 2026-05-27T15:29:54Z

Approvability

Verdict: Approved

All changes are confined to example files in the examples/ directory, mechanically updating them to match current API signatures. No production library code is modified.

^{You can customize Macroscope's approvability policy. Learn more.}

…e from Tokenizer Brings in #68 (examples), #69 (harmony floor), #71 (qwen3.5 hard-coded enable_thinking). The only qwen35.py conflict is resolved by keeping #71's hard-coded `_ENABLE_THINKING_DEFAULTS` table (no `apply_chat_template` probe) on top of #31's `Tokenizer`/`Processor` type hints. Now that #71 removed the last hand-coded-renderer call to `apply_chat_template`, drop it from the `Tokenizer` protocol so a plain `tokenizers.Tokenizer` wrapper satisfies it. `apply_chat_template` moves to a new `ChatTemplateTokenizer(Tokenizer, Protocol)` subtype, required only by `DefaultRenderer` (the generic chat-template fallback).

macroscopeapp Bot approved these changes May 27, 2026

View reviewed changes

hallerite merged commit 74425da into main May 27, 2026
11 checks passed

hallerite deleted the worktree-examples-current-api branch May 27, 2026 17:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(examples): update multi-turn examples to current renderer API#68

fix(examples): update multi-turn examples to current renderer API#68
hallerite merged 1 commit into
mainfrom
worktree-examples-current-api

hallerite commented May 27, 2026 •

edited

Loading

Uh oh!

macroscopeapp Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hallerite commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Three API drifts (fixed in all five scripts)

Validation (actually run on GPU, current renderer code)

Uh oh!

macroscopeapp Bot commented May 27, 2026

Approvability

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hallerite commented May 27, 2026 •

edited

Loading