ToolEnv silently fails with default use_renderer=true

Still exploring this, but it looks like the default `use_renderer = true` in the orchestrator causes silent failures for any environment using `vf.ToolEnv` (e.g. `wiki_search`, `wordle`).

The TITO rollout path returns `ParsedToolCall` dataclass instances, but `ToolEnv` expects dict-shaped tool calls and tries to subscript them (`tool_call["function"]["name"]`). This raises `TypeError` on every tool-using rollout. The orchestrator catches it, retries, and training continues - but only on trajectories where the model didn't use tools.

`vf-eval` doesn't surface the problem because eval rollouts always go through `openai_chat_completions` regardless of the flag.

Setting `use_renderer = false` in `rl.toml` fixes it.

Possibly related: PrimeIntellect-ai/prime-rl#1196 hit a similar object-vs-dict mismatch in tool calls, though that one crashed hard rather than failing silently.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ToolEnv silently fails with default use_renderer=true #2524

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

ToolEnv silently fails with default use_renderer=true #2524

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions