Response prefill support for anthropic, deepseek and openrouter #2825

yf-yang · 2025-09-06T13:55:25Z

Closes #2778

yf-yang · 2025-09-06T15:16:41Z

@DouweM

DouweM

Thank you!

pydantic_ai_slim/pydantic_ai/profiles/openai.py

pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py

pydantic_ai_slim/pydantic_ai/models/anthropic.py

yf-yang · 2025-09-09T13:25:06Z

Sure, will get back to you in one or two days

yf-yang · 2025-09-11T03:06:27Z

Hmmm, is this test failure relevant?
Also, I am not sure if the coverage is enough as I don't have a deepseek/openrouter key

DouweM · 2025-09-11T18:45:29Z

@yf-yang The test failure indicate that 'response_prefix': None should be added to the output's that's compared in that test.

Hmmm, is this test failure relevant? Also, I am not sure if the coverage is enough as I don't have a deepseek/openrouter key

I have both keys so I can look at that today or tomorrow. Thanks for all your work here!

DouweM · 2025-09-15T23:38:45Z

@yf-yang I'll have a look here tomorrow, sorry for the delay!

# Conflicts: # pydantic_ai_slim/pydantic_ai/_agent_graph.py # tests/test_agent.py

DouweM · 2025-09-16T19:35:50Z

@yf-yang I played around with this for a bit and made some tweaks (and introduced a new issue 🙈 ), and while it works well for Anthropic I have a few doubts about the design.

We could make this work for DeepSeek, OpenRouter and Mistral as well, but out of the major providers only Anthropic supports it (and they may drop it at some point as it makes it a lot easier to jailbreak models, which is presumably why OpenAI and Google don't have it), so I don't love having response_prefix as a top-level field on run etc for such a niche feature.

Ideally, the field could live on AnthropicModelSettings as anthropic_assistant_prefill, but those are passed to every model request in an agent run, while we really only want it on the initial request. But there's not currently a way for the model class methods to determine what step of the agent run it's on. It could makes sense to add request_index to ModelRequestParameters (which could also help with #1820), which I'd like better than adding the just-for-Anthropic response_prefix there.

Would you mind refactoring in that direction? We can then drop the openai.py and profile stuff, and make it clear that this is an Anthropic-only feature. If other major models support it at some point, we can then consider introducing a top-level field on ModelSettings.

yf-yang · 2025-09-19T00:30:14Z

@DouweM What do you mean by "we really only want it on the initial request"? Is initial request the first call of a multiple tool iteration, or does it mean the first call with no history existing?

Also I'd say sometimes I would like to dynamically control the prefill during agent run. For example, it is a useful strategy to support multiple tool calls similar to PromptOutput by

assistant: LLM generates tool call name
user: generates a tool schema
assistant (better with prefill): <json>args</json>

So I think it is not a good idea to make it a model-wise concept. The lifetime of the prefill request should be similar to user prompt's

DouweM · 2025-09-19T14:11:59Z

What do you mean by "we really only want it on the initial request"? Is initial request the first call of a multiple tool iteration, or does it mean the first call with no history existing?

@yf-yang I was referring to the first request made to the model in an agent run (which could have message history from a previous run), assuming that agent.run(..., response_prefix='...') means that the next thing the model starts generating has to start with that prefix, but if it then calls some tools and we send them back in a request, the response to that request shouldn't necessarily start with that same prefix.

Also I'd say sometimes I would like to dynamically control the prefill during agent run. For example, it is a useful strategy to support multiple tool calls similar to PromptOutput by
assistant: LLM generates tool call name 
user: generates a tool schema 
assistant (better with prefill): `<json>args</json>`

Yeah that's reasonable, and similar to #1820 where we're discussing letting model settings be changed per-request in an agent run with a new prepare_request hook. That would work well with this being a model setting, which you can then set as desired in just the first request or a later request. That solves my problem, but means that this feature doesn't make sense without having that prepare_request hook.

So I think it is not a good idea to make it a model-wise concept. The lifetime of the prefill request should be similar to user prompt's

But the user prompt is only used on the first request generated in UserPromptNode, right? So I think that's not like how you intend for prefill to work.

DouweM self-assigned this Sep 8, 2025

DouweM requested changes Sep 9, 2025

View reviewed changes

pydantic_ai_slim/pydantic_ai/profiles/openai.py Outdated Show resolved Hide resolved

pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py Show resolved Hide resolved

pydantic_ai_slim/pydantic_ai/models/anthropic.py Outdated Show resolved Hide resolved

DouweM added the awaiting author revision label Sep 9, 2025

yf-yang added 7 commits September 11, 2025 10:49

response prefill support for anthropic, deepseek and openrouter

4581650

fix: lint, test, pyright

02ddf5e

fix: lint, test, pyright

91008c2

fix: tests

eae4705

test: more coverage

153bc57

fix: lint

3e9ccbd

fix: apply code review comments

c58795f

yf-yang force-pushed the main branch from 2d43924 to c58795f Compare September 11, 2025 02:51

yf-yang added 2 commits September 11, 2025 10:58

fix: tests

8ba3d25

fix: lint

344369f

yf-yang requested a review from DouweM September 11, 2025 17:03

DouweM added 3 commits September 16, 2025 18:50

Merge branch 'main' into pr/yf-yang/2825

22026a5

# Conflicts: # pydantic_ai_slim/pydantic_ai/_agent_graph.py # tests/test_agent.py

Fix snapshot

a370d23

some tweaks

b42d6b0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Response prefill support for anthropic, deepseek and openrouter #2825

Response prefill support for anthropic, deepseek and openrouter #2825

Uh oh!

yf-yang commented Sep 6, 2025 •

edited

Loading

Uh oh!

yf-yang commented Sep 6, 2025

Uh oh!

DouweM left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yf-yang commented Sep 9, 2025

Uh oh!

yf-yang commented Sep 11, 2025

Uh oh!

DouweM commented Sep 11, 2025

Uh oh!

DouweM commented Sep 15, 2025

Uh oh!

DouweM commented Sep 16, 2025 •

edited

Loading

Uh oh!

yf-yang commented Sep 19, 2025 •

edited

Loading

Uh oh!

DouweM commented Sep 19, 2025

Uh oh!

Uh oh!

Response prefill support for anthropic, deepseek and openrouter #2825

Are you sure you want to change the base?

Response prefill support for anthropic, deepseek and openrouter #2825

Uh oh!

Conversation

yf-yang commented Sep 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yf-yang commented Sep 6, 2025

Uh oh!

DouweM left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yf-yang commented Sep 9, 2025

Uh oh!

yf-yang commented Sep 11, 2025

Uh oh!

DouweM commented Sep 11, 2025

Uh oh!

DouweM commented Sep 15, 2025

Uh oh!

DouweM commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yf-yang commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DouweM commented Sep 19, 2025

Uh oh!

Uh oh!

yf-yang commented Sep 6, 2025 •

edited

Loading

DouweM commented Sep 16, 2025 •

edited

Loading

yf-yang commented Sep 19, 2025 •

edited

Loading