Skip to content

Add ModelRetryMiddleware for fine-grained control for LLM retries in agents created via create_agent #33983

@anupam-stripe

Description

@anupam-stripe

Checked other resources

  • This is a feature request, not a bug report or usage question.
  • I added a clear and descriptive title that summarizes the feature request.
  • I used the GitHub search to find a similar feature request and didn't find it.
  • I checked the LangChain documentation and API reference to see if this feature already exists.
  • This is not related to the langchain-community package.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-cli
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-perplexity
  • langchain-prompty
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Feature Description

Ability to granularly control LLM retry policy, including the exceptions to retry on, the retry count, backoff etc. while creating agent using create_agent in an ergonomic way.

Use Case

Currently, adding robust, fine-grained retry logic to the LLM used within create_agent is difficult due to a few factors:

  1. The create_agent function expects an LLM of type BaseChatModel.
  2. Applying the standard LCEL .with_retry() method to the LLM (e.g., llm.with_retry(...)) changes its type to RunnableRetry, which is not compatible with create_agent and causes a type error.
  3. The alternative, setting max_retries directly on the chat model instance (e.g., ChatOpenAI(max_retries=3)), is too limited. It doesn't allow for specifying which exceptions should trigger a retry (e.g., only RateLimitErrors) or implementing custom backoff strategies (like exponential backoff).

This makes it challenging to build resilient agents that can gracefully handle specific, transient LLM errors without implementing complex custom logic.

Proposed Solution

I propose adding a new ModelRetryMiddleware component, similar in concept to the existing ToolRetryMiddleware.

This middleware would wrap the agent's LLM (or the entire agent runnable) and provide a clear, dedicated mechanism for handling LLM-specific exceptions. It should allow the user to:

  • Specify a list of exceptions to retry on.
  • Define a backoff policy (e.g., exponential, fixed).
  • Set a maximum number of retries.

This would provide the same level of granular control we have for tool retries but apply it to the core model calls, solving the type-incompatibility issue with .with_retry and the lack of control from max_retries.

Alternatives Considered

Using .with_retry(): As mentioned, this breaks the type compatibility with create_agent.

Setting max_retries on the model: This lacks the necessary control over what to retry and how to backoff.

Implementing a custom middleware: This is overly complex for what feels like a common and necessary pattern for production-ready agents. This feels like an omission especially as we have ToolRetryMiddleware as part of the langchain package.

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    1.1 candidatefeature requestrequest for an enhancement / additional functionalitylangchainRelated to the package `langchain`

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions