-
Notifications
You must be signed in to change notification settings - Fork 38
Description
Problem
Large token input/outputs often result in a timeout error from the Anthropic API.
Solution
Streaming responses with stream=True
in requests allows larger requests to succeed, since responses are returned immediately and thereby keep the API connection open.
Streaming is supported by the Anthropic API: https://docs.anthropic.com/en/api/messages-streaming
Additional context
See this merged PR which adds streaming support for OpenAI: #95
Adding streaming support for Anthropic should be simpler because we've already updated the base classes in src.artkit.model.base._model
and src.artkit.model.llm._cached
to handle streamed responses. For Anthropic, we should only need updates in two places:
- Update the connector class in
src.artkit.model.llm.anthropic._anthropic
with similar changes tosrc.artkit.model.llm.openai._openai
- Be sure to add timeout error handling and helpful logging directing users who get timeout errors to try
stream=True
- Be sure to add timeout error handling and helpful logging directing users who get timeout errors to try
- Add a non-streaming and streaming unit test in
test.artkit_test.model.llm.test_anthropic.py
Script for testing changes
Suggest creating a similar test script to the OpenAI streaming PR: https://github.com/BCG-X-Official/artkit/pull/95/files#diff-9c8eebcf0111bf1fec6b50445ed9464c167f2a188a80527809dc246cdf273d73
Ensure the script replicates the timeout error for long responses and verifies that the solution solves the problem.
The script may be committed while the PR is a draft, enabling reviewers to use the script for testing. However, it should be removed before merging the PR.