Releases: BerriAI/litellm
v1.76.0.dev2
Full Changelog: 1.76.0.dev1...v1.76.0.dev2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.76.0.dev2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 110.0 | 143.15108061148368 | 6.459849558694928 | 6.459849558694928 | 1933 | 1933 | 87.39768300000605 | 2923.3753640000086 |
Aggregated | Failed ❌ | 110.0 | 143.15108061148368 | 6.459849558694928 | 6.459849558694928 | 1933 | 1933 | 87.39768300000605 | 2923.3753640000086 |
1.76.0.rc.1
What's Changed
- Litellm dev 08 16 2025 p3 by @krrishdholakia in #13694
- GPT-5-chat does not support function by @superpoussin22 in #13612
- fix(vertexai-batch): fix vertexai batch file format by @thiagosalvatore in #13576
- [Feat] Datadog LLM Observability - Add support for Failure Logging by @ishaan-jaff in #13726
- [Feat] DD LLM Observability - Add time to first token, litellm overhead, guardrail overhead latency metrics by @ishaan-jaff in #13734
- [Bug Fix] litellm incompatible with newest release of openAI v1.100.0 by @ishaan-jaff in #13728
- [Bug Fix] image_edit() function returns APIConnectionError with
litellm_proxy
- Support for both image edits and image generations by @ishaan-jaff in #13735 - [Fix] Cooldowns - don't return raw Azure Exceptions to client by @krrishdholakia in #13529
- Responses API - add default api version for openai responses api calls + Openrouter - fix claude-sonnet-4 on openrouter + Azure - Handle
openai/v1/responses
by @krrishdholakia in #13526 - Use namespace as prefix for s3 cache by @michal-otmianowski in #13704
- Add Search Functionality for Public Model Names in Model Dashboard by @NANDINI-star in #13687
- Add Azure Deployment Name Support in UI by @NANDINI-star in #13685
- Fix - gemini prompt caching cost calculation by @krrishdholakia in #13742
- Refactor - forward model group headers - reuse same logic as global header forwarding by @krrishdholakia in #13741
- Fix Groq streaming ASCII encoding issue by @colesmcintosh in #13675
- Add possibility to configure resources for migrations-job in Helm chart by @moandersson in #13617
- [Feat] Datadog LLM Observability - Add support for tracing guardrail input/output by @ishaan-jaff in #13767
- Models page row UI restructure by @NANDINI-star in #13771
- [Bug Fix] Bedrock KB - Using LiteLLM Managed Credentials for Query by @ishaan-jaff in #13787
- [Bug Fix] Fixes for using Auto Router with LiteLLM Docker Image by @ishaan-jaff in #13788
- [Feat] - UI Allow using Key/Team Based Logging for Langfuse OTEL by @ishaan-jaff in #13791
- Add long context support for claude-4-sonnet by @kankute-sameer in #13759
- Migrate to aim new firewall api by @hxdror in #13748
- [LLM Translation] Adjust max_input_tokens for azure/gpt-5-chat models in JSON configuration by @jugaldb in #13660
- Added Qwen3, Deepseek R1 0528 Throughput, GLM 4.5 and GPT-OSS models for Together AI by @Tasmay-Tibrewal in #13637
- Fix query passthrough deletion by @NANDINI-star in #13622
- [Feat] add fireworks_ai/accounts/fireworks/models/deepseek-v3-0324 by @ishaan-jaff in #13821
- New notifications toast UI everywhere by @NANDINI-star in #13813
- Fix key edit settings after regenerating key by @NANDINI-star in #13815
- [Feat] Add VertexAI qwen API Service by @ishaan-jaff in #13828
- Add OTEL tracing for actual LLM API call by @krrishdholakia in #13836
- [Performance] Improve LiteLLM Python SDK RPS by +200 RPS by @ishaan-jaff in #13839
- Fix(bedrock): fix the api key support for bedrock guardrail in proxy by @0x-fang in #13835
- Add rerank endpoint support for deepinfra by @kankute-sameer in #13820
- fix : Synchronize cache behavior between acompletion and completion by @UlookEE in #13803
- Include predicted output in MLflow tracing by @TomeHirata in #13795
- Fix - Ensure Helm chart auto generated master keys follow sk-xxxx format by @ishaan-jaff in #13871
- [Fix] Ensure Service Account Keys require team_id field on API Endpoints by @ishaan-jaff in #13873
- Fix e2e_ui_test by @NANDINI-star in #13861
- Fix Filter Dropdown UX Issue - Load Initial Options by @NANDINI-star in #13858
- [Helm charts] Enhance database configuration: add support for optional endpointKey by @jugaldb in #13763
- [Feat] Add new VertexAI image models
vertex_ai/imagen-4.0-generate-001
,vertex_ai/imagen-4.0-ultra-generate-001
,vertex_ai/imagen-4.0-fast-generate-001
by @ishaan-jaff in #13874 - [Feat] Add new Google AI Studio image models gemini/imagen-4.0-generate-001, gemini/imagen-4.0-ultra-generate-001, gemini/imagen-4.0-fast-generate-001 by @ishaan-jaff in #13876
- Update Baseten LiteLLM integration by @philipkiely-baseten in #13783
- Fix(Bedrock): fix application inference profile for pass-through endpoints for bedrock by @0x-fang in #13796
- Fix e2e_ui_test by @NANDINI-star in #13881
- [Performance] Use O(1) Set lookups for model routing by @ishaan-jaff in #13879
- Update model metadata for Deepinfra provider by @Toy-97 in #13883
- fix: fixing descriptor/response size mismatch on parallel_request_limiter_v3 by @luizrennocosta in #13863
- [Feat] Add support for voyage-context-3 embedding model by @kankute-sameer in #13868
- 🐛 Bug Fix: Updated URL handling for DataRobot provider URL by @carsongee in #13880
- Async s3 implementation by @michal-otmianowski in #13852
- fix: role chaining and session name with webauthentication for aws bedrock by @RichardoC in #13753
- [Bug Fix] JS exception in User Agent Activity: Cannot read properties of undefined by @ishaan-jaff in #13892
- [ui/dashboard] add support for host_vllm by @NiuBlibing in #13885
- [Documentation] Litellm rerank deepinfra endpoint by @kankute-sameer in #13845
- [MCP Gateway] fix StreamableHTTPSessionManager .run() error by @jugaldb in #13666
- [Performance] Reduce Significant CPU overhead from litellm_logging.py by @ishaan-jaff in #13895
- Fix Ollama transformations crash when tools are used with non-tool trained models by @bcdonadio in #13902
- Add openrouter deepseek/deepseek-chat-v3.1 support by @kankute-sameer in #13897
- docs: clarify prerequisites and env var for team rate limits by @TeddyAmkie in #13899
- [Enhancement] Add support for Mistral model file handling and update documentation by @jinskjoy in #13866
- fix permission access on prisma migrate in non-root image by @Ithanil in #13848
- feat(utils.py): accept 'api_version' as param for validate_environment by @mainred in #13808
- Responses API - support
allowed_openai_params
+ Mistral - handle empty assistant content + support new mistral 'thinking' response block by @krrishdholakia in #13671 - fix(openai/image_edits): Support 'mask' parameter for openai image edits by @krrishdholakia in #13673
- SSO - Free SSO usage for up to 5 users + remove deprecated dbrx models (dbrx-instruct, llama 3.1) by @krrishdholakia in #13843
- Fix calling key with access to model alias by @krrishdholakia in #13830
- [Feat] New LLM API - AI/ML API for Image Gen by @ishaan-jaff in #13893
- [Perf] Improvements for Async Success Handler (Logging Callbacks) - Approx +130 RPS by @ishaan-jaff in #13905
- Added FAQ under deployment docs by @mubashir1osmani in #13912
- updated claude-code docs by @mubashir1osmani in #13784
- [Feat] UI QA Fixes by @ishaan-jaff in #13915
New Contributors
- @moandersson made their first contribution in #13617
- @Tasmay-Tibrewal made their first contribution in #13637
- @UlookEE made their first contribution in #13803
- @philipkiely-baseten made their first contribution in #13783
- @luizrennocosta...
1.76.0.dev1
Full Changelog: 1.76.0.rc.1...1.76.0.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-1.76.0.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 62 | 100.40888889168315 | 6.512486450300847 | 6.512486450300847 | 1948 | 1948 | 43.315152999980455 | 2949.629679999987 |
Aggregated | Failed ❌ | 62 | 100.40888889168315 | 6.512486450300847 | 6.512486450300847 | 1948 | 1948 | 43.315152999980455 | 2949.629679999987 |
v1.76.0-nightly
What's Changed
- updated claude-code docs by @mubashir1osmani in #13784
- [Feat] UI QA Fixes by @ishaan-jaff in #13915
Full Changelog: v1.76.0-stable-draft...v1.76.0-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.76.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.76.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.76.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 110.0 | 146.96742692248557 | 6.510920428813231 | 6.510920428813231 | 1948 | 1948 | 90.41871399995216 | 2697.183144999997 |
Aggregated | Failed ❌ | 110.0 | 146.96742692248557 | 6.510920428813231 | 6.510920428813231 | 1948 | 1948 | 90.41871399995216 | 2697.183144999997 |
v1.76.0-stable-draft
What's Changed
- Litellm stable release fixes by @krrishdholakia in #13682
- [UI QA] Aug 16th Fixes by @ishaan-jaff in #13684
- Litellm dev 08 16 2025 p3 by @krrishdholakia in #13694
- GPT-5-chat does not support function by @superpoussin22 in #13612
- fix(vertexai-batch): fix vertexai batch file format by @thiagosalvatore in #13576
- [Feat] Datadog LLM Observability - Add support for Failure Logging by @ishaan-jaff in #13726
- [Feat] DD LLM Observability - Add time to first token, litellm overhead, guardrail overhead latency metrics by @ishaan-jaff in #13734
- [Bug Fix] litellm incompatible with newest release of openAI v1.100.0 by @ishaan-jaff in #13728
- [Bug Fix] image_edit() function returns APIConnectionError with
litellm_proxy
- Support for both image edits and image generations by @ishaan-jaff in #13735 - [Fix] Cooldowns - don't return raw Azure Exceptions to client by @krrishdholakia in #13529
- Responses API - add default api version for openai responses api calls + Openrouter - fix claude-sonnet-4 on openrouter + Azure - Handle
openai/v1/responses
by @krrishdholakia in #13526 - Use namespace as prefix for s3 cache by @michal-otmianowski in #13704
- Add Search Functionality for Public Model Names in Model Dashboard by @NANDINI-star in #13687
- Add Azure Deployment Name Support in UI by @NANDINI-star in #13685
- Fix - gemini prompt caching cost calculation by @krrishdholakia in #13742
- Refactor - forward model group headers - reuse same logic as global header forwarding by @krrishdholakia in #13741
- Fix Groq streaming ASCII encoding issue by @colesmcintosh in #13675
- Add possibility to configure resources for migrations-job in Helm chart by @moandersson in #13617
- [Feat] Datadog LLM Observability - Add support for tracing guardrail input/output by @ishaan-jaff in #13767
- Models page row UI restructure by @NANDINI-star in #13771
- [Bug Fix] Bedrock KB - Using LiteLLM Managed Credentials for Query by @ishaan-jaff in #13787
- [Bug Fix] Fixes for using Auto Router with LiteLLM Docker Image by @ishaan-jaff in #13788
- [Feat] - UI Allow using Key/Team Based Logging for Langfuse OTEL by @ishaan-jaff in #13791
- Add long context support for claude-4-sonnet by @kankute-sameer in #13759
- Migrate to aim new firewall api by @hxdror in #13748
- [LLM Translation] Adjust max_input_tokens for azure/gpt-5-chat models in JSON configuration by @jugaldb in #13660
- Added Qwen3, Deepseek R1 0528 Throughput, GLM 4.5 and GPT-OSS models for Together AI by @Tasmay-Tibrewal in #13637
- Fix query passthrough deletion by @NANDINI-star in #13622
- [Feat] add fireworks_ai/accounts/fireworks/models/deepseek-v3-0324 by @ishaan-jaff in #13821
- New notifications toast UI everywhere by @NANDINI-star in #13813
- Fix key edit settings after regenerating key by @NANDINI-star in #13815
- [Feat] Add VertexAI qwen API Service by @ishaan-jaff in #13828
- Add OTEL tracing for actual LLM API call by @krrishdholakia in #13836
- [Performance] Improve LiteLLM Python SDK RPS by +200 RPS by @ishaan-jaff in #13839
- Fix(bedrock): fix the api key support for bedrock guardrail in proxy by @0x-fang in #13835
- Add rerank endpoint support for deepinfra by @kankute-sameer in #13820
- fix : Synchronize cache behavior between acompletion and completion by @UlookEE in #13803
- Include predicted output in MLflow tracing by @TomeHirata in #13795
- Fix - Ensure Helm chart auto generated master keys follow sk-xxxx format by @ishaan-jaff in #13871
- [Fix] Ensure Service Account Keys require team_id field on API Endpoints by @ishaan-jaff in #13873
- Fix e2e_ui_test by @NANDINI-star in #13861
- Fix Filter Dropdown UX Issue - Load Initial Options by @NANDINI-star in #13858
- [Helm charts] Enhance database configuration: add support for optional endpointKey by @jugaldb in #13763
- [Feat] Add new VertexAI image models
vertex_ai/imagen-4.0-generate-001
,vertex_ai/imagen-4.0-ultra-generate-001
,vertex_ai/imagen-4.0-fast-generate-001
by @ishaan-jaff in #13874 - [Feat] Add new Google AI Studio image models gemini/imagen-4.0-generate-001, gemini/imagen-4.0-ultra-generate-001, gemini/imagen-4.0-fast-generate-001 by @ishaan-jaff in #13876
- Update Baseten LiteLLM integration by @philipkiely-baseten in #13783
- Fix(Bedrock): fix application inference profile for pass-through endpoints for bedrock by @0x-fang in #13796
- Fix e2e_ui_test by @NANDINI-star in #13881
- [Performance] Use O(1) Set lookups for model routing by @ishaan-jaff in #13879
- Update model metadata for Deepinfra provider by @Toy-97 in #13883
- fix: fixing descriptor/response size mismatch on parallel_request_limiter_v3 by @luizrennocosta in #13863
- [Feat] Add support for voyage-context-3 embedding model by @kankute-sameer in #13868
- 🐛 Bug Fix: Updated URL handling for DataRobot provider URL by @carsongee in #13880
- Async s3 implementation by @michal-otmianowski in #13852
- fix: role chaining and session name with webauthentication for aws bedrock by @RichardoC in #13753
- [Bug Fix] JS exception in User Agent Activity: Cannot read properties of undefined by @ishaan-jaff in #13892
- [ui/dashboard] add support for host_vllm by @NiuBlibing in #13885
- [Documentation] Litellm rerank deepinfra endpoint by @kankute-sameer in #13845
- [MCP Gateway] fix StreamableHTTPSessionManager .run() error by @jugaldb in #13666
- [Performance] Reduce Significant CPU overhead from litellm_logging.py by @ishaan-jaff in #13895
- Fix Ollama transformations crash when tools are used with non-tool trained models by @bcdonadio in #13902
- Add openrouter deepseek/deepseek-chat-v3.1 support by @kankute-sameer in #13897
- docs: clarify prerequisites and env var for team rate limits by @TeddyAmkie in #13899
- [Enhancement] Add support for Mistral model file handling and update documentation by @jinskjoy in #13866
- fix permission access on prisma migrate in non-root image by @Ithanil in #13848
- feat(utils.py): accept 'api_version' as param for validate_environment by @mainred in #13808
- Responses API - support
allowed_openai_params
+ Mistral - handle empty assistant content + support new mistral 'thinking' response block by @krrishdholakia in #13671 - fix(openai/image_edits): Support 'mask' parameter for openai image edits by @krrishdholakia in #13673
- SSO - Free SSO usage for up to 5 users + remove deprecated dbrx models (dbrx-instruct, llama 3.1) by @krrishdholakia in #13843
- Fix calling key with access to model alias by @krrishdholakia in #13830
- [Feat] New LLM API - AI/ML API for Image Gen by @ishaan-jaff in #13893
- [Perf] Improvements for Async Success Handler (Logging Callbacks) - Approx +130 RPS by @ishaan-jaff in #13905
- Added FAQ under deployment docs by @mubashir1osmani in #13912
New Contributors
- @michal-otmianowski made their first contribution in #13704
- @moandersson made their first contribution in #13617
- @Tasmay-Tibrewal made their first contribution in #13637
- @UlookEE made their f...
v1.75.9-nightly
What's Changed
- Litellm stable release fixes by @krrishdholakia in #13682
- [UI QA] Aug 16th Fixes by @ishaan-jaff in #13684
- Litellm dev 08 16 2025 p3 by @krrishdholakia in #13694
- GPT-5-chat does not support function by @superpoussin22 in #13612
- fix(vertexai-batch): fix vertexai batch file format by @thiagosalvatore in #13576
- [Feat] Datadog LLM Observability - Add support for Failure Logging by @ishaan-jaff in #13726
- [Feat] DD LLM Observability - Add time to first token, litellm overhead, guardrail overhead latency metrics by @ishaan-jaff in #13734
- [Bug Fix] litellm incompatible with newest release of openAI v1.100.0 by @ishaan-jaff in #13728
- [Bug Fix] image_edit() function returns APIConnectionError with
litellm_proxy
- Support for both image edits and image generations by @ishaan-jaff in #13735 - [Fix] Cooldowns - don't return raw Azure Exceptions to client by @krrishdholakia in #13529
- Responses API - add default api version for openai responses api calls + Openrouter - fix claude-sonnet-4 on openrouter + Azure - Handle
openai/v1/responses
by @krrishdholakia in #13526 - Use namespace as prefix for s3 cache by @michal-otmianowski in #13704
- Add Search Functionality for Public Model Names in Model Dashboard by @NANDINI-star in #13687
- Add Azure Deployment Name Support in UI by @NANDINI-star in #13685
- Fix - gemini prompt caching cost calculation by @krrishdholakia in #13742
- Refactor - forward model group headers - reuse same logic as global header forwarding by @krrishdholakia in #13741
- Fix Groq streaming ASCII encoding issue by @colesmcintosh in #13675
- Add possibility to configure resources for migrations-job in Helm chart by @moandersson in #13617
- [Feat] Datadog LLM Observability - Add support for tracing guardrail input/output by @ishaan-jaff in #13767
- Models page row UI restructure by @NANDINI-star in #13771
- [Bug Fix] Bedrock KB - Using LiteLLM Managed Credentials for Query by @ishaan-jaff in #13787
- [Bug Fix] Fixes for using Auto Router with LiteLLM Docker Image by @ishaan-jaff in #13788
- [Feat] - UI Allow using Key/Team Based Logging for Langfuse OTEL by @ishaan-jaff in #13791
- Add long context support for claude-4-sonnet by @kankute-sameer in #13759
- Migrate to aim new firewall api by @hxdror in #13748
- [LLM Translation] Adjust max_input_tokens for azure/gpt-5-chat models in JSON configuration by @jugaldb in #13660
- Added Qwen3, Deepseek R1 0528 Throughput, GLM 4.5 and GPT-OSS models for Together AI by @Tasmay-Tibrewal in #13637
- Fix query passthrough deletion by @NANDINI-star in #13622
New Contributors
- @michal-otmianowski made their first contribution in #13704
- @moandersson made their first contribution in #13617
- @Tasmay-Tibrewal made their first contribution in #13637
Full Changelog: v1.75.8-nightly...v1.75.9-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.75.9-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 98 | 143.26229013447838 | 6.435162941463474 | 0.0 | 1926 | 0 | 70.0977230000035 | 1988.133740999956 |
Aggregated | Passed ✅ | 98 | 143.26229013447838 | 6.435162941463474 | 0.0 | 1926 | 0 | 70.0977230000035 | 1988.133740999956 |
v1.75.9.dev3
What's Changed
- Litellm stable release fixes by @krrishdholakia in #13682
- [UI QA] Aug 16th Fixes by @ishaan-jaff in #13684
- Litellm dev 08 16 2025 p3 by @krrishdholakia in #13694
- GPT-5-chat does not support function by @superpoussin22 in #13612
- fix(vertexai-batch): fix vertexai batch file format by @thiagosalvatore in #13576
- [Feat] Datadog LLM Observability - Add support for Failure Logging by @ishaan-jaff in #13726
- [Feat] DD LLM Observability - Add time to first token, litellm overhead, guardrail overhead latency metrics by @ishaan-jaff in #13734
- [Bug Fix] litellm incompatible with newest release of openAI v1.100.0 by @ishaan-jaff in #13728
- [Bug Fix] image_edit() function returns APIConnectionError with
litellm_proxy
- Support for both image edits and image generations by @ishaan-jaff in #13735 - [Fix] Cooldowns - don't return raw Azure Exceptions to client by @krrishdholakia in #13529
- Responses API - add default api version for openai responses api calls + Openrouter - fix claude-sonnet-4 on openrouter + Azure - Handle
openai/v1/responses
by @krrishdholakia in #13526 - Use namespace as prefix for s3 cache by @michal-otmianowski in #13704
- Add Search Functionality for Public Model Names in Model Dashboard by @NANDINI-star in #13687
- Add Azure Deployment Name Support in UI by @NANDINI-star in #13685
- Fix - gemini prompt caching cost calculation by @krrishdholakia in #13742
- Refactor - forward model group headers - reuse same logic as global header forwarding by @krrishdholakia in #13741
- Fix Groq streaming ASCII encoding issue by @colesmcintosh in #13675
- Add possibility to configure resources for migrations-job in Helm chart by @moandersson in #13617
- [Feat] Datadog LLM Observability - Add support for tracing guardrail input/output by @ishaan-jaff in #13767
- Models page row UI restructure by @NANDINI-star in #13771
New Contributors
- @michal-otmianowski made their first contribution in #13704
- @moandersson made their first contribution in #13617
Full Changelog: v1.75.8-nightly...v1.75.9.dev3
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.75.9.dev3
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 167.2434307868771 | 6.365346480657209 | 0.0 | 1905 | 0 | 103.2638190000057 | 1092.3964670000146 |
Aggregated | Passed ✅ | 130.0 | 167.2434307868771 | 6.365346480657209 | 0.0 | 1905 | 0 | 103.2638190000057 | 1092.3964670000146 |
v1.75.8-stable
What's Changed
- Litellm stable release fixes by @krrishdholakia in #13682
- [UI QA] Aug 16th Fixes by @ishaan-jaff in #13684
Full Changelog: v1.75.8-nightly...v1.75.8-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.75.8-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 120.0 | 151.45218493123352 | 6.364773520109997 | 6.364773520109997 | 1905 | 1905 | 90.27845100001741 | 2454.094601999998 |
Aggregated | Failed ❌ | 120.0 | 151.45218493123352 | 6.364773520109997 | 6.364773520109997 | 1905 | 1905 | 90.27845100001741 | 2454.094601999998 |
v1.75.5-stable
Full Changelog: v1.75.5.rc.1...v1.75.5-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.75.5-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.75.5-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 160.0 | 192.4000595502936 | 6.278691415340353 | 0.0 | 1879 | 0 | 117.62542199994641 | 1498.513451000008 |
Aggregated | Passed ✅ | 160.0 | 192.4000595502936 | 6.278691415340353 | 0.0 | 1879 | 0 | 117.62542199994641 | 1498.513451000008 |
v1.75.8-nightly
What's Changed
- [Feat] UI - Add Confirmation Modal Before Deleting Keys by @ishaan-jaff in #13655
- [Bug Fix] Using
stream=True
+background=True
with Responses API by @ishaan-jaff in #13654 - Chore: update boto3 version to 1.37.38 for Bedrock Integration by @0x-fang in #13656
- Fix LangfuseOtelSpanAttributes constants to match expected values at langfuse by @willfinnigan in #13659
- Fixed incorrect key info endpoint by @dcbark01 in #13633
- trivy/secrets false positives by @javacruft in #13631
- [Feat] Team Member Rate Limits - show team member tpm/rpm limits by @ishaan-jaff in #13662
- [Feat] UI QA for Team Member Rate Limits by @ishaan-jaff in #13664
- Chore: update boto3 to 1.36.0 and aioboto3 to 13.4.0 for Bedrock Integration by @0x-fang in #13665
- [Feat] UI - Allow editing team member rpm/tpm limits by @ishaan-jaff in #13669
- [Bug Fix] Add cachePoint support for assistant and tool messages in Bedrock by @yytdfc in #13640
- [Docs] v1.75.8-stable by @ishaan-jaff in #13676
- Edit Budget Duration + Other Improvements in User Settings by @NANDINI-star in #13629
New Contributors
- @willfinnigan made their first contribution in #13659
- @dcbark01 made their first contribution in #13633
- @javacruft made their first contribution in #13631
Full Changelog: v1.75.7-nightly...v1.75.8-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.75.8-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 192.18510385185118 | 6.314985319551383 | 0.0 | 1890 | 0 | 114.82424099995114 | 1309.5362409999325 |
Aggregated | Passed ✅ | 140.0 | 192.18510385185118 | 6.314985319551383 | 0.0 | 1890 | 0 | 114.82424099995114 | 1309.5362409999325 |