-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Problem
Users are experiencing frequent timeouts and rate limit (429) errors when using GLM5 inference.
Error Details
HTTPSConnectionPool(host='cloud-api.near.ai', port=443): Read timed out. (read timeout=60)
Rate limiting (429): 429 Client Error: Too Many Requests for url: https://cloud-api.near.ai/v1/chat/completions
Source
- Slack: https://jasnahworkspace.slack.com/archives/C09GV3MTGFK/p1771535121703859
- Datadog logs: https://us3.datadoghq.com/logs?query=env%3Aprod%20service%3Acloud-api%20status%3A%28warn%20OR%20error%29%20%40fields.message%3A%22Organization%20concurrent%20request%20limit%20exceeded%20for%20model%22&agg_m=count&agg_m_source=base&agg_t=count&clustering_pattern_field_path=message&cols=host%2Cservice&messageDisplay=inline&refresh_mode=sliding&storage=hot&stream_sort=desc&viz=stream&from_ts=1771363463621&to_ts=1771536263621&live=true
Proposed Resolution
- Relax rate limits for GLM5 endpoints
- Add more GLM5 instances to handle current demand
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels