Skip to content

GLM5: Timeouts and rate limit errors #450

@henrypark133

Description

@henrypark133

Problem

Users are experiencing frequent timeouts and rate limit (429) errors when using GLM5 inference.

Error Details

HTTPSConnectionPool(host='cloud-api.near.ai', port=443): Read timed out. (read timeout=60)
Rate limiting (429): 429 Client Error: Too Many Requests for url: https://cloud-api.near.ai/v1/chat/completions

Source

Proposed Resolution

  1. Relax rate limits for GLM5 endpoints
  2. Add more GLM5 instances to handle current demand

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions