-
Notifications
You must be signed in to change notification settings - Fork 8
Description
Your current environment
Details
vllm commit link: vllm-ascend: v0.11.0rc4-EPD-post1🐛 Describe the bug
[ENCODE_0] : INFO 12-19 11:04:37 [loggers.py:127] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 2 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%
[ENCODE_0] : INFO 12-19 11:04:47 [loggers.py:127] Engine 000: Avg prompt throughput: 103.8 tokens/s, Avg generation throughput: 0.1 tokens/s, Running: 0 reqs, Waiting: 2 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%
[ENCODE_0] : INFO 12-19 11:04:57 [loggers.py:127] Engine 000: Avg prompt throughput: 92.6 tokens/s, Avg generation throughput: 0.1 tokens/s, Running: 0 reqs, Waiting: 2 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%
[ENCODE_0] : INFO 12-19 11:05:07 [loggers.py:127] Engine 000: Avg prompt throughput: 270.8 tokens/s, Avg generation throughput: 0.3 tokens/s, Running: 0 reqs, Waiting: 2 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%
[ENCODE_0] : INFO 12-19 11:05:17 [loggers.py:127] Engine 000: Avg prompt throughput: 92.7 tokens/s, Avg generation throughput: 0.1 tokens/s, Running: 0 reqs, Waiting: 2 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%
[ENCODE_0] : INFO 12-19 11:05:27 [loggers.py:127] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 2 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%
[PROXY] : ERROR 12-19 11:11:55 [proxy.py:439] Runtime error during generate: Request c3bb9d1a-3c75-423b-85a3-1f888da19a83 timed out after 600s without worker response.
[PROXY] : Error processing chat completion request: %s 500: No response from proxy
[PROXY] : INFO: 127.0.0.1:45092 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
[PROXY] : ERROR 12-19 11:11:56 [proxy.py:439] Runtime error during generate: Request b61f14bf-7406-4317-8760-5edea7dc5276 timed out after 600s without worker response.
[PROXY] : Error processing chat completion request: %s 500: No response from proxy
[PROXY] : INFO: 127.0.0.1:45098 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
[PROXY] : ERROR 12-19 11:11:56 [proxy.py:439] Runtime error during generate: Request 12979e06-0de5-415b-99d0-345547eec19a timed out after 600s without worker response.
[PROXY] : Error processing chat completion request: %s 500: No response from proxy
[PROXY] : INFO: 127.0.0.1:45432 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
[PROXY] : ERROR 12-19 11:11:57 [proxy.py:439] Runtime error during generate: Request 0e1f2120-5ee3-4ff7-abfe-860ef2d2a6d3 timed out after 600s without worker response.
[PROXY] : Error processing chat completion request: %s 500: No response from proxy
[PROXY] : INFO: 127.0.0.1:45022 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
[PROXY] : ERROR 12-19 11:11:57 [proxy.py:439] Runtime error during generate: Request 9b781388-4e3d-4d98-8eac-aab7f9b817c0 timed out after 600s without worker response.
[PROXY] : Error processing chat completion request: %s 500: No response from proxy
[PROXY] : INFO: 127.0.0.1:45114 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
[PROXY] : ERROR 12-19 11:11:58 [proxy.py:439] Runtime error during generate: Request 6d470295-d7a8-4d53-9880-54787942d259 timed out after 600s without worker response.
[PROXY] : Error processing chat completion request: %s 500: No response from proxy
[PROXY] : INFO: 127.0.0.1:45226 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
[PROXY] : ERROR 12-19 11:11:58 [proxy.py:439] Runtime error during generate: Request 0d6324c0-5d43-43ae-b5b1-a510981915e7 timed out after 600s without worker response.
[PROXY] : Error processing chat completion request: %s 500: No response from proxy
[PROXY] : INFO: 127.0.0.1:52332 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
[PROXY] : ERROR 12-19 11:11:58 [proxy.py:439] Runtime error during generate: Request 017dc567-b9ed-4f9f-96ce-b9bb63b8453a timed out after 600s without worker response.
[PROXY] : Error processing chat completion request: %s 500: No response from proxy
[PROXY] : INFO: 127.0.0.1:45358 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
[PROXY] : ERROR 12-19 11:11:59 [proxy.py:439] Runtime error during generate: Request d78658fa-5446-4e37-8b7e-5bfc7743ee5d timed out after 600s without worker response.
[PROXY] : Error processing chat completion request: %s 500: No response from proxy
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.