Skip to content

[Bug]: 不使用redis发现服务,使用Qwen2.5-VL-7B-Instruct模型,用ipv6,拉起1proxy1e1p1d实例,proxy、worker单机,开启前缀缓存,拉起实例成功,持续发送请求,先扩容p实例,扩容正常,有请求发送到扩容实例中,缩容p实例,缩容正常;再扩容D实例,扩容正常,有请求发送到扩容实例中,缩容D实例,缩容后异常请求异常,P实例异常。 #175

@wangwei1254

Description

@wangwei1254

Your current environment

The output of vllm python collect_env.py vllm commit link: 615fb1b

Collecting environment information...

    System Info

==============================
OS : openEuler 24.03 (LTS-SP2) (aarch64)
GCC version : (GCC) 10.3.1
Clang version : Could not collect
CMake version : version 4.1.2
Libc version : glibc-2.38

==============================
PyTorch Info

PyTorch version : 2.7.1+cpu
Is debug build : False
CUDA used to build PyTorch : None
ROCM used to build PyTorch : N/A

==============================
Python Environment

Python version : 3.11.13 (main, Nov 2 2025, 08:49:25) [GCC 12.3.1 (openEuler 12.3.1-98.oe2403sp2)] (64-bit runtime)
Python platform : Linux-4.19.90-vhulk2211.3.0.h1912.eulerosv2r10.aarch64-aarch64-with-glibc2.38

==============================
CUDA / GPU Info

Is CUDA available : False
CUDA runtime version : No CUDA
CUDA_MODULE_LOADING set to : N/A
GPU models and configuration : No CUDA
Nvidia driver version : No CUDA
cuDNN version : No CUDA
HIP runtime version : N/A
MIOpen runtime version : N/A
Is XNNPACK available : True

==============================
CPU Info

Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Vendor ID: HiSilicon
BIOS Vendor ID: HiSilicon
Model name: Kunpeng-920
BIOS Model name: HUAWEI Kunpeng 920 5250 To be filled by O.E.M. CPU @ 2.6GHz
BIOS CPU family: 280
Model: 0
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 4
Stepping: 0x1
BogoMIPS: 200.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm ssbs
L1d cache: 12 MiB (192 instances)
L1i cache: 12 MiB (192 instances)
L2 cache: 96 MiB (192 instances)
L3 cache: 192 MiB (8 instances)
NUMA node(s): 8
NUMA node0 CPU(s): 0-23
NUMA node1 CPU(s): 24-47
NUMA node2 CPU(s): 48-71
NUMA node3 CPU(s): 72-95
NUMA node4 CPU(s): 96-119
NUMA node5 CPU(s): 120-143
NUMA node6 CPU(s): 144-167
NUMA node7 CPU(s): 168-191
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected

==============================
Versions of relevant libraries

[pip3] numpy==1.26.4
[pip3] pyzmq==27.1.0
[pip3] torch==2.7.1
[pip3] torch_npu==2.7.1
[pip3] torchaudio==2.8.0
[pip3] torchvision==0.22.1
[pip3] transformers==4.57.1
[conda] Could not collect

==============================
vLLM Info

ROCM Version : Could not collect
vLLM Version : 0.11.0
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled
GPU Topology:
Could not collect

==============================
Environment Variables

LD_LIBRARY_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/openssl-3.2.6/lib:/usr/local/lib64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:
TORCH_DEVICE_BACKEND_AUTOLOAD=1
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1

The output of vllm-ascend python collect_env.py vllm-ascend commit link: 22e9188fa562a2d12f0c8514278a7a90b035c764 Collecting environment information... PyTorch version: 2.7.1+cpu Is debug build: False

OS: openEuler 24.03 (LTS-SP2) (aarch64)
GCC version: (GCC) 10.3.1
Clang version: Could not collect
CMake version: version 4.1.2
Libc version: glibc-2.38

Python version: 3.11.13 (main, Nov 2 2025, 08:49:25) [GCC 12.3.1 (openEuler 12.3.1-98.oe2403sp2)] (64-bit runtime)
Python platform: Linux-4.19.90-vhulk2211.3.0.h1912.eulerosv2r10.aarch64-aarch64-with-glibc2.38

CPU:
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Vendor ID: HiSilicon
BIOS Vendor ID: HiSilicon
Model name: Kunpeng-920
BIOS Model name: HUAWEI Kunpeng 920 5250 To be filled by O.E.M. CPU @ 2.6GHz
BIOS CPU family: 280
Model: 0
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 4
Stepping: 0x1
BogoMIPS: 200.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm ssbs
L1d cache: 12 MiB (192 instances)
L1i cache: 12 MiB (192 instances)
L2 cache: 96 MiB (192 instances)
L3 cache: 192 MiB (8 instances)
NUMA node(s): 8
NUMA node0 CPU(s): 0-23
NUMA node1 CPU(s): 24-47
NUMA node2 CPU(s): 48-71
NUMA node3 CPU(s): 72-95
NUMA node4 CPU(s): 96-119
NUMA node5 CPU(s): 120-143
NUMA node6 CPU(s): 144-167
NUMA node7 CPU(s): 168-191
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] pyzmq==27.1.0
[pip3] torch==2.7.1
[pip3] torch_npu==2.7.1
[pip3] torchaudio==2.8.0
[pip3] torchvision==0.22.1
[pip3] transformers==4.57.1
[conda] Could not collect
vLLM Version: 0.11.0
vLLM Ascend Version: 0.11.0

ENV Variables:
ATB_OPSRUNNER_KERNEL_CACHE_LOCAL_COUNT=1
ATB_STREAM_SYNC_EVERY_RUNNER_ENABLE=0
ATB_OPSRUNNER_SETUP_CACHE_ENABLE=1
ATB_WORKSPACE_MEM_ALLOC_GLOBAL=1
ATB_DEVICE_TILING_BUFFER_BLOCK_NUM=32
ATB_STREAM_SYNC_EVERY_KERNEL_ENABLE=0
ATB_OPSRUNNER_KERNEL_CACHE_GLOABL_COUNT=5
ATB_HOME_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1
ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
ATB_COMPARE_TILING_EVERY_KERNEL=0
ASCEND_OPP_PATH=/usr/local/Ascend/ascend-toolkit/latest/opp
LD_LIBRARY_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/openssl-3.2.6/lib:/usr/local/lib64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:
ASCEND_AICPU_PATH=/usr/local/Ascend/ascend-toolkit/latest
ATB_STREAM_SYNC_EVERY_OPERATION_ENABLE=0
ASCEND_HOME_PATH=/usr/local/Ascend/ascend-toolkit/latest
ATB_MATMUL_SHUFFLE_K_ENABLE=1
ATB_WORKSPACE_MEM_ALLOC_ALG_TYPE=1
ATB_HOST_TILING_BUFFER_BLOCK_NUM=128
ATB_SHARE_MEMORY_NAME_SUFFIX=
TORCH_DEVICE_BACKEND_AUTOLOAD=1
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1

NPU:
+------------------------------------------------------------------------------------------------+
| npu-smi 24.1.rc2 Version: 24.1.rc2 |
+---------------------------+---------------+----------------------------------------------------+

CANN:
package_name=Ascend-cann-toolkit
version=8.3.RC1
innerversion=V100R001C23SPC001B235
compatible_version=[V100R001C15],[V100R001C18],[V100R001C19],[V100R001C20],[V100R001C21],[V100R001C23]
arch=aarch64
os=linux
path=/usr/local/Ascend/ascend-toolkit/8.3.RC1/aarch64-linux

The output of llm-service commit link: 7ec9efbb2ab214672710d7b1ec075ca271a6d2cd

🐛 Describe the bug

Run the following command to reproduce the error: 不使用redis发现服务,使用Qwen2.5-VL-7B-Instruct模型,用ipv6,拉起1proxy1e1p1d实例,proxy、worker单机,开启前缀缓存,拉起实例成功,持续发送请求,先扩容p实例,扩容正常,有请求发送到扩容实例中,缩容p实例,缩容正常;再扩容D实例,扩容正常,有请求发送到扩容实例中,缩容D实例,缩容后异常请求异常,P实例异常。报错AssertionError: Encoder cache miss forxxx
Error output:
[ENCODE_0] : I20251205 22:00:53.362879 281408496660896 client.cpp:1172] Successfully revoked failed put for key ad8c7f9be928df406a871420df753a2fa826135d6b0c4c92b7c9255052a92c9d [ENCODE_0] : E20251205 22:00:53.362895 281408496660896 client.cpp:1210] Operation for key ad8c7f9be928df406a871420df753a2fa826135d6b0c4c92b7c9255052a92c9d failed: TRANSFER_FAIL (TRANSFER_FAIL: Transfer 0 failed; ) [ENCODE_0] : E20251205 22:00:53.363321 281408102461856 tcp_transport.cpp:487] TcpTransport::startTransfer encountered an ASIO exception. Slice details - source_addr: 0x12c1c7390000, length: 28672, opcode: 1, target_id: 3. Exception: connect: Connection refused [ENCODE_0] : E20251205 22:00:53.363348 281408102461856 transfer_task.cpp:247] Transfer failed for batch 281459541691744 task 0 with status 6 [ENCODE_0] : E20251205 22:00:53.363357 281408102461856 client.cpp:1071] Transfer failed for key 4ec1a571bd57bb7f286fb718d55278fa369e7a4c2a3ae07b7d65b2d510988b91: TRANSFER_FAIL (Transfer 0 failed) [ENCODE_0] : I20251205 22:00:53.363423 281408102461856 client.cpp:1172] Successfully revoked failed put for key 4ec1a571bd57bb7f286fb718d55278fa369e7a4c2a3ae07b7d65b2d510988b91 [ENCODE_0] : E20251205 22:00:53.363435 281408102461856 client.cpp:1210] Operation for key 4ec1a571bd57bb7f286fb718d55278fa369e7a4c2a3ae07b7d65b2d510988b91 failed: TRANSFER_FAIL (TRANSFER_FAIL: Transfer 0 failed; ) [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] Generation failed for request da364d0d-2673-47ec-9893-f4cefb677418 [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] Traceback (most recent call last): [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/llm-service/lm_service/workers/vllm/disagg_worker.py", line 445, in _generate [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] async for request_output in generator: [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 370, in generate [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] q = await self.add_request( [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] ^^^^^^^^^^^^^^^^^^^^^^^ [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 276, in add_request [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] raise EngineDeadError() [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause. [PROXY] : ERROR: Exception in ASGI application [PROXY] : + Exception Group Traceback (most recent call last): [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_utils.py", line 79, in collapse_excgroups [PROXY] : | yield [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 270, in __call__ [PROXY] : | async with anyio.create_task_group() as task_group: [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 781, in __aexit__ [PROXY] : | raise BaseExceptionGroup( [PROXY] : | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception) [PROXY] : +-+---------------- 1 ---------------- [PROXY] : | Traceback (most recent call last): [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi [PROXY] : | result = await app( # type: ignore[func-returns-value] [PROXY] : | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__ [PROXY] : | return await self.app(scope, receive, send) [PROXY] : | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/applications.py", line 1134, in __call__ [PROXY] : | await super().__call__(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__ [PROXY] : | await self.middleware_stack(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__ [PROXY] : | raise exc [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__ [PROXY] : | await self.app(scope, receive, _send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 63, in __call__ [PROXY] : | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app [PROXY] : | raise exc [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app [PROXY] : | await app(scope, receive, sender) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__ [PROXY] : | await self.app(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 716, in __call__ [PROXY] : | await self.middleware_stack(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 736, in app [PROXY] : | await route.handle(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 290, in handle [PROXY] : | await self.app(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 125, in app [PROXY] : | await wrap_app_handling_exceptions(app, request)(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app [PROXY] : | raise exc [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app [PROXY] : | await app(scope, receive, sender) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 112, in app [PROXY] : | await response(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 269, in __call__ [PROXY] : | with collapse_excgroups(): [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 158, in __exit__ [PROXY] : | self.gen.throw(typ, value, traceback) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups [PROXY] : | raise exc [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 273, in wrap [PROXY] : | await func() [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 253, in stream_response [PROXY] : | async for chunk in self.body_iterator: [PROXY] : | File "/workspace/.../test-epd/tests/e2e/epd/../../../tools/api_server.py", line 110, in stream_generator [PROXY] : | async for output in app.state.proxy.generate( [PROXY] : | File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 632, in generate [PROXY] : | await self._run_prefill(request, q) [PROXY] : | File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 476, in _run_prefill [PROXY] : | raise response [PROXY] : | RuntimeError: Request error: EngineCore encountered an issue. See stack trace (above) for the root cause. [PROXY] : +------------------------------------ [PROXY] : [PROXY] : During handling of the above exception, another exception occurred: [PROXY] : [PROXY] : Traceback (most recent call last): [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi [PROXY] : result = await app( # type: ignore[func-returns-value] [PROXY] : ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__ [PROXY] : return await self.app(scope, receive, send) [PROXY] : ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/applications.py", line 1134, in __call__ [PROXY] : await super().__call__(scope, receive, send) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__ [PROXY] : await self.middleware_stack(scope, receive, send) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__ [PROXY] : raise exc [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__ [PROXY] : await self.app(scope, receive, _send) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 63, in __call__ [PROXY] : await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app [PROXY] : raise exc [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app [PROXY] : await app(scope, receive, sender) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__ [PROXY] : await self.app(scope, receive, send) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 716, in __call__ [PROXY] : await self.middleware_stack(scope, receive, send) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 736, in app [PROXY] : await route.handle(scope, receive, send) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 290, in handle [PROXY] : await self.app(scope, receive, send) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 125, in app [PROXY] : await wrap_app_handling_exceptions(app, request)(scope, receive, send) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app [PROXY] : raise exc [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app [PROXY] : await app(scope, receive, sender) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 112, in app [PROXY] : await response(scope, receive, send) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 269, in __call__ [PROXY] : with collapse_excgroups(): [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 158, in __exit__ [PROXY] : self.gen.throw(typ, value, traceback) [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups [PROXY] : raise exc [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 273, in wrap [PROXY] : await func() [PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 253, in stream_response [PROXY] : async for chunk in self.body_iterator: [PROXY] : File "/workspace/.../test-epd/tests/e2e/epd/../../../tools/api_server.py", line 110, in stream_generator [PROXY] : async for output in app.state.proxy.generate( [PROXY] : File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 632, in generate [PROXY] : await self._run_prefill(request, q) [PROXY] : File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 476, in _run_prefill [PROXY] : raise response [PROXY] : RuntimeError: Request error: EngineCore encountered an issue. See stack trace (above) for the root cause. [MOONCAKE] : I20251205 22:00:53.377387 281473156277408 master_service.cpp:946] client_id=7732672241904003391-3380990598042666136, action=client_expired [MOONCAKE] : I20251205 22:00:53.377443 281473156277408 master_service.cpp:946] client_id=16303821524489573273-14217180225676256912, action=client_expired [MOONCAKE] : I20251205 22:00:53.377451 281473156277408 master_service.cpp:946] client_id=1315131055065746170-10921226819496639379, action=client_expired [MOONCAKE] : I20251205 22:00:53.383519 281473156277408 master_service.cpp:1008] client_id=1315131055065746170-10921226819496639379, segment_name=::1:12994, action=unmount_expired_segment [PROXY] : INFO: 127.0.0.1:45464 - "POST /v1/chat/completions HTTP/1.1" 200 OK [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] Generation failed for request 61bc98d4-415b-4ca9-9b86-f05d6d72b66c [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] Traceback (most recent call last): [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/llm-service/lm_service/workers/vllm/disagg_worker.py", line 445, in _generate [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] async for request_output in generator: [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 370, in generate [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] q = await self.add_request( [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] ^^^^^^^^^^^^^^^^^^^^^^^ [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 276, in add_request [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] raise EngineDeadError() [P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause. [PROXY] : ERROR: Exception in ASGI application [PROXY] : + Exception Group Traceback (most recent call last): [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_utils.py", line 79, in collapse_excgroups [PROXY] : | yield [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 270, in __call__ [PROXY] : | async with anyio.create_task_group() as task_group: [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 781, in __aexit__ [PROXY] : | raise BaseExceptionGroup( [PROXY] : | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception) [PROXY] : +-+---------------- 1 ---------------- [PROXY] : | Traceback (most recent call last): [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi [PROXY] : | result = await app( # type: ignore[func-returns-value] [PROXY] : | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__ [PROXY] : | return await self.app(scope, receive, send) [PROXY] : | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/applications.py", line 1134, in __call__ [PROXY] : | await super().__call__(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__ [PROXY] : | await self.middleware_stack(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__ [PROXY] : | raise exc [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__ [PROXY] : | await self.app(scope, receive, _send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 63, in __call__ [PROXY] : | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app [PROXY] : | raise exc [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app [PROXY] : | await app(scope, receive, sender) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__ [PROXY] : | await self.app(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 716, in __call__ [PROXY] : | await self.middleware_stack(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 736, in app [PROXY] : | await route.handle(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 290, in handle [PROXY] : | await self.app(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 125, in app [PROXY] : | await wrap_app_handling_exceptions(app, request)(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app [PROXY] : | raise exc [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app [PROXY] : | await app(scope, receive, sender) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 112, in app [PROXY] : | await response(scope, receive, send) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 269, in __call__ [PROXY] : | with collapse_excgroups(): [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 158, in __exit__ [PROXY] : | self.gen.throw(typ, value, traceback) [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups [PROXY] : | raise exc [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 273, in wrap [PROXY] : | await func() [PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 253, in stream_response [PROXY] : | async for chunk in self.body_iterator: [PROXY] : | File "/workspace/.../test-epd/tests/e2e/epd/../../../tools/api_server.py", line 110, in stream_generator [PROXY] : | async for output in app.state.proxy.generate( [PROXY] : | File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 632, in generate [PROXY] : | await self._run_prefill(request, q) [PROXY] : | File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 476, in _run_prefill [PROXY] : | raise response [PROXY] : | RuntimeError: Request error: EngineCore encountered an issue. See stack trace (above) for the root cause.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions