Your current environment
The output of vllm python collect_env.py
vllm commit link: 615fb1b
Collecting environment information...
==============================
OS : openEuler 24.03 (LTS-SP2) (aarch64)
GCC version : (GCC) 10.3.1
Clang version : Could not collect
CMake version : version 4.1.2
Libc version : glibc-2.38
==============================
PyTorch Info
PyTorch version : 2.7.1+cpu
Is debug build : False
CUDA used to build PyTorch : None
ROCM used to build PyTorch : N/A
==============================
Python Environment
Python version : 3.11.13 (main, Nov 2 2025, 08:49:25) [GCC 12.3.1 (openEuler 12.3.1-98.oe2403sp2)] (64-bit runtime)
Python platform : Linux-4.19.90-vhulk2211.3.0.h1912.eulerosv2r10.aarch64-aarch64-with-glibc2.38
==============================
CUDA / GPU Info
Is CUDA available : False
CUDA runtime version : No CUDA
CUDA_MODULE_LOADING set to : N/A
GPU models and configuration : No CUDA
Nvidia driver version : No CUDA
cuDNN version : No CUDA
HIP runtime version : N/A
MIOpen runtime version : N/A
Is XNNPACK available : True
==============================
CPU Info
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Vendor ID: HiSilicon
BIOS Vendor ID: HiSilicon
Model name: Kunpeng-920
BIOS Model name: HUAWEI Kunpeng 920 5250 To be filled by O.E.M. CPU @ 2.6GHz
BIOS CPU family: 280
Model: 0
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 4
Stepping: 0x1
BogoMIPS: 200.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm ssbs
L1d cache: 12 MiB (192 instances)
L1i cache: 12 MiB (192 instances)
L2 cache: 96 MiB (192 instances)
L3 cache: 192 MiB (8 instances)
NUMA node(s): 8
NUMA node0 CPU(s): 0-23
NUMA node1 CPU(s): 24-47
NUMA node2 CPU(s): 48-71
NUMA node3 CPU(s): 72-95
NUMA node4 CPU(s): 96-119
NUMA node5 CPU(s): 120-143
NUMA node6 CPU(s): 144-167
NUMA node7 CPU(s): 168-191
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
==============================
Versions of relevant libraries
[pip3] numpy==1.26.4
[pip3] pyzmq==27.1.0
[pip3] torch==2.7.1
[pip3] torch_npu==2.7.1
[pip3] torchaudio==2.8.0
[pip3] torchvision==0.22.1
[pip3] transformers==4.57.1
[conda] Could not collect
==============================
vLLM Info
ROCM Version : Could not collect
vLLM Version : 0.11.0
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled
GPU Topology:
Could not collect
==============================
Environment Variables
LD_LIBRARY_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/openssl-3.2.6/lib:/usr/local/lib64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:
TORCH_DEVICE_BACKEND_AUTOLOAD=1
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1
The output of vllm-ascend python collect_env.py
vllm-ascend commit link: 22e9188fa562a2d12f0c8514278a7a90b035c764
Collecting environment information...
PyTorch version: 2.7.1+cpu
Is debug build: False
OS: openEuler 24.03 (LTS-SP2) (aarch64)
GCC version: (GCC) 10.3.1
Clang version: Could not collect
CMake version: version 4.1.2
Libc version: glibc-2.38
Python version: 3.11.13 (main, Nov 2 2025, 08:49:25) [GCC 12.3.1 (openEuler 12.3.1-98.oe2403sp2)] (64-bit runtime)
Python platform: Linux-4.19.90-vhulk2211.3.0.h1912.eulerosv2r10.aarch64-aarch64-with-glibc2.38
CPU:
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Vendor ID: HiSilicon
BIOS Vendor ID: HiSilicon
Model name: Kunpeng-920
BIOS Model name: HUAWEI Kunpeng 920 5250 To be filled by O.E.M. CPU @ 2.6GHz
BIOS CPU family: 280
Model: 0
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 4
Stepping: 0x1
BogoMIPS: 200.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm ssbs
L1d cache: 12 MiB (192 instances)
L1i cache: 12 MiB (192 instances)
L2 cache: 96 MiB (192 instances)
L3 cache: 192 MiB (8 instances)
NUMA node(s): 8
NUMA node0 CPU(s): 0-23
NUMA node1 CPU(s): 24-47
NUMA node2 CPU(s): 48-71
NUMA node3 CPU(s): 72-95
NUMA node4 CPU(s): 96-119
NUMA node5 CPU(s): 120-143
NUMA node6 CPU(s): 144-167
NUMA node7 CPU(s): 168-191
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] pyzmq==27.1.0
[pip3] torch==2.7.1
[pip3] torch_npu==2.7.1
[pip3] torchaudio==2.8.0
[pip3] torchvision==0.22.1
[pip3] transformers==4.57.1
[conda] Could not collect
vLLM Version: 0.11.0
vLLM Ascend Version: 0.11.0
ENV Variables:
ATB_OPSRUNNER_KERNEL_CACHE_LOCAL_COUNT=1
ATB_STREAM_SYNC_EVERY_RUNNER_ENABLE=0
ATB_OPSRUNNER_SETUP_CACHE_ENABLE=1
ATB_WORKSPACE_MEM_ALLOC_GLOBAL=1
ATB_DEVICE_TILING_BUFFER_BLOCK_NUM=32
ATB_STREAM_SYNC_EVERY_KERNEL_ENABLE=0
ATB_OPSRUNNER_KERNEL_CACHE_GLOABL_COUNT=5
ATB_HOME_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1
ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
ATB_COMPARE_TILING_EVERY_KERNEL=0
ASCEND_OPP_PATH=/usr/local/Ascend/ascend-toolkit/latest/opp
LD_LIBRARY_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/openssl-3.2.6/lib:/usr/local/lib64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:
ASCEND_AICPU_PATH=/usr/local/Ascend/ascend-toolkit/latest
ATB_STREAM_SYNC_EVERY_OPERATION_ENABLE=0
ASCEND_HOME_PATH=/usr/local/Ascend/ascend-toolkit/latest
ATB_MATMUL_SHUFFLE_K_ENABLE=1
ATB_WORKSPACE_MEM_ALLOC_ALG_TYPE=1
ATB_HOST_TILING_BUFFER_BLOCK_NUM=128
ATB_SHARE_MEMORY_NAME_SUFFIX=
TORCH_DEVICE_BACKEND_AUTOLOAD=1
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1
NPU:
+------------------------------------------------------------------------------------------------+
| npu-smi 24.1.rc2 Version: 24.1.rc2 |
+---------------------------+---------------+----------------------------------------------------+
CANN:
package_name=Ascend-cann-toolkit
version=8.3.RC1
innerversion=V100R001C23SPC001B235
compatible_version=[V100R001C15],[V100R001C18],[V100R001C19],[V100R001C20],[V100R001C21],[V100R001C23]
arch=aarch64
os=linux
path=/usr/local/Ascend/ascend-toolkit/8.3.RC1/aarch64-linux
The output of llm-service
commit link: 7ec9efbb2ab214672710d7b1ec075ca271a6d2cd
🐛 Describe the bug
Run the following command to reproduce the error:
不使用redis发现服务,使用Qwen2.5-VL-7B-Instruct模型,用ipv6,拉起1proxy1e1p1d实例,proxy、worker单机,开启前缀缓存,拉起实例成功,持续发送请求,先扩容p实例,扩容正常,有请求发送到扩容实例中,缩容p实例,缩容正常;再扩容D实例,扩容正常,有请求发送到扩容实例中,缩容D实例,缩容后异常请求异常,P实例异常。报错AssertionError: Encoder cache miss forxxx
Error output:
[ENCODE_0] : I20251205 22:00:53.362879 281408496660896 client.cpp:1172] Successfully revoked failed put for key ad8c7f9be928df406a871420df753a2fa826135d6b0c4c92b7c9255052a92c9d
[ENCODE_0] : E20251205 22:00:53.362895 281408496660896 client.cpp:1210] Operation for key ad8c7f9be928df406a871420df753a2fa826135d6b0c4c92b7c9255052a92c9d failed: TRANSFER_FAIL (TRANSFER_FAIL: Transfer 0 failed; )
[ENCODE_0] : E20251205 22:00:53.363321 281408102461856 tcp_transport.cpp:487] TcpTransport::startTransfer encountered an ASIO exception. Slice details - source_addr: 0x12c1c7390000, length: 28672, opcode: 1, target_id: 3. Exception: connect: Connection refused
[ENCODE_0] : E20251205 22:00:53.363348 281408102461856 transfer_task.cpp:247] Transfer failed for batch 281459541691744 task 0 with status 6
[ENCODE_0] : E20251205 22:00:53.363357 281408102461856 client.cpp:1071] Transfer failed for key 4ec1a571bd57bb7f286fb718d55278fa369e7a4c2a3ae07b7d65b2d510988b91: TRANSFER_FAIL (Transfer 0 failed)
[ENCODE_0] : I20251205 22:00:53.363423 281408102461856 client.cpp:1172] Successfully revoked failed put for key 4ec1a571bd57bb7f286fb718d55278fa369e7a4c2a3ae07b7d65b2d510988b91
[ENCODE_0] : E20251205 22:00:53.363435 281408102461856 client.cpp:1210] Operation for key 4ec1a571bd57bb7f286fb718d55278fa369e7a4c2a3ae07b7d65b2d510988b91 failed: TRANSFER_FAIL (TRANSFER_FAIL: Transfer 0 failed; )
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] Generation failed for request da364d0d-2673-47ec-9893-f4cefb677418
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] Traceback (most recent call last):
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/llm-service/lm_service/workers/vllm/disagg_worker.py", line 445, in _generate
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] async for request_output in generator:
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 370, in generate
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] q = await self.add_request(
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] ^^^^^^^^^^^^^^^^^^^^^^^
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 276, in add_request
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] raise EngineDeadError()
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.
[PROXY] : ERROR: Exception in ASGI application
[PROXY] : + Exception Group Traceback (most recent call last):
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_utils.py", line 79, in collapse_excgroups
[PROXY] : | yield
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 270, in __call__
[PROXY] : | async with anyio.create_task_group() as task_group:
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 781, in __aexit__
[PROXY] : | raise BaseExceptionGroup(
[PROXY] : | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
[PROXY] : +-+---------------- 1 ----------------
[PROXY] : | Traceback (most recent call last):
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
[PROXY] : | result = await app( # type: ignore[func-returns-value]
[PROXY] : | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
[PROXY] : | return await self.app(scope, receive, send)
[PROXY] : | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/applications.py", line 1134, in __call__
[PROXY] : | await super().__call__(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__
[PROXY] : | await self.middleware_stack(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
[PROXY] : | raise exc
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
[PROXY] : | await self.app(scope, receive, _send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
[PROXY] : | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
[PROXY] : | raise exc
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
[PROXY] : | await app(scope, receive, sender)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
[PROXY] : | await self.app(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 716, in __call__
[PROXY] : | await self.middleware_stack(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 736, in app
[PROXY] : | await route.handle(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 290, in handle
[PROXY] : | await self.app(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 125, in app
[PROXY] : | await wrap_app_handling_exceptions(app, request)(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
[PROXY] : | raise exc
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
[PROXY] : | await app(scope, receive, sender)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 112, in app
[PROXY] : | await response(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 269, in __call__
[PROXY] : | with collapse_excgroups():
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 158, in __exit__
[PROXY] : | self.gen.throw(typ, value, traceback)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
[PROXY] : | raise exc
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 273, in wrap
[PROXY] : | await func()
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 253, in stream_response
[PROXY] : | async for chunk in self.body_iterator:
[PROXY] : | File "/workspace/.../test-epd/tests/e2e/epd/../../../tools/api_server.py", line 110, in stream_generator
[PROXY] : | async for output in app.state.proxy.generate(
[PROXY] : | File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 632, in generate
[PROXY] : | await self._run_prefill(request, q)
[PROXY] : | File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 476, in _run_prefill
[PROXY] : | raise response
[PROXY] : | RuntimeError: Request error: EngineCore encountered an issue. See stack trace (above) for the root cause.
[PROXY] : +------------------------------------
[PROXY] :
[PROXY] : During handling of the above exception, another exception occurred:
[PROXY] :
[PROXY] : Traceback (most recent call last):
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
[PROXY] : result = await app( # type: ignore[func-returns-value]
[PROXY] : ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
[PROXY] : return await self.app(scope, receive, send)
[PROXY] : ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/applications.py", line 1134, in __call__
[PROXY] : await super().__call__(scope, receive, send)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__
[PROXY] : await self.middleware_stack(scope, receive, send)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
[PROXY] : raise exc
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
[PROXY] : await self.app(scope, receive, _send)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
[PROXY] : await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
[PROXY] : raise exc
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
[PROXY] : await app(scope, receive, sender)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
[PROXY] : await self.app(scope, receive, send)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 716, in __call__
[PROXY] : await self.middleware_stack(scope, receive, send)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 736, in app
[PROXY] : await route.handle(scope, receive, send)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 290, in handle
[PROXY] : await self.app(scope, receive, send)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 125, in app
[PROXY] : await wrap_app_handling_exceptions(app, request)(scope, receive, send)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
[PROXY] : raise exc
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
[PROXY] : await app(scope, receive, sender)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 112, in app
[PROXY] : await response(scope, receive, send)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 269, in __call__
[PROXY] : with collapse_excgroups():
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 158, in __exit__
[PROXY] : self.gen.throw(typ, value, traceback)
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
[PROXY] : raise exc
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 273, in wrap
[PROXY] : await func()
[PROXY] : File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 253, in stream_response
[PROXY] : async for chunk in self.body_iterator:
[PROXY] : File "/workspace/.../test-epd/tests/e2e/epd/../../../tools/api_server.py", line 110, in stream_generator
[PROXY] : async for output in app.state.proxy.generate(
[PROXY] : File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 632, in generate
[PROXY] : await self._run_prefill(request, q)
[PROXY] : File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 476, in _run_prefill
[PROXY] : raise response
[PROXY] : RuntimeError: Request error: EngineCore encountered an issue. See stack trace (above) for the root cause.
[MOONCAKE] : I20251205 22:00:53.377387 281473156277408 master_service.cpp:946] client_id=7732672241904003391-3380990598042666136, action=client_expired
[MOONCAKE] : I20251205 22:00:53.377443 281473156277408 master_service.cpp:946] client_id=16303821524489573273-14217180225676256912, action=client_expired
[MOONCAKE] : I20251205 22:00:53.377451 281473156277408 master_service.cpp:946] client_id=1315131055065746170-10921226819496639379, action=client_expired
[MOONCAKE] : I20251205 22:00:53.383519 281473156277408 master_service.cpp:1008] client_id=1315131055065746170-10921226819496639379, segment_name=::1:12994, action=unmount_expired_segment
[PROXY] : INFO: 127.0.0.1:45464 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] Generation failed for request 61bc98d4-415b-4ca9-9b86-f05d6d72b66c
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] Traceback (most recent call last):
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/llm-service/lm_service/workers/vllm/disagg_worker.py", line 445, in _generate
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] async for request_output in generator:
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 370, in generate
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] q = await self.add_request(
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] ^^^^^^^^^^^^^^^^^^^^^^^
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 276, in add_request
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] raise EngineDeadError()
[P_0] : ERROR 12-05 22:00:53 [disagg_worker.py:456] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.
[PROXY] : ERROR: Exception in ASGI application
[PROXY] : + Exception Group Traceback (most recent call last):
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_utils.py", line 79, in collapse_excgroups
[PROXY] : | yield
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 270, in __call__
[PROXY] : | async with anyio.create_task_group() as task_group:
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 781, in __aexit__
[PROXY] : | raise BaseExceptionGroup(
[PROXY] : | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
[PROXY] : +-+---------------- 1 ----------------
[PROXY] : | Traceback (most recent call last):
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
[PROXY] : | result = await app( # type: ignore[func-returns-value]
[PROXY] : | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
[PROXY] : | return await self.app(scope, receive, send)
[PROXY] : | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/applications.py", line 1134, in __call__
[PROXY] : | await super().__call__(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__
[PROXY] : | await self.middleware_stack(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
[PROXY] : | raise exc
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
[PROXY] : | await self.app(scope, receive, _send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
[PROXY] : | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
[PROXY] : | raise exc
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
[PROXY] : | await app(scope, receive, sender)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
[PROXY] : | await self.app(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 716, in __call__
[PROXY] : | await self.middleware_stack(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 736, in app
[PROXY] : | await route.handle(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/routing.py", line 290, in handle
[PROXY] : | await self.app(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 125, in app
[PROXY] : | await wrap_app_handling_exceptions(app, request)(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
[PROXY] : | raise exc
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
[PROXY] : | await app(scope, receive, sender)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/fastapi/routing.py", line 112, in app
[PROXY] : | await response(scope, receive, send)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 269, in __call__
[PROXY] : | with collapse_excgroups():
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 158, in __exit__
[PROXY] : | self.gen.throw(typ, value, traceback)
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/_utils.py", line 85, in collapse_excgroups
[PROXY] : | raise exc
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 273, in wrap
[PROXY] : | await func()
[PROXY] : | File "/usr/local/python3.11.13/lib/python3.11/site-packages/starlette/responses.py", line 253, in stream_response
[PROXY] : | async for chunk in self.body_iterator:
[PROXY] : | File "/workspace/.../test-epd/tests/e2e/epd/../../../tools/api_server.py", line 110, in stream_generator
[PROXY] : | async for output in app.state.proxy.generate(
[PROXY] : | File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 632, in generate
[PROXY] : | await self._run_prefill(request, q)
[PROXY] : | File "/vllm-workspace/llm-service/lm_service/apis/vllm/proxy.py", line 476, in _run_prefill
[PROXY] : | raise response
[PROXY] : | RuntimeError: Request error: EngineCore encountered an issue. See stack trace (above) for the root cause.
Before submitting a new issue...
Before submitting a new issue...