-
Notifications
You must be signed in to change notification settings - Fork 8
Description
Your current environment
The output of vllm python collect_env.py
vllm commit link: 615fb1b
Collecting environment information...
System Info
==============================
OS : openEuler 24.03 (LTS-SP2) (aarch64)
GCC version : (GCC) 10.3.1
Clang version : Could not collect
CMake version : version 4.1.2
Libc version : glibc-2.38
==============================
PyTorch Info
PyTorch version : 2.7.1+cpu
Is debug build : False
CUDA used to build PyTorch : None
ROCM used to build PyTorch : N/A
==============================
Python Environment
Python version : 3.11.13 (main, Nov 2 2025, 08:49:25) [GCC 12.3.1 (openEuler 12.3.1-98.oe2403sp2)] (64-bit runtime)
Python platform : Linux-4.19.90-vhulk2211.3.0.h1912.eulerosv2r10.aarch64-aarch64-with-glibc2.38
==============================
CUDA / GPU Info
Is CUDA available : False
CUDA runtime version : No CUDA
CUDA_MODULE_LOADING set to : N/A
GPU models and configuration : No CUDA
Nvidia driver version : No CUDA
cuDNN version : No CUDA
HIP runtime version : N/A
MIOpen runtime version : N/A
Is XNNPACK available : True
==============================
CPU Info
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Vendor ID: HiSilicon
BIOS Vendor ID: HiSilicon
Model name: Kunpeng-920
BIOS Model name: HUAWEI Kunpeng 920 5250 To be filled by O.E.M. CPU @ 2.6GHz
BIOS CPU family: 280
Model: 0
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 4
Stepping: 0x1
BogoMIPS: 200.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm ssbs
L1d cache: 12 MiB (192 instances)
L1i cache: 12 MiB (192 instances)
L2 cache: 96 MiB (192 instances)
L3 cache: 192 MiB (8 instances)
NUMA node(s): 8
NUMA node0 CPU(s): 0-23
NUMA node1 CPU(s): 24-47
NUMA node2 CPU(s): 48-71
NUMA node3 CPU(s): 72-95
NUMA node4 CPU(s): 96-119
NUMA node5 CPU(s): 120-143
NUMA node6 CPU(s): 144-167
NUMA node7 CPU(s): 168-191
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
==============================
Versions of relevant libraries
[pip3] numpy==1.26.4
[pip3] pyzmq==27.1.0
[pip3] torch==2.7.1
[pip3] torch_npu==2.7.1
[pip3] torchaudio==2.8.0
[pip3] torchvision==0.22.1
[pip3] transformers==4.57.1
[conda] Could not collect
==============================
vLLM Info
ROCM Version : Could not collect
vLLM Version : 0.11.0
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled
GPU Topology:
Could not collect
==============================
Environment Variables
LD_LIBRARY_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/openssl-3.2.6/lib:/usr/local/lib64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:
TORCH_DEVICE_BACKEND_AUTOLOAD=1
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1
The output of vllm-ascend python collect_env.py
vllm-ascend commit link: 22e9188fa562a2d12f0c8514278a7a90b035c764
Collecting environment information...
PyTorch version: 2.7.1+cpu
Is debug build: False
OS: openEuler 24.03 (LTS-SP2) (aarch64)
GCC version: (GCC) 10.3.1
Clang version: Could not collect
CMake version: version 4.1.2
Libc version: glibc-2.38
Python version: 3.11.13 (main, Nov 2 2025, 08:49:25) [GCC 12.3.1 (openEuler 12.3.1-98.oe2403sp2)] (64-bit runtime)
Python platform: Linux-4.19.90-vhulk2211.3.0.h1912.eulerosv2r10.aarch64-aarch64-with-glibc2.38
CPU:
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Vendor ID: HiSilicon
BIOS Vendor ID: HiSilicon
Model name: Kunpeng-920
BIOS Model name: HUAWEI Kunpeng 920 5250 To be filled by O.E.M. CPU @ 2.6GHz
BIOS CPU family: 280
Model: 0
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 4
Stepping: 0x1
BogoMIPS: 200.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm ssbs
L1d cache: 12 MiB (192 instances)
L1i cache: 12 MiB (192 instances)
L2 cache: 96 MiB (192 instances)
L3 cache: 192 MiB (8 instances)
NUMA node(s): 8
NUMA node0 CPU(s): 0-23
NUMA node1 CPU(s): 24-47
NUMA node2 CPU(s): 48-71
NUMA node3 CPU(s): 72-95
NUMA node4 CPU(s): 96-119
NUMA node5 CPU(s): 120-143
NUMA node6 CPU(s): 144-167
NUMA node7 CPU(s): 168-191
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] pyzmq==27.1.0
[pip3] torch==2.7.1
[pip3] torch_npu==2.7.1
[pip3] torchaudio==2.8.0
[pip3] torchvision==0.22.1
[pip3] transformers==4.57.1
[conda] Could not collect
vLLM Version: 0.11.0
vLLM Ascend Version: 0.11.0
ENV Variables:
ATB_OPSRUNNER_KERNEL_CACHE_LOCAL_COUNT=1
ATB_STREAM_SYNC_EVERY_RUNNER_ENABLE=0
ATB_OPSRUNNER_SETUP_CACHE_ENABLE=1
ATB_WORKSPACE_MEM_ALLOC_GLOBAL=1
ATB_DEVICE_TILING_BUFFER_BLOCK_NUM=32
ATB_STREAM_SYNC_EVERY_KERNEL_ENABLE=0
ATB_OPSRUNNER_KERNEL_CACHE_GLOABL_COUNT=5
ATB_HOME_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1
ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
ATB_COMPARE_TILING_EVERY_KERNEL=0
ASCEND_OPP_PATH=/usr/local/Ascend/ascend-toolkit/latest/opp
LD_LIBRARY_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/openssl-3.2.6/lib:/usr/local/lib64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_1/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/lib:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/examples:/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0/tests/atbopstest:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:
ASCEND_AICPU_PATH=/usr/local/Ascend/ascend-toolkit/latest
ATB_STREAM_SYNC_EVERY_OPERATION_ENABLE=0
ASCEND_HOME_PATH=/usr/local/Ascend/ascend-toolkit/latest
ATB_MATMUL_SHUFFLE_K_ENABLE=1
ATB_WORKSPACE_MEM_ALLOC_ALG_TYPE=1
ATB_HOST_TILING_BUFFER_BLOCK_NUM=128
ATB_SHARE_MEMORY_NAME_SUFFIX=
TORCH_DEVICE_BACKEND_AUTOLOAD=1
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1
NPU:
+------------------------------------------------------------------------------------------------+
| npu-smi 24.1.rc2 Version: 24.1.rc2 |
+---------------------------+---------------+----------------------------------------------------+
CANN:
package_name=Ascend-cann-toolkit
version=8.3.RC1
innerversion=V100R001C23SPC001B235
compatible_version=[V100R001C15],[V100R001C18],[V100R001C19],[V100R001C20],[V100R001C21],[V100R001C23]
arch=aarch64
os=linux
path=/usr/local/Ascend/ascend-toolkit/8.3.RC1/aarch64-linux
The output of llm-service
commit link: 7ec9efbb2ab214672710d7b1ec075ca271a6d2cd🐛 Describe the bug
Run the following command to reproduce the error:
[Bug]: 不使用redis发现服务,使用Qwen2.5-VL-7B-Instruct模型,用ipv6,拉起1proxy1e1p1d实例,proxy、worker单机,开启前缀缓存,拉起实例成功,持续发送请求,先扩缩容p实例,再扩缩容d实例,缩容d完成后报错AssertionError: Encoder cache miss forxxx ,初始P实例异常死亡Error output:
[P_0] : E20251209 21:51:15.872431 281473081813536 tcp_transport.cpp:528] TcpTransport::startTransfer encountered an ASIO exception. Slice details - source_addr: 0x12c200038000, length: 28672, opcode: 0, target_id: 6. Exception: connect: Connection refused [P_0] : E20251209 21:51:15.872454 281473081813536 transfer_task.cpp:247] Transfer failed for batch 1802411712 task 0 with status 6 [P_0] : E20251209 21:51:15.872462 281473081813536 client.cpp:727] Transfer failed for key: bbaa754374649997a7448c6c9d5ffd1a07b6b28943f5ceb3d3d124ea7fe8dcc0 with error: -800 [P_0] : E20251209 21:51:15.872472 281473081813536 pybind_client.cpp:1143] BatchGet failed for key 'bbaa754374649997a7448c6c9d5ffd1a07b6b28943f5ceb3d3d124ea7fe8dcc0': TRANSFER_FAIL [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [mooncake_storage_connector.py:92] Load failed for 27cf26f42741411952861572f3bf9587034b364c335b239d504bcdc69a08c245 [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [mooncake_storage_connector.py:92] Load failed for bbaa754374649997a7448c6c9d5ffd1a07b6b28943f5ceb3d3d124ea7fe8dcc0 [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [mooncake_storage_connector.py:92] Load failed for c143c3ee8bf51b7cf6a5227c795f91fc959f1b532c9146b2141b10b748e8de43 [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] EngineCore encountered a fatal error. [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] Traceback (most recent call last): [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 701, in run_engine_core [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] engine_core.run_busy_loop() [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 728, in run_busy_loop [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] self._process_engine_step() [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 754, in _process_engine_step [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] outputs, model_executed = self.step_fn() [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 284, in step [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] model_output = self.execute_model_with_error_logging( [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 270, in execute_model_with_error_logging [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] raise err [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 261, in execute_model_with_error_logging [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] return model_fn(scheduler_output) [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm/vllm/v1/executor/abstract.py", line 103, in execute_model [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] output = self.collective_rpc("execute_model", [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm/vllm/executor/uniproc_executor.py", line 83, in collective_rpc [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] return [run_method(self.driver_worker, method, args, kwargs)] [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm/vllm/utils/__init__.py", line 3122, in run_method [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] return func(*args, **kwargs) [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker_v1.py", line 257, in execute_model [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] output = self.model_runner.execute_model(scheduler_output, [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] return func(*args, **kwargs) [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1960, in execute_model [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] max_query_len) = (self._prepare_inputs(scheduler_output, [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1426, in _prepare_inputs [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] mm_embeds = self._gather_mm_embeddings(scheduler_output) [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1126, in _gather_mm_embeddings [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] assert encoder_output is not None, \ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] ^^^^^^^^^^^^^^^^^^^^^^^^^^ [P_0] : ^[[1;36m(EngineCore_DP0 pid=524879)^[[0;0m ERROR 12-09 21:51:15 [core.py:710] AssertionError: Encoder cache miss for 27cf26f42741411952861572f3bf9587034b364c335b239d504bcdc69a08c245.Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.