[Bug]: PD合并 TP=4 拉起Qwen3-VL-30B-A3B-Instruct模型，并发32发送textvqa-subset数据集，请求卡死，5分钟后实例异常退出

### Your current environment

<details>
<summary>The output of <code>vllm python collect_env.py</code></summary>
vllm commit link: c1378b82fe18b7e410098f7706974e7c601388c8
</details>
<details>
<summary>The output of <code>vllm-ascend python collect_env.py</code></summary>
vllm-ascend commit link: f78db0894660f3e64afb29b204aeb204806ffe08
</details>


<details>
<summary>The output of llm-service</summary>
commit link: 5c37e8dbc71bfefd0c0fc2e00cca219221000e21
</details>

### 🐛 Describe the bug

<details>
<summary>Run the following command to reproduce the error:</summary>
ASCEND_RT_VISIBLE_DEVICES=0,1,2,3 python3 -m vllm.entrypoints.openai.api_server --model Qwen3-VL-30B-A3B-Instruct/ --gpu-memory-utilization 0.9 --port 13808 --enforce-eager --enable-request-id-headers --no-enable-prefix-caching --max-num-batched-tokens 18000 --max-num-seqs 128 --no-enable-prefix-caching --tensor-parallel-size 4 --max-model-len 18000 
</details>

<details>
<summary>Error output:</summary>
(EngineCore_DP0 pid=1532787) INFO 12-12 12:28:11 [shm_broadcast.py:466] No available shared memory broadcast block found in 60 seconds. This typically happens when some processes are hanging or doing some time-consuming work (e.g. compilation).
(EngineCore_DP0 pid=1532787) INFO 12-12 12:29:11 [shm_broadcast.py:466] No available shared memory broadcast block found in 60 seconds. This typically happens when some processes are hanging or doing some time-consuming work (e.g. compilation).
(EngineCore_DP0 pid=1532787) INFO 12-12 12:30:11 [shm_broadcast.py:466] No available shared memory broadcast block found in 60 seconds. This typically happens when some processes are hanging or doing some time-consuming work (e.g. compilation).
(EngineCore_DP0 pid=1532787) INFO 12-12 12:31:11 [shm_broadcast.py:466] No available shared memory broadcast block found in 60 seconds. This typically happens when some processes are hanging or doing some time-consuming work (e.g. compilation).
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:69] Dumping input data for V1 LLM engine (v0.11.0) with config: model='/data/models/Qwen3-VL-30B-A3B-Instruct/', speculative_config=None, tokenizer='/data/models/Qwen3-VL-30B-A3B-Instruct/', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=10000, download_dir=None, load_format=auto, tensor_parallel_size=4, pipeline_parallel_size=1, data_parallel_size=1, disable_custom_all_reduce=True, quantization=None, enforce_eager=True, kv_cache_dtype=auto, device_config=npu, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser=''), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=0, served_model_name=/data/models/Qwen3-VL-30B-A3B-Instruct/, enable_prefix_caching=False, chunked_prefill_enabled=True, pooler_config=None, compilation_config={"level":0,"debug_dump_path":"","cache_dir":"","backend":"","custom_ops":["all"],"splitting_ops":null,"use_inductor":true,"compile_sizes":[],"inductor_compile_config":{"enable_auto_functionalized_v2":false},"inductor_passes":{},"cudagraph_mode":0,"use_cudagraph":true,"cudagraph_num_of_warmups":1,"cudagraph_capture_sizes":[],"cudagraph_copy_inputs":false,"full_cuda_graph":false,"use_inductor_graph_partition":false,"pass_config":{},"max_capture_size":0,"local_cache_dir":null},
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76] Dumping scheduler output for model execution: SchedulerOutput(scheduled_new_reqs=[NewRequestData(req_id=chatcmpl-47adc6f12a004694a5457d72b73910fd,prompt_token_ids_len=826,mm_features=[MultiModalFeatureSpec(data={'image_grid_thw': MultiModalFieldElem(modality='image', key='image_grid_thw', data=tensor([ 1, 50, 64]), field=MultiModalBatchedField()), 'pixel_values': MultiModalFieldElem(modality='image', key='pixel_values', data=tensor([[0.9922, 0.9922, 0.9844,  ..., 1.0000, 1.0000, 1.0000],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [1.0000, 1.0000, 1.0000,  ..., 1.0000, 1.0000, 1.0000],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [1.0000, 1.0000, 1.0000,  ..., 1.0000, 1.0000, 1.0000],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         ...,
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [1.0000, 1.0000, 1.0000,  ..., 1.0000, 1.0000, 1.0000],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [1.0000, 1.0000, 1.0000,  ..., 1.0000, 1.0000, 1.0000],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [1.0000, 1.0000, 1.0000,  ..., 1.0000, 1.0000, 1.0000]],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]        dtype=torch.bfloat16), field=MultiModalFlatField(slices=[[slice(0, 3200, None)]], dim=0))}, modality='image', identifier='9aa5cfebd71d6edf0bd31a6e929c7e67f1720101d2b4ac0c98bd154948865243', mm_position=PlaceholderRange(offset=4, length=800, is_embed=None))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=77, stop=[], stop_token_ids=[151643], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([698, 997, 998, 999, 1000, 893, 894],),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None), NewRequestData(req_id=chatcmpl-ce8258fc5ef34634a97b50bcefcf369f,prompt_token_ids_len=794,mm_features=[MultiModalFeatureSpec(data={'image_grid_thw': MultiModalFieldElem(modality='image', key='image_grid_thw', data=tensor([ 1, 48, 64]), field=MultiModalBatchedField()), 'pixel_values': MultiModalFieldElem(modality='image', key='pixel_values', data=tensor([[-0.0275, -0.0275, -0.0510,  ..., -0.6172, -0.5938, -0.5625],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.0669, -0.0354, -0.0275,  ..., -0.6641, -0.6641, -0.6250],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.0825, -0.0981, -0.0825,  ..., -0.5938, -0.5859, -0.5859],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         ...,
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.7500, -0.7344, -0.7344,  ..., -0.9375, -0.9375, -0.9375],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.7266, -0.7578, -0.7812,  ..., -0.9375, -0.9141, -0.8828],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.8047, -0.7891, -0.7266,  ..., -0.9141, -1.0000, -1.0000]],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]        dtype=torch.bfloat16), field=MultiModalFlatField(slices=[[slice(0, 3072, None)]], dim=0))}, modality='image', identifier='b7a2987df2467b785f90a460fe2bfd5510ec2d313a4aa69de2cc487d5d5c0b2f', mm_position=PlaceholderRange(offset=4, length=768, is_embed=None))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=77, stop=[], stop_token_ids=[151643], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([895, 896, 897, 898, 899, 1213, 1214],),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None), NewRequestData(req_id=chatcmpl-f7e2b6dc767c4340b7597b8b81e8b97c,prompt_token_ids_len=702,mm_features=[MultiModalFeatureSpec(data={'image_grid_thw': MultiModalFieldElem(modality='image', key='image_grid_thw', data=tensor([ 1, 42, 64]), field=MultiModalBatchedField()), 'pixel_values': MultiModalFieldElem(modality='image', key='pixel_values', data=tensor([[ 0.1611,  0.1611,  0.1611,  ..., -0.1533, -0.1533, -0.1533],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.1846,  0.1846,  0.1846,  ..., -0.1299, -0.1216, -0.1216],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.1846,  0.1924,  0.1924,  ..., -0.1377, -0.1299, -0.1216],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         ...,
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.0039, -0.0039,  0.0039,  ..., -0.2871, -0.2637, -0.2871],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.0275,  0.0275,  0.0275,  ..., -0.2793, -0.2871, -0.2871],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.0039,  0.0118,  0.0197,  ..., -0.2871, -0.2949, -0.3105]],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]        dtype=torch.bfloat16), field=MultiModalFlatField(slices=[[slice(0, 2688, None)]], dim=0))}, modality='image', identifier='1d41b91d62557f18556549c9fc67c39935645f1cec7c277dcfa194bd13eff451', mm_position=PlaceholderRange(offset=4, length=672, is_embed=None))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=77, stop=[], stop_token_ids=[151643], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([1215, 1216, 1198, 1199, 1228, 1229],),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None), NewRequestData(req_id=chatcmpl-7cbc224c6b424eae9e884e3e93ffdb67,prompt_token_ids_len=793,mm_features=[MultiModalFeatureSpec(data={'image_grid_thw': MultiModalFieldElem(modality='image', key='image_grid_thw', data=tensor([ 1, 48, 64]), field=MultiModalBatchedField()), 'pixel_values': MultiModalFieldElem(modality='image', key='pixel_values', data=tensor([[-0.0275, -0.0275, -0.0510,  ..., -0.6172, -0.5938, -0.5625],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.0669, -0.0354, -0.0275,  ..., -0.6641, -0.6641, -0.6250],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.0825, -0.0981, -0.0825,  ..., -0.5938, -0.5859, -0.5859],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         ...,
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.7500, -0.7344, -0.7344,  ..., -0.9375, -0.9375, -0.9375],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.7266, -0.7578, -0.7812,  ..., -0.9375, -0.9141, -0.8828],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.8047, -0.7891, -0.7266,  ..., -0.9141, -1.0000, -1.0000]],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]        dtype=torch.bfloat16), field=MultiModalFlatField(slices=[[slice(0, 3072, None)]], dim=0))}, modality='image', identifier='b7a2987df2467b785f90a460fe2bfd5510ec2d313a4aa69de2cc487d5d5c0b2f', mm_position=PlaceholderRange(offset=4, length=768, is_embed=None))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=77, stop=[], stop_token_ids=[151643], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([1230, 1231, 1211, 1212, 1256, 1257, 1258],),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None), NewRequestData(req_id=chatcmpl-c00326c2f3504405bde8fb442cfebfdd,prompt_token_ids_len=796,mm_features=[MultiModalFeatureSpec(data={'image_grid_thw': MultiModalFieldElem(modality='image', key='image_grid_thw', data=tensor([ 1, 48, 64]), field=MultiModalBatchedField()), 'pixel_values': MultiModalFieldElem(modality='image', key='pixel_values', data=tensor([[-0.0275,  0.0039, -0.0118,  ...,  0.2002,  0.1689,  0.1768],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.3105,  0.3105,  0.2949,  ...,  0.1060,  0.0825,  0.0669],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.0118,  0.0197,  0.0275,  ...,  0.1924,  0.2002,  0.2002],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         ...,
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.3965,  0.4043,  0.4355,  ..., -0.1377, -0.1455, -0.1455],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.3965,  0.3887,  0.3887,  ..., -0.1689, -0.1846, -0.1689],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.3887,  0.4121,  0.3965,  ..., -0.2637, -0.3652, -0.4199]],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]        dtype=torch.bfloat16), field=MultiModalFlatField(slices=[[slice(0, 3072, None)]], dim=0))}, modality='image', identifier='90d4b40e1aaccc35741d4adbc632bfe125d33bb8b65acdfdbd94b30b77683768', mm_position=PlaceholderRange(offset=4, length=768, is_embed=None))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=77, stop=[], stop_token_ids=[151643], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([1259, 1260, 1261, 1318, 1319, 1293, 1294],),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None), NewRequestData(req_id=chatcmpl-47920a3a434246af9e1cb483cdb94f8b,prompt_token_ids_len=798,mm_features=[MultiModalFeatureSpec(data={'image_grid_thw': MultiModalFieldElem(modality='image', key='image_grid_thw', data=tensor([ 1, 48, 64]), field=MultiModalBatchedField()), 'pixel_values': MultiModalFieldElem(modality='image', key='pixel_values', data=tensor([[-0.0275,  0.0039, -0.0118,  ...,  0.2002,  0.1689,  0.1768],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.3105,  0.3105,  0.2949,  ...,  0.1060,  0.0825,  0.0669],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [-0.0118,  0.0197,  0.0275,  ...,  0.1924,  0.2002,  0.2002],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         ...,
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.3965,  0.4043,  0.4355,  ..., -0.1377, -0.1455, -0.1455],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.3965,  0.3887,  0.3887,  ..., -0.1689, -0.1846, -0.1689],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]         [ 0.3887,  0.4121,  0.3965,  ..., -0.2637, -0.3652, -0.4199]],
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:76]        dtype=torch.bfloat16), field=MultiModalFlatField(slices=[[slice(0, 3072, None)]], dim=0))}, modality='image', identifier='90d4b40e1aaccc35741d4adbc632bfe125d33bb8b65acdfdbd94b30b77683768', mm_position=PlaceholderRange(offset=4, length=768, is_embed=None))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=77, stop=[], stop_token_ids=[151643], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([1295, 1296, 1322, 1323, 1324, 1314, 1315],),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None)], scheduled_cached_reqs=CachedRequestData(req_ids=['chatcmpl-5ce7be5891984269afcb9e69951c03a2', 'chatcmpl-38c0f3b0324e41fa80445e94ea376ad8', 'chatcmpl-bb7abdd8cd9646c8b19061db190d478e', 'chatcmpl-5e48126b1bff4cbf9b32f95235878c00', 'chatcmpl-63e6f87bdb4d46cdb195647acaaf31af', 'chatcmpl-8a2a96bbe96f4a03be158c97a591ae78', 'chatcmpl-98ddce561aff4485a62193d238d4bdb5', 'chatcmpl-4504192b0b6c4e9c83d8e1f3457f0a54', 'chatcmpl-a9ffdec7f4a445b780d4fd36ff417dd3', 'chatcmpl-cb3b31082f624b1a970b7523566157a6', 'chatcmpl-91da0eeda2e045aba9a9caee61c07f82', 'chatcmpl-75bd4088b3264e0aa13976728b6e0886', 'chatcmpl-c19032e7951e4ecea3aed5667da5c39e', 'chatcmpl-9d0bdd59b5764c7eb87ca01e2c536900', 'chatcmpl-aa5b82a1b0ae4799b9c2e496f6f5244f', 'chatcmpl-5a5ff59d47f447e2a2262a825088a41d', 'chatcmpl-99b235a23fb440cb84ad6776bf95c36a', 'chatcmpl-5d82771fba9442c9955f47e6069c980c', 'chatcmpl-6066ac058d7c42459299ac31f8eb1ba9', 'chatcmpl-25f63f598dbb42bdb1eff1306be9ea1b', 'chatcmpl-09aed0366a37455385c03583047a6cf7', 'chatcmpl-df72cdd54a8046e8ba98634046335f26', 'chatcmpl-1bbc501aa189461da831e3321f73bac3', 'chatcmpl-17217b7df00d49f4939e9b8ee3acc0eb', 'chatcmpl-ee661349c5ba4328ab83b6cc5a8882dc'], resumed_from_preemption=[false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false], new_token_ids=[], new_block_ids=[null, null, null, null, null, null, null, null, null, null, null, [[697]], null, null, null, null, null, null, null, null, null, null, null, null, null], num_computed_tokens=[706, 706, 670, 799, 797, 702, 705, 797, 796, 704, 700, 768, 765, 604, 798, 797, 700, 697, 797, 800, 799, 797, 700, 697, 832]), num_scheduled_tokens={chatcmpl-5d82771fba9442c9955f47e6069c980c: 1, chatcmpl-bb7abdd8cd9646c8b19061db190d478e: 1, chatcmpl-6066ac058d7c42459299ac31f8eb1ba9: 1, chatcmpl-38c0f3b0324e41fa80445e94ea376ad8: 1, chatcmpl-99b235a23fb440cb84ad6776bf95c36a: 1, chatcmpl-c19032e7951e4ecea3aed5667da5c39e: 1, chatcmpl-17217b7df00d49f4939e9b8ee3acc0eb: 1, chatcmpl-4504192b0b6c4e9c83d8e1f3457f0a54: 1, chatcmpl-7cbc224c6b424eae9e884e3e93ffdb67: 793, chatcmpl-ce8258fc5ef34634a97b50bcefcf369f: 794, chatcmpl-5e48126b1bff4cbf9b32f95235878c00: 1, chatcmpl-a9ffdec7f4a445b780d4fd36ff417dd3: 1, chatcmpl-8a2a96bbe96f4a03be158c97a591ae78: 1, chatcmpl-63e6f87bdb4d46cdb195647acaaf31af: 1, chatcmpl-df72cdd54a8046e8ba98634046335f26: 1, chatcmpl-47adc6f12a004694a5457d72b73910fd: 826, chatcmpl-09aed0366a37455385c03583047a6cf7: 1, chatcmpl-47920a3a434246af9e1cb483cdb94f8b: 798, chatcmpl-98ddce561aff4485a62193d238d4bdb5: 1, chatcmpl-cb3b31082f624b1a970b7523566157a6: 1, chatcmpl-5ce7be5891984269afcb9e69951c03a2: 1, chatcmpl-ee661349c5ba4328ab83b6cc5a8882dc: 1, chatcmpl-25f63f598dbb42bdb1eff1306be9ea1b: 1, chatcmpl-c00326c2f3504405bde8fb442cfebfdd: 796, chatcmpl-aa5b82a1b0ae4799b9c2e496f6f5244f: 1, chatcmpl-f7e2b6dc767c4340b7597b8b81e8b97c: 702, chatcmpl-9d0bdd59b5764c7eb87ca01e2c536900: 1, chatcmpl-91da0eeda2e045aba9a9caee61c07f82: 1, chatcmpl-1bbc501aa189461da831e3321f73bac3: 1, chatcmpl-75bd4088b3264e0aa13976728b6e0886: 1, chatcmpl-5a5ff59d47f447e2a2262a825088a41d: 1}, total_num_scheduled_tokens=4734, scheduled_spec_decode_tokens={}, scheduled_encoder_inputs={chatcmpl-f7e2b6dc767c4340b7597b8b81e8b97c: [0], chatcmpl-ce8258fc5ef34634a97b50bcefcf369f: [0], chatcmpl-c00326c2f3504405bde8fb442cfebfdd: [0]}, num_common_prefix_blocks=[0], finished_req_ids=['chatcmpl-0b6eb56742694a359f96602a12e54a18'], free_encoder_mm_hashes=['4305c2c3017eea60f2497b664cbf3c6ee327aa09cdf75fcbb496db1f15a45d20', 'b2c42b958566d9d27241c6580c7b60225559d5d7049bb2623e088e517d27d0da', '2608f628d1f799b1e16abe874a9e70af7f065fb9654a7c602e4d7bc03fba8e78'], structured_output_request_ids={}, grammar_bitmask=null, kv_connector_metadata=null, ec_connector_metadata=null)
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [dump_input.py:79] Dumping scheduler stats: SchedulerStats(num_running_reqs=31, num_waiting_reqs=0, step_counter=0, current_wave=0, kv_cache_usage=0.0613608748481167, prefix_cache_stats=PrefixCacheStats(reset=False, requests=0, queries=0, hits=0), spec_decoding_stats=None, kv_connector_stats=None, num_corrupted_reqs=0)
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710] EngineCore encountered a fatal error.
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710] Traceback (most recent call last):
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 264, in collective_rpc
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     result = get_response(w, dequeue_timeout,
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 244, in get_response
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     status, result = w.worker_response_mq.dequeue(
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/distributed/device_communicators/shm_broadcast.py", line 511, in dequeue
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     with self.acquire_read(timeout, cancel, indefinite) as buf:
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 137, in __enter__
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     return next(self.gen)
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]            ^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/distributed/device_communicators/shm_broadcast.py", line 460, in acquire_read
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     raise TimeoutError
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710] TimeoutError
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710] The above exception was the direct cause of the following exception:
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710] Traceback (most recent call last):
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 701, in run_engine_core
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     engine_core.run_busy_loop()
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 728, in run_busy_loop
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     self._process_engine_step()
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 754, in _process_engine_step
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     outputs, model_executed = self.step_fn()
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]                               ^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 284, in step
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     model_output = self.execute_model_with_error_logging(
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 270, in execute_model_with_error_logging
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     raise err
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 261, in execute_model_with_error_logging
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     return model_fn(scheduler_output)
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]            ^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 181, in execute_model
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     (output, ) = self.collective_rpc(
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]                  ^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]   File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 273, in collective_rpc
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710]     raise TimeoutError(f"RPC call to {method} timed out.") from e
(EngineCore_DP0 pid=1532787) ERROR 12-12 12:32:11 [core.py:710] TimeoutError: RPC call to execute_model timed out.
(Worker_TP1 pid=1532925) INFO 12-12 12:32:11 [multiproc_executor.py:558] Parent process exited, terminating worker
(Worker_TP0 pid=1532924) INFO 12-12 12:32:11 [multiproc_executor.py:558] Parent process exited, terminating worker
(Worker_TP2 pid=1532926) INFO 12-12 12:32:11 [multiproc_executor.py:558] Parent process exited, terminating worker
(APIServer pid=1532514) ERROR 12-12 12:32:11 [async_llm.py:480] AsyncLLM output_handler failed.
(APIServer pid=1532514) ERROR 12-12 12:32:11 [async_llm.py:480] Traceback (most recent call last):
(APIServer pid=1532514) ERROR 12-12 12:32:11 [async_llm.py:480]   File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 439, in output_handler
(APIServer pid=1532514) ERROR 12-12 12:32:11 [async_llm.py:480]     outputs = await engine_core.get_output_async()
(APIServer pid=1532514) ERROR 12-12 12:32:11 [async_llm.py:480]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1532514) ERROR 12-12 12:32:11 [async_llm.py:480]   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 846, in get_output_async
(APIServer pid=1532514) ERROR 12-12 12:32:11 [async_llm.py:480]     raise self._format_exception(outputs) from None
(APIServer pid=1532514) ERROR 12-12 12:32:11 [async_llm.py:480] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.
(Worker_TP3 pid=1532927) INFO 12-12 12:32:11 [multiproc_executor.py:558] Parent process exited, terminating worker

</details>


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: PD合并 TP=4 拉起Qwen3-VL-30B-A3B-Instruct模型，并发32发送textvqa-subset数据集，请求卡死，5分钟后实例异常退出 #183

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: PD合并 TP=4 拉起Qwen3-VL-30B-A3B-Instruct模型，并发32发送textvqa-subset数据集，请求卡死，5分钟后实例异常退出 #183

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions