For nz unset in bf16&fp16 #4495

henryxuxu0716 · 2025-11-27T07:39:09Z

What this PR does / why we need it?

disable NZ for float weight case. This is only a quick fix for dev branch.

For main branch, we'll consider more case to make it more common.

Does this PR introduce any user-facing change?

How was this patch tested?

qwen2.5 32B

github-actions · 2025-11-27T07:39:21Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

这个 PR 的目标是在浮点（bf16/fp16）场景下禁用 NZ 格式，这是通过修改 is_enable_nz 函数并更新其所有调用点来实现的。这个方向是正确的。

然而，在 vllm_ascend/utils.py 中对 is_enable_nz 函数的修改引入了一个严重的问题。该函数现在会修改一个全局变量 _ENABLE_NZ，这会导致其行为依赖于调用历史，从而产生不确定的结果。例如，一旦使用 float16 或 bfloat16 类型的 dtype 调用该函数，全局的 _ENABLE_NZ 标志就会被永久设置为 False，这将影响后续对其他数据类型（如 int8）的调用。

我提供了一个修复建议，以解决这个全局状态被错误修改的问题。请查看具体的审查评论。

gemini-code-assist · 2025-11-27T07:41:13Z

vllm_ascend/utils.py

+    if dtype in [torch.float16, torch.bfloat16]:
+        _ENABLE_NZ = False
+
    return _ENABLE_NZ


这个函数对全局变量 _ENABLE_NZ 的修改方式存在严重问题。当使用 torch.float16 或 torch.bfloat16 类型的 dtype 调用此函数时，它会将全局变量 _ENABLE_NZ 设置为 False。由于 _ENABLE_NZ 是全局变量，并且只在为 None 时初始化一次，这个修改将是持久的。

这会导致后续对 is_enable_nz 的调用（即使 dtype 不同，例如 torch.int8）都会返回 False，这可能与预期的行为不符，并引入了依赖于调用顺序的隐藏 bug。

为了修正这个问题，我们应该避免在函数中修改全局变量 _ENABLE_NZ。正确的做法应该是，在函数内部根据 dtype 决定返回值，而不改变全局状态。

建议修改如下：

def is_enable_nz(dtype: Optional[torch.dtype] = torch.int8, vllm_config: Optional[VllmConfig] = None) -> bool: global _ENABLE_NZ if _ENABLE_NZ is None: if not vllm_config: raise ValueError( "vllm_config must be provided when _ENABLE_NZ is None") _ENABLE_NZ = envs_ascend.VLLM_ASCEND_ENABLE_NZ and vllm_config.model_config.hf_config.model_type != "qwen3_next" if dtype in [torch.float16, torch.bfloat16]: return False return _ENABLE_NZ

Suggested change

if dtype in [torch.float16, torch.bfloat16]:

_ENABLE_NZ = False

return _ENABLE_NZ

if dtype in [torch.float16, torch.bfloat16]:

return False

return _ENABLE_NZ

Signed-off-by: 刘哲续 <[email protected]>

github-actions bot added module:ops module:core labels Nov 27, 2025

henryxuxu0716 force-pushed the for_nz_unset branch from 36e30a0 to 18bd5fc Compare November 27, 2025 07:39

gemini-code-assist bot reviewed Nov 27, 2025

View reviewed changes

刘哲续 added 3 commits November 27, 2025 19:02

unset nz in bf16&fp16

cebdd4a

Signed-off-by: 刘哲续 <[email protected]>

unset nz in bf16&fp16

c17a0d2

Signed-off-by: 刘哲续 <[email protected]>

bugfix

50f58d7

Signed-off-by: 刘哲续 <[email protected]>

henryxuxu0716 force-pushed the for_nz_unset branch from 2b59871 to 50f58d7 Compare November 27, 2025 11:10

wangxiyuan approved these changes Nov 27, 2025

View reviewed changes

weijinqian0 added ready read for review ready-for-test start test by label for PR labels Nov 27, 2025

weijinqian0 approved these changes Nov 27, 2025

View reviewed changes

weijinqian0 merged commit 71acc8d into vllm-project:v0.11.0-dev Nov 28, 2025
43 of 44 checks passed

Yikun mentioned this pull request Nov 29, 2025

[Misc]: Performance regression between v0.11.0rc1 and v0.9.1 #4110

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

For nz unset in bf16&fp16 #4495

For nz unset in bf16&fp16 #4495

Uh oh!

henryxuxu0716 commented Nov 27, 2025 •

edited by wangxiyuan

Loading

Uh oh!

github-actions bot commented Nov 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

For nz unset in bf16&fp16 #4495

For nz unset in bf16&fp16 #4495

Uh oh!

Conversation

henryxuxu0716 commented Nov 27, 2025 • edited by wangxiyuan Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

henryxuxu0716 commented Nov 27, 2025 •

edited by wangxiyuan

Loading