Skip to content

Conversation

@MrZ20
Copy link
Contributor

@MrZ20 MrZ20 commented Nov 27, 2025

What this PR does / why we need it?

Modify errors in the tutorial document(qwen2-audio-7b)

  • Fixes Deprecated Import: Updates the import of FlexibleArgumentParser from vllm.utils to vllm.utils.argparse_utils to address a deprecation warning and prevent future ImportError.

Related errors:

ImportError: cannot import name 'FlexibleArgumentParser' from 'vllm.utils' (/vllm-workspace/vllm/vllm/utils/__init__.py)
  • Corrects Max Sequence Length: Changes the --max_model_len from 16384 to the model-supported 8192. This resolves the ValueError caused by exceeding the model's physical context window (max_position_embeddings).

Related errors:

Value error, User-specified max_model_len (16384) is greater than the derived max_model_len 
(max_position_embeddings=8192 or model_max_length=None in model's config.json). To allow overriding this maximum, set 
the env var VLLM_ALLOW_LONG_MAX_MODEL_LEN=1. VLLM_ALLOW_LONG_MAX_MODEL_LEN must be used with extreme 
caution. If the model uses relative position encoding (RoPE), positions exceeding derived_max_model_len lead to nan. If the 
model uses absolute position encoding, positions exceeding derived_max_model_len will cause a CUDA array out-of-bounds 
error. [type=value_error, input_value=ArgsKwargs((), {'model': ...rocessor_plugin': None}), input_type=ArgsKwargs]

Does this PR introduce any user-facing change?

How was this patch tested?

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to correct errors in the Qwen2-Audio-7B tutorial document. The changes include fixing an import path, updating the max_model_len parameter value, and correcting the model identifier in a curl example. These changes improve the accuracy of the tutorial. I've identified one remaining issue in a command-line argument and provided a suggestion for a fix. Overall, this is a helpful documentation update.

-it $IMAGE \
vllm serve Qwen/Qwen2-Audio-7B-Instruct \
--max_model_len 16384 \
--max_model_len 8192 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The command-line argument for setting the maximum model length should use a hyphen instead of an underscore. The correct argument is --max-model-len, not --max_model_len. Using an underscore will likely cause the argument to be unrecognized and the command to fail.

Suggested change
--max_model_len 8192 \
--max-model-len 8192 \

@MrZ20 MrZ20 force-pushed the qwen2_audio_7b_docs branch from 2b8634d to b605462 Compare November 27, 2025 07:45
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 27, 2025
@MrZ20
Copy link
Contributor Author

MrZ20 commented Nov 27, 2025

@wangxiyuan @Yikun It addresses several issues/corrections in the tutorial documentation for Qwen2-Audio-7B. Once confirmed, this PR is ready to be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant