[Feature]: support for nvidia/Qwen2.5-VL-7B-Instruct-FP4

### 🚀 The feature, motivation and pitch
Nvidia L4 (25 GB)
https://huggingface.co/nvidia/Qwen2.5-VL-7B-Instruct-FP4
nvidia-smi
| NVIDIA-SMI 550.163.01             Driver Version: 535.261.03     CUDA Version: 13.0     |


python3 ./TensorRT-LLM/examples/models/core/qwen/convert_checkpoint.py --model_dir ./Qwen2.5-VL-7B-Instruct-FP4 --output_dir ./tllm_checkpoint_fp4 --dtype float16 --smoothquant 0.5

[TensorRT-LLM] TensorRT LLM version: 1.2.0rc1
1.2.0rc1
Traceback (most recent call last):
  File "/code/tensorrt_llm/./TensorRT-LLM/examples/models/core/qwen/convert_checkpoint.py", line 345, in <module>
    main()
  File "/code/tensorrt_llm/./TensorRT-LLM/examples/models/core/qwen/convert_checkpoint.py", line 337, in main
    convert_and_save_hf(args)
  File "/code/tensorrt_llm/./TensorRT-LLM/examples/models/core/qwen/convert_checkpoint.py", line 263, in convert_and_save_hf
    QWenForCausalLM.quantize(
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/qwen/model.py", line 530, in quantize
    config = QWenConfig.from_hugging_face(hf_model_dir,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/qwen/config.py", line 115, in from_hugging_face
    assert qwen_type in valid_types, f"Unsupported Qwen type: {qwen_type}, only {valid_types} are acceptable."
           ^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Unsupported Qwen type: qwen2_5_vl, only ('qwen', 'qwen2', 'qwen2_moe', 'qwen2_llava_onevision', 'qwen2_vl', 'qwen2_audio', 'qwen3', 'qwen3_moe') are acceptable.

uname -a
Linux f909e01d9262 6.12.38+deb13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.38-1 (2025-07-16) x86_64 x86_64 x86_64 GNU/Linux

### Alternatives

When I edit code adding model name manually, I receive this error: [TensorRT-LLM] TensorRT LLM version: 1.2.0rc1
1.2.0rc1
torch_dtype is deprecated! Use dtype instead!
Traceback (most recent call last):
  File "/code/tensorrt_llm/./TensorRT-LLM/examples/models/core/qwen/convert_checkpoint.py", line 345, in <module>
    main()
  File "/code/tensorrt_llm/./TensorRT-LLM/examples/models/core/qwen/convert_checkpoint.py", line 337, in main
    convert_and_save_hf(args)
  File "/code/tensorrt_llm/./TensorRT-LLM/examples/models/core/qwen/convert_checkpoint.py", line 263, in convert_and_save_hf
    QWenForCausalLM.quantize(
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/qwen/model.py", line 535, in quantize
    convert.quantize(hf_model_dir,
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/qwen/convert.py", line 996, in quantize
    hf_model = model_cls.from_pretrained(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/transformers/models/auto/auto_factory.py", line 607, in from_pretrained
    raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.qwen2_5_vl.configuration_qwen2_5_vl.Qwen2_5_VLConfig'> for this kind of AutoModel: Au
toModelForCausalLM.
Model type should be one of ApertusConfig, ArceeConfig, AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegas
usConfig, BioGptConfig, BitNetConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2
Config, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, DeepseekV2Config, DeepseekV3Config, DiffLlamaConfig, DogeConfig, Dots1Config, ElectraConfig,
 Emu3Config, ErnieConfig, Ernie4_5Config, Ernie4_5_MoeConfig, Exaone4Config, FalconConfig, FalconH1Config, FalconMambaConfig, FuyuConfig, GemmaConfig, Gemma2C
onfig, Gemma3Config, Gemma3TextConfig, Gemma3nConfig, Gemma3nTextConfig, GitConfig, GlmConfig, Glm4Config, Glm4MoeConfig, GotOcr2Config, GPT2Config, GPT2Confi
g, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GptOssConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeHybridConfig, Gr
aniteMoeSharedConfig, HeliumConfig, HunYuanDenseV1Config, HunYuanMoEV1Config, JambaConfig, JetMoeConfig, Lfm2Config, LlamaConfig, Llama4Config, Llama4TextConf
ig, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MiniMaxConfig, MistralConfig, MixtralConfig, MllamaConfig, ModernBer
tDecoderConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NemotronConfig, OlmoConfig, Olmo2Config, OlmoeConfig, OpenLlamaConfig
, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, Phi4MultimodalConfig, PhimoeConfig, PLBartConfig, ProphetNetConfig, QDQBe
rtConfig, Qwen2Config, Qwen2MoeConfig, Qwen3Config, Qwen3MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormCon
fig, RoCBertConfig, RoFormerConfig, RwkvConfig, SeedOssConfig, SmolLM3Config, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConf
ig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, xLSTMConfig, XmodConfig, ZambaConfig, Zamba2
Config.


### Additional context

The following document says Qwen2.5 is supported:
https://nvidia.github.io/TensorRT-LLM/reference/support-matrix.html#models-pytorch-backend
"""
Qwen2_5_VLForConditionalGeneration | Qwen2.5-VL | Qwen/Qwen2.5-VL-7B-Instruct
"""

Docker image:
nvcr.io/nvidia/tensorrt-llm/release:1.2.0rc0.post1


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: support for nvidia/Qwen2.5-VL-7B-Instruct-FP4 #8404

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]: support for nvidia/Qwen2.5-VL-7B-Instruct-FP4 #8404

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions