Skip to content

Conversation

@openvino-dev-samples
Copy link
Collaborator

No description provided.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@review-notebook-app
Copy link

review-notebook-app bot commented Oct 24, 2025

View / edit / reply to this conversation on ReviewNB

aleksandr-mokrov commented on 2025-10-24T15:50:16Z
----------------------------------------------------------------

Line #4.    demo = make_demo(model, processor)

Check it please. make_demo accepts only model, genai is not installed and '_OVQwen3VLForCausalLM' object has no attribute 'start_chat'.


@openvino-dev-samples
Copy link
Collaborator Author

View / edit / reply to this conversation on ReviewNB

aleksandr-mokrov commented on 2025-10-24T15:50:16Z ----------------------------------------------------------------

Line #4. demo = make_demo(model, processor)
Check it please. make_demo accepts only model, genai is not installed and '_OVQwen3VLForCausalLM' object has no attribute 'start_chat'.

sorry, its my mistake, and it shall reuse the gradio_helper.py of qwen2.5vl notebook instead of qwenvl2. it should be working now.

@helena-intel
Copy link
Contributor

@openvino-dev-samples have you tried this on a computer with 32GB RAM? I have a LNL laptop with 32GB RAM and I can only convert the 2B model. The 4B model crashes. The same happened too when I tried to reproduce https://medium.com/openvino-toolkit/early-look-exploring-qwen3-vl-and-qwen3-next-day0-model-integration-for-enhanced-ai-pc-experiences-134498f6b290 . There is no error, conversion just stops after "loading checkpoint shards" (outside of the notebook, Inside the notebook I get "subprocess failed").

Converting does work on SPR with 1TB RAM. I used the optimum-cli command from the notebook with these requirements (also from the notebook) in a clean env https://gist.githubusercontent.com/helena-intel/53d044d690b7769b49b5bbccd5c267bf/raw/34b56b29b79b17f7d59b875218e650101ea19ae1/requirements_notebook.txt

It's surprising not to be able to convert a 4B model on a LNL laptop. If this is expected for now, can this be very clear in the notebook, with bold letters or a warning note?

@brmarkus
Copy link

@openvino-dev-samples have you tried this on a computer with 32GB RAM? I have a LNL laptop with 32GB RAM and I can only convert the 2B model. The 4B model crashes. The same happened too when I tried to reproduce https://medium.com/openvino-toolkit/early-look-exploring-qwen3-vl-and-qwen3-next-day0-model-integration-for-enhanced-ai-pc-experiences-134498f6b290 . There is no error, conversion just stops after "loading checkpoint shards" (outside of the notebook, Inside the notebook I get "subprocess failed").

Converting does work on SPR with 1TB RAM. I used the optimum-cli command from the notebook with these requirements (also from the notebook) in a clean env https://gist.githubusercontent.com/helena-intel/53d044d690b7769b49b5bbccd5c267bf/raw/34b56b29b79b17f7d59b875218e650101ea19ae1/requirements_notebook.txt

It's surprising not to be able to convert a 4B model on a LNL laptop. If this is expected for now, can this be very clear in the notebook, with bold letters or a warning note?

I can confirm I can run the notebook with the model Qwen/Qwen3-VL-4B-Instruct on a MeteorLake/Ultra-7-155H Laptop under MS-Win11-Pro with 64GB RAM.

After downloading the model during the call optimum-cli export openvino --model Qwen/Qwen3-VL-4B-Instruct Qwen3-VL-4B-Instruct\INT4 --weight-format int4 the current system memory consumption was at ~22GB (browser open, VisualStudio open, various PDFs, etc.) and during conversion&optimization it went up to ~57GB (an increase of ~35GB)!

The inference succeeded on both the CPU and GPU.

@openvino-dev-samples
Copy link
Collaborator Author

Hi @helena-intel @brmarkus

I have converted both of them in LNL with 32GB RAM

image

@brmarkus
Copy link

Hi @helena-intel @brmarkus

I have converted both of them in LNL with 32GB RAM

Looks like as many applications and services as possible need to be shutdown and stopped to free memory. In your setup your screenshot shows an idle memory usage of 6.6GB from total 32GB. Have you watched the memory consumption during download and then during conversion&optimization?
For me it looked like memory usage rised during download - and hasn't fallen when conversion&optimization started immediately. It migh help when there is a gap or notebook restart between download and conversion&optimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants