Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .ci/spellcheck/.pyspelling.wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,7 @@ deduplicated
DeepFloyd
DeepLabV
DeepSeek
DeepStack
denoise
denoised
denoises
Expand Down Expand Up @@ -412,6 +413,7 @@ intervaling
im
imageLink
img
io
ip
IPs
ir
Expand Down
4 changes: 2 additions & 2 deletions notebooks/llm-rag-langchain/llm-rag-langchain-genai.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -880,7 +880,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": null,
"id": "d0bab20b",
"metadata": {},
"outputs": [],
Expand All @@ -892,7 +892,7 @@
"display(Markdown(f\"`{export_command}`\"))\n",
"\n",
"if not Path(rerank_model_id.value).exists():\n",
" optimum_cli(rerank_model_configuration[\"model_id\"], str(rerank_model_id.value), show_command=False, additional_args={\"task\": \"text-classificaton\"})"
" optimum_cli(rerank_model_configuration[\"model_id\"], str(rerank_model_id.value), show_command=False, additional_args={\"task\": \"text-classification\"})"
]
},
{
Expand Down
4 changes: 3 additions & 1 deletion notebooks/llm-rag-llamaindex/llm-rag-llamaindex.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1224,7 +1224,7 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": null,
"id": "f7f708db-8de1-4efd-94b2-fcabc48d52f4",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -1288,7 +1288,9 @@
"import openvino.properties as props\n",
"import openvino.properties.hint as hints\n",
"import openvino.properties.streams as streams\n",
"import openvino\n",
"\n",
"core = openvino.Core()\n",
"\n",
"if model_to_run.value == \"INT4\":\n",
" model_dir = int4_model_dir\n",
Expand Down
69 changes: 69 additions & 0 deletions notebooks/qwen3-vl/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Visual-language assistant with Qwen3-VL and OpenVINO

Qwen3-VL is the latest addition to the QwenVL series of multimodal large language models.

This generation delivers comprehensive upgrades across the board: superior text understanding & generation, deeper visual perception & reasoning, extended context length, enhanced spatial and video dynamics comprehension, and stronger agent interaction capabilities.

Available in Dense and MoE architectures that scale from edge to cloud, with Instruct and reasoning‑enhanced Thinking editions for flexible, on‑demand deployment.


#### Key Enhancements:

* **Visual Agent**: Operates PC/mobile GUIs—recognizes elements, understands functions, invokes tools, completes tasks.

* **Visual Coding Boost**: Generates Draw.io/HTML/CSS/JS from images/videos.

* **Advanced Spatial Perception**: Judges object positions, viewpoints, and occlusions; provides stronger 2D grounding and enables 3D grounding for spatial reasoning and embodied AI.

* **Long Context & Video Understanding**: Native 256K context, expandable to 1M; handles books and hours-long video with full recall and second-level indexing.

* **Enhanced Multimodal Reasoning**: Excels in STEM/Math—causal analysis and logical, evidence-based answers.

* **Upgraded Visual Recognition**: Broader, higher-quality pretraining is able to “recognize everything”—celebrities, anime, products, landmarks, flora/fauna, etc.

* **Expanded OCR**: Supports 32 languages (up from 19); robust in low light, blur, and tilt; better with rare/ancient characters and jargon; improved long-document structure parsing.

* **Text Understanding on par with pure LLMs**: Seamless text–vision fusion for lossless, unified comprehension.


#### Model Architecture Updates:

<p align="center">
<img src="https://qianwen-res.oss-accelerate.aliyuncs.com/Qwen3-VL/qwen3vl_arc.jpg" width="80%"/>
<p>


1. **Interleaved-MRoPE**: Full‑frequency allocation over time, width, and height via robust positional embeddings, enhancing long‑horizon video reasoning.

2. **DeepStack**: Fuses multi‑level ViT features to capture fine‑grained details and sharpen image–text alignment.

3. **Text–Timestamp Alignment:** Moves beyond T‑RoPE to precise, timestamp‑grounded event localization for stronger video temporal modeling.


More details about model can be found in [model card](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct), [blog](https://qwen.ai/blog?id=99f0335c4ad9ff6153e517418d48535ab6d8afef&from=research.latest-advancements-list) and original [repo](https://github.com/QwenLM/Qwen3-VL).

In this tutorial we consider how to convert and optimize Qwen3VL model for creating multimodal chatbot using [Optimum Intel](https://github.com/huggingface/optimum-intel). Additionally, we demonstrate how to apply model optimization techniques like weights compression using [NNCF](https://github.com/openvinotoolkit/nncf).

## Notebook contents
The tutorial consists from following steps:

- Install requirements
- Convert and Optimize model
- Run OpenVINO model inference
- Launch Interactive demo

In this demonstration, you'll create interactive chatbot that can answer questions about provided image's content.

The image bellow illustrates example of input prompt and model answer.
![example.png](https://github.com/user-attachments/assets/7e12ac6c-12f8-43d8-9c0a-b63d6ecaf20b)

## Installation instructions
This is a self-contained example that relies solely on its own code.</br>
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to [Installation Guide](../../README.md).

<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/qwen2.5-vl/README.md" />

<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks\qwen3-vl\README.md" />

<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/qwen3-vl/README.md" />
Loading
Loading