openvinotoolkit · openvino-dev-samples · May 21, 2025 · Oct 15, 2025 · Oct 16, 2025 · Oct 16, 2025
diff --git a/.ci/spellcheck/.pyspelling.wordlist.txt b/.ci/spellcheck/.pyspelling.wordlist.txt
@@ -189,6 +189,7 @@ deduplicated
 DeepFloyd
 DeepLabV
 DeepSeek
+DeepStack
 denoise
 denoised
 denoises
@@ -412,6 +413,7 @@ intervaling
 im
 imageLink
 img
+io
 ip
 IPs
 ir

diff --git a/notebooks/llm-rag-langchain/llm-rag-langchain-genai.ipynb b/notebooks/llm-rag-langchain/llm-rag-langchain-genai.ipynb
@@ -880,7 +880,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": null,
    "id": "d0bab20b",
    "metadata": {},
    "outputs": [],
@@ -892,7 +892,7 @@
     "display(Markdown(f\"`{export_command}`\"))\n",
     "\n",
     "if not Path(rerank_model_id.value).exists():\n",
-    "    optimum_cli(rerank_model_configuration[\"model_id\"], str(rerank_model_id.value), show_command=False, additional_args={\"task\": \"text-classificaton\"})"
+    "    optimum_cli(rerank_model_configuration[\"model_id\"], str(rerank_model_id.value), show_command=False, additional_args={\"task\": \"text-classification\"})"
    ]
   },
   {

diff --git a/notebooks/llm-rag-llamaindex/llm-rag-llamaindex.ipynb b/notebooks/llm-rag-llamaindex/llm-rag-llamaindex.ipynb
@@ -1224,7 +1224,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": null,
    "id": "f7f708db-8de1-4efd-94b2-fcabc48d52f4",
    "metadata": {},
    "outputs": [
@@ -1288,7 +1288,9 @@
     "import openvino.properties as props\n",
     "import openvino.properties.hint as hints\n",
     "import openvino.properties.streams as streams\n",
+    "import openvino\n",
     "\n",
+    "core = openvino.Core()\n",
     "\n",
     "if model_to_run.value == \"INT4\":\n",
     "    model_dir = int4_model_dir\n",

diff --git a/notebooks/qwen3-vl/README.md b/notebooks/qwen3-vl/README.md
@@ -0,0 +1,69 @@
+# Visual-language assistant with Qwen3-VL and OpenVINO
+
+Qwen3-VL is the latest addition to the QwenVL series of multimodal large language models.
+
+This generation delivers comprehensive upgrades across the board: superior text understanding & generation, deeper visual perception & reasoning, extended context length, enhanced spatial and video dynamics comprehension, and stronger agent interaction capabilities.
+
+Available in Dense and MoE architectures that scale from edge to cloud, with Instruct and reasoning‑enhanced Thinking editions for flexible, on‑demand deployment.
+
+
+#### Key Enhancements:
+
+* **Visual Agent**: Operates PC/mobile GUIs—recognizes elements, understands functions, invokes tools, completes tasks.
+
+* **Visual Coding Boost**: Generates Draw.io/HTML/CSS/JS from images/videos.
+
+* **Advanced Spatial Perception**: Judges object positions, viewpoints, and occlusions; provides stronger 2D grounding and enables 3D grounding for spatial reasoning and embodied AI.
+
+* **Long Context & Video Understanding**: Native 256K context, expandable to 1M; handles books and hours-long video with full recall and second-level indexing.
+
+* **Enhanced Multimodal Reasoning**: Excels in STEM/Math—causal analysis and logical, evidence-based answers.
+
+* **Upgraded Visual Recognition**: Broader, higher-quality pretraining is able to “recognize everything”—celebrities, anime, products, landmarks, flora/fauna, etc.
+
+* **Expanded OCR**: Supports 32 languages (up from 19); robust in low light, blur, and tilt; better with rare/ancient characters and jargon; improved long-document structure parsing.
+
+* **Text Understanding on par with pure LLMs**: Seamless text–vision fusion for lossless, unified comprehension.
+
+
+#### Model Architecture Updates:
+
+<p align="center">
+    <img src="https://qianwen-res.oss-accelerate.aliyuncs.com/Qwen3-VL/qwen3vl_arc.jpg" width="80%"/>
+<p>
+
+
+1. **Interleaved-MRoPE**: Full‑frequency allocation over time, width, and height via robust positional embeddings, enhancing long‑horizon video reasoning.
+
+2. **DeepStack**: Fuses multi‑level ViT features to capture fine‑grained details and sharpen image–text alignment.
+
+3. **Text–Timestamp Alignment:** Moves beyond T‑RoPE to precise, timestamp‑grounded event localization for stronger video temporal modeling.
+
+
+More details about model can be found in [model card](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct), [blog](https://qwen.ai/blog?id=99f0335c4ad9ff6153e517418d48535ab6d8afef&from=research.latest-advancements-list) and original [repo](https://github.com/QwenLM/Qwen3-VL).
+
+In this tutorial we consider how to convert and optimize Qwen3VL model for creating multimodal chatbot using [Optimum Intel](https://github.com/huggingface/optimum-intel). Additionally, we demonstrate how to apply model optimization techniques like weights compression using [NNCF](https://github.com/openvinotoolkit/nncf).
+
+## Notebook contents
+The tutorial consists from following steps:
+
+- Install requirements
+- Convert and Optimize model
+- Run OpenVINO model inference
+- Launch Interactive demo
+
+In this demonstration, you'll create interactive chatbot that can answer questions about provided image's content.
+
+The image bellow illustrates example of input prompt and model answer.
+![example.png](https://github.com/user-attachments/assets/7e12ac6c-12f8-43d8-9c0a-b63d6ecaf20b)
+
+## Installation instructions
+This is a self-contained example that relies solely on its own code.</br>
+We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
+For details, please refer to [Installation Guide](../../README.md).
+
+<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/qwen2.5-vl/README.md" />
+
+<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks\qwen3-vl\README.md" />
+
+<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/qwen3-vl/README.md" />