Updates to visually tag "layout mode" explain PyMuPDF Layout more.

jamie-lemon · jamie-lemon · commit 6ef406aabc2a · 2025-11-28T21:09:46.000Z
diff --git a/docs/images/layout-ocr-flow.png b/docs/images/layout-ocr-flow.png
diff --git a/docs/installation.rst b/docs/installation.rst
@@ -330,4 +330,9 @@ So for a working OCR functionality, make sure to complete this checklist:
      * Windows: `setx TESSDATA_PREFIX "C:/Program Files/Tesseract-OCR/tessdata"`
      * Unix systems: `declare -x TESSDATA_PREFIX=/usr/share/tesseract-ocr/4.00/tessdata`
 
+
+.. note::
+
+  Find out more on the `official documention for installing Tesseract website <https://tesseract-ocr.github.io/tessdoc/Installation.html>`_.
+
 .. include:: footer.rst
diff --git a/docs/pymupdf-layout/index.rst b/docs/pymupdf-layout/index.rst
@@ -1,7 +1,7 @@
 
 .. include:: ../header.rst
 
-.. _pymupdf-layout
+.. _pymupdf-layout:
 
 
 PyMuPDF Layout
@@ -22,6 +22,8 @@ Install from |PyPI| with::
     pip install pymupdf-layout
 
 
+.. _pymupdf_layout_using:
+
 Using
 ----------------------------------
 
@@ -118,16 +120,31 @@ Now we can happily load Office files and convert them as follows::
     md = pymupdf4llm.to_markdown("sample.docx")
 
 
+.. _pymupdf_layout_ocr_support:
+
 OCR support
 ~~~~~~~~~~~~~~~~~
 
-The new layout-sensitive PyMuPDF4LLM version also evaluates whether a page would benefit from applying OCR to it. If its heuristics come to this conclusion, the built-in Tesseract-OCR module is automatically invoked. Its results are then handled like normal page content.
+The new layout-sensitive |PyMuPDF4LLM| version also evaluates whether a page would benefit from applying OCR to it. If its heuristics come to this conclusion, the built-in Tesseract-OCR module is automatically invoked. Its results are then handled like normal page content.
  
-If a page contains (roughly) no text at all, but is covered with images or many character-sized vectors, a check is made using `OpenCV <https://pypi.org/project/opencv-python/>`_ whether text is *probably* detectable on the page at all. This is done to tell apart image-based text from ordinary pictures (like photographies).
+If a page contains (roughly) no text at all, but is covered with images or many character-sized vectors, a check is made using `OpenCV <https://pypi.org/project/opencv-python/>`_ whether text is *probably* detectable on the page at all. This is done to tell apart image-based text from ordinary pictures (like photographs).
 
 If the page does contain text but too many characters are unreadable (like "�����"), OCR is also executed, but **for the affected text areas only** -- not the full page. This way, we avoid losing already existing text and other content like images and vectors.
 
-For these heuristics to work we need both, an existing Tesseract installation and the availability of OpenCV in the Python environment. If either is missing, no OCR is attempted at all.
+For these heuristics to work we need both, an existing :ref:`Tesseract installation <installation_ocr>` and the availability of `OpenCV <https://pypi.org/project/opencv-python/>`_ in the Python environment. If either is missing, no OCR is attempted at all.
+
+The decision tree for whether OCR is actually used or not depends on the following:
+
+1. :ref:`PyMuPDF Layout is imported <pymupdf_layout_using>`
+
+2. In the :ref:`PyMuPDF4LLM API <pymupdf4llm-api>` you have `use_ocr` enabled (this is set to `True` by default)
+
+3. :ref:`Tesseract is correctly installed <installation_ocr>`
+
+4. `OpenCV <https://pypi.org/project/opencv-python/>`_ is available in your Python environment
+
+
+.. image:: ../images/layout-ocr-flow.png
 
 ----
 
diff --git a/docs/pymupdf-pro/index.rst b/docs/pymupdf-pro/index.rst
@@ -2,7 +2,7 @@
 .. include:: ../header.rst
 
 
-.. _pymupdf-pro
+.. _pymupdf-pro:
 
 PyMuPDF Pro
 =============
diff --git a/docs/pymupdf4llm/api.rst b/docs/pymupdf4llm/api.rst
diff --git a/docs/pymupdf4llm/index.rst b/docs/pymupdf4llm/index.rst
@@ -1,15 +1,15 @@
 
 .. include:: ../header.rst
 
-.. _pymupdf4llm
+.. _pymupdf4llm:
 
 
 PyMuPDF4LLM
 ===========================================================================
 
 |PyMuPDF4LLM| is aimed to make it easier to extract |PDF| content in the format you need for **LLM** & **RAG** environments. It supports :ref:`Markdown extraction <extracting_as_md>` as well as :ref:`LlamaIndex document output <extracting_as_llamaindex>`.
 
-When using |PyMuPDF4LLM| with PyMuPDF-Layout, page layout detection will be greatly improved. This is true for table detection, but also for the detection of page headers and footers, footnotes, list items and text paragraphs. In addition two new methods become available, `to_json()` and `to_text()`.
+When using |PyMuPDF4LLM| with PyMuPDF Layout, page layout detection will be greatly improved. This is true for table detection, but also for the detection of page headers and footers, footnotes, list items and text paragraphs. In addition two new methods become available, `to_json()` and `to_text()`.
 
 .. important::
 
@@ -22,8 +22,8 @@ Features
     - Support for image and vector graphics extraction (and inclusion of references in the MD text)
     - Support for page chunking output.
     - Direct support for output as :ref:`LlamaIndex Documents <extracting_as_llamaindex>`.
-    - In "layout mode": Support for plain text output similar to Markdown
-    - In "layout mode": Support for JSON output
+    - When used with :ref:`PyMuPDF Layout <pymupdf-layout>` : Support for plain text output similar to Markdown
+    - When used with :ref:`PyMuPDF Layout <pymupdf-layout>` : Support for JSON output
 
 
 Functionality
diff --git a/docs/recipes.rst b/docs/recipes.rst
@@ -18,6 +18,11 @@
 
 ----
 
+.. toctree::
+
+   recipes-ocr.rst
+
+----
 
 .. toctree::
 
@@ -61,11 +66,6 @@
 
 ----
 
-.. toctree::
-
-   recipes-ocr.rst
-
-----
 
 .. toctree::