Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
0ef1eae
Add workflow_dispatch integration test for library mode on Windows an…
charlesbluca Apr 8, 2026
6a21df3
Split torch cu130 deps into explicit group
charlesbluca Apr 8, 2026
bbe51d3
ci: increase RAY_raylet_start_wait_time_s for macOS integration tests
charlesbluca Apr 8, 2026
027c154
Use inprocess run mode for now
charlesbluca Apr 8, 2026
79e8f0b
Pass API key via --api-key flag using NGC_NV_DEVELOPER_NVCF secret
charlesbluca Apr 8, 2026
1f31946
Initial plan & refactors for slimmer instal
charlesbluca Apr 8, 2026
8c45923
Drop nv-ingest as dep
charlesbluca Apr 8, 2026
531015e
Make heavy optional deps lazy for slim Intel Mac install
charlesbluca Apr 9, 2026
6190f04
Fix Intel Mac slim-install blockers in PDF/image/embed pipeline
charlesbluca Apr 9, 2026
264ab24
Merge remote-tracking branch 'upstream/main' into slim-install
charlesbluca Apr 9, 2026
fd53df6
Use CUDA torch index for Windows as well as Linux
charlesbluca Apr 9, 2026
5bef6f7
Merge branch 'slim-install'
charlesbluca Apr 9, 2026
26475c8
Add macOS x64 to workflow
charlesbluca Apr 9, 2026
4b120f0
torch cuda index rename
charlesbluca Apr 9, 2026
9f2035b
Merge branch 'slim-install'
charlesbluca Apr 9, 2026
45f2732
Try switching to macos-26-intel
charlesbluca Apr 9, 2026
9e01344
Modify unit test install
charlesbluca Apr 9, 2026
326de9a
Linting
charlesbluca Apr 9, 2026
0722e6f
Guard optional imports and restore graceful embedding failure handling
charlesbluca Apr 9, 2026
4591f62
Fix test failures from lazy import change and network-dependent token…
charlesbluca Apr 9, 2026
4273b5a
Fix misplaced docstrings and remove invalid uv conflicts block
charlesbluca Apr 9, 2026
25622df
Simplify dependency groups; move remote and lancedb to core
charlesbluca Apr 9, 2026
48a8953
Drop agent doc
charlesbluca Apr 9, 2026
03dc39f
Fix README install instructions to reflect simplified dependency groups
charlesbluca Apr 9, 2026
ec99f30
ci: add nightly schedule trigger and fix secret name in library mode …
charlesbluca Apr 9, 2026
92f777a
Compat code for ray[data] 2.49
charlesbluca Apr 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions .github/workflows/integration-test-library-mode.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# SPDX-FileCopyrightText: Copyright (c) 2024-25, NVIDIA CORPORATION & AFFILIATES.
# All rights reserved.
# SPDX-License-Identifier: Apache-2.0

name: Library Mode Integration Tests (Windows & macOS)

on:
schedule:
# Runs every day at 11:30PM (UTC)
- cron: "30 23 * * *"
workflow_dispatch:
inputs:
source-ref:
description: 'Git ref to test (branch, tag, or SHA). Defaults to the dispatched branch.'
required: false
type: string
default: ''

jobs:
integration-test:
name: Integration Tests (${{ matrix.os-label }})
runs-on: ${{ matrix.runner }}
timeout-minutes: 90

strategy:
fail-fast: false
matrix:
include:
- runner: windows-latest
os-label: windows-x64
- runner: macos-26
os-label: macos-arm64
- runner: macos-26-intel
os-label: macos-x64

env:
# NIM endpoint URLs — edit these directly to point at different deployments
PAGE_ELEMENTS_INVOKE_URL: "https://ai.api.nvidia.com/v1/cv/nvidia/nemotron-page-elements-v3"
OCR_INVOKE_URL: "https://ai.api.nvidia.com/v1/cv/nvidia/nemoretriever-ocr-v1"
GRAPHIC_ELEMENTS_INVOKE_URL: "https://ai.api.nvidia.com/v1/cv/nvidia/nemotron-graphic-elements-v1"
TABLE_STRUCTURE_INVOKE_URL: "https://ai.api.nvidia.com/v1/cv/nvidia/nemotron-table-structure-v1"
EMBED_INVOKE_URL: "https://integrate.api.nvidia.com/v1"
EMBED_MODEL_NAME: "nvidia/llama-nemotron-embed-1b-v2"

steps:
- name: Check out repository code
uses: actions/checkout@v4
with:
ref: ${{ inputs.source-ref != '' && inputs.source-ref || github.ref }}

- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install uv
run: pip install uv
Comment on lines +50 to +57
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 GitHub Actions not pinned to commit SHA

Both actions/checkout@v4 and actions/setup-python@v5 use mutable version tags. Per the repository's github-actions-security rule, third-party actions must be pinned to a full commit SHA to prevent supply-chain attacks.

Suggested change
- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install uv
run: pip install uv
- name: Check out repository code
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
ref: ${{ inputs.source-ref != '' && inputs.source-ref || github.ref }}
- name: Set up Python 3.12
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: '3.12'
Prompt To Fix With AI
This is a comment left during a code review.
Path: .github/workflows/integration-test-library-mode.yml
Line: 47-54

Comment:
**GitHub Actions not pinned to commit SHA**

Both `actions/checkout@v4` and `actions/setup-python@v5` use mutable version tags. Per the repository's `github-actions-security` rule, third-party actions must be pinned to a full commit SHA to prevent supply-chain attacks.

```suggestion
      - name: Check out repository code
        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683  # v4.2.2
        with:
          ref: ${{ inputs.source-ref != '' && inputs.source-ref || github.ref }}

      - name: Set up Python 3.12
        uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065  # v5.6.0
        with:
          python-version: '3.12'
```

How can I resolve this? If you propose a fix, please make it concise.


- name: Install nemo-retriever and dependencies
shell: bash
run: |
uv pip install --system -e "nemo_retriever"

- name: Run graph pipeline on PDFs
shell: bash
env:
PYTHONPATH: nemo_retriever/src
run: |
python -m nemo_retriever.examples.graph_pipeline ./data \
--run-mode inprocess \
--input-type pdf \
--api-key "${{ secrets.NVCF_API_KEY }}" \
--page-elements-invoke-url "$PAGE_ELEMENTS_INVOKE_URL" \
--ocr-invoke-url "$OCR_INVOKE_URL" \
--use-graphic-elements \
--graphic-elements-invoke-url "$GRAPHIC_ELEMENTS_INVOKE_URL" \
--use-table-structure \
--table-structure-invoke-url "$TABLE_STRUCTURE_INVOKE_URL" \
--embed-invoke-url "$EMBED_INVOKE_URL" \
--embed-model-name "$EMBED_MODEL_NAME"
3 changes: 1 addition & 2 deletions .github/workflows/retriever-unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,7 @@ jobs:

- name: Install unit test dependencies
run: |
uv pip install --system -e src/ -e api/ -e client/
uv pip install --system -e nemo_retriever
uv pip install --system -e nemo_retriever[all,dev]

- name: Run retriever unit tests
env:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,19 @@

import pandas as pd
import pypdfium2 as pdfium
from unstructured_client import UnstructuredClient
from unstructured_client.models import operations
from unstructured_client.models import shared
from unstructured_client.utils import BackoffStrategy
from unstructured_client.utils import RetryConfig

try:
from unstructured_client import UnstructuredClient
from unstructured_client.models import operations
from unstructured_client.models import shared
from unstructured_client.utils import BackoffStrategy
from unstructured_client.utils import RetryConfig
except ImportError:
UnstructuredClient = None
operations = None
shared = None
BackoffStrategy = None
RetryConfig = None

from nv_ingest_api.internal.enums.common import AccessLevelEnum, DocumentTypeEnum
from nv_ingest_api.internal.enums.common import ContentTypeEnum
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,19 @@
from typing import Optional

import backoff
import cv2
import numpy as np
import requests

from nv_ingest_api.internal.primitives.nim.model_interface.decorators import multiprocessing_cache
from nv_ingest_api.util.image_processing.transforms import pad_image, normalize_image
from nv_ingest_api.util.string_processing import generate_url, remove_url_endpoints

cv2.setNumThreads(1)
try:
import cv2

cv2.setNumThreads(1)
except ImportError:
cv2 = None
logger = logging.getLogger(__name__)


Expand Down
6 changes: 4 additions & 2 deletions api/src/nv_ingest_api/util/detectors/language.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@
# SPDX-License-Identifier: Apache-2.0


import langdetect

from nv_ingest_api.internal.enums.common import LanguageEnum
from nv_ingest_api.util.exception_handlers.detectors import langdetect_exception_handler

Expand All @@ -24,6 +22,10 @@ def detect_language(text):
LanguageEnum
A value from `LanguageEnum` detected language code.
"""
try:
import langdetect
except ImportError:
return LanguageEnum.UNKNOWN

try:
language = langdetect.detect(text)
Expand Down
14 changes: 9 additions & 5 deletions api/src/nv_ingest_api/util/exception_handlers/detectors.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@
from typing import Callable
from typing import Dict

from langdetect.lang_detect_exception import LangDetectException
try:
from langdetect.lang_detect_exception import LangDetectException as _LangDetectException
except ImportError:
_LangDetectException = None

from nv_ingest_api.internal.enums.common import LanguageEnum

Expand Down Expand Up @@ -66,9 +69,10 @@ def langdetect_exception_handler(func: Callable, **kwargs: Dict[str, Any]) -> Ca
def inner_function(*args, **kwargs):
try:
return func(*args, **kwargs)
except LangDetectException as e:
log_error_message = f"LangDetectException: {e}"
logger.warning(log_error_message)
return LanguageEnum.UNKNOWN
except Exception as e:
if _LangDetectException is not None and isinstance(e, _LangDetectException):
logger.warning(f"LangDetectException: {e}")
return LanguageEnum.UNKNOWN
raise

return inner_function
15 changes: 10 additions & 5 deletions api/src/nv_ingest_api/util/image_processing/table_and_chart.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@

import numpy as np
import pandas as pd
from sklearn.cluster import DBSCAN


logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -173,10 +172,14 @@ def convert_ocr_response_to_psuedo_markdown(bboxes, texts):
)
preds_df = preds_df.sort_values("y0")

dbscan = DBSCAN(eps=10, min_samples=1)
dbscan.fit(preds_df["y0"].values[:, None])
try:
from sklearn.cluster import DBSCAN

preds_df["cluster"] = dbscan.labels_
dbscan = DBSCAN(eps=10, min_samples=1)
dbscan.fit(preds_df["y0"].values[:, None])
preds_df["cluster"] = dbscan.labels_
except ImportError:
preds_df["cluster"] = (preds_df["y0"] / 10).round().astype(int)
preds_df = preds_df.sort_values(["cluster", "x0"])

results = ""
Expand Down Expand Up @@ -483,12 +486,14 @@ def reorder_boxes(boxes, texts, confs, mode="top_left", dbscan_eps=10):
if dbscan_eps:
do_naive_sorting = False
try:
from sklearn.cluster import DBSCAN

dbscan = DBSCAN(eps=dbscan_eps, min_samples=1)
dbscan.fit(df["y"].values[:, None])
df["cluster"] = dbscan.labels_
df["cluster_centers"] = df.groupby("cluster")["y"].transform("mean").astype(int)
df = df.sort_values(["cluster_centers", "x"], ascending=[True, True], ignore_index=True)
except ValueError:
except (ImportError, ValueError):
do_naive_sorting = True
else:
do_naive_sorting = True
Expand Down
78 changes: 48 additions & 30 deletions api/src/nv_ingest_api/util/image_processing/transforms.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,19 @@
from typing import Optional
from typing import Tuple

import cv2
import numpy as np
from io import BytesIO
from PIL import Image

from nv_ingest_api.util.converters import bytetools

# Configure OpenCV to use a single thread for image processing
cv2.setNumThreads(1)
try:
import cv2

# Configure OpenCV to use a single thread for image processing
cv2.setNumThreads(1)
except ImportError:
cv2 = None
DEFAULT_MAX_WIDTH = 1024
DEFAULT_MAX_HEIGHT = 1280

Expand All @@ -26,9 +30,7 @@
logger = logging.getLogger(__name__)


def _resize_image_opencv(
array: np.ndarray, target_size: Tuple[int, int], interpolation=cv2.INTER_LANCZOS4
) -> np.ndarray:
def _resize_image_opencv(array: np.ndarray, target_size: Tuple[int, int], interpolation=None) -> np.ndarray:
"""
Resizes a NumPy array representing an image using OpenCV.

Expand All @@ -46,7 +48,12 @@ def _resize_image_opencv(
np.ndarray
The resized image as a NumPy array.
"""
return cv2.resize(array, target_size, interpolation=interpolation)
if interpolation is None:
interpolation = cv2.INTER_LANCZOS4 if cv2 is not None else 4 # 4 == INTER_LANCZOS4 constant
if cv2 is not None:
return cv2.resize(array, target_size, interpolation=interpolation)
pil_img = Image.fromarray(array)
return np.array(pil_img.resize(target_size, resample=Image.LANCZOS))


def rgba_to_rgb_white_bg(rgba_image):
Expand Down Expand Up @@ -539,10 +546,15 @@ def numpy_to_base64_png(array: np.ndarray) -> str:
If there is an issue during the image conversion or base64 encoding process.
"""
try:
# Encode to PNG bytes using OpenCV
png_bytes = _encode_opencv_png(array)
if cv2 is not None:
png_bytes = _encode_opencv_png(array)
else:
from io import BytesIO

# Convert to base64
pil_img = Image.fromarray(array.astype(np.uint8))
buf = BytesIO()
pil_img.save(buf, format="PNG", compress_level=3)
png_bytes = buf.getvalue()
base64_img = bytetools.base64frombytes(png_bytes)
except Exception as e:
raise RuntimeError(f"Failed to encode image to base64 PNG: {e}")
Expand Down Expand Up @@ -572,10 +584,15 @@ def numpy_to_base64_jpeg(array: np.ndarray, quality: int = 100) -> str:
If there is an issue during the image conversion or base64 encoding process.
"""
try:
# Encode to JPEG bytes using OpenCV
jpeg_bytes = _encode_opencv_jpeg(array, quality=quality)
if cv2 is not None:
jpeg_bytes = _encode_opencv_jpeg(array, quality=quality)
else:
from io import BytesIO

# Convert to base64
pil_img = Image.fromarray(array.astype(np.uint8)).convert("RGB")
buf = BytesIO()
pil_img.save(buf, format="JPEG", quality=quality)
jpeg_bytes = buf.getvalue()
base64_img = bytetools.base64frombytes(jpeg_bytes)
except Exception as e:
raise RuntimeError(f"Failed to encode image to base64 JPEG: {e}")
Expand Down Expand Up @@ -626,14 +643,15 @@ def numpy_to_base64(array: np.ndarray, format: str = "PNG", **kwargs) -> str:
>>> isinstance(encoded_str_jpeg, str)
True
"""
# Centralized preprocessing of the numpy array
processed_array = _preprocess_numpy_array(array)

# Quick format normalization
format = format.upper().strip()
if format == "JPG":
format = "JPEG"

# _preprocess_numpy_array converts RGB→BGR for OpenCV; skip it when cv2 is unavailable
# since numpy_to_base64_png/jpeg already handle the PIL fallback path with RGB input.
processed_array = _preprocess_numpy_array(array) if cv2 is not None else array

if format == "PNG":
return numpy_to_base64_png(processed_array)
elif format == "JPEG":
Expand Down Expand Up @@ -676,21 +694,21 @@ def base64_to_numpy(base64_string: str) -> np.ndarray:
except Exception as e:
raise ValueError("Invalid base64 string") from e

# Create numpy buffer from bytes and decode using OpenCV
buf = np.frombuffer(image_bytes, dtype=np.uint8)
try:
img = cv2.imdecode(buf, cv2.IMREAD_UNCHANGED)
if img is None:
raise ValueError("OpenCV failed to decode image")

# Convert 4 channel to 3 channel if necessary
if img.shape[2] == 4:
img = rgba_to_rgb_white_bg(img)

# Convert BGR to RGB for consistent processing (OpenCV loads as BGR)
# Only convert if it's a 3-channel color image
if img.ndim == 3 and img.shape[2] == 3:
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
if cv2 is not None:
buf = np.frombuffer(image_bytes, dtype=np.uint8)
img = cv2.imdecode(buf, cv2.IMREAD_UNCHANGED)
if img is None:
raise ValueError("OpenCV failed to decode image")
if img.ndim == 3 and img.shape[2] == 4:
img = rgba_to_rgb_white_bg(img)
if img.ndim == 3 and img.shape[2] == 3:
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
else:
from io import BytesIO

pil_img = Image.open(BytesIO(image_bytes)).convert("RGB")
img = np.array(pil_img)
except ImportError:
raise
except Exception as e:
Expand Down
16 changes: 13 additions & 3 deletions api/src/nv_ingest_api/util/pdf/pdfium.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,12 @@
from typing import Optional
from typing import Tuple

import cv2
import numpy as np

try:
import cv2
except ImportError:
cv2 = None
import pypdfium2 as pdfium
import pypdfium2.raw as pdfium_c
from numpy import dtype
Expand Down Expand Up @@ -97,9 +101,15 @@ def convert_bitmap_to_corrected_numpy(bitmap: pdfium.PdfBitmap) -> np.ndarray:
# CFX_AggDeviceDriver::GetDIBits() that SIGTRAPs under concurrent rendering.
mode = bitmap.mode
if mode in {"BGRA", "BGRX"}:
cv2.cvtColor(img_arr, cv2.COLOR_BGRA2RGBA, dst=img_arr)
if cv2 is not None:
cv2.cvtColor(img_arr, cv2.COLOR_BGRA2RGBA, dst=img_arr)
else:
img_arr[:, :, :3] = img_arr[:, :, 2::-1] # BGR→RGB in-place, preserve alpha
elif mode == "BGR":
cv2.cvtColor(img_arr, cv2.COLOR_BGR2RGB, dst=img_arr)
if cv2 is not None:
cv2.cvtColor(img_arr, cv2.COLOR_BGR2RGB, dst=img_arr)
else:
img_arr[:, :, [0, 2]] = img_arr[:, :, [2, 0]] # swap R and B channels in-place

return img_arr

Expand Down
Loading
Loading