Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 56 additions & 56 deletions docs/nvidia-rag-mcp.md

Large diffs are not rendered by default.

29 changes: 29 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# NVIDIA RAG Examples

This directory contains example integrations and extensions for NVIDIA RAG.

## Examples

| Example | Description | Documentation |
|---------|-------------|---------------|
| [rag_react_agent](./rag_react_agent/) | Integration with [NeMo Agent Toolkit (NAT)](https://github.com/NVIDIA/NeMo-Agent-Toolkit) providing RAG query and search capabilities for agent workflows | [README](./rag_react_agent/README.md) |
| [nvidia_rag_mcp](./nvidia_rag_mcp/) | MCP (Model Context Protocol) server and client for exposing NVIDIA RAG capabilities to MCP-compatible applications | [Documentation](../docs/nvidia-rag-mcp.md) |

## rag_react_agent

This plugin integrates NVIDIA RAG with [NeMo Agent Toolkit](https://github.com/NVIDIA/NeMo-Agent-Toolkit), enabling intelligent agents to use RAG tools for document retrieval and question answering. It demonstrates:

- Creating custom NAT tools that wrap NVIDIA RAG functionality
- Using the React Agent workflow for intelligent tool selection

See the [rag_react_agent README](./rag_react_agent/README.md) for setup and usage instructions.

## nvidia_rag_mcp

This example provides an MCP server and client that exposes NVIDIA RAG and Ingestor capabilities as MCP tools. It supports multiple transport modes (SSE, streamable HTTP, stdio) and enables MCP-compatible applications to:

- Generate answers using the RAG pipeline
- Search the vector database for relevant documents
- Manage collections and documents in the vector database

See the [MCP documentation](../docs/nvidia-rag-mcp.md) for detailed setup and usage instructions.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -300,16 +300,16 @@ def main() -> None:
Main entry point for the MCP client CLI.
Examples:
List tools (SSE):
python nvidia_rag_mcp/mcp_client.py list --transport=sse --url=http://127.0.0.1:8000/sse
python examples/nvidia_rag_mcp/mcp_client.py list --transport=sse --url=http://127.0.0.1:8000/sse
List tools (stdio):
python nvidia_rag_mcp/mcp_client.py list --transport=stdio --command=python \
--args="-m nvidia_rag_mcp.mcp_server --transport stdio"
python examples/nvidia_rag_mcp/mcp_client.py list --transport=stdio --command=python \
--args="examples/nvidia_rag_mcp/mcp_server.py --transport stdio"
Call generate (streamable_http):
python nvidia_rag_mcp/mcp_client.py call --transport=streamable_http --url=http://127.0.0.1:8000/mcp \
python examples/nvidia_rag_mcp/mcp_client.py call --transport=streamable_http --url=http://127.0.0.1:8000/mcp \
--tool=generate --json-args='{"messages":[{"role":"user","content":"Hi"}]}'
Call upload_documents (stdio):
python nvidia_rag_mcp/mcp_client.py call --transport=stdio --command=python \
--args="-m nvidia_rag_mcp.mcp_server --transport stdio" \
python examples/nvidia_rag_mcp/mcp_client.py call --transport=stdio --command=python \
--args="examples/nvidia_rag_mcp/mcp_server.py --transport stdio" \
--tool=upload_documents \
--json-args='{"collection_name":"my_collection","file_paths":["/abs/path/file.pdf"]}'
"""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -815,9 +815,9 @@ def main() -> None:
Main entry point for the MCP server.
Examples:
SSE:
python nvidia_rag_mcp/mcp_server.py --transport sse
python examples/nvidia_rag_mcp/mcp_server.py --transport sse
streamable_http:
python nvidia_rag_mcp/mcp_server.py --transport streamable_http
python examples/nvidia_rag_mcp/mcp_server.py --transport streamable_http
"""
parser = argparse.ArgumentParser(description="NVIDIA RAG MCP server")
parser.add_argument("--transport", choices=["sse", "streamable_http", "stdio"], help="Transport mode")
Expand Down
252 changes: 252 additions & 0 deletions examples/rag_react_agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
# Building Agentic RAG with NeMo Agent Toolkit

This example demonstrates how to build intelligent agents that leverage **NVIDIA RAG** capabilities using [NeMo Agent Toolkit (NAT)](https://github.com/NVIDIA/NeMo-Agent-Toolkit). The agent can autonomously decide when and how to query your document knowledge base.

## Overview

This example shows how to:

1. **Expose RAG as agent tools** - Wrap NVIDIA RAG query and search capabilities as tools that agents can use
2. **Build a ReAct agent** - Use NAT's ReAct workflow to create an agent that reasons about when to use RAG

The ReAct (Reason + Act) agent pattern enables the LLM to iteratively reason about which tools to use based on the user's query, making it ideal for building conversational AI applications with document retrieval capabilities.

## Prerequisites

- Python 3.11+
- Access to NVIDIA AI endpoints (API key required)
- **Data ingested into Milvus** - Complete the [rag_library_usage.ipynb](../../notebooks/rag_library_usage.ipynb) notebook to set up Milvus and ingest documents before running this example

## Quick Start

All commands should be run from the `examples/rag_react_agent/` directory.

### 1. Set Environment Variables

```bash
# Required: NVIDIA API key for embeddings, reranking, and LLM
export NVIDIA_API_KEY="your-nvidia-api-key"

# Optional: If using custom endpoints
# export NVIDIA_BASE_URL="https://integrate.api.nvidia.com/v1"
```

### 2. Configure Vector Database Endpoint

By default, the example connects to Milvus at `http://localhost:19530`. You can configure this in two ways:

**Option A: Environment Variable (takes precedence)**

```bash
# For standard Milvus server
# export APP_VECTORSTORE_URL="http://localhost:19530"

# Or for a remote Milvus instance
# export APP_VECTORSTORE_URL="http://milvus-host:19530"
```

**Option B: Update config.yml**

Edit `src/rag_react_agent/configs/config.yml` and update the `vdb_endpoint` field:

```yaml
functions:
rag_query:
vdb_endpoint: "http://localhost:19530" # Your Milvus endpoint
rag_search:
vdb_endpoint: "http://localhost:19530" # Your Milvus endpoint
```

> **Note**: If you have followed rag_library_lite_usage.ipynb notebook and have a setup using milvus-lite, provide an absolute path to the `.db` file (e.g., `/home/user/data/milvus.db`).

### 3. Install Dependencies and Run the Agent

```bash
# From examples/rag_react_agent/ directory
# Install all dependencies including nvidia-rag and NeMo Agent Toolkit
uv sync

# Activate the virtual environment
source .venv/bin/activate
```

## Usage

### Running the RAG Agent

The example uses NAT's **ReAct Agent** workflow, which enables the LLM to reason about which RAG tools to use based on the user's query.

```bash
# From examples/rag_react_agent/ directory with .venv activated
nat run --config_file src/rag_react_agent/configs/config.yml --input "what is giraffe doing?"
```

### Example Queries

Try different queries to see how the agent decides which tool to use:

```bash
# Query that triggers rag_query (generates a response using retrieved documents)
nat run --config_file src/rag_react_agent/configs/config.yml --input "Summarize the main themes of the documents"

# Query that triggers rag_search (returns relevant document chunks)
nat run --config_file src/rag_react_agent/configs/config.yml --input "Find all animals mentioned in documents"
```

### Expected Output

When running successfully, you'll see the agent's reasoning process:

```
Configuration Summary:
--------------------
Workflow Type: react_agent
Number of Functions: 3
Number of LLMs: 1

------------------------------
[AGENT]
Agent input: what is giraffe doing?
Agent's thoughts:
Thought: I don't have any information about what giraffe is doing.

Action: rag_query
Action Input: {'query': 'giraffe current activity'}
------------------------------

------------------------------
[AGENT]
Calling tools: rag_query
Tool's input: {'query': 'giraffe current activity'}
Tool's response:
Driving a car at the beach
------------------------------

------------------------------
[AGENT]
Agent input: what is giraffe doing?
Agent's thoughts:
Thought: I now know the final answer
Final Answer: Giraffe is driving a car at the beach.
------------------------------

--------------------------------------------------
Workflow Result:
['Giraffe is driving a car at the beach.']
--------------------------------------------------
```

## Configuration

The configuration file at `src/rag_react_agent/configs/config.yml` defines the RAG tools and agent workflow:

```yaml
functions:
# RAG Query Tool - Queries documents and returns LLM or VLM Generated response
rag_query:
_type: nvidia_rag_query
# Ensure collection_name matches with the collection name used in the rag library notebook.
collection_names: ["test_library"] # Milvus collection names
vdb_endpoint: "http://localhost:19530" # Milvus endpoint URL

# RAG Search Tool - Searches for relevant document chunks
rag_search:
_type: nvidia_rag_search
collection_names: ["test_library"]
vdb_endpoint: "http://localhost:19530"
reranker_top_k: 3 # Number of results after reranking
vdb_top_k: 20 # Number of results from vector search

# Utility tool for date/time queries
current_datetime:
_type: current_datetime

llms:
nim_llm:
_type: nim
model_name: meta/llama-3.1-70b-instruct
temperature: 0.0

# ReAct Agent workflow - enables the LLM to reason about tool usage
workflow:
_type: react_agent
tool_names:
- rag_query
- rag_search
- current_datetime
llm_name: nim_llm
verbose: true # Shows agent reasoning process
```

### RAG Tools

| Tool | Type | Description |
|------|------|-------------|
| `rag_query` | `nvidia_rag_query` | Queries documents and returns an AI-generated response based on retrieved context |
| `rag_search` | `nvidia_rag_search` | Searches for relevant document chunks without generating a response |

### Tool Configuration Options

#### `nvidia_rag_query`

| Parameter | Description | Default |
|-----------|-------------|---------|
| `collection_names` | List of Milvus collection names to query | `[]` |
| `vdb_endpoint` | Vector database endpoint URL or absolute path to Milvus Lite `.db` file | `"http://localhost:19530"` |

#### `nvidia_rag_search`

| Parameter | Description | Default |
|-----------|-------------|---------|
| `collection_names` | List of Milvus collection names to search | `[]` |
| `vdb_endpoint` | Vector database endpoint URL or absolute path to Milvus Lite `.db` file | `"http://localhost:19530"` |
| `reranker_top_k` | Number of results to return after reranking | `10` |
| `vdb_top_k` | Number of results to retrieve before reranking | `100` |

## Troubleshooting

### Error: Function type `nvidia_rag_query` not found

The tools are not registered. Ensure you've installed the package:

```bash
# From examples/rag_react_agent/ directory
uv sync
source .venv/bin/activate
```

### Error: Token limit exceeded

If you encounter token limit errors, reduce the number of results:

```yaml
rag_search:
_type: nvidia_rag_search
reranker_top_k: 1 # Reduce from 3
vdb_top_k: 10 # Reduce from 20
```

This commonly occurs when documents contain large base64-encoded images.

### Error: NVIDIA API key not set

```bash
export NVIDIA_API_KEY="your-api-key"
```

### Error: Connection to Milvus failed

Ensure Milvus is running and accessible at the configured endpoint. If you followed the [rag_library_usage.ipynb](../../notebooks/rag_library_usage.ipynb) notebook, Milvus should be running at `http://localhost:19530`.

```bash
# Check if Milvus is running
docker ps | grep milvus
```

## Learn More

- [NeMo Agent Toolkit Documentation](https://docs.nvidia.com/nemo/agent-toolkit/latest/)

## License

Apache-2.0
58 changes: 58 additions & 0 deletions examples/rag_react_agent/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
[build-system]
build-backend = "setuptools.build_meta"
requires = ["setuptools >= 64", "setuptools-scm>=8"]


[tool.setuptools.packages.find]
where = ["src"]
include = ["rag_react_agent*"]


[tool.setuptools_scm]
git_describe_command = "git describe --long --first-parent"
root = "../.."


[project]
name = "rag-react-agent"
dynamic = ["version"]
dependencies = [
# Keep package version constraints as open as possible to avoid conflicts with other packages. Always define a minimum
# version when adding a new package. If unsure, default to using `~=` instead of `==`. Does not apply to nvidia-nat packages.
# Keep sorted!!!
"langgraph>=0.2", # Required for react_agent workflow
"langchain_classic",
"nvidia-nat>=1.5.0a0,<2.0", # Allow pre-release versions
"nvidia-nat-langchain>=1.5.0a0,<2.0", # Allow pre-release versions
"nvidia-rag[rag]~=2.4",
]
requires-python = ">=3.11,<3.14"
description = "RAG React Agent example using NVIDIA RAG with NeMo Agent Toolkit"
keywords = ["ai", "rag", "agents"]
license = { text = "Apache-2.0" }
authors = [{ name = "NVIDIA Corporation" }]
maintainers = [{ name = "NVIDIA Corporation" }]
classifiers = [
"Programming Language :: Python",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
]

[project.urls]
documentation = "https://docs.nvidia.com/nemo/agent-toolkit/latest/"
source = "https://github.com/NVIDIA/NeMo-Agent-Toolkit"


[tool.uv]
managed = true
config-settings = { editable_mode = "compat" }
prerelease = "allow" # nvidia-nat packages are currently pre-release only


[tool.uv.sources]
nvidia-rag = { path = "../..", editable = true }


[project.entry-points.'nat.components']
nat_rag = "rag_react_agent.register"
16 changes: 16 additions & 0 deletions examples/rag_react_agent/src/rag_react_agent/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""NVIDIA RAG integration for NeMo Agent Toolkit."""
Loading
Loading