-
Notifications
You must be signed in to change notification settings - Fork 208
NAT rag integration as example #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
4d163d0
NAT rag integration
nv-pranjald 67c8875
Add license in ingest file and remove unnecessary doc
nv-pranjald 12f94bc
Add sample query and response
nv-pranjald a89b3fc
Move mcp server to examples directory
nv-pranjald fa46eec
Rename nvidia nat agent to rag react agent
nv-pranjald db45a66
Update uv lock file
nv-pranjald c8bdf00
Move agent to rag_react_agent
nv-pranjald 596e7e7
docs: reframe rag_react_agent README to focus on using NAT with RAG a…
nv-pranjald 6f04a5f
Fix unittest for rag mcp server
nv-pranjald 764deb5
Remove ingest script in rag react agent
nv-pranjald 59e2639
Refine README based on review comments
nv-pranjald File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| # NVIDIA RAG Examples | ||
|
|
||
| This directory contains example integrations and extensions for NVIDIA RAG. | ||
|
|
||
| ## Examples | ||
|
|
||
| | Example | Description | Documentation | | ||
| |---------|-------------|---------------| | ||
| | [rag_react_agent](./rag_react_agent/) | Integration with [NeMo Agent Toolkit (NAT)](https://github.com/NVIDIA/NeMo-Agent-Toolkit) providing RAG query and search capabilities for agent workflows | [README](./rag_react_agent/README.md) | | ||
| | [nvidia_rag_mcp](./nvidia_rag_mcp/) | MCP (Model Context Protocol) server and client for exposing NVIDIA RAG capabilities to MCP-compatible applications | [Documentation](../docs/nvidia-rag-mcp.md) | | ||
|
|
||
| ## rag_react_agent | ||
|
|
||
| This plugin integrates NVIDIA RAG with [NeMo Agent Toolkit](https://github.com/NVIDIA/NeMo-Agent-Toolkit), enabling intelligent agents to use RAG tools for document retrieval and question answering. It demonstrates: | ||
|
|
||
| - Creating custom NAT tools that wrap NVIDIA RAG functionality | ||
| - Using the React Agent workflow for intelligent tool selection | ||
|
|
||
| See the [rag_react_agent README](./rag_react_agent/README.md) for setup and usage instructions. | ||
|
|
||
| ## nvidia_rag_mcp | ||
|
|
||
| This example provides an MCP server and client that exposes NVIDIA RAG and Ingestor capabilities as MCP tools. It supports multiple transport modes (SSE, streamable HTTP, stdio) and enables MCP-compatible applications to: | ||
|
|
||
| - Generate answers using the RAG pipeline | ||
| - Search the vector database for relevant documents | ||
| - Manage collections and documents in the vector database | ||
|
|
||
| See the [MCP documentation](../docs/nvidia-rag-mcp.md) for detailed setup and usage instructions. |
File renamed without changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,252 @@ | ||
| # Building Agentic RAG with NeMo Agent Toolkit | ||
|
|
||
| This example demonstrates how to build intelligent agents that leverage **NVIDIA RAG** capabilities using [NeMo Agent Toolkit (NAT)](https://github.com/NVIDIA/NeMo-Agent-Toolkit). The agent can autonomously decide when and how to query your document knowledge base. | ||
|
|
||
| ## Overview | ||
|
|
||
| This example shows how to: | ||
|
|
||
| 1. **Expose RAG as agent tools** - Wrap NVIDIA RAG query and search capabilities as tools that agents can use | ||
| 2. **Build a ReAct agent** - Use NAT's ReAct workflow to create an agent that reasons about when to use RAG | ||
|
|
||
| The ReAct (Reason + Act) agent pattern enables the LLM to iteratively reason about which tools to use based on the user's query, making it ideal for building conversational AI applications with document retrieval capabilities. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| - Python 3.11+ | ||
| - Access to NVIDIA AI endpoints (API key required) | ||
| - **Data ingested into Milvus** - Complete the [rag_library_usage.ipynb](../../notebooks/rag_library_usage.ipynb) notebook to set up Milvus and ingest documents before running this example | ||
|
|
||
| ## Quick Start | ||
|
|
||
| All commands should be run from the `examples/rag_react_agent/` directory. | ||
|
|
||
| ### 1. Set Environment Variables | ||
|
|
||
| ```bash | ||
| # Required: NVIDIA API key for embeddings, reranking, and LLM | ||
| export NVIDIA_API_KEY="your-nvidia-api-key" | ||
|
|
||
| # Optional: If using custom endpoints | ||
| # export NVIDIA_BASE_URL="https://integrate.api.nvidia.com/v1" | ||
| ``` | ||
|
|
||
| ### 2. Configure Vector Database Endpoint | ||
|
|
||
| By default, the example connects to Milvus at `http://localhost:19530`. You can configure this in two ways: | ||
|
|
||
| **Option A: Environment Variable (takes precedence)** | ||
|
|
||
| ```bash | ||
| # For standard Milvus server | ||
| # export APP_VECTORSTORE_URL="http://localhost:19530" | ||
|
|
||
| # Or for a remote Milvus instance | ||
| # export APP_VECTORSTORE_URL="http://milvus-host:19530" | ||
| ``` | ||
|
|
||
| **Option B: Update config.yml** | ||
|
|
||
| Edit `src/rag_react_agent/configs/config.yml` and update the `vdb_endpoint` field: | ||
|
|
||
| ```yaml | ||
| functions: | ||
| rag_query: | ||
| vdb_endpoint: "http://localhost:19530" # Your Milvus endpoint | ||
| rag_search: | ||
| vdb_endpoint: "http://localhost:19530" # Your Milvus endpoint | ||
| ``` | ||
|
|
||
| > **Note**: If you have followed rag_library_lite_usage.ipynb notebook and have a setup using milvus-lite, provide an absolute path to the `.db` file (e.g., `/home/user/data/milvus.db`). | ||
|
|
||
| ### 3. Install Dependencies and Run the Agent | ||
|
|
||
| ```bash | ||
| # From examples/rag_react_agent/ directory | ||
| # Install all dependencies including nvidia-rag and NeMo Agent Toolkit | ||
| uv sync | ||
|
|
||
| # Activate the virtual environment | ||
| source .venv/bin/activate | ||
| ``` | ||
|
|
||
| ## Usage | ||
|
|
||
| ### Running the RAG Agent | ||
|
|
||
| The example uses NAT's **ReAct Agent** workflow, which enables the LLM to reason about which RAG tools to use based on the user's query. | ||
|
|
||
| ```bash | ||
| # From examples/rag_react_agent/ directory with .venv activated | ||
| nat run --config_file src/rag_react_agent/configs/config.yml --input "what is giraffe doing?" | ||
| ``` | ||
|
|
||
| ### Example Queries | ||
|
|
||
| Try different queries to see how the agent decides which tool to use: | ||
|
|
||
| ```bash | ||
| # Query that triggers rag_query (generates a response using retrieved documents) | ||
| nat run --config_file src/rag_react_agent/configs/config.yml --input "Summarize the main themes of the documents" | ||
|
|
||
| # Query that triggers rag_search (returns relevant document chunks) | ||
| nat run --config_file src/rag_react_agent/configs/config.yml --input "Find all animals mentioned in documents" | ||
| ``` | ||
|
|
||
| ### Expected Output | ||
|
|
||
| When running successfully, you'll see the agent's reasoning process: | ||
|
|
||
| ``` | ||
| Configuration Summary: | ||
| -------------------- | ||
| Workflow Type: react_agent | ||
| Number of Functions: 3 | ||
| Number of LLMs: 1 | ||
|
|
||
| ------------------------------ | ||
| [AGENT] | ||
| Agent input: what is giraffe doing? | ||
| Agent's thoughts: | ||
| Thought: I don't have any information about what giraffe is doing. | ||
|
|
||
| Action: rag_query | ||
| Action Input: {'query': 'giraffe current activity'} | ||
| ------------------------------ | ||
|
|
||
| ------------------------------ | ||
| [AGENT] | ||
| Calling tools: rag_query | ||
| Tool's input: {'query': 'giraffe current activity'} | ||
| Tool's response: | ||
| Driving a car at the beach | ||
| ------------------------------ | ||
|
|
||
| ------------------------------ | ||
| [AGENT] | ||
| Agent input: what is giraffe doing? | ||
| Agent's thoughts: | ||
| Thought: I now know the final answer | ||
| Final Answer: Giraffe is driving a car at the beach. | ||
| ------------------------------ | ||
|
|
||
| -------------------------------------------------- | ||
| Workflow Result: | ||
| ['Giraffe is driving a car at the beach.'] | ||
| -------------------------------------------------- | ||
| ``` | ||
|
|
||
| ## Configuration | ||
|
|
||
| The configuration file at `src/rag_react_agent/configs/config.yml` defines the RAG tools and agent workflow: | ||
|
|
||
| ```yaml | ||
| functions: | ||
| # RAG Query Tool - Queries documents and returns LLM or VLM Generated response | ||
| rag_query: | ||
| _type: nvidia_rag_query | ||
| # Ensure collection_name matches with the collection name used in the rag library notebook. | ||
| collection_names: ["test_library"] # Milvus collection names | ||
| vdb_endpoint: "http://localhost:19530" # Milvus endpoint URL | ||
|
|
||
| # RAG Search Tool - Searches for relevant document chunks | ||
| rag_search: | ||
| _type: nvidia_rag_search | ||
| collection_names: ["test_library"] | ||
| vdb_endpoint: "http://localhost:19530" | ||
| reranker_top_k: 3 # Number of results after reranking | ||
| vdb_top_k: 20 # Number of results from vector search | ||
|
|
||
| # Utility tool for date/time queries | ||
| current_datetime: | ||
| _type: current_datetime | ||
|
|
||
| llms: | ||
| nim_llm: | ||
| _type: nim | ||
| model_name: meta/llama-3.1-70b-instruct | ||
| temperature: 0.0 | ||
|
|
||
| # ReAct Agent workflow - enables the LLM to reason about tool usage | ||
| workflow: | ||
| _type: react_agent | ||
| tool_names: | ||
| - rag_query | ||
| - rag_search | ||
| - current_datetime | ||
| llm_name: nim_llm | ||
| verbose: true # Shows agent reasoning process | ||
| ``` | ||
|
|
||
| ### RAG Tools | ||
|
|
||
| | Tool | Type | Description | | ||
| |------|------|-------------| | ||
| | `rag_query` | `nvidia_rag_query` | Queries documents and returns an AI-generated response based on retrieved context | | ||
| | `rag_search` | `nvidia_rag_search` | Searches for relevant document chunks without generating a response | | ||
|
|
||
| ### Tool Configuration Options | ||
|
|
||
| #### `nvidia_rag_query` | ||
|
|
||
| | Parameter | Description | Default | | ||
| |-----------|-------------|---------| | ||
| | `collection_names` | List of Milvus collection names to query | `[]` | | ||
| | `vdb_endpoint` | Vector database endpoint URL or absolute path to Milvus Lite `.db` file | `"http://localhost:19530"` | | ||
|
|
||
| #### `nvidia_rag_search` | ||
|
|
||
| | Parameter | Description | Default | | ||
| |-----------|-------------|---------| | ||
| | `collection_names` | List of Milvus collection names to search | `[]` | | ||
| | `vdb_endpoint` | Vector database endpoint URL or absolute path to Milvus Lite `.db` file | `"http://localhost:19530"` | | ||
| | `reranker_top_k` | Number of results to return after reranking | `10` | | ||
| | `vdb_top_k` | Number of results to retrieve before reranking | `100` | | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| ### Error: Function type `nvidia_rag_query` not found | ||
|
|
||
| The tools are not registered. Ensure you've installed the package: | ||
|
|
||
| ```bash | ||
| # From examples/rag_react_agent/ directory | ||
| uv sync | ||
| source .venv/bin/activate | ||
| ``` | ||
|
|
||
| ### Error: Token limit exceeded | ||
|
|
||
| If you encounter token limit errors, reduce the number of results: | ||
|
|
||
| ```yaml | ||
| rag_search: | ||
| _type: nvidia_rag_search | ||
| reranker_top_k: 1 # Reduce from 3 | ||
| vdb_top_k: 10 # Reduce from 20 | ||
| ``` | ||
|
|
||
| This commonly occurs when documents contain large base64-encoded images. | ||
|
|
||
| ### Error: NVIDIA API key not set | ||
|
|
||
| ```bash | ||
| export NVIDIA_API_KEY="your-api-key" | ||
| ``` | ||
|
|
||
| ### Error: Connection to Milvus failed | ||
|
|
||
| Ensure Milvus is running and accessible at the configured endpoint. If you followed the [rag_library_usage.ipynb](../../notebooks/rag_library_usage.ipynb) notebook, Milvus should be running at `http://localhost:19530`. | ||
|
|
||
| ```bash | ||
| # Check if Milvus is running | ||
| docker ps | grep milvus | ||
| ``` | ||
|
|
||
| ## Learn More | ||
|
|
||
| - [NeMo Agent Toolkit Documentation](https://docs.nvidia.com/nemo/agent-toolkit/latest/) | ||
|
|
||
shubhadeepd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ## License | ||
|
|
||
| Apache-2.0 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| [build-system] | ||
| build-backend = "setuptools.build_meta" | ||
| requires = ["setuptools >= 64", "setuptools-scm>=8"] | ||
|
|
||
|
|
||
| [tool.setuptools.packages.find] | ||
| where = ["src"] | ||
| include = ["rag_react_agent*"] | ||
|
|
||
|
|
||
| [tool.setuptools_scm] | ||
| git_describe_command = "git describe --long --first-parent" | ||
| root = "../.." | ||
|
|
||
|
|
||
| [project] | ||
| name = "rag-react-agent" | ||
| dynamic = ["version"] | ||
| dependencies = [ | ||
| # Keep package version constraints as open as possible to avoid conflicts with other packages. Always define a minimum | ||
| # version when adding a new package. If unsure, default to using `~=` instead of `==`. Does not apply to nvidia-nat packages. | ||
| # Keep sorted!!! | ||
| "langgraph>=0.2", # Required for react_agent workflow | ||
| "langchain_classic", | ||
| "nvidia-nat>=1.5.0a0,<2.0", # Allow pre-release versions | ||
| "nvidia-nat-langchain>=1.5.0a0,<2.0", # Allow pre-release versions | ||
| "nvidia-rag[rag]~=2.4", | ||
| ] | ||
| requires-python = ">=3.11,<3.14" | ||
| description = "RAG React Agent example using NVIDIA RAG with NeMo Agent Toolkit" | ||
| keywords = ["ai", "rag", "agents"] | ||
| license = { text = "Apache-2.0" } | ||
| authors = [{ name = "NVIDIA Corporation" }] | ||
| maintainers = [{ name = "NVIDIA Corporation" }] | ||
| classifiers = [ | ||
| "Programming Language :: Python", | ||
| "Programming Language :: Python :: 3.11", | ||
| "Programming Language :: Python :: 3.12", | ||
| "Programming Language :: Python :: 3.13", | ||
| ] | ||
|
|
||
| [project.urls] | ||
| documentation = "https://docs.nvidia.com/nemo/agent-toolkit/latest/" | ||
| source = "https://github.com/NVIDIA/NeMo-Agent-Toolkit" | ||
|
|
||
|
|
||
| [tool.uv] | ||
| managed = true | ||
| config-settings = { editable_mode = "compat" } | ||
| prerelease = "allow" # nvidia-nat packages are currently pre-release only | ||
|
|
||
|
|
||
| [tool.uv.sources] | ||
| nvidia-rag = { path = "../..", editable = true } | ||
|
|
||
|
|
||
| [project.entry-points.'nat.components'] | ||
| nat_rag = "rag_react_agent.register" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| """NVIDIA RAG integration for NeMo Agent Toolkit.""" |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.