Run your first research query in 5 minutes. This guide assumes you have already completed Installation.
If you have not already, create the environment file and add your keys:
cp deploy/.env.example deploy/.envEdit deploy/.env and set at minimum:
NVIDIA_API_KEY=nvapi-...
TAVILY_API_KEY=tvly-...The AI-Q blueprint supports two primary modes for interactive use: a terminal-based CLI and a browser-based web UI.
The CLI provides an interactive research assistant in your terminal.
source .venv/bin/activate
# Using the convenience script
./scripts/start_cli.sh
# Or run directly with the custom CLI entry point
dotenv -f deploy/.env run .venv/bin/aiq-research --config_file configs/config_cli_default.ymlNote:
start_cli.shruns.venv/bin/aiq-research(a custom CLI entry point registered by this project), notnat run. The custom entry point adds interactive features like conversation history that are not part of the standard NeMo Agent Toolkit CLI.
You should observe the agent start up and present an input prompt where you can type questions.
For a browser-based experience with a chat interface:
source .venv/bin/activate
./scripts/start_e2e.shThis starts:
- Backend API at
http://localhost:8000 - Frontend UI at
http://localhost:3000
Open http://localhost:3000 in your browser.
The web UI requires Node.js 22+ and npm. If these were available during `./scripts/setup.sh`, UI dependencies are already installed. Otherwise, run `cd frontends/ui && npm ci` first.
**Running on a remote VM?** If you access the VM via SSH, you need to forward ports 3000 and 8000 to your local machine: `ssh -L 3000:localhost:3000 -L 8000:localhost:8000 user@your-vm-host`. See [Troubleshooting — VM / Remote Development](../resources/troubleshooting.md#vm--remote-development) for details.
Try one of these example queries to observe the system in action:
Shallow research (fast, concise answers):
What is CUDA and how does it relate to GPU programming?
What are the key differences between TCP and UDP?
Deep research (detailed, report-style output):
Compare the transformer architectures used in GPT-4 and Gemini, including their training approaches, parameter counts, and benchmark performance.
What are the current approaches to solving the protein folding problem, and how do AlphaFold and RoseTTAFold compare?
- Shallow queries produce a concise answer with inline citations and source links within a few seconds.
- Deep queries trigger a multi-phase research process (planning, research, synthesis) that produces a structured report with a table of contents, inline citations, and a references section. This takes longer (typically 1--3 minutes depending on complexity).
The system automatically routes queries to the appropriate depth based on complexity. You do not need to specify shallow vs. deep -- the orchestration node decides for you.
$ ./scripts/start_cli.sh
============================================
AI-Q Blueprint - CLI Mode
============================================
Config: config_cli_default.yml
Verbose: OFF (use -v to enable)
Type 'exit' or 'quit' to exit
--------------------------------------------
NVIDIA AI-Q Blueprint
Research Assistant powered by NVIDIA NeMo Agent Toolkit
AI-Q initialized!
Type 'exit', 'quit', or 'q' to quit.
You: What is the NVIDIA NeMo Agent Toolkit?
Answer
The NVIDIA NeMo Agent Toolkit (NAT) is a framework for building and deploying
AI agents and multi-agent systems...
Sources:
[1] https://docs.nvidia.com/nemo/agent-toolkit/latest/ - NeMo Agent Toolkit Documentation
[2] https://developer.nvidia.com/nemo - NVIDIA NeMo Overview
To view detailed agent execution logs (tool calls, routing decisions, LLM interactions):
./scripts/start_cli.sh --verboseNow that you have the system running, explore these topics:
- Architecture Overview -- Understand how the orchestrator, shallow researcher, and deep researcher work together
- Customization -- Swap models, configure tools, adjust prompts, and tune agent behavior
- Deployment -- Run with Docker Compose
- Evaluation -- Measure research quality with built-in benchmarks