Skip to content

Latest commit

 

History

History
107 lines (79 loc) · 3.99 KB

File metadata and controls

107 lines (79 loc) · 3.99 KB

Data Analysis Agent

An interactive, agentic data analysis application that leverages advanced LLM reasoning to help users explore, visualize, and understand their data using NVIDIA Llama-3.1-Nemotron-Ultra-253B-v1 and NVIDIA Llama-3.3-Nemotron-Super-49B-v1.5.

Overview

This repository contains a Streamlit application that demonstrates a complete workflow for data analysis:

  1. Data Upload: Upload CSV files for analysis
  2. Natural Language Queries: Ask questions about your data in plain English
  3. Automated Visualization: Generate relevant plots and charts
  4. Transparent Reasoning: Get detailed explanations of the analysis process

The implementation leverages the powerful Llama-3.1-Nemotron-Ultra-253B-v1 and Llama-3.3-Nemotron-Super-49B-v1.5 models through NVIDIA's API, enabling sophisticated data analysis and reasoning.

Learn more about the models here.

Features

  • Agentic Architecture: Modular agents for data insight, code generation, execution, and reasoning
  • Natural Language Queries: Ask questions about your data—no coding required
  • Automated Visualization: Instantly generate and display relevant plots
  • Transparent Reasoning: Get clear, LLM-generated explanations for every result
  • Powered by NVIDIA Llama-3.1-Nemotron-Ultra-253B-v1 and NVIDIA Llama-3.3-Nemotron-Super-49B-v1.5: State-of-the-art reasoning and interpretability

Workflow

Requirements

  • Python 3.10+
  • Streamlit
  • NVIDIA API Key (see Installation section for setup instructions)
  • Required Python packages:
    • pandas
    • matplotlib
    • streamlit
    • requests

Installation

  1. Clone this repository:

    git clone https://github.com/NVIDIA/GenerativeAIExamples.git
    cd GenerativeAIExamples/community/data-analysis-agent
  2. Install dependencies:

    pip install -r requirements.txt
  3. Set up your NVIDIA API key:

    • Sign up or log in at NVIDIA Build
    • Generate an API key
    • Set the API key in your environment:
      export NVIDIA_API_KEY=your_nvidia_api_key_here
    • Or add it to your .env file if you use one

Usage

  1. Run the Streamlit app:

    streamlit run data_analysis_agent.py
  2. Download example dataset (optional):

    wget https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv
  3. Use the application:

    • Select a model from the dropdown menu
    • Upload a CSV file (e.g., the Titanic dataset)
    • Ask questions in natural language
    • View results, visualizations, and detailed reasoning

Example

App Demo

Models Details

The Llama-3.1-Nemotron-Ultra-253B-v1 model used in this project has the following specifications:

  • Parameters: 253B
  • Features: Advanced reasoning capabilities
  • Use Cases: Complex data analysis, multi-agent systems
  • Enterprise Ready: Optimized for production deployment

The Llama-3.3-Nemotron-Super-49B-v1.5 model used in this project has the following specifications:

  • Parameters: 49B
  • Features: Efficient reasoning and and chat model
  • Use Cases: AI Agent systems, chatbots, RAG systems, and other AI-powered applications. Also suitable for typical instruction-following tasks
  • Enterprise Ready: Optimized for production deployment

Acknowledgments

Contributing

Contributions are welcome! Please open an issue or submit a pull request.