Skip to content

Add pandas example#36

Merged
ofermend merged 6 commits intomainfrom
add_pandas_example
Mar 11, 2026
Merged

Add pandas example#36
ofermend merged 6 commits intomainfrom
add_pandas_example

Conversation

@ofermend
Copy link
Contributor

No description provided.

added reranker instructions example (notebook 8)
updated messaging
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the Vectara API tutorial notebook series by adding new end-to-end examples (data analysis Lambda tools with NumPy/Pandas, and reranker instructions with Qwen3), while also refreshing the “About Vectara” boilerplate and updating tutorial documentation.

Changes:

  • Update notebooks/api-examples/README.md to reflect additional notebooks and endpoints.
  • Add Notebook 7 (NumPy/Pandas Lambda tools for data analysis) and Notebook 8 (Qwen3 reranker instructions).
  • Refresh “About Vectara” text formatting/content across Notebooks 1–4 and add CLAUDE.md repo guidance.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
notebooks/api-examples/README.md Updates tutorial narrative, adds Notebook 6/7 sections, tutorial flow, and endpoint mapping.
notebooks/api-examples/7-lambda-tools-data-analysis.ipynb New notebook demonstrating creation/use of Python Lambda tools leveraging NumPy/Pandas.
notebooks/api-examples/8-reranker-instructions.ipynb New notebook demonstrating reranker instructions with qwen3-reranker.
notebooks/api-examples/1-corpus-creation.ipynb Updates “About Vectara” section text/formatting.
notebooks/api-examples/2-data-ingestion.ipynb Updates “About Vectara” section text/formatting.
notebooks/api-examples/3-query-api.ipynb Updates “About Vectara” section text/formatting.
notebooks/api-examples/4-agent-api.ipynb Updates “About Vectara” section text/formatting.
CLAUDE.md Adds repository usage guidance for Claude Code.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"id": "cell-15",
"metadata": {},
"source": [
"## Step 5: Create a Session and Test the Agent\n",
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step numbering skips from Step 3 to "Step 5" here, which is confusing when following the tutorial. Renumber this to Step 4 (or add the missing Step 4 section if something was omitted).

Suggested change
"## Step 5: Create a Session and Test the Agent\n",
"## Step 4: Create a Session and Test the Agent\n",

Copilot uses AI. Check for mistakes.
" print(\"\\n=== Top Search Results ===\")\n",
" for i, sr in enumerate(result.get('search_results', [])[:5], 1):\n",
" meta = sr.get('document_metadata', {})\n",
" print(f\"\\n--- Result {i} (score: {sr.get('score', 'N/A'):.4f}) ---\")\n",
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This format expression will raise a TypeError if score is missing or non-numeric (because the default is the string 'N/A'). Use a numeric default (e.g., 0.0) or branch formatting based on whether score is present.

Suggested change
" print(f\"\\n--- Result {i} (score: {sr.get('score', 'N/A'):.4f}) ---\")\n",
" score = sr.get('score')\n",
" if isinstance(score, (int, float)):\n",
" score_str = f\"{score:.4f}\"\n",
" else:\n",
" score_str = \"N/A\"\n",
" print(f\"\\n--- Result {i} (score: {score_str}) ---\")\n",

Copilot uses AI. Check for mistakes.
# Vectara API Tutorial Series

This tutorial series provides a comprehensive, hands-on introduction to building RAG (Retrieval-Augmented Generation) applications using Vectara's REST API. Through five progressive notebooks, you'll learn to create corpora, ingest data, query information, build intelligent AI agents, and orchestrate multi-agent workflows.
This tutorial series provides a comprehensive, hands-on introduction to building RAG (Retrieval-Augmented Generation) applications using Vectara's REST API. Through seven progressive notebooks, you'll learn to create corpora, ingest data, query information, build intelligent AI agents, orchestrate multi-agent workflows, work with file artifacts, and create data analysis tools with NumPy and Pandas.
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README says there are "seven progressive notebooks", but this directory now contains 8 notebooks (including 8-reranker-instructions.ipynb). Update the count and ensure the tutorial overview/flow includes Notebook 8 (or explicitly label it as an optional/bonus notebook).

Suggested change
This tutorial series provides a comprehensive, hands-on introduction to building RAG (Retrieval-Augmented Generation) applications using Vectara's REST API. Through seven progressive notebooks, you'll learn to create corpora, ingest data, query information, build intelligent AI agents, orchestrate multi-agent workflows, work with file artifacts, and create data analysis tools with NumPy and Pandas.
This tutorial series provides a comprehensive, hands-on introduction to building RAG (Retrieval-Augmented Generation) applications using Vectara's REST API. Through eight progressive notebooks, you'll learn to create corpora, ingest data, query information, build intelligent AI agents, orchestrate multi-agent workflows, work with file artifacts, and create data analysis tools with NumPy and Pandas.

Copilot uses AI. Check for mistakes.
Comment on lines +289 to +296
- Build tools for statistical analysis, trend detection, and data transformation
- Combine multiple data analysis tools in agent workflows

**What you'll build:**
Three **Data Analysis Lambda Tools**:
1. **Statistical Analyzer**: Descriptive statistics, correlations, percentiles using Pandas
2. **Trend Analyzer**: Moving averages, growth rates, linear regression using NumPy
3. **Data Transformer**: Normalization, missing value handling, outlier removal, aggregation
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section claims Notebook 7 builds three Lambda tools (including a "Data Transformer"), but 7-lambda-tools-data-analysis.ipynb only defines statistical_analyzer and trend_analyzer (no transformer tool found). Either add the missing tool to the notebook or adjust this README to match what the notebook actually builds.

Suggested change
- Build tools for statistical analysis, trend detection, and data transformation
- Combine multiple data analysis tools in agent workflows
**What you'll build:**
Three **Data Analysis Lambda Tools**:
1. **Statistical Analyzer**: Descriptive statistics, correlations, percentiles using Pandas
2. **Trend Analyzer**: Moving averages, growth rates, linear regression using NumPy
3. **Data Transformer**: Normalization, missing value handling, outlier removal, aggregation
- Build tools for statistical analysis and trend detection
- Combine multiple data analysis tools in agent workflows
**What you'll build:**
Two **Data Analysis Lambda Tools**:
1. **Statistical Analyzer**: Descriptive statistics, correlations, percentiles using Pandas
2. **Trend Analyzer**: Moving averages, growth rates, linear regression using NumPy

Copilot uses AI. Check for mistakes.
Comment on lines +306 to +312
"code": """
import pandas as pd
import numpy as np

def process(data: str, columns: str = "", operations: str = "describe") -> dict:
df = pd.DataFrame(json.loads(data))
# ... compute statistics
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code snippet uses json.loads(data) but doesn't import json, so it won't run as written. Add an import json in the snippet (or adjust the example to avoid json.loads).

Copilot uses AI. Check for mistakes.
Comment on lines +404 to +408
| `POST /v2/agents` | Create agent | 4, 5, 6, 7 |
| `POST /v2/agents/{key}/sessions` | Create session | 4, 5, 6, 7 |
| `POST /v2/agents/{key}/sessions/{key}/events` | Send messages / Upload artifacts | 4, 5, 6, 7 |
| `GET /v2/agents/{key}/sessions/{key}/events` | Get conversation history | 4 |
| `GET /v2/agents/{key}/sessions/{key}/artifacts` | List session artifacts | 6 |
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endpoint mapping table hasn’t been updated for the new Notebook 8 (which uses POST /v2/query). Update the table so the "Notebook" column references 8 where applicable (at minimum for /v2/query).

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

" return {\"success\": False, \"error\": f\"None of the specified columns found. Available: {list(df.columns)}\"}\n",
" df_analysis = df[valid_cols]\n",
" else:\n",
" df_analysis = df.select_dtypes(include=[np.number])\n",
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When columns is empty, df_analysis = df.select_dtypes(include=[np.number]) can produce an empty DataFrame (e.g., if the input data has no numeric columns). Downstream operations like describe() will then raise (pandas cannot describe a DataFrame without columns). Add an explicit guard after selecting numeric columns to return a clear error (or fall back to describe(include='all')) when there are no columns to analyze.

Suggested change
" df_analysis = df.select_dtypes(include=[np.number])\n",
" df_analysis = df.select_dtypes(include=[np.number])\n",
" if df_analysis.shape[1] == 0:\n",
" return {\"success\": False, \"error\": \"No numeric columns found to analyze.\"}\n",

Copilot uses AI. Check for mistakes.
@ofermend ofermend merged commit a593d59 into main Mar 11, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants