Skip to content

Latest commit

 

History

History
 
 

csv-qa

CSV Question Answering

This module shows how we benchmark question answering over CSV data. There are several components:

Setup

To setup, you should install all required packages:

pip install -r requirements.txt

You then need to set environment variables. This heavily uses LangSmith, so you need to set those environment variables:

export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_ENDPOINT=https://api.langchain.plus
export LANGCHAIN_API_KEY=...

This also uses OpenAI, so you need to set that environment variable:

export OPENAI_API_KEY=...

How we collected data

To do this, we set up a simple streamlit app that was logging questions, answers, and feedback to LangSmith. We then annotated examples in LangSmith and added them to a dataset we were creating. For more details on how to do this generally, see this cookbook

When doing this, you probably want to specific a project for all runs to be logged to:

export LANGCHAIN_PROJECT="Titanic CSV"

The streamlit_app.py file contains the exact code used to run the application. You can run this with streamlit run streamlit_app.py

What the data is

See data.csv for the data points we labeled.

How we evaluate

In order to evaluate, we first upload our data to LangSmith, with dataset name Titanic CSV. This is done in upload_data.py. You can run this with:

python upload_data.py

This allows us to track different evaluation runs against this dataset. We then use a standard qa evaluator to evaluate whether the generated answers are correct are not.

We include scripts for evaluating a few different methods:

Run with python pandas_agent_gpt_35.py

Results:

results_35.png

Run with python pandas_agent_gpt_4.py

Results:

results_4.png

Need into install more packages:

pip install beautifulsoup4 pandasai

Then can run with python pandas_ai.py

Results (note token tracking is off because not using LangChain):

results_pandasai.png

A custom agent equipped with a custom prompt and some custom tools (Python REPL and vectorstore)

Run with python custom_agent.py

Results:

results_custom.png