📄 Document Q&A Assistant

A fully local, privacy-first RAG (Retrieval-Augmented Generation) application that lets you chat with any PDF document. Upload a PDF, ask questions in plain English, and get grounded answers with page citations.

Built with LangChain, Groq, HuggingFace, ChromaDB, and Streamlit.

🎯 What It Does

Upload any PDF — a research paper, contract, report, or manual — and ask questions about it in plain English. The app finds the most relevant sections and generates a grounded answer with page citations.

✅ No hallucinations — the LLM only answers from your document's content
✅ Page citations — every answer links back to source page numbers
✅ Transparent retrieval — inspect the exact chunks used to generate each answer
✅ Free to run — Groq's free tier is fast and generous

🏗️ Architecture

PDF Upload
    │
    ▼
┌─────────────────────────────────────────────┐
│  loader.py                                  │
│  PyMuPDF extracts text page by page         │
│  RecursiveCharacterTextSplitter chunks it   │
└─────────────────┬───────────────────────────┘
                  │  list[Document] chunks
                  ▼
┌─────────────────────────────────────────────┐
│  embedder.py                                │
│  all-MiniLM-L6-v2 converts chunks → vectors │
│  ChromaDB stores vectors in-memory          │  ← updated
└─────────────────┬───────────────────────────┘
                  │
     User asks a question
                  │
                  ▼
┌─────────────────────────────────────────────┐
│  retriever.py                               │
│  Question → vector → similarity search      │
│  Top-5 chunks injected into prompt          │
│  Llama 3.3 70B (via Groq) generates answer  │
└─────────────────┬───────────────────────────┘
                  │  answer + page citations
                  ▼
           Streamlit UI (app.py)

🛠️ Tech Stack

Component	Tool	Why
UI	Streamlit	Fast to build, easy to demo
LLM	Llama 3.3 70B via Groq	Free API, faster than local inference
Embeddings	all-MiniLM-L6-v2 (HuggingFace)	Runs on CPU, no API key needed
Vector DB	ChromaDB (in-memory)	Lightweight, no disk writes, works anywhere
PDF Parsing	PyMuPDF	Fast, accurate text extraction
RAG Framework	LangChain	Industry standard

📁 Project Structure

doc-qa-assistant/
│
├── app.py                  # Streamlit UI — chat interface
├── rag/
│   ├── __init__.py
│   ├── loader.py           # PDF loading & chunking
│   ├── embedder.py         # Embeddings + ChromaDB vector store
│   └── retriever.py        # RAG chain — retrieval + answer generation
│
├── .python-version         # Pins Python 3.11 (via uv)
├── .streamlit/
│   └── secrets.toml        # Local secrets — not committed
├── requirements.txt
└── .gitignore

🚀 Getting Started

Prerequisites

Python 3.11
uv — fast Python package manager
Groq API key — free, takes 1 minute to get

1. Clone the repository

git clone https://github.com/SameerGadge/doc-qa-assistant.git
cd doc-qa-assistant

2. Set up the Python environment

uv venv --python 3.11
source .venv/bin/activate
uv pip install -r requirements.txt

3. Add your Groq API key

This app uses Streamlit secrets for API key management (both locally and in the cloud).

Create .streamlit/secrets.toml in the project root:

GROQ_API_KEY = "gsk_your_key_here"

⚠️ Make sure .streamlit/secrets.toml is in your .gitignore — never commit API keys.

Get a free key at console.groq.com → API Keys.

4. Run the app

uv run streamlit run app.py

Open http://localhost:8501 in your browser.

☁️ Deployment (Streamlit Cloud)

Push the repo to GitHub
Go to share.streamlit.io and connect your repo
Set Main file path to app.py
Under Advanced Settings → Secrets, add:
```
GROQ_API_KEY = "gsk_your_key_here"
```
Click Deploy

💡 How to Use

Upload a PDF using the sidebar file uploader
Wait for the document to be processed (chunked + embedded)
Type a question in the chat input
View the answer with page citations
Expand "View retrieved context" to see exactly which chunks were used

⚙️ Configuration

Key parameters can be tuned in each module:

Chunking (rag/loader.py)

chunk_size    = 500   # characters per chunk — increase for denser docs
chunk_overlap = 100   # overlap between chunks — increase to avoid boundary loss

Retrieval (rag/retriever.py)

TOP_K       = 5      # chunks retrieved per query
temperature = 0      # 0 = factual/deterministic, 1 = creative
LLM_MODEL   = "llama-3.3-70b-versatile"  # swap to llama-3.1-8b-instant for faster responses

🧠 Key Concepts

RAG (Retrieval-Augmented Generation) — Instead of fine-tuning a model on your documents, RAG retrieves relevant snippets at query time and injects them into the prompt. The LLM answers using only that context, reducing hallucinations and keeping it grounded in your data.

Chunking — LLMs have limited context windows, so documents are split into small overlapping pieces. Overlap ensures sentences that span chunk boundaries aren't lost.

Embeddings — Each chunk is converted into a vector that captures its semantic meaning. Similar meanings produce similar vectors, enabling meaning-based search rather than keyword matching.

Vector Database — ChromaDB stores chunk vectors in-memory per session and performs fast similarity search to find the most relevant chunks for any query.

🔮 Possible Improvements

Support multiple PDFs simultaneously with document selection
Add hybrid search (semantic + keyword BM25) for better retrieval
Implement conversation memory for multi-turn follow-up questions
Evaluate retrieval quality with the RAGAS framework
Add a reranker to improve chunk ranking
Export Q&A sessions as a PDF report

📄 License

MIT License — free to use, modify, and distribute.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.devcontainer		.devcontainer
.streamlit		.streamlit
rag		rag
.env		.env
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
app.py		app.py
main.py		main.py
platform_concept.docx		platform_concept.docx
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 Document Q&A Assistant

🎯 What It Does

🏗️ Architecture

🛠️ Tech Stack

📁 Project Structure

🚀 Getting Started

Prerequisites

1. Clone the repository

2. Set up the Python environment

3. Add your Groq API key

4. Run the app

☁️ Deployment (Streamlit Cloud)

💡 How to Use

⚙️ Configuration

🧠 Key Concepts

🔮 Possible Improvements

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📄 Document Q&A Assistant

🎯 What It Does

🏗️ Architecture

🛠️ Tech Stack

📁 Project Structure

🚀 Getting Started

Prerequisites

1. Clone the repository

2. Set up the Python environment

3. Add your Groq API key

4. Run the app

☁️ Deployment (Streamlit Cloud)

💡 How to Use

⚙️ Configuration

🧠 Key Concepts

🔮 Possible Improvements

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages