MegaMind-RAG

A fully online Retrieval-Augmented Generation (RAG) system with Streamlit frontend and FastAPI backend.
Upload PDFs, Markdown, or text files → automatically chunk, embed, index, and query them semantically: all in the cloud.

Live Demo

Frontend: https://megamind-rag.streamlit.app
Backend: https://megamind-rag.onrender.com/docs

System Overview

Streamlit UI → FastAPI Backend → AWS S3 → FAISS Index → OpenAI Embeddings → LLM Response

Flow

Upload documents via Streamlit.
Files are stored in your S3 bucket (/users/{user_id}/docs/).
Click Ingest to trigger the backend:
- Downloads from S3
- Chunks text into overlapping segments
- Generates embeddings (OpenAI or MiniLM)
- Builds a FAISS index → uploads back to S3
Ask a question → retrieves top chunks → sends to LLM → returns contextual answer + citations.

Tech Stack

Component	Purpose
FastAPI	REST backend for `/ask` and `/ingest`
Streamlit	Frontend UI for uploads, ingestion, and Q&A
OpenAI / MiniLM	Text embeddings
FAISS	Vector similarity search
AWS S3 (boto3)	Cloud document + index storage
Render	Backend hosting (Dockerized)
Streamlit Cloud	Frontend hosting

S3 Structure

Path	Description
`{APP_ENV}/users/{user_id}/docs/`	Uploaded PDFs, Markdown, and TXT files
`{APP_ENV}/users/{user_id}/indexes/`	Generated FAISS index and metadata

Core Ideas

Chunking: Breaks long docs into overlapping sections to preserve context.
Embedding: Maps text to semantic vector space using pretrained models.
FAISS: Finds the most semantically similar chunks fast.
RAG: Retrieval + Generation = grounded, explainable AI responses.

Future possible improvements

Semantic (topic-aware) chunking
Re-ranking retrieved chunks before LLM query
Domain-tuned embeddings for specialized corpora

Alvaro Balbin: for programming everything.
Gen AI: for writing this README (but not the ideas in it, just polishing)

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
api		api
docs		docs
themind		themind
ui		ui
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yaml		docker-compose.yaml
requirements-api.txt		requirements-api.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MegaMind-RAG

Live Demo

System Overview

Tech Stack

S3 Structure

Core Ideas

Future possible improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MegaMind-RAG

Live Demo

System Overview

Tech Stack

S3 Structure

Core Ideas

Future possible improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages