Skip to content

Brijesh-Thakkar/NeuroLearn-Project

Repository files navigation

NeuroLearn — AI-Powered Course Recommendation

NeuroLearn is a full-stack course recommendation system that uses a production-grade hybrid RAG pipeline — not basic cosine similarity. When a user describes what they want to learn, the system runs their query through ChromaDB dense vector search and BM25 sparse keyword search simultaneously, fuses the two result sets using Reciprocal Rank Fusion (RRF), reranks the top candidates with a cross-encoder model, and passes the best match to Qwen 2.5-14B to generate a personalised explanation. The result is a recommender that handles both semantic queries ("I want to understand how attention mechanisms work") and keyword queries ("React hooks tutorial") correctly — something neither dense-only nor sparse-only retrieval achieves alone.


Architecture

User Query
    │
    ▼
React Frontend (port 3000)
    │  POST /output/ {text: query}
    ▼
FastAPI Backend (port 8000)
    │
    ▼
┌─────────────────────────────────────────────┐
│             Hybrid RAG Retriever            │
│                                             │
│  ┌──────────────────┐  ┌─────────────────┐  │
│  │ ChromaDB         │  │ BM25Okapi       │  │
│  │ Dense search     │  │ Sparse search   │  │
│  │ (BGE-small,top20)│  │ (rank_bm25,top20│  │
│  └────────┬─────────┘  └────────┬────────┘  │
│           │                     │            │
│           └──────────┬──────────┘            │
│                      ▼                       │
│          RRF Fusion  score = Σ 1/(rank+60)   │
│                      │ top-10                │
│                      ▼                       │
│          Cross-Encoder Reranker              │
│          (ms-marco-MiniLM-L-6-v2)            │
│                      │ top-5                 │
└──────────────────────┼───────────────────────┘
                       │
                       ▼
          Qwen 2.5-14B LLM (generation)
                       │
                       ▼
    {recommended_course, difficulty_level, llm_reasoning}

Why Hybrid RAG Beats Basic Cosine Similarity

Dense retrieval (cosine similarity) captures semantic meaning but misses exact keyword matches — a query for "AWS EC2" may not retrieve courses titled "Amazon Elastic Compute Cloud" if the embeddings diverge. BM25 catches exact term matches but has zero semantic understanding, so "neural network tutorial" won't find a course titled "Deep Learning Fundamentals." RRF fusion gives each document credit from both lists, and the cross-encoder reranker then reads the full (query, course description) pair to make a final relevance decision — something neither single-vector distance nor keyword overlap can do.


Evaluation (RAGAS)

Evaluated on a 20-question test set of realistic user learning goals (e.g. "I want to learn machine learning from scratch", "I need a course on cybersecurity") against the 999-course Coursera corpus.

Metric Hybrid RAG Keyword Baseline Improvement
context_precision 0.9400 0.4100 +129%
context_recall 0.7375 0.4000 +84%
answer_relevancy 0.1606 0.0740 +117%
faithfulness 1.0000 1.0000

Note: answer_relevancy and faithfulness are computed locally using statistical overlap methods. LLM-dependent variants (requiring GPT-4 as judge) need OPENAI_API_KEY. The retrieval metrics (context_precision, context_recall) are fully reliable and show the key result: 94% of retrieved courses are relevant vs 41% for keyword matching.

See backend/evaluation/ragas_results.json for per-question breakdown.


Performance (ONNX vs PyTorch)

Embedding model: BAAI/bge-small-en-v1.5 (384-dim), benchmarked on 100 course titles, 5-run average, CPU-only.

Runtime Avg Latency Std Dev Speedup
PyTorch (sentence-transformers) 563.8ms ±24.9ms baseline
ONNX Runtime 495.2ms ±20.4ms 1.14×

Cosine deviation: 0.0857 — caused by a pooling strategy difference (sentence-transformers uses CLS-token weighted pooling; ONNX wrapper uses mean pooling). Both produce valid semantic embeddings for retrieval. Matching pooling strategies would bring deviation below 0.001.

The ONNX model eliminates the PyTorch framework dependency entirely, enabling deployment on lightweight inference servers.


Quick Start (Docker)

git clone https://github.com/Brijesh-Thakkar/NeuroLearn-Project.git
cd NeuroLearn-Project
cp .env.example .env
docker-compose up --build
Service URL
React Frontend http://localhost:3000
FastAPI Backend http://localhost:8000
MLflow UI http://localhost:5000

First run: ChromaDB will embed all 999 courses on startup (~2 minutes). Subsequent starts use the persisted chroma_data volume.


Manual Setup (without Docker)

Backend

cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Download Qwen2.5-14B model (see README note below)
uvicorn app:app --reload --host 0.0.0.0 --port 8000

Model downloads: On first startup, BAAI/bge-small-en-v1.5 (133MB) and cross-encoder/ms-marco-MiniLM-L-6-v2 are auto-downloaded from HuggingFace. Qwen2.5-14B requires manual download: https://huggingface.co/mohitdeharkar/warp_Qwen2.5-14B-Instruct

Frontend

cd frontend
npm install
npm start
# Opens http://localhost:3000

Notebooks

Notebook Location What it shows
RAGAS Evaluation notebooks/ragas_evaluation.ipynb 20-question retrieval quality benchmark with bar chart comparison vs keyword baseline
MLflow Tracking notebooks/mlflow_tracking.ipynb Logs experiment params + metrics to SQLite; compares BGE-small vs BGE-base runs
ONNX Benchmark notebooks/onnx_benchmark.ipynb Timing + memory benchmark for PyTorch vs ONNX Runtime embedding inference
Course EDA notebooks/eda_courses.ipynb Full EDA: missing values, rating distributions, UMAP embedding visualization, retrieval quality comparison (precision@5 for BM25/dense/hybrid)
SQL Analytics analytics/eda_queries.ipynb 12 SQL query results visualized with matplotlib; mock data for offline rendering

Analytics

The analytics/ folder contains:

  • queries.sql — 12 production-ready SQL queries covering: RANK() OVER, cohort retention with date_trunc + LAG, funnel drop-off with LEFT JOIN + IS NULL, rolling 7-day volume, PERCENTILE_CONT for P50/P90/P99 latency, month-over-month growth, and keyword frequency analysis on user queries.
  • schema_overview.md — Full ERD (ASCII) documenting all 3 tables with column types, foreign keys, and recommended extensions for analytics columns (created_at, response_ms, clicked).

Project Structure

Neurolearn-Proj/
│
├── backend/
│   ├── app.py                  Main FastAPI application (lifespan startup, /output/ endpoint)
│   ├── requirements.txt        All Python dependencies
│   ├── Dockerfile              Python 3.11-slim container
│   ├── mlflow_config.py        MLflow tracking URI and experiment name constants
│   │
│   ├── rag/                    Hybrid RAG pipeline
│   │   ├── embedder.py         BAAI/bge-small-en-v1.5 via sentence-transformers
│   │   ├── onnx_embedder.py    Same model exported to ONNX Runtime
│   │   ├── vector_store.py     ChromaDB persistent collection + dense_search()
│   │   ├── bm25_retriever.py   BM25Okapi sparse index + sparse_search()
│   │   ├── reranker.py         cross-encoder/ms-marco-MiniLM-L-6-v2 reranker
│   │   └── hybrid_retriever.py RRF fusion + full pipeline orchestration
│   │
│   ├── evaluation/
│   │   ├── ragas_runner.py     Standalone evaluation script (20 queries, saves JSON)
│   │   └── ragas_results.json  Actual evaluation results (context_precision=0.94)
│   │
│   ├── tests/
│   │   └── test_rag.py         pytest: result count, metadata keys, score ordering
│   │
│   └── app/                    Original app (DB models, auth — unchanged)
│       ├── database/           PostgreSQL connection setup
│       └── models/             SQLAlchemy models (sqlusers, sqlcourses, userhistory)
│
├── frontend/                   React frontend (unchanged from original)
│   ├── Dockerfile              node:20-alpine container
│   └── src/                    Components, pages, API calls
│
├── notebooks/                  Jupyter notebooks
│   ├── ragas_evaluation.ipynb  RAGAS benchmark results
│   ├── mlflow_tracking.ipynb   MLflow experiment comparison
│   ├── onnx_benchmark.ipynb    Inference timing benchmark
│   └── eda_courses.ipynb       Full course dataset EDA
│
├── analytics/                  SQL analytics layer
│   ├── queries.sql             12 complex PostgreSQL queries
│   ├── schema_overview.md      ERD + column documentation
│   └── eda_queries.ipynb       Visualizations of query results
│
├── mlflow/
│   └── README.md               MLflow UI setup instructions
│
├── docker-compose.yml          4-service stack (backend, frontend, postgres, mlflow)
├── .env.example                Required environment variables
└── README.md                   This file

Tech Stack

Component Technology Purpose
Frontend React 18, React Router User interface, search input, result display
Backend FastAPI, Python 3.11, Uvicorn REST API, lifespan startup, CORS
Vector DB ChromaDB (persistent) Dense embedding storage and ANN search
Dense Retrieval BAAI/bge-small-en-v1.5 384-dim semantic embeddings
Sparse Retrieval BM25Okapi (rank-bm25) Exact term frequency matching
Reranking cross-encoder/ms-marco-MiniLM-L-6-v2 Pairwise query-document scoring
LLM Generation Qwen 2.5-14B-Instruct (8-bit) Natural language recommendation explanation
Experiment Tracking MLflow + SQLite Parameter/metric logging per retrieval experiment
Inference Optimization ONNX Runtime (optimum) 1.14× embedding speedup, no PyTorch dependency
Database PostgreSQL 15, SQLAlchemy User auth, course catalogue, query history
Containerization Docker, docker-compose One-command full-stack deployment

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors