Skip to content

test: add unit tests for rag_service (Fixes #444)#514

Open
Qiu-Difeng wants to merge 19 commits into
ritesh-1918:gssocfrom
Qiu-Difeng:fix/test-rag-service
Open

test: add unit tests for rag_service (Fixes #444)#514
Qiu-Difeng wants to merge 19 commits into
ritesh-1918:gssocfrom
Qiu-Difeng:fix/test-rag-service

Conversation

@Qiu-Difeng
Copy link
Copy Markdown

@Qiu-Difeng Qiu-Difeng commented May 28, 2026

Summary

Adds comprehensive unit tests for the RAG (Retrieval-Augmented Generation) service.

Closes #444

Changes

  • Created backend/tests/test_rag_service.py with 20 test cases

Test Coverage

Initialization (3 tests)

  • Default unloaded state
  • is_available() returns False when not loaded
  • Supabase client is None without env vars

Model Loading (7 tests)

  • Successful load sets _loaded flag
  • Local model path loading
  • Default HuggingFace download
  • Load failure handling
  • Degraded mode (ALLOW_DEGRADED_STARTUP=1)
  • Skip reload if already loaded
  • Skip retry if previously failed

search_knowledge_base (8 tests)

  • Returns None when not loaded
  • Returns None without Supabase
  • Degraded mode returns None
  • Returns matched article when similarity exceeds threshold
  • Returns None when no match found
  • Handles empty query
  • Handles RPC exceptions gracefully
  • Custom threshold and match_count parameters

is_available (2 tests)

  • Available after successful load
  • Not available after failed load

How to Run

cd backend
python -m pytest tests/test_rag_service.py -v

Bounty attempt under GSSoC '26.

Summary by CodeRabbit

  • Tests
    • Added comprehensive test coverage for RAG service initialization, model loading, and knowledge-base search functionality to ensure reliability and proper error handling.

Review Change Stack

namann5 and others added 19 commits May 22, 2026 11:30
- Replace raw user_id with SHA256 hash (8-char prefix) in all log statements
- Maintains audit trail capability while protecting user identifiers (PII)
- Complies with GDPR/CCPA privacy requirements
- Hash is deterministic for correlation without exposing PII

Resolves CodeRabbit PII logging concern
…backfill

Fix tenant ticket orphaning by persisting company_id on save
…ashboard

feat: Real-time Support Dashboard Updates Using Supabase Realtime Channels
Closes ritesh-1918#444

- Test initialization, model loading, and availability
- Test context retrieval and relevance scoring
- Test empty query handling
- Test degraded mode behavior
- Test RPC exception handling
Copilot AI review requested due to automatic review settings May 28, 2026 19:59
@vercel
Copy link
Copy Markdown

vercel Bot commented May 28, 2026

@Qiu-Difeng is attempting to deploy a commit to the ritesh Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

📝 Walkthrough

Walkthrough

Adds a comprehensive pytest test module for RagService covering initialization state, model loading behavior with failure and degraded startup handling, knowledge-base search guardrails and matching logic, and service availability semantics, using mocked Supabase environment variables and SentenceTransformer loader.

Changes

RagService Unit Test Coverage

Layer / File(s) Summary
Test infrastructure and fixtures
backend/tests/test_rag_service.py
Module imports pytest and mocking utilities; fixtures construct RagService instances with Supabase environment variables patched and mocked create_client for RPC verification.
Initialization and model loading behavior
backend/tests/test_rag_service.py
Default service state assertions (_loaded=False, _load_failed=False, is_available()=False); successful load toggles _loaded and clears _load_failed; local model path and default HuggingFace model name loading; failure handling with re-raise and degraded startup suppression; no-op when already loaded or previously failed.
Search behavior and availability state
backend/tests/test_rag_service.py
search_knowledge_base returns None when model unloaded, Supabase unconfigured, or load previously failed; returns matched records when similarity exceeds threshold; returns None on empty results, empty queries, and RPC exceptions; forwards threshold and match_count parameters to RPC; selects only the best match when multiple rows returned. is_available becomes True after successful load and False after failed load.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A test suite hops forth with care,
Guard rails checked, edge cases snared,
From models loaded to searches deep,
Each RAG behavior safe to keep!
The service sleeps soundly now—fully covered, rare!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding unit tests for the RAG service, and includes reference to the fixed issue.
Linked Issues check ✅ Passed The pull request comprehensively addresses all coding requirements from issue #444: creates test_rag_service.py, covers context retrieval, relevance scoring, empty query handling, and tests across all specified service behaviors.
Out of Scope Changes check ✅ Passed All changes are scoped to the test file for RAG service; no unrelated modifications to production code or other files are present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/tests/test_rag_service.py (1)

15-16: ⚡ Quick win

Consider using conftest.py for path setup instead of sys.path manipulation.

The sys.path.insert pattern works but is not the recommended pytest approach. A cleaner solution is to create backend/tests/conftest.py with shared path configuration, or ensure the backend package is installable (with pyproject.toml or setup.py) and run tests with pip install -e ..

♻️ Proposed conftest.py approach

Create backend/tests/conftest.py:

import sys
import os

# Add backend directory to path once for all test modules
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))

Then remove lines 15-16 from this test file.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/tests/test_rag_service.py` around lines 15 - 16, Remove the ad-hoc
sys.path manipulation from backend/tests/test_rag_service.py (the
sys.path.insert call) and instead add a backend/tests/conftest.py that performs
the shared path setup once; specifically, create conftest.py to insert the
backend parent directory into sys.path (mirroring the current
os.path.join(os.path.dirname(__file__), "..") logic) and then delete the
sys.path.insert lines from test_rag_service.py so tests use the centralized
conftest configuration.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@backend/tests/test_rag_service.py`:
- Around line 15-16: Remove the ad-hoc sys.path manipulation from
backend/tests/test_rag_service.py (the sys.path.insert call) and instead add a
backend/tests/conftest.py that performs the shared path setup once;
specifically, create conftest.py to insert the backend parent directory into
sys.path (mirroring the current os.path.join(os.path.dirname(__file__), "..")
logic) and then delete the sys.path.insert lines from test_rag_service.py so
tests use the centralized conftest configuration.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4d8e5e46-1ed0-4687-95ae-773e6a93b860

📥 Commits

Reviewing files that changed from the base of the PR and between fb6a950 and bdaae5a.

📒 Files selected for processing (1)
  • backend/tests/test_rag_service.py

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@Qiu-Difeng Qiu-Difeng changed the base branch from main to gssoc May 29, 2026 13:31
@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented May 29, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
29368972 Triggered Supabase Service Role JWT b460068 scratch/test_companies.js View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@Qiu-Difeng
Copy link
Copy Markdown
Author

Hi @ritesh-1918! 👋 I've updated this PR to target the gssoc branch as per the contribution guidelines. Please let me know if any further changes are needed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test : add unit tests for rag_service

5 participants