Skip to content

epiverse-connect/epiverse-search-backend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Epiverse search backend

This project implements a semantic search system using embeddings. It consists of two main parts: data acquisition and embedding generation, and a FastAPI endpoint for querying.

1. Data Acquisition and Embedding Generation

This part of the project handles scraping data, generating embeddings, and storing them for later use.

Data Scraping

The data is scraped using a dedicated scraper. You can find the scraper code in this repository: [epiverse-scraper]. This scraper is responsible for collecting the raw data that will be used for the search index.

Embedding Generation

After scraping, the data is processed to generate embeddings. This involves the following steps:

  1. Embedding Creation: We use multi-qa-MiniLM-L6-cos-v1 to generate embeddings for each data entry.
  2. Storage: The generated embeddings, along with the original data, are stored as a .pth file (corpus_embeddings.pth) for efficient retrieval.

2. FastAPI Endpoint for Search

This part of the project provides a FastAPI endpoint that handles search queries and returns the most relevant results.

Flowchart

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published