Epiverse search backend

This project implements a semantic search system using embeddings. It consists of two main parts: data acquisition and embedding generation, and a FastAPI endpoint for querying.

1. Data Acquisition and Embedding Generation

This part of the project handles scraping data, generating embeddings, and storing them for later use.

Data Scraping

The data is scraped using a dedicated scraper. You can find the scraper code in this repository: [epiverse-scraper]. This scraper is responsible for collecting the raw data that will be used for the search index.

Embedding Generation

After scraping, the data is processed to generate embeddings. This involves the following steps:

Embedding Creation: We use multi-qa-MiniLM-L6-cos-v1 to generate embeddings for each data entry.
Storage: The generated embeddings, along with the original data, are stored as a .pth file (corpus_embeddings.pth) for efficient retrieval.

2. FastAPI Endpoint for Search

This part of the project provides a FastAPI endpoint that handles search queries and returns the most relevant results.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
app		app
Dockerfile		Dockerfile
README.md		README.md
analysis_df.zip		analysis_df.zip
docker-compose.yml		docker-compose.yml
epiverse_search_flowchart.png		epiverse_search_flowchart.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Epiverse search backend

1. Data Acquisition and Embedding Generation

Data Scraping

Embedding Generation

2. FastAPI Endpoint for Search

About

Releases

Packages

Contributors 2

Languages

epiverse-connect/epiverse-search-backend

Folders and files

Latest commit

History

Repository files navigation

Epiverse search backend

1. Data Acquisition and Embedding Generation

Data Scraping

Embedding Generation

2. FastAPI Endpoint for Search

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages