Prerequisites

Python (3.8+)
Standard ML/data science libraries (numpy, pandas, scikit-learn, faiss, joblib)
sentence-transformers
flask (for the web service)

Run Instructions

clone from new-flask-server branch in this repository
run command: pip install -r requirements.txt in directory
ensure ngrok is installed from here
create venv via command: python3 -m venv venv then run venv via command: venv\Scripts\activate
run server in venv via command: flask --app app.py run
once running server in venv, run ngrok then run command: ngrok http 5000
access frontend for site here
must then put in forwarding address into the top input box then click save in order to run this application
now you can enjoy listening to the recommended music!

Adaptive Tempo-Lyric Recommender (ATLR)

The Adaptive Tempo-Lyric Recommender (ATLR) is a dynamic music recommendation system designed to provide personalized, real-time track suggestions by fusing musical structure (Tempo/BPM) with semantic meaning (Lyrical Content).

ATLR is built on a hybrid architecture that combines efficient Approximate Nearest Neighbor (ANN) Retrieval with a Multi-Armed Bandit (MAB) for adaptive, session-based re-ranking.

Key Features & Differentiators

Unlike traditional systems that rely heavily on collaborative filtering or generalized audio features, ATLR focuses on transparent user control and immediate session adaptation.

User-Tunable Control: Provides a mechanism for users to set the relative base weights between Tempo/BPM and Lyrical content, making the recommendation bias transparent.
Fast Session Adaptation (Softmax UCB): Utilizes a SoftmaxUCBWeightBandit to dynamically adjust the scoring weights (θ) after every 3-5 plays based on implicit feedback (e.g., track completion and skip latency), ensuring the recommendations adapt to the user's most up-to-date mood.
Musical BPM Anchoring: Employs a custom bpm_distance function to filter and score tracks, correctly accounting for half-time and double-time equivalence (e.g., treating 60 BPM and 120 BPM as similar), which is crucial for musical coherence.
Diversity (MMR Re-ranking): Applies Maximal Marginal Relevance (MMR) to balance the retrieved candidates between high relevance (ANN score) and maximal diversity (based on BPM similarity), reducing repetitive recommendations.
Adaptive Fusion Scoring: Features a robust score_track function that automatically renormalizes the active feature weights if a track lacks certain features (e.g., lyrics or audio embeddings), preventing scoring bias.

Architecture and Pipeline

The system operates in a five-stage loop, driven by user interaction and implicit feedback.

Bandit Arm Selection: The SoftmaxUCBWeightBandit selects a weight vector θ = [wbpm, wlyrics, waudio] for the session, constrained to remain near the user's explicit slider preference.
Candidate Generation:

A query is run against the FAISS Index (built on 15 weighted numeric features like tempo=1.5, energy=1.2) for high-recall retrieval.
Candidates are pre-filtered using the tempo-octave-aware bpm_prefilter.
The remaining candidates are re-ranked using MMR to maximize diversity.

Dynamic Scoring: The score_track function computes the final rank based on the bandit's weights θ and three similarity components:

Score = wbpm . Sbpm + wlyrics . Slyrics + waudio . Saudio

Weights are adaptively normalized if data is missing.

Implicit Feedback & Reward: After a track is played, an implicit reward is calculated based on session metrics (e.g., play time, skip latency). The reward policy penalizes early skips (e.g., skipping before 30 seconds).
Bandit Update: The reward is fed back to the SoftmaxUCBWeightBandit via bandit.update(), adjusting the future selection probability of that weight arm to reinforce successful recommendations.

Data and Feature Engineering

FAISS Index: The core retrieval index is built using a faiss.IndexFlatIP (Inner Product, for cosine similarity on L2-normalized vectors) over 15 scaled audio features.
Weighted Features: Features like tempo (1.5) and energy (1.2) are explicitly weighted up during the initial vectorization for the ANN search to reflect their importance.
Lyric Embeddings: Lyrical content is processed using a pre-trained multilingual Sentence Transformer (paraphrase-multilingual-MiniLM-L12-v2) to generate semantic embeddings for similarity calculation.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
dataset		dataset
.gitignore		.gitignore
README.md		README.md
bandit_adapter.py		bandit_adapter.py
candidate_gen.py		candidate_gen.py
demo.py		demo.py
feature_store.py		feature_store.py
lyrics_embeddings.py		lyrics_embeddings.py
main.py		main.py
reward_policy.py		reward_policy.py
scorer.py		scorer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prerequisites

Run Instructions

Adaptive Tempo-Lyric Recommender (ATLR)

Key Features & Differentiators

Architecture and Pipeline

Data and Feature Engineering

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Prerequisites

Run Instructions

Adaptive Tempo-Lyric Recommender (ATLR)

Key Features & Differentiators

Architecture and Pipeline

Data and Feature Engineering

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages