Conversation
catebros
commented
Mar 10, 2026
- Collect all books per user: ratings >=4 , shelved books, listed books
- Cluster the embeddings
- Recommend per cluster: compute each cluster's center and query with it
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
bookdb-landing | 230e6aa | Commit Preview URL Branch Preview URL |
Mar 10 2026, 07:46 PM |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the book recommendation system by introducing a clustering-based approach. Instead of generating recommendations from a single user profile vector, the system now clusters a user's interaction history into distinct interest groups. Recommendations are then generated for each cluster's centroid, aiming to provide more diverse and relevant suggestions by capturing multiple facets of a user's reading preferences. A fallback mechanism ensures that the previous recommendation method is used if clustering is not applicable or yields no results. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new recommendation strategy based on clustering user interaction embeddings. The user's book interactions are clustered, and recommendations are generated from the center of each cluster. The changes include new functions for fetching vectors from Qdrant, a clustering utility, and the main recommendation logic that integrates this new strategy with a fallback to the previous method.
My review focuses on improving robustness and maintainability. The main points are:
- Replacing broad
except Exceptionblocks with more specific exception handling and logging to avoid silencing errors. - Refactoring duplicated code to improve readability.
- Highlighting a potentially brittle approach to handling named vectors in Qdrant.
| try: | ||
| vector_map = get_vectors_by_ids(qdrant_client, list(seed_scores.keys())) | ||
| except Exception: | ||
| return [] |
| try: | ||
| clusters = cluster_seeds_by_embedding(valid_seeds, seed_scores, n_clusters) | ||
| except Exception: | ||
| return [] |
| try: | ||
| hits = most_similar_by_vector( | ||
| qdrant_client, | ||
| query_vector=centroid.tolist(), | ||
| top_k=per_cluster_limits[cluster_idx], | ||
| exclude_ids=cluster_excluded, | ||
| ) | ||
| except Exception: |
| if isinstance(vector, dict): | ||
| vector = next(iter(vector.values()), None) |
There was a problem hiding this comment.
This logic to handle named vectors by picking the first one from the dictionary can be brittle. If multiple named vectors exist, the one chosen is arbitrary and depends on dictionary insertion order. It would be more robust to either expect a specific vector name or handle the case of multiple vectors more explicitly. If only one vector is ever expected, adding a comment to clarify this assumption would be helpful.
| interaction_goodreads_ids = _cluster_vector_recommendations( | ||
| db, | ||
| current_user.id, | ||
| qdrant_client=qdrant, | ||
| limit=max(limit * 4, 80), | ||
| exclude_ids=set(bpr_goodreads_ids), | ||
| ) | ||
| if not interaction_goodreads_ids: | ||
| # Fall back | ||
| interaction_goodreads_ids = _interaction_vector_recommendations( | ||
| db, | ||
| current_user.id, | ||
| qdrant_client=qdrant, | ||
| limit=max(limit * 4, 80), | ||
| exclude_ids=set(bpr_goodreads_ids), | ||
| ) |
There was a problem hiding this comment.
The arguments passed to _cluster_vector_recommendations and the fallback _interaction_vector_recommendations are identical. To improve readability and avoid repetition, you can define the arguments once in a dictionary and unpack it for both function calls.
reco_args = {
"db": db,
"user_id": current_user.id,
"qdrant_client": qdrant,
"limit": max(limit * 4, 80),
"exclude_ids": set(bpr_goodreads_ids),
}
interaction_goodreads_ids = _cluster_vector_recommendations(**reco_args)
if not interaction_goodreads_ids:
# Fall back
interaction_goodreads_ids = _interaction_vector_recommendations(**reco_args)