Skip to content

Book Diversification by Author#146

Merged
leaabj merged 2 commits intodevfrom
diversify/books
Mar 12, 2026
Merged

Book Diversification by Author#146
leaabj merged 2 commits intodevfrom
diversify/books

Conversation

@leaabj
Copy link
Copy Markdown
Collaborator

@leaabj leaabj commented Mar 12, 2026

Changes:

  • Added _diversify_books_by_author() helper function in books router

    • Limits to max 2 books per author (configurable via _MAX_BOOKS_PER_AUTHOR)
    • Handles multi-author books correctly (counts toward each author's limit)
    • Preserves most similar books first (maintains relevance)
    • Returns diversifed results maintaining original order
  • Modified get_related_books() endpoint

    • Fetches 3x candidates from Qdrant (up to 100 books)
    • Applies diversification to results
    • Falls back to popular books (also diversified) when Qdrant unavailable
    • Caches larger candidate set for flexibility
  • Added configuration constants

    • _MAX_BOOKS_PER_AUTHOR = 2 (maximum books per author)
    • _CANDIDATE_MULTIPLIER = 3 (fetch more candidates to filter)

Closes #142

… of books per author in related book recommendations
@leaabj leaabj requested a review from yamirghofran March 12, 2026 09:42
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Mar 12, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bookdb-landing 0f5d6a4 Commit Preview URL

Branch Preview URL
Mar 12 2026, 10:22 AM

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the book recommendation system by introducing author diversification. The primary goal is to prevent search results or related book suggestions from being dominated by a single author, thereby offering users a broader range of choices. This is achieved by fetching more candidates than initially required, applying a new diversification algorithm that limits books per author, and ensuring this logic is consistently applied across both primary and fallback recommendation paths.

Highlights

  • New Helper Function: Added the _diversify_books_by_author() helper function to the books router. This function limits the number of books from a single author to a configurable maximum (defaulting to 2), correctly handles multi-author books, and preserves the relevance order of the books.
  • Modified get_related_books() Endpoint: The get_related_books() endpoint was updated to fetch a larger set of candidate books (3x the requested limit) from Qdrant. It then applies the new author diversification logic to these candidates. The endpoint also caches the larger candidate set for flexibility and falls back to diversified popular books if Qdrant is unavailable.
  • New Configuration Constants: Introduced _MAX_BOOKS_PER_AUTHOR (set to 2) and _CANDIDATE_MULTIPLIER (set to 3) to control the diversification behavior and the number of candidates fetched.
Changelog
  • apps/api/routers/books.py
    • Imported defaultdict from the collections module.
    • Refactored several import statements for better readability.
    • Added _MAX_BOOKS_PER_AUTHOR and _CANDIDATE_MULTIPLIER constants for configuring diversification.
    • Introduced the _diversify_books_by_author function to limit books per author in a given list.
    • Adjusted the get_related_books endpoint to fetch more candidates from Qdrant using _CANDIDATE_MULTIPLIER.
    • Implemented author diversification in get_related_books by calling _diversify_books_by_author on Qdrant results.
    • Updated the caching mechanism in get_related_books to store the larger candidate set before diversification.
    • Added logic to get_related_books to fill remaining slots with non-diversified books if diversification yields fewer than the requested limit.
    • Applied author diversification to the popular books fallback in get_related_books.
    • Performed minor code formatting adjustments in various functions (_search_response_from_ranked_books, _run_chatbot_search_pipeline, _load_book, _check_book_exists, search_books, get_book_reviews) for improved readability.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@leaabj leaabj marked this pull request as ready for review March 12, 2026 09:42
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces book diversification by author to the related books feature. A new helper function _diversify_books_by_author is added, and the get_related_books endpoint is updated to use it, both for Qdrant results and the popular books fallback. The changes look good overall, but I've found a performance issue with the caching logic in get_related_books where the cache is being ignored. I've also included a couple of suggestions to improve code clarity and conciseness.

Comment on lines 631 to +635
cached_ids = _qdrant_cache.get(goodreads_id)
recent_failure = _qdrant_failure_cache.get(goodreads_id, False)

if cached_ids is not None:
related = load_books_by_goodreads_ids(db, cached_ids)
return serialize_books_with_engagement(db, related)

# Skip cached results - they don't have diversification applied
# Let them naturally expire (30 min TTL)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The caching logic for Qdrant results appears to be ineffective. cached_ids are fetched from the cache on line 631 but are then ignored. The code proceeds to call most_similar on every request, bypassing the cache. This negates the benefit of caching, increasing latency and load on the Qdrant service.

To fix this, you should use the cached_ids if they exist, and only query Qdrant if there's a cache miss. The misleading comment on lines 634-635 should also be removed or updated.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@leaabj leaabj merged commit cf929d0 into dev Mar 12, 2026
4 checks passed
@leaabj leaabj deleted the diversify/books branch March 12, 2026 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants