Skip to content

Conversation

@ajaynz
Copy link
Collaborator

@ajaynz ajaynz commented Nov 17, 2025

Description

Fixes duplicate items appearing when using the 'Fastest Delivery' sort option.

Root Cause

When joining ProductDeliveryLink and DeliveryOption tables without using DISTINCT, products with multiple delivery options appeared multiple times in the result set (once for each delivery option).

Solution

Added .distinct(Product.id) to all delivery_fastest sort queries to ensure each product appears only once in results.

Changes

  • Modified backend/app/main.py: Added .distinct() calls to three delivery_fastest query paths
  • Added tests: Two comprehensive tests in backend/tests/api/test_delivery_options.py to verify no duplicates appear

Testing

✅ All 53 backend tests pass
✅ All 34 E2E tests pass
✅ Linting passes (ruff, eslint)

Verification

  • test_fastest_delivery_sort_no_duplicates: Ensures products appear exactly once
  • test_fastest_delivery_sort_with_category_filter_no_duplicates: Ensures no duplicates with category filter applied

- Add .distinct(Product.id) to delivery_fastest sort queries
- Prevents duplicate rows when products have multiple delivery options
- Added comprehensive tests for both general and category-filtered cases
- All 53 backend tests pass
- All 34 E2E tests pass
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This PR attempts to fix issue #35 by preventing duplicate products when they have multiple delivery options. It adds .distinct(Product.id) to three query branches that join with delivery options, and includes comprehensive tests to verify the fix.

Critical Issue

The implementation has a fundamental bug with PostgreSQL's DISTINCT ON clause. When using .distinct(expression), the ORDER BY clause must start with the same expression. All three additions violate this rule by using distinct(Product.id) followed by ordering starting with DeliveryOption.estimated_days_min, which will cause database errors.

The fix requires restructuring the ORDER BY clauses to place Product.id first in all three locations.

View this review on Amp

stmt = (
stmt.join(ProductDeliveryLink)
.join(DeliveryOption)
.distinct(cast(Any, Product.id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical Bug: Same DISTINCT ON issue as lines 228 and 240. The distinct(Product.id) must be followed by order_by(Product.id, ...) as the first ordering column.

stmt = (
stmt.join(ProductDeliveryLink)
.join(DeliveryOption)
.distinct(cast(Any, Product.id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical Bug: When using DISTINCT ON with PostgreSQL, the expression in distinct() must match the first column(s) in order_by(). Here, distinct(Product.id) is used but order_by() starts with DeliveryOption.estimated_days_min. This will cause a database error.

The correct pattern is:

.distinct(Product.id)
.order_by(Product.id, DeliveryOption.estimated_days_min.asc(), ...)

This same issue appears in all three locations where .distinct() was added (lines 228, 240, 254).

stmt = (
stmt.join(ProductDeliveryLink)
.join(DeliveryOption)
.distinct(cast(Any, Product.id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical Bug: Same DISTINCT ON issue as line 228. The distinct(Product.id) must be followed by order_by(Product.id, ...) as the first ordering column.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants