Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #1806: N+1 query issue on dashboard index page #1813

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ontowhee
Copy link
Contributor

@ontowhee ontowhee commented Dec 9, 2024

fix #1806

The existing query was looping on metrics to grab the latest Datum for each metric, which led to 18 similar queries reported by django-debug-toolbar. There were 45 sql queries overall for the page.

Screenshot 2024-12-09 at 7 33 50 AM

This is WIP. The change here reduces the number of queries down to 1 large statement of unions of each metric's latest datum. There are 9 sql queries overall for the page.

Changes include:

  • Query on the content type for each metric, which is used to build the union.
  • Query on Datum by building a union of all the metric querysets.
  • Adding a select_related('category') for the metric query that is used for sorting based on display_position.

Screenshot 2024-12-10 at 10 28 08 AM

Comment on lines 22 to 42
metrics.extend(
MC.objects.filter(show_on_dashboard=True).select_related("category")
)

content_types = ContentType.objects.get_for_models(*metrics)
datum_queryset = Datum.objects.none()
for metric, content_type in content_types.items():
datum_queryset = datum_queryset.union(
Datum.objects.filter(
content_type_id=content_type.id, object_id=metric.id
).order_by("-timestamp")[0:1]
)

latest_datums = {
(datum.object_id, datum.content_type_id): datum for datum in datum_queryset
}

data = []
for metric in metrics:
data.append({"metric": metric, "latest": metric.data.latest()})
for metric, content_type in content_types.items():
if latest := latest_datums.get((metric.id, content_type.id)):
data.append({"metric": metric, "latest": latest})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way to write this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been trying to find an alternate approach (got pretty close with a Datum.objects.values('content_type', 'object_id').annotate(Max('timestamp')) but unfortunately that can only get the latest timestamp for each metric, not the associated measurement).

Your original approach using UNION seems quite good (and it's quite clever).

I've made a suggestion for a slight rewrite that is a bit shorter and also closer to the original view implementation (it does so by pushing some logic onto the Datum manager). If you like it I can push a commit onto this branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! It's very interesting how you've moved most of the logic to DatumQuerySet!

The only concern is that latest = latest_by_metric.get((metric.content_type.pk, metric.pk)) could return None. Will the frontend be able to handle rendering of None?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the current behavior for a metric with no data is for the view to crash, so it can't be worse than that 😁
(good call though, I'll test it out)

@@ -33,10 +33,12 @@ def setUp(self):
def test_index(self):
for MC in Metric.__subclasses__():
for metric in MC.objects.filter(show_on_dashboard=True):
metric.data.create(measurement=44)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding extra data to ensure the query will retrieve the latest datum for each metric.

@ontowhee ontowhee marked this pull request as ready for review December 10, 2024 18:58
@bmispelon bmispelon force-pushed the fix-1806-dashboard-n-plus-1-query branch from a529fc0 to 2288df6 Compare December 11, 2024 21:56
@bmispelon
Copy link
Member

@ontowhee It took a few tries, but I think I finally landed on something that I think I like. I moved a lot of the logic onto manager methods so I could simplify the view (and it also simplified the N+1 tests sinceI could test directly against the manager method rather thant involve the view).

I managed to switch to a subquery instead of a union which I think simplifies the code, but I don't know if the performance is better/worse/equivalent. Do you know anything about that?

I also added an explicit last_updated into the template to deal with metrics that have no data. That's not really supposed to happen in production, but I did run into a crash a few times when I was testing things locally.

Let me know what you think, if you're OK with it I'll merge all commits on this branch into one, with both of us as co-authors.

@ontowhee
Copy link
Contributor Author

For commit 8fe1e01, the overall performance was 110ms.

Screenshot 2024-12-12 at 10 41 36 PM

For commit 2288df6, using the subquery seems to be worse for TracTicketMetric. It took 186ms to query that table, and overall it took 224ms. Will this performance be noticeable by the end user or the server?

Screenshot 2024-12-12 at 10 42 17 PM

Django debug toolbar also identifies "4 similar" queries on ContentType due to calling .for_dashboard() in the loop. Will Sentry flag these 4 similar queries as an N+1 issue?

@bmispelon
Copy link
Member

Thanks for doing the measurements! The difference between 110 and 186ms doesn't seem so bad to me, but I would like to know your opinion too.

Django debug toolbar also identifies "4 similar" queries on ContentType due to calling .for_dashboard() in the loop. Will Sentry flag these 4 similar queries as an N+1 issue?
Does that warning persist after a page refresh? Django should cache the content type globally (across requests), so I would expect those queries in the loop to happen only on the first load, but not once the cache has been populated.

@ontowhee
Copy link
Contributor Author

The difference between 110 and 186ms doesn't seem so bad to me, but I would like to know your opinion too.

It doesn't seem too bad to me either. I wasn't sure how similar the local database was to production. I noticed that the results are returned in much less time in production too.

Django should cache the content type globally (across requests), so I would expect those queries in the loop to happen only on the first load, but not once the cache has been populated.

You're right, on first load it shows the warning, but on refresh, the content type queries are cached along with all the other queries. (0 queries!)

I think this is good to merge! Thanks so much for refactoring this. I learned so much from your code!!

@ontowhee
Copy link
Contributor Author

I was playing around to understand the queries better. Here's another approach that queries on the Datum model. It uses .distinct() and .order_by() to get the latest measurement for each metric, and it uses prefetch_related() to fetch the metric. (I didn't know prefetch_related could be called on generic foreign keys.) This avoids making subqueries, and it also avoids building the unions or Q expressions that were done in the previous approaches.

Not asking for any changes here. I think your approach of querying on the Metric is easier to understand, and it’s already an improvement from the current dashboard!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[dashbord] Investigate and fix N+1 query issue
2 participants