Skip to content

fix(grouping): Schedule seer deletion tasks with less hashes #95156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 10, 2025

Conversation

armenzg
Copy link
Member

@armenzg armenzg commented Jul 9, 2025

The original code would always pass all hashes to all tasks spawned, thus, we could end up with massive payloads for tasks causing trouble to taskbroker.

We got into such a situation in the last few days when the deletion of a project would lead to hundreds of thousands of hashes being passed to tasks (179k+ hashes -> 6MB+ task payloads).

The changes here would take all hashes from a task, chunk the hashes and spawn new tasks with a small size of hashes.

This moves us from sequential scheduling of tasks to parallelized scheduling.
This could have an impact on the Seer service.

Ref inc-1236

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jul 9, 2025
for i in range(last_deleted_index, len_hashes, BATCH_SIZE):
# Slice operations are safe and will not raise IndexError
chunked_hashes = hashes[i : i + BATCH_SIZE]
delete_seer_grouping_records_by_hash.apply_async(args=[project_id, chunked_hashes, 0])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newer tasks will be scheduled with last_deleted_index=0 since we're scheduling a chunked task.

@armenzg armenzg marked this pull request as ready for review July 9, 2025 18:55
@armenzg armenzg requested a review from a team as a code owner July 9, 2025 18:55
# Iterate through hashes in chunks and schedule a task for each chunk
# There are tasks passing last_deleted_index, thus, we need to start from that index
# Eventually all tasks will pass 0
for i in range(last_deleted_index, len_hashes, BATCH_SIZE):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to add similar tests to make sure the right number of tasks get called?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test I added yesterday covers it.

def test_call_delete_seer_grouping_records_by_hash_chunked(self) -> None:

Copy link

codecov bot commented Jul 9, 2025

Codecov Report

Attention: Patch coverage is 75.00000% with 2 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/tasks/delete_seer_grouping_records.py 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #95156      +/-   ##
==========================================
+ Coverage   87.84%   87.90%   +0.05%     
==========================================
  Files       10469    10459      -10     
  Lines      605374   604694     -680     
  Branches    23674    23571     -103     
==========================================
- Hits       531819   531575     -244     
+ Misses      73195    72758     -437     
- Partials      360      361       +1     

@armenzg armenzg requested a review from markstory July 9, 2025 19:15
Copy link
Member

@markstory markstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me.

end_index = min(last_deleted_index + BATCH_SIZE, len_hashes)
call_seer_to_delete_these_hashes(project_id, hashes[last_deleted_index:end_index])
if end_index < len_hashes:
delete_seer_grouping_records_by_hash.apply_async(args=[project_id, hashes, end_index])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this where all the continued stream of big tasks was coming from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markstory yes, this is where it came from

Base automatically changed from ref/seer_deletion/armenzg to master July 10, 2025 12:41
@armenzg armenzg requested a review from a team as a code owner July 10, 2025 12:41
@armenzg
Copy link
Member Author

armenzg commented Jul 10, 2025

bugbot run

Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ BugBot reviewed your changes and found no bugs!


Was this report helpful? Give feedback by reacting with 👍 or 👎

@armenzg armenzg merged commit 130345e into master Jul 10, 2025
66 checks passed
@armenzg armenzg deleted the fix/split_seer_deletions/armenzg branch July 10, 2025 14:18
armenzg added a commit that referenced this pull request Jul 11, 2025
This simplifies the tests for deletions of hashes from Seer and it also
adds a test for #95156.
andrewshie-sentry pushed a commit that referenced this pull request Jul 14, 2025
The original code would always pass all hashes to all tasks spawned,
thus, we could end up with massive payloads for tasks causing trouble to
taskbroker.

We got into such a situation in the last few days when the deletion of a
project would lead to hundreds of thousands of hashes being passed to
tasks (179k+ hashes -> 6MB+ task payloads).

The changes here would take all hashes from a task, chunk the hashes and
spawn new tasks with a small size of hashes.

This moves us from sequential scheduling of tasks to parallelized
scheduling. This could have an impact on the Seer service if a massive number of hashes are requested for deletion.

Ref inc-1236
andrewshie-sentry pushed a commit that referenced this pull request Jul 14, 2025
This simplifies the tests for deletions of hashes from Seer and it also
adds a test for #95156.
Copy link

sentry-io bot commented Jul 19, 2025

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

Did you find this useful? React with a 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants