cst: Use hashes in scrubber for segment-exists checks #22810

abhijat · 2024-08-09T05:56:25Z

When checking for segments within a manifest, the anomaly detector first tries to load up ntp hashes which could have been placed there by inventory service.

Once loaded, this data is used as following:

for each segment path, first check the inventory data set
if path exists move on. no operation budget is consumed
if path missing or collision, make http call to check if segment is actually missing
if path still missing, mark the segment as missing in anomalies

The hash data set is not checked for manifests as these have to be downloaded for the scrub eventually.

The segment-existence check is performed under these circumstances:

inventory based scrub is enabled AND inventory data is available
scrub is enabled AND inventory based scrub is disabled
A special flag is passed to anomaly detector, this is for the benefit of topic recovery validator which also uses anomaly detector but may be run in environments where inventory based scrub is not available.

The check is not performed if inventory based scrub is enabled, but data is not available on disk.

Note: The scrub result is still marked as full if segment-existence checks are not performed. The semantics of scrub result (whether it is full or partial) are mainly used to drive the combination of scrub result.

Partial results are merged together while full results replace previous values. In this context, marking the result as partial because segment-existence was not checked (possibly due to missing inventory data) does not make much sense. The scrub is still performed over the full offset range in the bucket, we just turned off one check because of data not being available. Such a result should still replace the previous result.

Backports Required

Release Notes

none

Deflaimun · 2024-08-09T13:58:33Z

Hi @abhijat. Should we review the properties for docs now or do you prefer that we review when it's not in draft? Thanks

abhijat · 2024-08-09T14:44:27Z

Hi @abhijat. Should we review the properties for docs now or do you prefer that we review when it's not in draft? Thanks

@Deflaimun this PR is actually based off #21576 - the properties have been reviewed by the docs team already. Once the base PR is merged and I rebase this PR won't contain any configuration related changes.

abhijat · 2024-09-02T09:17:39Z

/cdt

abhijat · 2024-09-02T10:19:43Z

/ci-repeat

abhijat · 2024-09-02T11:04:07Z

/ci-repeat

vbotbuildovich · 2024-09-02T13:07:12Z

new failures in https://buildkite.com/redpanda/redpanda/builds/53889#0191b29a-3ee8-4ab4-9594-a8bcb095a2df:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/53889#0191b29a-3eea-4c5b-bcc5-d6276e182346:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/53889#0191b2c4-a837-449d-8b60-94b6f4fc90fa:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/53889#0191b2c4-a832-4831-bf55-559e60ac226e:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/54087#0191c767-9877-4a2a-bc1b-18f9a5851e30:

"rptest.tests.cloud_storage_scrubber_test.CloudStorageScrubberTest.test_scrubber.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/54087#0191c780-f5d1-4485-844a-14fd899b4862:

"rptest.tests.cloud_storage_scrubber_test.CloudStorageScrubberTest.test_scrubber.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/54819#01920f30-2f29-4cf4-aaf9-0aa79c47eaab:

"rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTestWithDisruptions.test_write_with_node_failures.cloud_storage_type=CloudStorageType.S3"

vbotbuildovich · 2024-09-02T13:10:41Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53889#0191b29a-3ee8-4ab4-9594-a8bcb095a2df

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53889#0191b29a-3ee7-4c9b-a91a-db4106b6a99d

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53889#0191b29a-3ee5-41d1-a7fe-89074572a760

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53889#0191b2c4-a832-4831-bf55-559e60ac226e

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53889#0191b2c4-a834-4415-9c62-8888284aef86

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54087#0191c767-987a-44ec-8ac4-efcba5aa3101

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54087#0191c767-987c-465d-bfa5-a449ce9a3cca

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54087#0191c780-f5d8-4847-8842-813d832bf4af

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54087#0191c780-f5d4-4aa2-9257-90db112d973d

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54138#0191cbb2-a314-4d67-ae05-5d33b4b09d29

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54138#0191cbb2-7ffc-4fce-b698-566eabed2dbc

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54143#0191cc73-b10e-4ce5-b58d-78230d952e60

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54143#0191cc8d-03c7-4b5d-a2f3-0c3d4f413641

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54167#0191d0ac-ce1b-45b3-bfe7-99d0f3295eef

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54167#0191d0ac-ce19-44a6-9bf5-3004d0a91b89

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54168#0191d215-e505-4306-91b1-04974d55c54f

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54181#0191d646-7a1f-44f5-8e86-8c1e2f9a7bd4

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54252#0191dc4b-de7f-468c-9a81-b88fadc95ea4

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54252#0191dc4e-59c2-4a76-b0fd-06ef2a7d2dfd

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54307#0191dff0-34e2-49b6-83e9-91237998c5fd

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54887#01921477-9065-4b1e-94da-c103927abb2f

abhijat · 2024-09-06T11:40:25Z

/ci-repeat

abhijat · 2024-09-06T11:43:23Z

/ci-repeat

abhijat · 2024-09-07T07:11:03Z

/ci-repeat

abhijat · 2024-09-07T11:09:17Z

/ci-repeat

andrwng

Nothing too drastic, just some nits and test suggestions

andrwng · 2024-09-13T16:06:29Z

src/v/cloud_storage/types.h

@@ -343,6 +343,8 @@ struct anomalies
    uint32_t num_discarded_missing_segments{0};
    uint32_t num_discarded_metadata_anomalies{0};

+    bool segment_existence_checked{false};


nit: can you add a comment explaining how this is expected to be used? Once this is set to true, does that mean further scrubs aren't needed?

This is informational for the person reviewing scrub results. So if this is true, we know that the segment existence checks were performed as part of that scrub. I added this mostly because as a result of this PR, the scrub may run but not check for segments (because inv. data was missing). This flag lets a user determine if that happened. I will add a comment.

andrwng · 2024-09-13T16:19:19Z

src/v/cloud_storage/anomalies_detector.cc

+}
+
+bool existence_query_context::should_check_cloud_storage() const {
+    if (is_http_force_override_enabled) {


nit: maybe rename to force_check_enabled or something? It isn't obvious what HTTP this is referring to in this context

I meant to imply here that HTTP API calls are forced to always happen but it was hard to encode in the name, will try to think of a better name.

andrwng · 2024-09-13T16:34:55Z

src/v/cloud_storage/anomalies_detector.cc

+        if (
+          query_ctx.lookup_inventory_data(segment_path)
+            != cloud_storage::inventory::lookup_result::exists
+          && query_ctx.should_check_cloud_storage()) {


nit: I found it a bit difficult to map the query_ctx members to these methods. I think it would be easier to follow if we merge these into the same method, like:

bool should_check_cloud_storage(remote_segment_path p) { if (is_inv_data_available) { return hashes->exists(p) == missing; } if (force_without_inv_data || // The regular, non-inventory based scrub checks every segment. !is_inv_scrub_enabled) { return true; } return false; }

yeah, this looks simpler. I tried to combine the two methods in latest changes

andrwng · 2024-09-13T16:57:03Z

src/v/cloud_storage/tests/anomalies_detector_test.cc

+      {xxhash_64(test_path.data(), test_path.size())},
+      0)
+      .get();
+    q.load_from_disk().get();


nit: do we ever expect to not load_from_disk()? Wondering if it makes sense for existence_query_context to have a static futurized constructor that loads from disk, some

static ss::future<existence_query_context> existence_query_context::load(bool, ntp) { existence_query_context q{}; co_await q.load_from_disk(); co_return q; }

WDYT?

changed. the main issue here is with the design of ntp_hashes, I originally added a gate in there but now realize it is just a set of hashmaps loaded from disk, it doesn't have a coroutine based api other than the loading, so ideally it doesn't need a gate, it should have been one free function to load the data which would be called with a gate held by the caller.

Then ntp hashes would just be a set of hashmaps. I will try to clean it up in a future PR.

andrwng · 2024-09-13T17:03:55Z

src/v/cloud_storage/inventory/utils.h

+// Writes hashes for a single NTP to a data file. The data file is
+// located in path: namespace/topic/partition_id/{seq}. The parent
+// directories are created if missing. A new data file is created for each
+// flush operation.
+ss::future<> flush_ntp_hashes(
+  std::filesystem::path root,
+  model::ntp ntp,
+  fragmented_vector<uint64_t> hashes,
+  uint64_t file_name);
+
+ss::future<>
+write_hashes_to_file(ss::file& f, fragmented_vector<uint64_t> hashes);
+
+ss::future<> write_hashes_to_file(
+  ss::output_stream<char>& stream, fragmented_vector<uint64_t> hashes);
+
+} // namespace cloud_storage::inventory


nit: these all seem to be in service of the single task of writing hashes. Would it make sense to wrap them in some hash_writer class that wraps a ss::file?

Or are these separate because they're used individually in tests?

Or maybe the write_hashes_to_file() can just be in an anonymous namespace in the cc file?

IIRC these were member functions of the inventory consumer which were separate just for readability. When I moved them to independent compilation unit, they are no longer required to be public other than the flush function. I'll make the other two private.

andrwng · 2024-09-13T17:18:25Z

tests/rptest/tests/cloud_storage_scrubber_test.py

+        # skip adding the segment, because we still want the data set to be present on disk, so that the
+        # scrubber loads the data and performs segment checks. The presence of the changed segment name
+        # causes the NTP hash dir to be created on disk.
+        # TODO add more assertions once metrics for scrubber run are added


+1

Can we have a version of this test that doesn't change the name?

It'd also be nice to have a test that removes segments, generates the report, and asserts that there are anomalies based on the inventory report

Yeah, with metrics of the scrub process added it will be easier to make some assertions, right now what happens in the scrubber is a bit of a black box for ducktape, especially asserting why actions were taken by scrubber.

andrwng · 2024-09-13T17:19:04Z

tests/rptest/tests/cloud_storage_scrubber_test.py

@@ -637,6 +678,10 @@ def test_scrubber(self, cloud_storage_type):
        self._produce()
        self._assert_no_anomalies()

+        # Add a randomly generated inv. report to the mix. This will have some keys missing, but scrubbing


nit: it doesn't seem to be randomly generated?

Yeah, the randomness didn't work with the test. I will update the comment.

src/v/cloud_storage/inventory/ntp_hashes.cc

abhijat · 2024-09-20T15:42:11Z

looks like the refactor broke some tests

The consumer of the object should be able to query if the hashes have been loaded from disk.

The ntp hashes class mainly contains movable fields except for retry chain node. Since the node is used only to set up the logger in the object, a move-ctor is added which moves everything except for the rtc node. Crucially, the gate of the hashes object is also moved, so we do not stop the moved-from hashes object.

A new boolean field denotes if a segment existence check was performed.

When checking for segments within a manifest, the anomaly detector first tries to load up ntp hashes which could have been placed there by inventory service. Once loaded, this data is used as following: * for each segment path, first check the inv. data * if path exists move on. no operation is consumed * if path missing or collision, make http call * if path still missing, mark the segment as missing in anomalies The hash data set is not checked for manifests as these have to be downloaded for the scrub. If the hash data set was not loaded, then the segment-exists checks are completely skipped. This makes sure that we do not make a lot of HEAD requests if the data set is missing. In this case the segment existence check is skipped. There are two caveats here, if scrub is enabled but inv-based scrub is disabled, we always make HTTP calls, because this config is deemed to be explicit intent that such calls should be made. Second is that a flag to force HTTP calls is provided in the anomaly-detector::run method. If set then HTTP calls are always made even if data set is missing. This is for the benefit of the topic recovery validator, where we may want to always make HTTP calls even if there is no inv. data.

The ntp hashes query object wraps several booleans which decide whether api calls should be made, as well as the hashes object itself. The tests added here exercise the combinations of these booleans derived from the cluster config as well as input arguments.

The hash writer logic is extracted to utils compilation unit. This way scrubber/anomaly detector unit tests can use the same logic to prepare fixtures for testing.

The anomaly detector now expects hashes to be loaded from disk. Helper functions are added to write a set of segment-meta to disk, while being able to skip some of these to introduce anomalies.

Two unit tests are added which assert that inventory data is used from disk, and when segments are missing in the data HTTP calls are made to check for the segments.

A similar change was previously made to S3 client but not reflected for ABS client. The caller can directly pass a binary payload and skip wrapping in bytes() during put-object.

Inventory report is added to bucket during scrubber test. This report fudges the segment names before adding to the report for two purposes: * at some point during the test we remove a segment and expect an anomaly. since we do not know the segment to be removed when generating the report, we have to change the name before adding the key, otherwise the key is found in the report and the bucket is never checked, thus breaking the test. * we cannot skip adding the segment to report altogether, because if we do this, no hash file is generated, and as a safety measure the scrubber does not check for segments to avoid making many HEAD requests. In the current state the addition to the test is simply a sanity check. As more metrics are added this test can be expanded to include more useful assertions.

abhijat · 2024-09-21T10:27:10Z

ducktape failure was #16202

andrwng

LGTM thanks for the cosmetic changes!

abhijat requested a review from a team as a code owner August 9, 2024 05:56

github-actions bot added the area/redpanda label Aug 9, 2024

abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from 28fbfa6 to 661d102 Compare August 9, 2024 06:01

abhijat marked this pull request as draft August 9, 2024 06:01

abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from 661d102 to cd0dfba Compare August 16, 2024 06:04

abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from 3afde3c to 5cae3b3 Compare September 2, 2024 11:01

abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from 8edf665 to c129b25 Compare September 6, 2024 11:43

abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from c129b25 to aafbbbd Compare September 7, 2024 07:10

github-actions bot added the area/build label Sep 7, 2024

abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from aafbbbd to 3323f9a Compare September 7, 2024 11:09

abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch 9 times, most recently from f11dbe3 to 3e563b0 Compare September 8, 2024 06:31

abhijat requested review from andrwng and removed request for a team September 11, 2024 05:38

andrwng reviewed Sep 13, 2024

View reviewed changes

Lazin previously approved these changes Sep 13, 2024

View reviewed changes

src/v/cloud_storage/inventory/ntp_hashes.cc Show resolved Hide resolved

abhijat dismissed Lazin’s stale review via 55575ae September 20, 2024 05:49

abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch 5 times, most recently from 1cf6165 to 66874ca Compare September 20, 2024 09:46

abhijat added 11 commits September 21, 2024 13:01

cst/inv: Expose loaded field in ntp hashes

52a19a5

The consumer of the object should be able to query if the hashes have been loaded from disk.

cst: Add field to anomalies result for segment-exists check

9e01ede

A new boolean field denotes if a segment existence check was performed.

cst/inv: Extract hash writer fns to utils

1ce4293

The hash writer logic is extracted to utils compilation unit. This way scrubber/anomaly detector unit tests can use the same logic to prepare fixtures for testing.

cst: Add tooling to write hashes during unit tests

fd95e28

The anomaly detector now expects hashes to be loaded from disk. Helper functions are added to write a set of segment-meta to disk, while being able to skip some of these to introduce anomalies.

cst: Add tests for anomaly detection with inv. data

cee6cef

Two unit tests are added which assert that inventory data is used from disk, and when segments are missing in the data HTTP calls are made to check for the segments.

cst: Remove unused imports

7f48c6f

ducktape: Allow ABS client to accept binary payload

44b24fa

A similar change was previously made to S3 client but not reflected for ABS client. The caller can directly pass a binary payload and skip wrapping in bytes() during put-object.

abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from 66874ca to 532515f Compare September 21, 2024 10:22

abhijat mentioned this pull request Sep 21, 2024

CI Failure (500 Server Error) in IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered #16202

Closed

abhijat requested review from andrwng and Lazin September 22, 2024 09:34

Lazin approved these changes Sep 23, 2024

View reviewed changes

abhijat merged commit b3e481b into redpanda-data:dev Sep 23, 2024
18 checks passed

andrwng reviewed Sep 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cst: Use hashes in scrubber for segment-exists checks #22810

cst: Use hashes in scrubber for segment-exists checks #22810

abhijat commented Aug 9, 2024 •

edited

Loading

Deflaimun commented Aug 9, 2024 •

edited

Loading

abhijat commented Aug 9, 2024

abhijat commented Sep 2, 2024

abhijat commented Sep 2, 2024

abhijat commented Sep 2, 2024

vbotbuildovich commented Sep 2, 2024 •

edited

Loading

vbotbuildovich commented Sep 2, 2024 •

edited

Loading

abhijat commented Sep 6, 2024

abhijat commented Sep 6, 2024

abhijat commented Sep 7, 2024

abhijat commented Sep 7, 2024

andrwng left a comment

andrwng Sep 13, 2024

abhijat Sep 20, 2024

andrwng Sep 13, 2024

abhijat Sep 20, 2024

andrwng Sep 13, 2024

abhijat Sep 20, 2024

andrwng Sep 13, 2024

abhijat Sep 20, 2024

andrwng Sep 13, 2024

andrwng Sep 13, 2024

abhijat Sep 20, 2024

andrwng Sep 13, 2024

abhijat Sep 20, 2024

andrwng Sep 13, 2024

abhijat Sep 20, 2024

abhijat commented Sep 20, 2024

abhijat commented Sep 21, 2024

andrwng left a comment

cst: Use hashes in scrubber for segment-exists checks #22810

cst: Use hashes in scrubber for segment-exists checks #22810

Conversation

abhijat commented Aug 9, 2024 • edited Loading

Backports Required

Release Notes

Deflaimun commented Aug 9, 2024 • edited Loading

abhijat commented Aug 9, 2024

abhijat commented Sep 2, 2024

abhijat commented Sep 2, 2024

abhijat commented Sep 2, 2024

vbotbuildovich commented Sep 2, 2024 • edited Loading

vbotbuildovich commented Sep 2, 2024 • edited Loading

abhijat commented Sep 6, 2024

abhijat commented Sep 6, 2024

abhijat commented Sep 7, 2024

abhijat commented Sep 7, 2024

andrwng left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abhijat commented Sep 20, 2024

abhijat commented Sep 21, 2024

andrwng left a comment

Choose a reason for hiding this comment

abhijat commented Aug 9, 2024 •

edited

Loading

Deflaimun commented Aug 9, 2024 •

edited

Loading

vbotbuildovich commented Sep 2, 2024 •

edited

Loading

vbotbuildovich commented Sep 2, 2024 •

edited

Loading