Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cst: Use hashes in scrubber for segment-exists checks #22810

Merged

Conversation

abhijat
Copy link
Contributor

@abhijat abhijat commented Aug 9, 2024

When checking for segments within a manifest, the anomaly detector first tries to load up ntp hashes which could have been placed there by inventory service.

Once loaded, this data is used as following:

  • for each segment path, first check the inventory data set
  • if path exists move on. no operation budget is consumed
  • if path missing or collision, make http call to check if segment is actually missing
  • if path still missing, mark the segment as missing in anomalies

The hash data set is not checked for manifests as these have to be downloaded for the scrub eventually.

The segment-existence check is performed under these circumstances:

  • inventory based scrub is enabled AND inventory data is available
  • scrub is enabled AND inventory based scrub is disabled
  • A special flag is passed to anomaly detector, this is for the benefit of topic recovery validator which also uses anomaly detector but may be run in environments where inventory based scrub is not available.

The check is not performed if inventory based scrub is enabled, but data is not available on disk.

Note: The scrub result is still marked as full if segment-existence checks are not performed. The semantics of scrub result (whether it is full or partial) are mainly used to drive the combination of scrub result.

Partial results are merged together while full results replace previous values. In this context, marking the result as partial because segment-existence was not checked (possibly due to missing inventory data) does not make much sense. The scrub is still performed over the full offset range in the bucket, we just turned off one check because of data not being available. Such a result should still replace the previous result.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.2.x
  • v24.1.x
  • v23.3.x

Release Notes

  • none

@abhijat abhijat requested a review from a team as a code owner August 9, 2024 05:56
@abhijat abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from 28fbfa6 to 661d102 Compare August 9, 2024 06:01
@abhijat abhijat marked this pull request as draft August 9, 2024 06:01
@Deflaimun
Copy link
Contributor

Deflaimun commented Aug 9, 2024

Hi @abhijat. Should we review the properties for docs now or do you prefer that we review when it's not in draft? Thanks

@abhijat
Copy link
Contributor Author

abhijat commented Aug 9, 2024

Hi @abhijat. Should we review the properties for docs now or do you prefer that we review when it's not in draft? Thanks

@Deflaimun this PR is actually based off #21576 - the properties have been reviewed by the docs team already. Once the base PR is merged and I rebase this PR won't contain any configuration related changes.

@abhijat abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from 661d102 to cd0dfba Compare August 16, 2024 06:04
@abhijat
Copy link
Contributor Author

abhijat commented Sep 2, 2024

/cdt

@abhijat
Copy link
Contributor Author

abhijat commented Sep 2, 2024

/ci-repeat

@abhijat abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from 3afde3c to 5cae3b3 Compare September 2, 2024 11:01
@abhijat
Copy link
Contributor Author

abhijat commented Sep 2, 2024

/ci-repeat

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Sep 2, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/53889#0191b29a-3ee8-4ab4-9594-a8bcb095a2df:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/53889#0191b29a-3eea-4c5b-bcc5-d6276e182346:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/53889#0191b2c4-a837-449d-8b60-94b6f4fc90fa:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/53889#0191b2c4-a832-4831-bf55-559e60ac226e:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/54087#0191c767-9877-4a2a-bc1b-18f9a5851e30:

"rptest.tests.cloud_storage_scrubber_test.CloudStorageScrubberTest.test_scrubber.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/54087#0191c780-f5d1-4485-844a-14fd899b4862:

"rptest.tests.cloud_storage_scrubber_test.CloudStorageScrubberTest.test_scrubber.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/54819#01920f30-2f29-4cf4-aaf9-0aa79c47eaab:

"rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTestWithDisruptions.test_write_with_node_failures.cloud_storage_type=CloudStorageType.S3"

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Sep 2, 2024

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53889#0191b29a-3ee8-4ab4-9594-a8bcb095a2df

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53889#0191b29a-3ee7-4c9b-a91a-db4106b6a99d

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53889#0191b29a-3ee5-41d1-a7fe-89074572a760

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53889#0191b2c4-a832-4831-bf55-559e60ac226e

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53889#0191b2c4-a834-4415-9c62-8888284aef86

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54087#0191c767-987a-44ec-8ac4-efcba5aa3101

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54087#0191c767-987c-465d-bfa5-a449ce9a3cca

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54087#0191c780-f5d8-4847-8842-813d832bf4af

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54087#0191c780-f5d4-4aa2-9257-90db112d973d

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54138#0191cbb2-a314-4d67-ae05-5d33b4b09d29

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54138#0191cbb2-7ffc-4fce-b698-566eabed2dbc

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54143#0191cc73-b10e-4ce5-b58d-78230d952e60

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54143#0191cc8d-03c7-4b5d-a2f3-0c3d4f413641

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54167#0191d0ac-ce1b-45b3-bfe7-99d0f3295eef

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54167#0191d0ac-ce19-44a6-9bf5-3004d0a91b89

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54168#0191d215-e505-4306-91b1-04974d55c54f

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54181#0191d646-7a1f-44f5-8e86-8c1e2f9a7bd4

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54252#0191dc4b-de7f-468c-9a81-b88fadc95ea4

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54252#0191dc4e-59c2-4a76-b0fd-06ef2a7d2dfd

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54307#0191dff0-34e2-49b6-83e9-91237998c5fd

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54887#01921477-9065-4b1e-94da-c103927abb2f

@abhijat
Copy link
Contributor Author

abhijat commented Sep 6, 2024

/ci-repeat

@abhijat abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from 8edf665 to c129b25 Compare September 6, 2024 11:43
@abhijat
Copy link
Contributor Author

abhijat commented Sep 6, 2024

/ci-repeat

@abhijat abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from c129b25 to aafbbbd Compare September 7, 2024 07:10
@abhijat
Copy link
Contributor Author

abhijat commented Sep 7, 2024

/ci-repeat

@abhijat abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch from aafbbbd to 3323f9a Compare September 7, 2024 11:09
@abhijat
Copy link
Contributor Author

abhijat commented Sep 7, 2024

/ci-repeat

@abhijat abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch 9 times, most recently from f11dbe3 to 3e563b0 Compare September 8, 2024 06:31
@abhijat abhijat requested review from andrwng and removed request for a team September 11, 2024 05:38
Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing too drastic, just some nits and test suggestions

@@ -343,6 +343,8 @@ struct anomalies
uint32_t num_discarded_missing_segments{0};
uint32_t num_discarded_metadata_anomalies{0};

bool segment_existence_checked{false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you add a comment explaining how this is expected to be used? Once this is set to true, does that mean further scrubs aren't needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is informational for the person reviewing scrub results. So if this is true, we know that the segment existence checks were performed as part of that scrub. I added this mostly because as a result of this PR, the scrub may run but not check for segments (because inv. data was missing). This flag lets a user determine if that happened. I will add a comment.

}

bool existence_query_context::should_check_cloud_storage() const {
if (is_http_force_override_enabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe rename to force_check_enabled or something? It isn't obvious what HTTP this is referring to in this context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to imply here that HTTP API calls are forced to always happen but it was hard to encode in the name, will try to think of a better name.

Comment on lines 263 to 266
if (
query_ctx.lookup_inventory_data(segment_path)
!= cloud_storage::inventory::lookup_result::exists
&& query_ctx.should_check_cloud_storage()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I found it a bit difficult to map the query_ctx members to these methods. I think it would be easier to follow if we merge these into the same method, like:

bool should_check_cloud_storage(remote_segment_path p) {
    if (is_inv_data_available) {
        return hashes->exists(p) == missing;
    }
    if (force_without_inv_data ||
        // The regular, non-inventory based scrub checks every segment.
        !is_inv_scrub_enabled) {
        return true;
    }
    return false;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this looks simpler. I tried to combine the two methods in latest changes

{xxhash_64(test_path.data(), test_path.size())},
0)
.get();
q.load_from_disk().get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we ever expect to not load_from_disk()? Wondering if it makes sense for existence_query_context to have a static futurized constructor that loads from disk, some

static ss::future<existence_query_context> existence_query_context::load(bool, ntp) {
    existence_query_context q{};
    co_await q.load_from_disk();
    co_return q;
}

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed. the main issue here is with the design of ntp_hashes, I originally added a gate in there but now realize it is just a set of hashmaps loaded from disk, it doesn't have a coroutine based api other than the loading, so ideally it doesn't need a gate, it should have been one free function to load the data which would be called with a gate held by the caller.

Then ntp hashes would just be a set of hashmaps. I will try to clean it up in a future PR.

Comment on lines 22 to 32
// Writes hashes for a single NTP to a data file. The data file is
// located in path: namespace/topic/partition_id/{seq}. The parent
// directories are created if missing. A new data file is created for each
// flush operation.
ss::future<> flush_ntp_hashes(
std::filesystem::path root,
model::ntp ntp,
fragmented_vector<uint64_t> hashes,
uint64_t file_name);

ss::future<>
write_hashes_to_file(ss::file& f, fragmented_vector<uint64_t> hashes);

ss::future<> write_hashes_to_file(
ss::output_stream<char>& stream, fragmented_vector<uint64_t> hashes);

} // namespace cloud_storage::inventory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these all seem to be in service of the single task of writing hashes. Would it make sense to wrap them in some hash_writer class that wraps a ss::file?

Or are these separate because they're used individually in tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe the write_hashes_to_file() can just be in an anonymous namespace in the cc file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC these were member functions of the inventory consumer which were separate just for readability. When I moved them to independent compilation unit, they are no longer required to be public other than the flush function. I'll make the other two private.

# skip adding the segment, because we still want the data set to be present on disk, so that the
# scrubber loads the data and performs segment checks. The presence of the changed segment name
# causes the NTP hash dir to be created on disk.
# TODO add more assertions once metrics for scrubber run are added
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Can we have a version of this test that doesn't change the name?

It'd also be nice to have a test that removes segments, generates the report, and asserts that there are anomalies based on the inventory report

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, with metrics of the scrub process added it will be easier to make some assertions, right now what happens in the scrubber is a bit of a black box for ducktape, especially asserting why actions were taken by scrubber.

@@ -637,6 +678,10 @@ def test_scrubber(self, cloud_storage_type):
self._produce()
self._assert_no_anomalies()

# Add a randomly generated inv. report to the mix. This will have some keys missing, but scrubbing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it doesn't seem to be randomly generated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the randomness didn't work with the test. I will update the comment.

Lazin
Lazin previously approved these changes Sep 13, 2024
src/v/cloud_storage/inventory/ntp_hashes.cc Show resolved Hide resolved
@abhijat abhijat force-pushed the feat/inv-scrub/use-hashes-in-scrubber branch 5 times, most recently from 1cf6165 to 66874ca Compare September 20, 2024 09:46
@abhijat
Copy link
Contributor Author

abhijat commented Sep 20, 2024

looks like the refactor broke some tests

The consumer of the object should be able to query if the hashes have
been loaded from disk.
The ntp hashes class mainly contains movable fields except for retry
chain node. Since the node is used only to set up the logger in the
object, a move-ctor is added which moves everything except for the rtc
node. Crucially, the gate of the hashes object is also moved, so we do
not stop the moved-from hashes object.
A new boolean field denotes if a segment existence check was performed.
When checking for segments within a manifest, the anomaly detector first
tries to load up ntp hashes which could have been placed there by
inventory service.

Once loaded, this data is used as following:

* for each segment path, first check the inv. data
* if path exists move on. no operation is consumed
* if path missing or collision, make http call
* if path still missing, mark the segment as missing in anomalies

The hash data set is not checked for manifests as these have to be
downloaded for the scrub.

If the hash data set was not loaded, then the segment-exists checks are
completely skipped. This makes sure that we do not make a lot of HEAD
requests if the data set is missing. In this case the segment existence
check is skipped.

There are two caveats here, if scrub is enabled but inv-based scrub is
disabled, we always make HTTP calls, because this config is deemed to be
explicit intent that such calls should be made. Second is that a flag to
force HTTP calls is provided in the anomaly-detector::run method. If set
then HTTP calls are always made even if data set is missing. This is for
the benefit of the topic recovery validator, where we may want to always
make HTTP calls even if there is no inv. data.
The ntp hashes query object wraps several booleans which decide whether
api calls should be made, as well as the hashes object itself.

The tests added here exercise the combinations of these booleans derived
from the cluster config as well as input arguments.
The hash writer logic is extracted to utils compilation unit. This way
scrubber/anomaly detector unit tests can use the same logic to prepare
fixtures for testing.
The anomaly detector now expects hashes to be loaded from disk. Helper
functions are added to write a set of segment-meta to disk, while being
able to skip some of these to introduce anomalies.
Two unit tests are added which assert that inventory data is used from
disk, and when segments are missing in the data HTTP calls are made to
check for the segments.
A similar change was previously made to S3 client but not reflected for
ABS client. The caller can directly pass a binary payload and skip
wrapping in bytes() during put-object.
Inventory report is added to bucket during scrubber test. This report
fudges the segment names before adding to the report for two purposes:

* at some point during the test we remove a segment and expect an
  anomaly. since we do not know the segment to be removed when
  generating the report, we have to change the name before adding the
  key, otherwise the key is found in the report and the bucket is never
  checked, thus breaking the test.
* we cannot skip adding the segment to report altogether, because if we
  do this, no hash file is generated, and as a safety measure the
  scrubber does not check for segments to avoid making many HEAD
  requests.

In the current state the addition to the test is simply a sanity check.
As more metrics are added this test can be expanded to include more
useful assertions.
@abhijat
Copy link
Contributor Author

abhijat commented Sep 21, 2024

ducktape failure was #16202

@abhijat abhijat merged commit b3e481b into redpanda-data:dev Sep 23, 2024
18 checks passed
Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for the cosmetic changes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants