CI Failure (500 Server Error) in `IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered` #16202

vbotbuildovich · 2024-01-20T00:59:41Z

https://buildkite.com/redpanda/redpanda/builds/43869

Module: rptest.tests.idempotency_test
Class: IdempotencySnapshotDelivery
Method: test_recovery_after_snapshot_is_delivered

test_id:    IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered
status:     FAIL
run time:   105.788 seconds

HTTPError('500 Server Error: Internal Server Error for url: http://docker-rp-12:9644/v1/cloud_storage/reset_scrubbing_metadata/kafka/topic-mmltvskecb/0')
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 184, in _do_run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 269, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/services/cluster.py", line 159, in wrapped
    self.redpanda.maybe_do_internal_scrub()
  File "/root/tests/rptest/services/redpanda.py", line 3955, in maybe_do_internal_scrub
    results = self.wait_for_internal_scrub(cloud_partitions)
  File "/root/tests/rptest/services/redpanda.py", line 4062, in wait_for_internal_scrub
    self._admin.reset_scrubbing_metadata(
  File "/root/tests/rptest/services/admin.py", line 1170, in reset_scrubbing_metadata
    return self._request(
  File "/root/tests/rptest/services/admin.py", line 363, in _request
    r.raise_for_status()
  File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 941, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http://docker-rp-12:9644/v1/cloud_storage/reset_scrubbing_metadata/kafka/topic-mmltvskecb/0

JIRA Link: CORE-1729

The text was updated successfully, but these errors were encountered:

andrwng · 2024-01-22T19:43:02Z

TRACE 2024-01-18 06:41:20,649 [shard 0:admi] admin_api_server - server.cc:615 - Attempting to audit authn for /v1/cloud_storage/reset_scrubbing_metadata/kafka/topic-mmltvskecb/0
TRACE 2024-01-18 06:41:20,649 [shard 0:admi] admin_api_server - server.cc:571 - Attempting to audit authz for /v1/cloud_storage/reset_scrubbing_metadata/kafka/topic-mmltvskecb/0
TRACE 2024-01-18 06:41:20,649 [shard 1:au  ] http - [/5a8073d5/kafka/topic-mmltvskecb/0_24/41009-46188-1098169-1-v1.log.2] - client.cc:89 - client.make_request HEAD /5a8073d5/kafka/topic-mmltvskecb/0_24/41009-46188-1098169-1-v1.log.2 HTTP/1.1
User-Agent: redpanda.vectorized.io
Host: panda-bucket-5bcff224-b5cc-11ee-bbd3-0242ac10101c.minio-s3
Content-Length: 0
x-amz-date: 20240118T064120Z
x-amz-content-sha256: [secret]
Authorization: [secret]


DEBUG 2024-01-18 06:41:20,650 [shard 0:admi] admin_api_server - server.cc:647 - [admin] POST http://docker-rp-12:9644/v1/cloud_storage/reset_scrubbing_metadata/kafka/topic-mmltvskecb/0
DEBUG 2024-01-18 06:41:20,650 [shard 1:au  ] http - [/5a8073d5/kafka/topic-mmltvskecb/0_24/41009-46188-1098169-1-v1.log.2] - client.cc:101 - reusing connection, age 1389585
TRACE 2024-01-18 06:41:20,650 [shard 1:au  ] http - /5a8073d5/kafka/topic-mmltvskecb/0_24/41009-46188-1098169-1-v1.log.2 - client.cc:434 - request_stream.send_some 0
DEBUG 2024-01-18 06:41:20,650 [shard 1:admi] cluster - ntp: {kafka/topic-mmltvskecb/0} - archival_metadata_stm.cc:411 - command_batch_builder::replicate called
WARN  2024-01-18 06:41:20,651 [shard 1:admi] archival - [fiber4 kafka/topic-mmltvskecb/0] - ntp_archiver_service.cc:603 - Failed to replicate reset scrubbing metadata command: Current node is not a leader for partition
TRACE 2024-01-18 06:41:20,652 [shard 1:au  ] http - /5a8073d5/kafka/topic-mmltvskecb/0_24/41009-46188-1098169-1-v1.log.2 - client.cc:296 - chunk received, chunk length 485
ERROR 2024-01-18 06:41:20,652 [shard 0:admi] admin_api_server - server.cc:680 - [admin] exception intercepted - url: [http://docker-rp-12:9644/v1/cloud_storage/reset_scrubbing_metadata/kafka/topic-mmltvskecb/0] http_return_status[500] reason - seastar::httpd::server_error_exception (Failed to replicate or apply scrubber metadata reset command)

Looks like there was a leadership change as the command to start the scrubber came in.

vbotbuildovich · 2024-02-04T00:13:43Z

*https://buildkite.com/redpanda/redpanda/builds/44669

vbotbuildovich · 2024-03-05T00:11:54Z

*https://buildkite.com/redpanda/redpanda/builds/45633

piyushredpanda · 2024-06-18T05:18:47Z

Not seen in at least two months, closing

abhijat · 2024-09-21T10:26:28Z

seen again in https://buildkite.com/redpanda/redpanda/builds/54819#01920f30-2f29-4cf4-aaf9-0aa79c47eaab
#22810

abhijat · 2024-09-21T10:28:23Z

DEBUG 2024-09-20 12:07:38,835 [shard 0:admi] cluster - ntp: {kafka/__consumer_offsets/12} - archival_metadata_stm.cc:478 - command_batch_builder::replicate called
WARN  2024-09-20 12:07:38,835 [shard 0:admi] archival - [fiber27 kafka/__consumer_offsets/12] - ntp_archiver_service.cc:622 - Failed to replicate reset scrubbing metadata command: Current node is not a leader for partition
ERROR 2024-09-20 12:07:38,835 [shard 0:admi] admin_api_server - server.cc:657 - [admin] exception intercepted - url: [http://docker-rp-13:9644/v1/cloud_storage/reset_scrubbing_metadata/kafka/__consumer_offsets/12] http_return_status[500] reason - seastar::httpd::server_error_exception (Failed to replicate or apply scrubber metadata reset command)

This seems to be run on __consumer_offsets

piyushredpanda · 2024-09-24T04:00:44Z

Closing older-bot-filed CI issues as we transition to a more reliable system.

vbotbuildovich added auto-triaged used to know which issues have been opened from a CI job ci-failure labels Jan 20, 2024

dotnwat changed the title ~~CI Failure (key symptom) in IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered~~ CI Failure (500 Server Error) in IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered Jan 20, 2024

andrwng added the sev/low Bugs which are non-functional paper cuts, e.g. typos, issues in log messages label Jan 22, 2024

piyushredpanda mentioned this issue Mar 18, 2024

[v23.3.x] compression: Allocate memory for LZ4_compressEnd #17159

Merged

dotnwat added area/storage sev/low Bugs which are non-functional paper cuts, e.g. typos, issues in log messages and removed sev/low Bugs which are non-functional paper cuts, e.g. typos, issues in log messages labels Apr 4, 2024

piyushredpanda closed this as completed Jun 18, 2024

abhijat reopened this Sep 21, 2024

abhijat mentioned this issue Sep 21, 2024

cst: Use hashes in scrubber for segment-exists checks #22810

Merged

7 tasks

piyushredpanda closed this as completed Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI Failure (500 Server Error) in `IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered` #16202

CI Failure (500 Server Error) in `IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered` #16202

vbotbuildovich commented Jan 20, 2024 •

edited by jira bot

Loading

andrwng commented Jan 22, 2024

vbotbuildovich commented Feb 4, 2024

vbotbuildovich commented Mar 5, 2024

piyushredpanda commented Jun 18, 2024

abhijat commented Sep 21, 2024 •

edited

Loading

abhijat commented Sep 21, 2024

piyushredpanda commented Sep 24, 2024

CI Failure (500 Server Error) in IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered #16202

CI Failure (500 Server Error) in IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered #16202

Comments

vbotbuildovich commented Jan 20, 2024 • edited by jira bot Loading

andrwng commented Jan 22, 2024

vbotbuildovich commented Feb 4, 2024

vbotbuildovich commented Mar 5, 2024

piyushredpanda commented Jun 18, 2024

abhijat commented Sep 21, 2024 • edited Loading

abhijat commented Sep 21, 2024

piyushredpanda commented Sep 24, 2024

CI Failure (500 Server Error) in `IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered` #16202

CI Failure (500 Server Error) in `IdempotencySnapshotDelivery.test_recovery_after_snapshot_is_delivered` #16202

vbotbuildovich commented Jan 20, 2024 •

edited by jira bot

Loading

abhijat commented Sep 21, 2024 •

edited

Loading