redpanda-data · Deflaimun · Dec 9, 2024 · Dec 9, 2024 · Dec 9, 2024 · Dec 11, 2024
@@ -969,182 +969,304 @@ The number of transform processors in a specific state (running, inactive, error
 
 == Cloud storage metrics
 
-include::reference:partial$public_metrics_tip.adoc[]
-
 ifndef::env-cloud[]
-NOTE: Cloud storage metrics are only available if you have:
 
-- xref:manage:tiered-storage.adoc[] enabled
-- The cluster property xref:reference:properties/object-storage-properties.adoc#cloud_storage_enabled[cloud_storage_enabled] set to `true`
+[NOTE]
+====
+Cloud storage metrics are only available if you have:
+
+* xref:manage:tiered-storage.adoc[Tiered Storage] enabled
+* The cluster property xref:reference:properties/object-storage-properties.adoc#cloud_storage_enabled[cloud_storage_enabled] set to `true`
+====
 endif::[]
 
-=== redpanda_cloud_storage_cache_space_size_bytes
+=== redpanda_cloud_storage_active_segments
 
-Sum of size of cached objects.
+Number of remote log segments currently hydrated for read. 
 
-=== redpanda_cloud_storage_housekeeping_drains
+*Type*: gauge
 
-Number of times upload housekeeping queue was drained.
+=== redpanda_cloud_storage_anomalies
 
-=== redpanda_cloud_storage_spillover_manifests_materialized_bytes
+Number of missing partition manifest anomalies for the topic.
 
-Bytes of memory used for spilled manifests currently cached in memory.
+*Type*: gauge
+
+=== redpanda_cloud_storage_cache_op_hit
+
+Number of get requests for objects that are already in cache.
+
+*Type*: counter
+
+=== redpanda_cloud_storage_cache_op_in_progress_files
+
+Number of files that are being put into cache.
+
+*Type*: gauge
+
+=== redpanda_cloud_storage_cache_op_miss
+
+Number of failed get requests due to missing object in the cache.
+
+*Type*: counter
 
 === redpanda_cloud_storage_cache_op_put
 
 Number of objects written into cache.
 
-=== redpanda_cloud_storage_segments
+*Type*: counter
 
-Total number of accounted topic segments in the cloud.
+=== redpanda_cloud_storage_cache_space_files
 
-=== redpanda_cloud_storage_jobs_local_segment_reuploads
+Number of objects in cache.
 
-Number of segment reuploads from local data directory.
+*Type*: gauge
 
-=== redpanda_cloud_storage_cache_trim_failed_trims
+=== redpanda_cloud_storage_cache_space_hwm_files
+
+High watermark for number of objects in cache.
+
+*Type*: gauge
+
+=== redpanda_cloud_storage_cache_space_hwm_size_bytes
+
+High watermark for sum of size of cached objects.
+
+*Type*: gauge
+
+=== redpanda_cloud_storage_cache_space_size_bytes
+
+Sum of size of cached objects.
+
+*Type*: gauge
+
+=== redpanda_cloud_storage_cache_space_tracker_size
+
+Number of entries in cache access tracker.
+
+*Type*: gauge
 
-Number of times Redpanda could not free the expected amount of space, indicating possible bug or configuration issue.
+=== redpanda_cloud_storage_cache_space_tracker_syncs
+
+Number of times the access tracker has been updated with cache disk data.
+
+*Type*: counter
+
+=== redpanda_cloud_storage_cache_trim_carryover_trims
+
+Number of times carryover trim has been invoked.
+
+*Type*: counter
 
 === redpanda_cloud_storage_cache_trim_exhaustive_trims
 
-Number of times a fast trim could not free enough space and had to fall back to a slower exhaustive trim.
+Number of times sufficient space could not be accommodated with a fast trim and had to fall back to a slower exhaustive trim.
+
+*Type*: counter
+
+=== redpanda_cloud_storage_cache_trim_failed_trims
+
+Number of times the expected amount of space could not be freed up, indicating a possible bug or configuration issue.
+
+*Type*: counter
+
+=== redpanda_cloud_storage_cache_trim_fast_trims
+
+Number of times the cache has been trimmed using the normal (fast) mode.
+
+*Type*: counter
+
+=== redpanda_cloud_storage_cache_trim_in_mem_trims
+
+Number of times the cache has been trimmed using the in-memory access tracker.
+
+*Type*: counter
+
+=== redpanda_cloud_storage_cloud_log_size
+
+Total size in bytes of the user-visible log for the topic.
+
+*Type*: gauge
 
 === redpanda_cloud_storage_deleted_segments
 
-Count of deleted remote segments.
+Number of segments that have been deleted from S3 for the topic. This may grow due to retention or non-compacted segments being replaced with their compacted equivalent.
 
-=== redpanda_cloud_storage_segment_uploads_total
+*Type*: counter
 
-Successful data segment uploads.
+=== redpanda_cloud_storage_errors_total
 
-=== redpanda_cloud_storage_active_segments
+Number of transmit errors.
 
-Number of remote log segments currently hydrated for read.
+*Type*: counter
 
-=== redpanda_cloud_storage_cache_trim_fast_trims
+=== redpanda_cloud_storage_housekeeping_drains
+
+Number of times upload housekeeping queue was drained.
+
+*Type*: gauge
+
+=== redpanda_cloud_storage_housekeeping_jobs_completed
 
-Number of times Redpanda trimmed the cache using the normal (fast) mode.
+Number of executed housekeeping jobs.
+
+*Type*: counter
 
 === redpanda_cloud_storage_housekeeping_jobs_failed
 
 Number of failed housekeeping jobs.
 
-=== redpanda_cloud_storage_partition_readers_delayed
+*Type*: counter
 
-How many partition readers were delayed due to hitting reader limit. This indicates cluster is saturated with Tiered Storage reads.
+=== redpanda_cloud_storage_housekeeping_jobs_skipped
 
-=== redpanda_cloud_storage_segments_pending_deletion
+Number of skipped housekeeping jobs.
 
-Total number of topic segments pending deletion from the cloud.
+*Type*: counter
 
-=== redpanda_cloud_storage_housekeeping_rounds
+=== redpanda_cloud_storage_housekeeping_pauses
 
-Number of upload housekeeping rounds.
+Number of times upload housekeeping was paused.
 
-=== redpanda_cloud_storage_segment_readers_delayed
+*Type*: gauge
 
-Number of segment readers delayed due to hitting reader limit. This indicates cluster is saturated with Tiered Storage reads.
+=== redpanda_cloud_storage_housekeeping_requests_throttled_average_rate
 
-=== redpanda_cloud_storage_cache_space_hwm_size_bytes
+Average rate of requests from the read and write path that were throttled by Tiered Storage (per shard).
 
-High watermark of sum of size of cached objects.
+*Type*: gauge
 
-=== redpanda_cloud_storage_cache_space_hwm_files
+=== redpanda_cloud_storage_housekeeping_resumes
+
+Number of times upload housekeeping was resumed.
 
-High watermark of number of objects in cache.
+*Type*: gauge
 
-=== redpanda_cloud_storage_cache_op_in_progress_files
+=== redpanda_cloud_storage_housekeeping_rounds
 
-Number of files that are being added to cache.
+Number of upload housekeeping rounds.
+
+*Type*: counter
 
 === redpanda_cloud_storage_jobs_cloud_segment_reuploads
 
 Number of segment reuploads from cloud storage sources (cloud storage cache or direct download from cloud storage).
 
+*Type*: gauge
+
+=== redpanda_cloud_storage_jobs_local_segment_reuploads
+
+Number of segment reuploads from local data directory.
+
+*Type*: gauge
+
 === redpanda_cloud_storage_jobs_manifest_reuploads
 
 Number of manifest reuploads performed by all housekeeping jobs.
 
-=== redpanda_cloud_storage_housekeeping_pauses
-
-Number of times upload housekeeping was paused.
+*Type*: gauge
 
-=== redpanda_cloud_storage_segment_index_uploads_total
+=== redpanda_cloud_storage_jobs_metadata_syncs
 
-Successful segment index uploads.
+Number of archival configuration updates performed by all housekeeping jobs.
 
-=== redpanda_cloud_storage_cache_op_miss
+*Type*: gauge
 
-Number of failed get requests because of missing object in the cache.
+=== redpanda_cloud_storage_jobs_segment_deletions
 
-=== redpanda_cloud_storage_errors_total
+Number of segments deleted by all housekeeping jobs.
 
-Number of transmit errors.
+*Type*: gauge
 
-=== redpanda_cloud_storage_spillover_manifest_uploads_total
+=== redpanda_cloud_storage_limits_downloads_throttled_sum
 
-Successful spillover manifest uploads.
+Total amount of time downloads were throttled (ms).
 
-=== redpanda_cloud_storage_housekeeping_requests_throttled_average_rate
+*Type*: counter
 
-Average rate per shard of requests from the read and write path that were throttled by Tiered Storage.
+=== redpanda_cloud_storage_partition_manifest_uploads_total
 
-=== redpanda_cloud_storage_jobs_segment_deletions
+Successful partition manifest uploads.
 
-Number of segments deleted by all housekeeping jobs.
+*Type*: counter
 
-=== redpanda_cloud_storage_segment_materializations_delayed
+=== redpanda_cloud_storage_partition_readers
 
-Number of segment materializations delayed due to hitting reader limit. This indicates cluster is saturated with Tiered Storage reads.
+Number of partition reader instances (number of current fetch/timequery requests reading from Tiered Storage).
 
-=== redpanda_cloud_storage_jobs_metadata_syncs
+*Type*: gauge
 
-Number of archival configuration updates performed by all housekeeping jobs.
+=== redpanda_cloud_storage_partition_readers_delayed
 
-=== redpanda_cloud_storage_housekeeping_jobs_completed
+Number of partition reads that were delayed due to hitting reader limit. This indicates a cluster is saturated with Tiered Storage reads.
 
-Number of successful housekeeping jobs.
+*Type*: counter
 
 === redpanda_cloud_storage_readers
 
 Number of segment read cursors for hydrated remote log segments.
 
-=== redpanda_cloud_storage_partition_manifest_uploads_total
+*Type*: gauge
 
-Successful partition manifest uploads.
+=== redpanda_cloud_storage_segment_index_uploads_total
 
-=== redpanda_cloud_storage_limits_downloads_throttled_sum
+Successful segment index uploads.
 
-Total amount of throttling applied to cloud storage downloads.
+*Type*: counter
 
-=== redpanda_cloud_storage_housekeeping_resumes
+=== redpanda_cloud_storage_segment_materializations_delayed
 
-Number of times upload housekeeping was resumed.
+Number of segment materializations that were delayed due to hitting reader limit. This indicates a cluster is saturated with Tiered Storage reads.
 
-=== redpanda_cloud_storage_cache_op_hit
+*Type*: counter
 
-Number of get requests for objects that are already in cache.
+=== redpanda_cloud_storage_segment_readers_delayed
 
-=== redpanda_cloud_storage_spillover_manifests_materialized_count
+Number of segment readers that were delayed due to hitting reader limit. This indicates a cluster is saturated with Tiered Storage reads.
 
-Number of spilled manifests currently cached in memory.
+*Type*: counter
 
-=== redpanda_cloud_storage_uploaded_bytes
+=== redpanda_cloud_storage_segment_uploads_total
 
-Total number of uploaded bytes for the topic.
+Number of successful data segment uploads.
 
-=== redpanda_cloud_storage_cache_space_files
+*Type*: counter
 
-Number of objects in cache.
+=== redpanda_cloud_storage_segments
 
-=== redpanda_cloud_storage_housekeeping_jobs_skipped
+Total number of accounted segments in the cloud for the topic.
 
-Number of skipped housekeeping jobs.
+*Type*: gauge
 
-=== redpanda_cloud_storage_partition_readers
+=== redpanda_cloud_storage_segments_pending_deletion
+
+Total number of segments pending deletion from the cloud for the topic.
+
+*Type*: gauge
+
+=== redpanda_cloud_storage_spillover_manifest_uploads_total
 
-Number of partition reader instances, based on the number of current fetch/timequery requests reading from Tiered Storage.
+Successful spillover manifest uploads.
+
+*Type*: counter
+
+=== redpanda_cloud_storage_spillover_manifests_materialized_bytes
+
+Bytes of memory used for spilled manifests currently cached in memory.
+
+*Type*: gauge
+
+=== redpanda_cloud_storage_spillover_manifests_materialized_count
+
+Number of spilled manifests that are currently cached in memory.
+
+*Type*: gauge
+
+=== redpanda_cloud_storage_uploaded_bytes
+
+Total number of uploaded bytes for the topic.
+
+*Type*: counter
 
 == Related topics