-
Notifications
You must be signed in to change notification settings - Fork 352
(feat) Add metrics persistence with dual storage strategy for Iceberg table operations (fixes #3337) #3385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
(feat) Add metrics persistence with dual storage strategy for Iceberg table operations (fixes #3337) #3385
Conversation
polaris-core/src/main/java/org/apache/polaris/core/storage/aws/AwsSessionTagsBuilder.java
Outdated
Show resolved
Hide resolved
| -- These are assumed to already exist from v3 migration | ||
|
|
||
| -- Scan Metrics Report Entity Table | ||
| CREATE TABLE IF NOT EXISTS scan_metrics_report ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dual Storage Strategy: We use dedicated scan_metrics_report and commit_metrics_report tables rather than storing metrics as JSON blobs in the existing events table. This enables efficient SQL queries for analytics (e.g., "find slow table scans") and proper indexing on metrics fields, at the cost of a more rigid schema. The metadata JSONB column provides extensibility for future metric fields without requiring schema migrations.
| COMMENT ON COLUMN scan_metrics_report.otel_trace_id IS 'OpenTelemetry trace ID from HTTP headers'; | ||
| COMMENT ON COLUMN scan_metrics_report.report_trace_id IS 'Trace ID from report metadata'; | ||
|
|
||
| -- Indexes for scan_metrics_report |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Index Strategy: Indexes are designed around common query patterns: time-range queries (timestamp_ms DESC), table-specific queries (composite on catalog/namespace/table), and trace correlation (otel_trace_id with partial index to skip NULLs). The realm_id index supports multi-tenant queries without requiring it in every composite index.
|
|
||
| @Produces | ||
| @ApplicationScoped | ||
| public PolarisMetricsReporter metricsReporter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CDI-based Reporter Selection: The @Identifier annotation pattern allows new reporter implementations to be added without modifying this factory. The composite reporter is assembled dynamically from the targets config, with safeguards against infinite recursion (ignoring "composite" as a target) and graceful fallback to default if no valid targets are resolved.
| BasePersistence session = metaStoreManagerFactory.getOrCreateSession(realmContext); | ||
|
|
||
| // Check if the session is a JdbcBasePersistenceImpl (supports metrics persistence) | ||
| if (!(session instanceof JdbcBasePersistenceImpl jdbcPersistence)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backend Compatibility Guard: This explicit check for JdbcBasePersistenceImpl ensures the persistence reporter degrades gracefully when running with a non-JDBC backend (e.g., in-memory for testing). The warning log makes misconfiguration obvious in production logs.
| .otelSpanId(otelSpanId) | ||
| .snapshotId(commitReport.snapshotId()) | ||
| .sequenceNumber(commitReport.sequenceNumber()) | ||
| .operation(commitReport.operation() != null ? commitReport.operation() : "UNKNOWN"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iceberg's CommitReport.operation() can be null in some edge cases (e.g., failed commits before operation is determined), but our DB schema defines operation TEXT NOT NULL. We default to "UNKNOWN" to satisfy the constraint while making these cases visible in analytics queries.
| String principalName = extractPrincipalName(); | ||
| String requestId = null; | ||
|
|
||
| // Extract OpenTelemetry trace context from the current span |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capturing the OpenTelemetry trace/span IDs enables correlating metrics reports with the HTTP request that triggered them. This is essential for debugging slow scans or failed commits by joining metrics data with distributed traces. Span.current() returns the active span from the thread-local context propagated through the Quarkus request handling.
| * Scheduled cleanup job that runs at the configured interval. The actual interval is configured | ||
| * via the retention.cleanup-interval property. | ||
| */ | ||
| @Scheduled(every = "${polaris.iceberg-metrics.reporting.retention.cleanup-interval:6h}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleanup interval is injected via config expression ${polaris.iceberg-metrics.reporting.retention.cleanup-interval:6h}. This runs in a Quarkus scheduler thread, not the request thread, so the realm context must be explicitly created (line 143) rather than injected.
|
Retest this please. |
…apache#3337) - Add table metrics persistence layer (JDBC models, schema migrations) - Add metrics report converters for ScanReport and CommitReport - Add EventsMetricsReporter for event-based metrics reporting - Add PersistingMetricsReporter for database persistence - Add CompositeMetricsReporter for combining multiple reporters - Add MetricsReportCleanupService for automatic cleanup of old reports - Add PolarisPersistenceEventListener for metrics event handling - Add comprehensive test coverage - Add telemetry documentation
461cfde to
2fc2a20
Compare
Checklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)Overview
Implements comprehensive metrics persistence for Iceberg table operations with support for both unified audit trail and analytics-optimized storage strategies. Enables tracking and analysis of scan and commit metrics from compute engines (Spark, Trino, Flink) with full OpenTelemetry trace context integration.
There is a document that explains the overall plan for supporting end to end auditing here: https://docs.google.com/document/d/1Ehzvi5RNPs4hChkBFI6VD23myEqm-7sWW3d2kjmuYj8/edit?tab=t.0
Fixes #3337
Key Features
Metrics Persistence
Implementation Details
New Components
PersistingMetricsReporter: Persists to dedicated scan/commit metrics tablesEventsMetricsReporter: Persists to events table as JSONCompositeMetricsReporter: Multi-destination reportingMetricsReportCleanupService: Automated retention managementMetricsReportingConfiguration: Flexible configuration interfaceDatabase Schema
scan_metrics_report: 30+ columns for scan metrics with indexes on timestamp, table, trace_idcommit_metrics_report: 30+ columns for commit metrics with indexes on timestamp, table, operationJDBC Persistence Layer
writeScanMetricsReport(),writeCommitMetricsReport()deleteAllMetricsReportsOlderThan()Event System Integration
AfterReportMetricsEvent: Emitted after metrics reports are processedPolarisPersistenceEventListener: Extracts and persists metrics dataInMemoryBufferEventListener: Batch writes for performanceConfiguration
Testing
Breaking Changes
None. All changes are additive with backward compatibility maintained.
Documentation
site/content/in-dev/unreleased/telemetry.md