Skip to content

Add io.trino.spi.connector.ConnectorSplitSource#getMetrics #25770

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 15, 2025

Conversation

raunaqmorarka
Copy link
Member

@raunaqmorarka raunaqmorarka commented May 12, 2025

Description

Add new SPI io.trino.spi.connector.ConnectorSplitSource#getMetrics
This allows connectors to expose detailed metrics from splits generation.
EXPLAIN ANALYZE VERBOSE is modified to print these metrics.
Added an implementation in iceberg to collect metrics from iceberg metadata scan.
Example output

ScanFilterProject[table = iceberg:default.lineitem$data@5213842060806047048, filterPredicate = (l_orderkey = bigint '3423110')]
    connector metrics:
      'ParquetReaderCompressionFormat_ZSTD' = LongCount{total=9808205}
    splits generation metrics:
      'dataFileSizeBytes' = LongCount{total=394528676}
      'dataFiles' = LongCount{total=8}
      'dataManifests' = LongCount{total=1}
      'deleteFileSizeBytes' = LongCount{total=2370}
      'deleteManifests' = LongCount{total=1}
      'equalityDeleteFiles' = LongCount{total=0}
      'positionalDeleteFiles' = LongCount{total=2}
      'scanPlanningDuration' = {duration=7.00ms}

Remove SplitOperatorInfo and ConnectorSplit#getSplitInfo, this
avoids the need to send splits info from workers to coordinator.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Iceberg
* Show detailed metrics from splits generation in output of EXPLAIN ANALYZE VERBOSE. ({issue}`25770`)

@cla-bot cla-bot bot added the cla-signed label May 12, 2025
@github-actions github-actions bot added hudi Hudi connector iceberg Iceberg connector delta-lake Delta Lake connector hive Hive connector bigquery BigQuery connector mongodb MongoDB connector cassandra Cassandra connector blackhole Blackhole connector elasticsearch Elasticsearch connector google-sheets Google Sheets connector kafka Kafka connector memory Memory connector opensearch OpenSearch connector pinot Pinot connector prometheus Prometheus connector redis Redis connector redshift Redshift connector labels May 12, 2025
@raunaqmorarka raunaqmorarka requested review from losipiuk, wendigo, Copilot, ebyhr and lukasz-stec and removed request for losipiuk May 12, 2025 09:51
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new SPI method, ConnectorSplitSource#getMetrics, to capture detailed metrics during splits generation and integrates these metrics into various scheduling and execution components. Key changes include:

  • Propagation of an optional source identifier (sourceId) in OperatorStats and OperatorContext.
  • Replacement of timing callbacks with a new SplitSourceMetricsRecorder across multiple scheduling/execution stages.
  • Removal of SplitOperatorInfo and a shift from custom formatting of split info to using default toString() methods.

Reviewed Changes

Copilot reviewed 91 out of 91 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
core/trino-main/src/main/java/io/trino/operator/ScanFilterAndProjectOperator.java Updated operator context instantiation to include sourceId.
core/trino-main/src/main/java/io/trino/operator/OperatorStats.java Introduced sourceId, added getter, and implemented addConnectorSplitSourceMetrics for merging metrics.
core/trino-main/src/main/java/io/trino/operator/OperatorInfo.java Removed obsolete SplitOperatorInfo mapping.
core/trino-main/src/main/java/io/trino/operator/OperatorContext.java Updated constructor and field to include sourceId.
core/trino-main/src/main/java/io/trino/operator/DriverContext.java Added overloaded addOperatorContext accepting sourceId.
core/trino-main/src/main/java/io/trino/metadata/Split.java Removed getInfo method.
core/trino-main/src/main/java/io/trino/execution/scheduler/faulttolerant/* Changed split time recorder to use metricsRecorder.
core/trino-main/src/main/java/io/trino/execution/* Updated methods to record and propagate split source metrics.
core/trino-main/src/main/java/io/trino/connector/* Removed or updated split info formatting methods.
Comments suppressed due to low confidence (1)

core/trino-main/src/main/java/io/trino/execution/SqlTaskExecution.java:897

  • [nitpick] Consider verifying that the default toString() implementation of the split provides sufficient detail for debugging purposes, or implement a custom formatting method if more clarity is required.
return (partitionedSplit == null) ? "" : partitionedSplit.getSplit().toString();

@raunaqmorarka raunaqmorarka requested a review from a team May 12, 2025 09:55
@raunaqmorarka raunaqmorarka force-pushed the split-source-metrics branch 3 times, most recently from ffca7c7 to 1902d67 Compare May 12, 2025 11:06
@wendigo
Copy link
Contributor

wendigo commented May 14, 2025

Overall LGTM

Allows connectors to expose detailed metrics from splits generation.
These metrics are exposed in StageStats which is part of queryinfo.
EXPLAIN ANALYZE VERBOSE is modified to print these metrics.
Example output
ScanFilterProject[table = iceberg:default.lineitem$data@5213842060806047048, filterPredicate = (l_orderkey = bigint '3423110')]
    connector metrics:
      'ParquetReaderCompressionFormat_ZSTD' = LongCount{total=9808205}
    splits generation metrics:
      'dataFileSizeBytes' = LongCount{total=394528676}
      'dataFiles' = LongCount{total=8}
      'dataManifests' = LongCount{total=1}
      'deleteFileSizeBytes' = LongCount{total=2370}
      'deleteManifests' = LongCount{total=1}
      'equalityDeleteFiles' = LongCount{total=0}
      'positionalDeleteFiles' = LongCount{total=2}
      'scanPlanningDuration' = {duration=7.00ms}
This is not needed anymore after adding split source metrics
Removing this also avoids unnecessary fetching of splits info
from worker to coordinator
This is no longer needed as toString suffices to print debug info on workers
@raunaqmorarka raunaqmorarka force-pushed the split-source-metrics branch from 1902d67 to 732cc7b Compare May 15, 2025 05:00
@raunaqmorarka raunaqmorarka requested a review from Copilot May 15, 2025 05:01
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new SPI method, ConnectorSplitSource#getMetrics, to allow connectors to expose detailed metrics from splits generation and update various components to record and display these metrics. Key changes include:

  • Injecting an Optional sourceId into operator contexts and stats to support metric recording.
  • Replacing the existing split time recording with a metrics‐based approach throughout scheduling and execution.
  • Removing SplitOperatorInfo and updating split info formatting.

Reviewed Changes

Copilot reviewed 91 out of 91 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
ScanFilterAndProjectOperator.java Updated operator context creation to pass Optional sourceId.
OperatorStats.java Added sourceId field and withConnectorSplitSourceMetrics method to merge connector split metrics.
OperatorInfo.java Removed deprecated SplitOperatorInfo registration.
OperatorContext.java Injected Optional sourceId into constructors.
DriverContext.java Added overload to support Optional sourceId when adding an operator context.
Split.java Removed getInfo() method to deprecate split info representation.
SplitSourceMetricsRecorder.java Introduced new interface for recording metrics.
EventDrivenTaskSourceFactory.java, EventDrivenTaskSource.java Replaced BiConsumer time recorder with a metrics recorder, updating corresponding references.
StageExecution.java, SourcePartitionedScheduler.java, PipelinedStageExecution.java, StageStats.java, StageStateMachine.java, SqlTaskExecution.java, SqlStage.java, QueryStateMachine.java Updated methods to record and propagate split source metrics.
SystemSplit.java, InformationSchemaSplit.java Changed the split info formatting to rely on toString().
Comments suppressed due to low confidence (1)

core/trino-main/src/main/java/io/trino/execution/scheduler/StageStateMachine.java:753

  • The current implementation overwrites split source metrics for a given node; if multiple updates are expected, consider merging the metrics instead of replacing them.
splitSourceMetrics.put(nodeId, metrics);

@raunaqmorarka raunaqmorarka merged commit d45b32e into master May 15, 2025
105 of 107 checks passed
@raunaqmorarka raunaqmorarka deleted the split-source-metrics branch May 15, 2025 06:03
@github-actions github-actions bot added this to the 476 milestone May 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery BigQuery connector blackhole Blackhole connector cassandra Cassandra connector cla-signed delta-lake Delta Lake connector elasticsearch Elasticsearch connector google-sheets Google Sheets connector hive Hive connector hudi Hudi connector iceberg Iceberg connector kafka Kafka connector memory Memory connector mongodb MongoDB connector opensearch OpenSearch connector pinot Pinot connector prometheus Prometheus connector redis Redis connector redshift Redshift connector
Development

Successfully merging this pull request may close these issues.

2 participants