Description
Improve Watermark Semantics in Gluten Flink
Background
Gluten Flink currently supports the basic propagation of watermark timestamps through the native execution path, but it does not yet fully match Flink's runtime watermark semantics. The current implementation can emit native StatefulWatermark events and translate them into Flink Watermark events, but several parts of the event-time runtime contract are either incomplete or missing.
This issue tracks the work required to gradually improve watermark support in Gluten Flink while minimizing risk. The proposed approach is to start with low-risk Java-side runtime fixes, then move to native operator semantics, and finally address source-level coordination and watermark alignment.
Goals
- Preserve existing data record routing semantics unless a task explicitly changes them.
- Improve watermark observability and runtime behavior step by step.
- Align Gluten Flink watermark behavior with Flink's event-time semantics.
- Add unit and end-to-end tests for each milestone.
Tasks
Gluten version
None
Description
Improve Watermark Semantics in Gluten Flink
Background
Gluten Flink currently supports the basic propagation of watermark timestamps through the native execution path, but it does not yet fully match Flink's runtime watermark semantics. The current implementation can emit native
StatefulWatermarkevents and translate them into FlinkWatermarkevents, but several parts of the event-time runtime contract are either incomplete or missing.This issue tracks the work required to gradually improve watermark support in Gluten Flink while minimizing risk. The proposed approach is to start with low-risk Java-side runtime fixes, then move to native operator semantics, and finally address source-level coordination and watermark alignment.
Goals
Tasks
Complete the basic watermark event path [FLINK]Update Watermark Gauge in Gluten Output Collector #12341
GlutenOutputCollector.emitWatermark(...)to updateWatermarkGauge.GlutenOutputCollector.collect(...)to update the gauge before forwarding watermark elements.Implement
WatermarkStatuspropagationemitWatermarkStatus(WatermarkStatus)and broadcast it to all downstream outputs.processWatermarkStatussupport inGlutenOneInputOperatorandGlutenTwoInputOperator.notifyWatermarkStatus(id, status[, inputIndex])to velox4j JNI.processWatermarkStatusto nativeStatefulTaskandStatefulOperator.CombinedWatermarkStatusto support active/idle state updates.IDLEandACTIVEstatuses can pass through the Gluten operator chain.Implement idle source handling
idleTimeout_in nativeWatermarkAssigner.WatermarkStatus.IDLEwhen the source becomes idle.WatermarkStatus.ACTIVEwhen new data arrives after an idle period.Implement periodic watermark emission
AUTO_WATERMARK_INTERVALsemantics.onPeriodicEmit.maxTimestampSeenandlastEmittedWatermarkinWatermarkAssigner.WatermarkGenerator.onPeriodicEmitbehavior.Fix null semantics for watermark expressions
timestampVector.WATERMARK FOR ts AS CASE WHEN ... THEN ts - INTERVAL ... ELSE NULL END.Complete end-of-input and final watermark handling
finishInput(inputIndex)andfinishInput()APIs.Watermark.MAX_WATERMARK.WindowAggregator::close().LocalWindowAggregator,WindowAggregator, andWindowJoin.Complete source-level watermark support
GlutenSourceFunction.SourceCoordinatorsemantics.Implement watermark alignment
SourceCoordinatorand only offload reader-side data reading to native.Complete control event and multi-output semantics
LatencyMarkerforwarding.RecordAttributesforwarding.throw new RuntimeException("Not implemented for gluten")for unsupported stream events when the plan can be rejected earlier.Add tests and acceptance coverage
WatermarkAssignerperiodic emission.CombinedWatermarkStatusactive/idle behavior.Gluten version
None