Skip to content

Support null_treatment, distinct, and filter for window functions in proto #17417

@geoffreyclaude

Description

@geoffreyclaude

Is your feature request related to a problem or challenge?

  • Problem: When serializing logical plans to proto, Expr::WindowFunction drops null_treatment, distinct, and filter. This prevents round-tripping plans that use SQL features like IGNORE NULLS, DISTINCT in window aggregates, or FILTER (WHERE ...), breaking remote execution and plan persistence.
  • Evidence:
       Expr::WindowFunction(window_fun) => {
           let expr::WindowFunction {
               ref fun,
               params:
                   expr::WindowFunctionParams {
                       ref args,
                       ref partition_by,
                       ref order_by,
                       ref window_frame,
                       // TODO: support null treatment in proto
                       null_treatment: _,
                       distinct: _,
                       filter: _,
                   },
           } = window_fun.as_ref();
  • Impact: Plans using queries such as:
    • last_value(x) IGNORE NULLS OVER (...)
    • count(DISTINCT x) OVER (PARTITION BY ...)
    • sum(x) FILTER (WHERE y > 0) OVER (ORDER BY ...)
      cannot be faithfully serialized/deserialized.

Describe the solution you'd like

  • Proto schema: Extend the window function proto message to include:
    • null_treatment: enum with values like UNSPECIFIED (default), RESPECT_NULLS, IGNORE_NULLS.
    • distinct: boolean flag (default false).
    • filter: optional expression node representing the filter predicate.
  • Conversions:
    • Update to_proto.rs to populate these fields from expr::WindowFunctionParams.
    • Update from_proto.rs to reconstruct WindowFunctionParams from proto.
  • Defaults & compatibility:
    • Omit fields should preserve current behavior: null_treatment=UNSPECIFIED (treat as “respect nulls”), distinct=false, filter=None.
    • Use new field numbers to avoid wire compatibility issues; maintain backward compatibility with older binaries that ignore unknown fields.

Describe alternatives you've considered

No response

Additional context

  • Primary locations:
    • datafusion/proto/src/logical_plan/to_proto.rs and corresponding from_proto.rs.
    • Proto definitions for window expressions (the message that carries window function params).
  • Acceptance criteria:
    • Round-trip tests for:
      • IGNORE NULLS and RESPECT NULLS window functions (e.g., last_value, first_value).
      • DISTINCT window aggregates (e.g., count(distinct x)).
      • FILTER (WHERE ...) on window aggregates.
    • Existing plans without these features continue to round-trip unchanged.
    • Wire-compat maintained: older consumers ignore the new fields; newer consumers default correctly when fields are absent.
    • Docs updated to state proto supports these window-function features.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions