Skip to content

Conversation

bubulalabu
Copy link

Which issue does this PR close?

Closes #17379.

Rationale for this change

PostgreSQL supports named arguments for function calls using the syntax function_name(param => value), which improves code readability and allows arguments to be specified in any order. DataFusion should support this syntax to enhance the user experience, especially for functions with many optional parameters.

What changes are included in this PR?

This PR implements PostgreSQL-style named arguments for scalar functions.

Features:

  • Parse named arguments from SQL (param => value syntax)
  • Resolve named arguments to positional order before execution
  • Support mixed positional and named arguments
  • Store parameter names in function signatures
  • Show parameter names in error messages

Limitations:

  • Named arguments only work for functions with known arity (fixed number of parameters)
  • Variadic functions (like concat) cannot use named arguments as they accept variable numbers of arguments
  • Supported signature types: Exact, Uniform, Any, Coercible, Comparable, Numeric, String, Nullary, ArraySignature, and OneOf (combinations of these)
  • Not supported: Variadic, VariadicAny, UserDefined

Implementation:

  • Added argument resolution logic with validation
  • Extended Signature with parameter_names field
  • Updated SQL parser to handle named argument syntax
  • Integrated into physical planning phase
  • Added comprehensive tests and documentation

Example usage:

-- All named arguments
SELECT substr(str => 'hello world', start_pos => 7, length => 5);

-- Mixed positional and named arguments
SELECT substr('hello world', start_pos => 7, length => 5);

-- Named arguments in any order
SELECT substr(length => 5, str => 'hello world', start_pos => 7);

Improved error messages:

Before this PR, error messages showed generic types:

Candidate functions:
    substr(Any, Any)
    substr(Any, Any, Any)

After this PR, error messages show parameter names:

Candidate functions:
    substr(str, start_pos)
    substr(str, start_pos, length)

Example error output:

datafusion % target/debug/datafusion-cli
DataFusion CLI v50.1.0
> SELECT substr(str => 'hello world');
Error during planning: Internal error: Function 'substr' failed to match any signature, errors: Error during planning: The function 'substr' expected 2 arguments but received 1,Error during planning: The function 'substr' expected 3 arguments but received 1.
This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues No function matches the given name and argument types 'substr(Utf8)'. You might need to add explicit type casts.
        Candidate functions:
        substr(str, start_pos)
        substr(str, start_pos, length)

Are these changes tested?

Yes, comprehensive tests are included:

  1. Unit tests (18 tests total):

    • Argument validation and reordering logic (8 tests in udf.rs)
    • Error message formatting with parameter names (2 tests in utils.rs)
    • TypeSignature parameter name support for all fixed-arity variants including ArraySignature (10 tests in signature.rs)
  2. Integration tests (named_arguments.slt):

    • Positional arguments (baseline)
    • Named arguments in order
    • Named arguments out of order
    • Mixed positional and named arguments
    • Optional parameters
    • Function aliases
    • Error cases (positional after named, unknown parameter, duplicate parameter)
    • Error message format verification

All tests pass successfully.

Are there any user-facing changes?

Yes, this PR adds new user-facing functionality:

  1. New SQL syntax: Users can now call functions with named arguments using param => value syntax (only for functions with fixed arity)
  2. Improved error messages: Signature mismatch errors now display parameter names instead of generic types
  3. UDF API: Function authors can add parameter names to their functions using:
    signature: Signature::uniform(2, vec![DataType::Float64], Volatility::Immutable)
        .with_parameter_names(vec!["base".to_string(), "exponent".to_string()])
        .expect("valid parameter names")

Potential breaking change (very unlikely): Added new public field parameter_names: Option<Vec<String>> to Signature struct. This is technically a breaking change if code constructs Signature using struct literal syntax. However, this is extremely unlikely in practice because:

  • Signature is almost always constructed using builder methods (Signature::exact(), Signature::uniform(), etc.)
  • The new field defaults to None, maintaining existing behavior
  • Existing code using builder methods continues to work without modification

No other breaking changes: The feature is purely additive - existing SQL queries and UDF implementations work without modification.

Implement support for calling functions with named parameters using
PostgreSQL-style syntax (param => value).

Features:
- Parse named arguments from SQL (param => value syntax)
- Resolve named arguments to positional order before execution
- Support mixed positional and named arguments
- Store parameter names in function signatures
- Show parameter names in error messages

Implementation:
- Added argument resolution logic with validation
- Extended Signature with parameter_names field
- Updated SQL parser to handle named argument syntax
- Integrated into physical planning phase
- Added comprehensive tests and documentation

Example usage:
  SELECT substr(str => 'hello', start_pos => 2, length => 3);
  SELECT substr('hello', start_pos => 2, length => 3);

Error messages now show:
  Candidate functions:
    substr(str, start_pos)
    substr(str, start_pos, length)

Instead of generic types like substr(Any, Any).

Related issue: apache#17379
@github-actions github-actions bot added documentation Improvements or additions to documentation sql SQL Planner logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation functions Changes to functions implementation logical-expr Logical plan and expressions sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add named notation for user defined function arguments

1 participant