Skip to content

fix: Preserve integer values in round() for large Int64 and UInt64 inputs#22697

Open
pchintar wants to merge 1 commit into
apache:mainfrom
pchintar:round-large-integer-handling
Open

fix: Preserve integer values in round() for large Int64 and UInt64 inputs#22697
pchintar wants to merge 1 commit into
apache:mainfrom
pchintar:round-large-integer-handling

Conversation

@pchintar
Copy link
Copy Markdown

@pchintar pchintar commented Jun 1, 2026

Which issue does this PR close?

Rationale for this change

round() should not change integer values when the scale is non-negative, since no fractional digits need to be rounded.

Currently, core round() coerces large Int64 values through Float64, causing precision loss:

SELECT round(arrow_cast(9007199254740993, 'Int64'));

Before/Current Buggy Output:

9007199254740992.0

Expected:

9007199254740993

The Spark-compatible round() also fails for UInt64 values above i64::MAX even when the scale is non-negative:

SELECT round(arrow_cast(18446744073709551615, 'UInt64'));

Before/Current Buggy Output:

round: UInt64 value 18446744073709551615 exceeds i64::MAX and cannot be rounded

What changes are included in this PR?

  • Preserved integer inputs in core round() for non-negative scales instead of routing them through Float64.

  • Preserved UInt64 values in Spark-compatible round() when the scale is non-negative, avoiding unnecessary UInt64 -> i64 conversion.

  • Added SQLLogicTest coverage for:

    • Int64 values above 2^53 in core round().
    • UInt64::MAX in Spark-compatible round().
    • Both one-argument and two-argument forms.

Are these changes tested?

Yes.

cargo fmt --all
git diff --check
cargo test -p datafusion-sqllogictest --test sqllogictests -- spark/math/round.slt
cargo test -p datafusion-functions round
cargo test -p datafusion-spark round

I also verified the core regression queries manually:

SELECT arrow_typeof(round(arrow_cast(9007199254740993, 'Int64'))),
       round(arrow_cast(9007199254740993, 'Int64'));

Result:

Int64 9007199254740993
SELECT arrow_typeof(round(arrow_cast(9007199254740993, 'Int64'), 2)),
       round(arrow_cast(9007199254740993, 'Int64'), 2);

Result:

Int64 9007199254740993

Are there any user-facing changes?

No.

@github-actions github-actions Bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation spark labels Jun 1, 2026
@pchintar pchintar force-pushed the round-large-integer-handling branch from e565975 to 7832c0c Compare June 1, 2026 13:58
// extra precision to accommodate potential carry-over.
let return_type =
match input_type {
input_type if is_integer_data_type(input_type) => input_type.clone(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returns the integer type for all scales, but only non-negative scales are handled downstream. round(arrow_cast(125,'Int64'), -1) now fails, it returned 130.0 before this PR.

Comment thread datafusion/functions/src/math/round.rs Outdated

match (value_scalar, args.return_type()) {
(value_scalar, return_type)
if is_integer_data_type(return_type) && dp >= 0 =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only covers dp >= 0. For dp < 0 with an integer value, needs handler for negative scale

}

let arr: ArrayRef = match (value_array.data_type(), return_type) {
(input_type, return_type)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The no-op fast path requires decimal_places to be a non-negative scalar literal. When the scale is a column (or a negative literal), this guard is false and there's no other integer arm, so it hits exec_err!

    CREATE TABLE t(v BIGINT, dp INT) AS VALUES (125,1),(125,-1);
    SELECT round(v, dp) FROM t;   -- errored; worked (Float64) before this PR

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests only cover non-negative scalar scale. There's no coverage for negative scale, column scale, or negative scale on a column

Comment thread datafusion/functions/src/math/round.rs Outdated
Comment on lines +332 to +349
(value_scalar, Float64)
if is_integer_data_type(&value_scalar.data_type()) =>
{
let value = ColumnarValue::Scalar(value_scalar.clone())
.cast_to(&Float64, None)?;
match value {
ColumnarValue::Scalar(ScalarValue::Float64(Some(v))) => {
let rounded = round_float(v, dp)?;
Ok(ColumnarValue::Scalar(ScalarValue::Float64(Some(rounded))))
}
ColumnarValue::Scalar(ScalarValue::Float64(None)) => {
Ok(ColumnarValue::Scalar(ScalarValue::Float64(None)))
}
_ => internal_err!(
"Unexpected datatype after casting integer argument to Float64"
),
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (value_scalar, Float64) if is_integer_data_type(value) arm and theround_integer_array_to_float64! macro + the eight (IntN, Float64) array arms are unreachable:

@kumarUjjawal
Copy link
Copy Markdown
Contributor

cc @neilconway

@pchintar pchintar force-pushed the round-large-integer-handling branch from 7832c0c to 97a7294 Compare June 2, 2026 09:13
@pchintar
Copy link
Copy Markdown
Author

pchintar commented Jun 2, 2026

@kumarUjjawal Thnx for the review!

I've updated the implementation so that integer inputs now stay as integer types instead of being converted through Float64. This preserves large Int64 values that cannot be represented exactly as Float64.

I also added tests for the cases you mentioned:

  • negative decimal_places values (for example, round(125, -1))
  • column-valued decimal_places
  • large Int64 values above the exact Float64 range

Additionally, I updated the affected sqllogictest expectations since the plan output changed after removing the unnecessary cast to Float64.

I've also rerun the relevant tests and manually verified all these cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation spark sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

round() mishandles large Int64 and UInt64 values

2 participants