Skip to content

Conversation

@sshekhar563
Copy link
Contributor

Summary #32839

This patch fixes numerical instability in the ONNX frontend’s ReduceLogSumExp operator.
The previous implementation used a naïve log(sum(exp(x))) formulation, which overflowed to Inf for
values near log(MAX_FLOAT32) (≈ 88.72). This caused incorrect results in the OpenVINO EP of ONNX Runtime,
especially for float16 and float32 models.

Fix

Implemented a numerically stable LogSumExp computation:

k = ReduceMax(x)
lse = k + log( ReduceSum( exp(x - k) ) )

This matches the behavior of:

  • ONNX Runtime CPU EP
  • PyTorch and NumPy stable LogSumExp
  • OpenVINO PyTorch frontend implementation (log.cpp)

The new implementation is applied to:

  • Opset 1–12 (opset_1::reduce_log_sum_exp)
  • Opset 13–17 (opset_13::reduce_log_sum_exp)
  • Opset 18+ (opset_18::reduce_log_sum_exp)

Motivation

Fixes incorrect Inf outputs for values ≥ 88.7 when using OpenVINOExecutionProvider.
The issue was originally reported here: [link to GitHub issue].

Validation

Tested using provided reproduction script from the issue:

  • Matches ONNX Runtime CPU EP for all tested values
  • No overflow observed for large positive inputs
  • Behavior consistent across float16/float32 models

Notes

This fix aligns ONNX frontend behavior with the already-correct PyTorch frontend implementation.

@sshekhar563 sshekhar563 requested a review from a team as a code owner November 18, 2025 15:38
@github-actions github-actions bot added the category: ONNX FE OpenVINO ONNX FrontEnd label Nov 18, 2025
@sys-openvino-ci sys-openvino-ci added the ExternalPR External contributor label Nov 18, 2025
Copy link

@cknd cknd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks!

We also looked into this in the meantime (our working version is here master...micropsi-industries:openvino:fix_logsumexp )
We were wondering about one or two points, see comments below

const ov::Output<ov::Node> sum_node =
make_ov_reduction_op<v1::ReduceSum>(node, node.get_ov_inputs().at(0), supported_types_v2);
return {std::make_shared<v0::Log>(sum_node)};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is reduce_log_sum deleted on purpose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

The removal of reduce_log_sum was not intentional the change in this PR is meant only to introduce a stable implementation for ReduceLogSumExp.
I'll restore the original reduce_log_sum translator and its registration in the next commit.

Thanks for pointing it out!

return {make_ov_reduction_op<v1::ReduceMin>(node, node.get_ov_inputs().at(0), supported_types_v3, false)};
}

ov::OutputVector reduce_log_sum(const ov::frontend::onnx::Node& node) {
Copy link

@cknd cknd Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(again, deletion of reduce_log_sum)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out again!
The deletion of reduce_log_sum was not intentional I accidentally removed it while restructuring the file during the ReduceLogSumExp update.
I'll restore the original reduce_log_sum implementation and its registration in the next commit.
Thanks for catching this!

auto keepdims = static_cast<bool>(node.get_attribute_value<std::int64_t>("keepdims", 1));

auto reduction_axes =
(node.get_ov_inputs().size() > 1) ? get_reduction_axes_from_input(node) : get_reduction_axes_from_attr(node);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: as far as we understand, reduction axes are always given by attr in opset <= 12 and always by input starting from opset 13 -- since these are separate implementations per opset, isn't reducution_axes = get_reduction_axes_from_attr(node) enough here in the opset 1 section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question thank you!

You're right that for ONNX opset ≤ 12, axes are normally provided via the axes attribute, and input-based axes become standard only from opset 13 onward.
However, some models (and converters) still emit opset-1 with axes passed as a second input rather than an attribute. OpenVINO’s legacy opset-1 translator handled both cases, so I kept the same logic to avoid breaking existing models.
That said, I can simplify it to attribute-only if we want strict adherence to the ONNX specification.
Please let me know if you prefer simplifying it I can update the PR accordingly.

Copy link

@cknd cknd Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah interesting point. That's beyond my expertise, so I don't have an opinion. Let's wait for the maintainer review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: ONNX FE OpenVINO ONNX FrontEnd ExternalPR External contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants