[5593873] [ONNX] Fix ResAdd logic to support 'Conv-BN-Sigmoid-Mul-Add' as fusible patterns #450

gcunhase · 2025-10-17T19:55:58Z

What does this PR do?

Type of change: Bug fix

Overview: QDQ nodes were being placed in both inputs of Add layers with Conv-BN-Sigmoid-Mul-Add pattern instead of just the residual branch.

Usage

$ python -m modelopt.onnx.quantization --onnx_path=$MODEL_NAME.onnx

Testing

Added unittest.

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: Yes
Did you add or update any necessary documentation?: No
Did you update Changelog?: No

Summary by CodeRabbit

New Features
- Expanded quantization capabilities to support additional layer fusion patterns, enabling better model optimization for more neural network architectures.
Tests
- Added new test cases validating quantization behavior for the expanded layer patterns.

coderabbitai · 2025-10-17T19:56:14Z

Warning

Rate limit exceeded

@gcunhase has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 21 minutes and 43 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a480adc and 5aadeac.

📒 Files selected for processing (3)

modelopt/onnx/quantization/graph_utils.py (1 hunks)
tests/_test_utils/onnx_quantization/lib_test_models.py (1 hunks)
tests/unit/onnx/test_quantize_int8.py (2 hunks)

Walkthrough

The PR extends ONNX graph fusion patterns to support additional Mul-Sigmoid-BatchNormalization sequences preceding convolution operations, adds a test model generator for this new pattern, and introduces a corresponding quantization verification test.

Changes

Cohort / File(s)	Summary
Graph fusion patterns `modelopt/onnx/quantization/graph_utils.py`	Added two new fusible path variants to `fusible_linear_path_types`: `["Mul", "Sigmoid", "BatchNormalization", conv_type]` and `["Mul", "Sigmoid", "BatchNormalization", "BiasAdd", conv_type]` to enable backward fusion detection of Mul-Sigmoid sequences preceding BatchNorm (with optional BiasAdd) before convolution.
Test model generator `tests/_test_utils/onnx_quantization/lib_test_models.py`	Added new function `build_conv_batchnorm_sig_mul_model()` that constructs an ONNX graph with the sequence: Relu → Conv → BatchNormalization → Sigmoid → Mul → Add → Relu, including weights, initializers, and shape inference.
Quantization test `tests/unit/onnx/test_quantize_int8.py`	Added import of `build_conv_batchnorm_sig_mul_model` and new test function `test_conv_batchnorm_sig_mul_int8()` that verifies int8 quantization of the new pattern, checking Conv/ConvTransposed quantization and Add node quantization behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A pattern new! Mul and Sigmoid dance,
Before the BatchNorm takes their chance,
Through fusion paths we now can blend,
With Conv that waits at journey's end! 🌟

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "[5593873] [ONNX] Fix ResAdd logic to support 'Conv-BN-Sigmoid-Mul-Add' as fusible patterns" accurately reflects the main change in the pull request. The code modifications add new fusible path variants to support the Conv-BN-Sigmoid-Mul-Add pattern, which aligns with the stated objective of fixing ResAdd logic to treat this pattern as fusible. The title is specific and clear, explicitly naming the pattern being supported and the component (ResAdd logic) being fixed, allowing a teammate scanning history to immediately understand the primary change without ambiguity.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

modelopt/onnx/quantization/graph_utils.py (1)
204-205: LGTM: Correctly extends fusible patterns for Swish/SiLU activation.

The new patterns properly support the Conv-BN-Sigmoid-Mul sequence, which implements the Swish/SiLU activation (x * sigmoid(x)) where x is the BatchNormalization output. This enables correct ResAdd quantization by identifying this sequence as a fusible backbone, ensuring QDQ nodes are placed only on the residual branch of Add nodes.

Consider adding a brief inline comment documenting what these patterns represent:
+        # Swish/SiLU activation patterns: x * sigmoid(x) after BatchNorm
         ["Mul", "Sigmoid", "BatchNormalization", conv_type],
         ["Mul", "Sigmoid", "BatchNormalization", "BiasAdd", conv_type],

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c6b915 and a480adc.

📒 Files selected for processing (3)

modelopt/onnx/quantization/graph_utils.py (1 hunks)
tests/_test_utils/onnx_quantization/lib_test_models.py (1 hunks)
tests/unit/onnx/test_quantize_int8.py (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

tests/_test_utils/onnx_quantization/lib_test_models.py (1)

modelopt/onnx/utils.py (1)

check_model (557-569)

tests/unit/onnx/test_quantize_int8.py (1)

tests/_test_utils/onnx_quantization/lib_test_models.py (1)

build_conv_batchnorm_sig_mul_model (560-675)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (2)

tests/_test_utils/onnx_quantization/lib_test_models.py (1)

560-675: LGTM: Test model correctly implements the Conv-BN-Sigmoid-Mul-Add pattern.

The function properly constructs an ONNX graph with the Swish/SiLU activation pattern (Sigmoid and BatchNormalization outputs feeding into Mul) followed by a residual Add. The model structure, initializers, and validation steps are all correctly implemented and consistent with other test models in this file.

tests/unit/onnx/test_quantize_int8.py (1)

24-24: LGTM: Import correctly added.

coderabbitai · 2025-10-17T19:59:41Z

tests/unit/onnx/test_quantize_int8.py

+def test_conv_batchnorm_sig_mul_int8(tmp_path="./"):
+    onnx_model = build_conv_batchnorm_sig_mul_model()
+    onnx_path = os.path.join(tmp_path, "conv_batchnorm_sig_mul_model.onnx")
+    save_onnx(onnx_model, onnx_path)
+
+    moq.quantize(onnx_path, quantize_mode="int8", high_precision_dtype="fp16")
+
+    # Output model should be produced in the same tmp_path
+    output_onnx_path = onnx_path.replace(".onnx", ".quant.onnx")
+
+    # Check that quantized explicit model is generated
+    assert os.path.isfile(output_onnx_path)
+
+    # Load the output model and check QDQ node placements
+    graph = gs.import_onnx(onnx.load(output_onnx_path))
+
+    # Check that Conv and ConvTransposed are quantized
+    conv_nodes = [n for n in graph.nodes if "Conv" in n.op]
+    assert _assert_nodes_are_quantized(conv_nodes)
+
+    # Check that only 1 input of Add is quantized
+    add_nodes = [n for n in graph.nodes if n.op == "Add"]
+    for node in add_nodes:
+        quantized_inputs = [inp for inp in node.inputs if inp.inputs[0].op == "DequantizeLinear"]
+        assert len(quantized_inputs) == 1, (
+            f"More than one input of {node.name} is being quantized, but only one should be quantized!"
+        )


⚠️ Potential issue | 🟡 Minor

Remove default value from pytest fixture parameter.

The tmp_path parameter should not have a default value since it's a pytest fixture. This is inconsistent with other test functions in the file (e.g., line 66).

Apply this diff:

-def test_conv_batchnorm_sig_mul_int8(tmp_path="./"): +def test_conv_batchnorm_sig_mul_int8(tmp_path):

Otherwise, the test logic correctly verifies that the new fusion pattern is recognized and that QDQ nodes are properly placed (Conv nodes quantized, only one Add input quantized).

🤖 Prompt for AI Agents

In tests/unit/onnx/test_quantize_int8.py around lines 95 to 121, the test function signature uses a default value for the pytest fixture parameter tmp_path (tmp_path="./") which is invalid; remove the default so the fixture is injected by pytest, i.e., change the function definition to accept tmp_path without any default (tmp_path) and keep the rest of the test body unchanged.

Signed-off-by: gcunhase <[email protected]>

codecov · 2025-10-17T20:15:08Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.43%. Comparing base (8c6b915) to head (5aadeac).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #450      +/-   ##
==========================================
+ Coverage   73.37%   73.43%   +0.05%     
==========================================
  Files         180      180              
  Lines       17937    17937              
==========================================
+ Hits        13162    13172      +10     
+ Misses       4775     4765      -10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ajrasane · 2025-10-20T20:15:56Z

tests/_test_utils/onnx_quantization/lib_test_models.py

+    ]
+
+    # Create the ONNX graph with the nodes
+    nodes = [


Should we create a flag for BiasAdd that will create two models, one with BiasAdd and one without?
This will help us test both the patterns.

gcunhase requested a review from a team as a code owner October 17, 2025 19:55

gcunhase requested a review from i-riyad October 17, 2025 19:56

gcunhase requested a review from ajrasane October 17, 2025 19:58

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

gcunhase added 2 commits October 17, 2025 16:01

Add unittest

7b91a1c

Signed-off-by: gcunhase <[email protected]>

Add support for pattern

5aadeac

Signed-off-by: gcunhase <[email protected]>

gcunhase force-pushed the dev/gcunhasergio/fix_res_add_ConvBNSigMul branch from a480adc to 5aadeac Compare October 17, 2025 20:01

ajrasane reviewed Oct 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[5593873] [ONNX] Fix ResAdd logic to support 'Conv-BN-Sigmoid-Mul-Add' as fusible patterns #450

[5593873] [ONNX] Fix ResAdd logic to support 'Conv-BN-Sigmoid-Mul-Add' as fusible patterns #450

gcunhase commented Oct 17, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 17, 2025 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 17, 2025

Uh oh!

codecov bot commented Oct 17, 2025

Uh oh!

ajrasane Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[5593873] [ONNX] Fix ResAdd logic to support 'Conv-BN-Sigmoid-Mul-Add' as fusible patterns #450

Are you sure you want to change the base?

[5593873] [ONNX] Fix ResAdd logic to support 'Conv-BN-Sigmoid-Mul-Add' as fusible patterns #450

Conversation

gcunhase commented Oct 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Oct 17, 2025

Codecov Report

Uh oh!

ajrasane Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gcunhase commented Oct 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 17, 2025 •

edited

Loading