Skip to content

Conversation

gcunhase
Copy link
Contributor

@gcunhase gcunhase commented Oct 17, 2025

What does this PR do?

Type of change: Bug fix

Overview: QDQ nodes were being placed in both inputs of Add layers with Conv-BN-Sigmoid-Mul-Add pattern instead of just the residual branch.

Usage

$ python -m modelopt.onnx.quantization --onnx_path=$MODEL_NAME.onnx

Testing

Added unittest.

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: Yes
  • Did you add or update any necessary documentation?: No
  • Did you update Changelog?: No

Summary by CodeRabbit

  • New Features

    • Expanded quantization capabilities to support additional layer fusion patterns, enabling better model optimization for more neural network architectures.
  • Tests

    • Added new test cases validating quantization behavior for the expanded layer patterns.

@gcunhase gcunhase requested a review from a team as a code owner October 17, 2025 19:55
@gcunhase gcunhase requested a review from i-riyad October 17, 2025 19:56
Copy link

coderabbitai bot commented Oct 17, 2025

Warning

Rate limit exceeded

@gcunhase has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 21 minutes and 43 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a480adc and 5aadeac.

📒 Files selected for processing (3)
  • modelopt/onnx/quantization/graph_utils.py (1 hunks)
  • tests/_test_utils/onnx_quantization/lib_test_models.py (1 hunks)
  • tests/unit/onnx/test_quantize_int8.py (2 hunks)

Walkthrough

The PR extends ONNX graph fusion patterns to support additional Mul-Sigmoid-BatchNormalization sequences preceding convolution operations, adds a test model generator for this new pattern, and introduces a corresponding quantization verification test.

Changes

Cohort / File(s) Summary
Graph fusion patterns
modelopt/onnx/quantization/graph_utils.py
Added two new fusible path variants to fusible_linear_path_types: ["Mul", "Sigmoid", "BatchNormalization", conv_type] and ["Mul", "Sigmoid", "BatchNormalization", "BiasAdd", conv_type] to enable backward fusion detection of Mul-Sigmoid sequences preceding BatchNorm (with optional BiasAdd) before convolution.
Test model generator
tests/_test_utils/onnx_quantization/lib_test_models.py
Added new function build_conv_batchnorm_sig_mul_model() that constructs an ONNX graph with the sequence: Relu → Conv → BatchNormalization → Sigmoid → Mul → Add → Relu, including weights, initializers, and shape inference.
Quantization test
tests/unit/onnx/test_quantize_int8.py
Added import of build_conv_batchnorm_sig_mul_model and new test function test_conv_batchnorm_sig_mul_int8() that verifies int8 quantization of the new pattern, checking Conv/ConvTransposed quantization and Add node quantization behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A pattern new! Mul and Sigmoid dance,
Before the BatchNorm takes their chance,
Through fusion paths we now can blend,
With Conv that waits at journey's end! 🌟

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "[5593873] [ONNX] Fix ResAdd logic to support 'Conv-BN-Sigmoid-Mul-Add' as fusible patterns" accurately reflects the main change in the pull request. The code modifications add new fusible path variants to support the Conv-BN-Sigmoid-Mul-Add pattern, which aligns with the stated objective of fixing ResAdd logic to treat this pattern as fusible. The title is specific and clear, explicitly naming the pattern being supported and the component (ResAdd logic) being fixed, allowing a teammate scanning history to immediately understand the primary change without ambiguity.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gcunhase gcunhase requested a review from ajrasane October 17, 2025 19:58
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
modelopt/onnx/quantization/graph_utils.py (1)

204-205: LGTM: Correctly extends fusible patterns for Swish/SiLU activation.

The new patterns properly support the Conv-BN-Sigmoid-Mul sequence, which implements the Swish/SiLU activation (x * sigmoid(x)) where x is the BatchNormalization output. This enables correct ResAdd quantization by identifying this sequence as a fusible backbone, ensuring QDQ nodes are placed only on the residual branch of Add nodes.

Consider adding a brief inline comment documenting what these patterns represent:

+        # Swish/SiLU activation patterns: x * sigmoid(x) after BatchNorm
         ["Mul", "Sigmoid", "BatchNormalization", conv_type],
         ["Mul", "Sigmoid", "BatchNormalization", "BiasAdd", conv_type],
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c6b915 and a480adc.

📒 Files selected for processing (3)
  • modelopt/onnx/quantization/graph_utils.py (1 hunks)
  • tests/_test_utils/onnx_quantization/lib_test_models.py (1 hunks)
  • tests/unit/onnx/test_quantize_int8.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/_test_utils/onnx_quantization/lib_test_models.py (1)
modelopt/onnx/utils.py (1)
  • check_model (557-569)
tests/unit/onnx/test_quantize_int8.py (1)
tests/_test_utils/onnx_quantization/lib_test_models.py (1)
  • build_conv_batchnorm_sig_mul_model (560-675)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: linux
  • GitHub Check: wait-checks / wait
  • GitHub Check: build-docs
  • GitHub Check: code-quality
🔇 Additional comments (2)
tests/_test_utils/onnx_quantization/lib_test_models.py (1)

560-675: LGTM: Test model correctly implements the Conv-BN-Sigmoid-Mul-Add pattern.

The function properly constructs an ONNX graph with the Swish/SiLU activation pattern (Sigmoid and BatchNormalization outputs feeding into Mul) followed by a residual Add. The model structure, initializers, and validation steps are all correctly implemented and consistent with other test models in this file.

tests/unit/onnx/test_quantize_int8.py (1)

24-24: LGTM: Import correctly added.

Comment on lines +95 to +121
def test_conv_batchnorm_sig_mul_int8(tmp_path="./"):
onnx_model = build_conv_batchnorm_sig_mul_model()
onnx_path = os.path.join(tmp_path, "conv_batchnorm_sig_mul_model.onnx")
save_onnx(onnx_model, onnx_path)

moq.quantize(onnx_path, quantize_mode="int8", high_precision_dtype="fp16")

# Output model should be produced in the same tmp_path
output_onnx_path = onnx_path.replace(".onnx", ".quant.onnx")

# Check that quantized explicit model is generated
assert os.path.isfile(output_onnx_path)

# Load the output model and check QDQ node placements
graph = gs.import_onnx(onnx.load(output_onnx_path))

# Check that Conv and ConvTransposed are quantized
conv_nodes = [n for n in graph.nodes if "Conv" in n.op]
assert _assert_nodes_are_quantized(conv_nodes)

# Check that only 1 input of Add is quantized
add_nodes = [n for n in graph.nodes if n.op == "Add"]
for node in add_nodes:
quantized_inputs = [inp for inp in node.inputs if inp.inputs[0].op == "DequantizeLinear"]
assert len(quantized_inputs) == 1, (
f"More than one input of {node.name} is being quantized, but only one should be quantized!"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove default value from pytest fixture parameter.

The tmp_path parameter should not have a default value since it's a pytest fixture. This is inconsistent with other test functions in the file (e.g., line 66).

Apply this diff:

-def test_conv_batchnorm_sig_mul_int8(tmp_path="./"):
+def test_conv_batchnorm_sig_mul_int8(tmp_path):

Otherwise, the test logic correctly verifies that the new fusion pattern is recognized and that QDQ nodes are properly placed (Conv nodes quantized, only one Add input quantized).

🤖 Prompt for AI Agents
In tests/unit/onnx/test_quantize_int8.py around lines 95 to 121, the test
function signature uses a default value for the pytest fixture parameter
tmp_path (tmp_path="./") which is invalid; remove the default so the fixture is
injected by pytest, i.e., change the function definition to accept tmp_path
without any default (tmp_path) and keep the rest of the test body unchanged.

Signed-off-by: gcunhase <[email protected]>
Signed-off-by: gcunhase <[email protected]>
@gcunhase gcunhase force-pushed the dev/gcunhasergio/fix_res_add_ConvBNSigMul branch from a480adc to 5aadeac Compare October 17, 2025 20:01
Copy link

codecov bot commented Oct 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.43%. Comparing base (8c6b915) to head (5aadeac).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #450      +/-   ##
==========================================
+ Coverage   73.37%   73.43%   +0.05%     
==========================================
  Files         180      180              
  Lines       17937    17937              
==========================================
+ Hits        13162    13172      +10     
+ Misses       4775     4765      -10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

]

# Create the ONNX graph with the nodes
nodes = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we create a flag for BiasAdd that will create two models, one with BiasAdd and one without?
This will help us test both the patterns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants