-
Notifications
You must be signed in to change notification settings - Fork 179
[5593873] [ONNX] Fix ResAdd logic to support 'Conv-BN-Sigmoid-Mul-Add' as fusible patterns #450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[5593873] [ONNX] Fix ResAdd logic to support 'Conv-BN-Sigmoid-Mul-Add' as fusible patterns #450
Conversation
Warning Rate limit exceeded@gcunhase has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 21 minutes and 43 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (3)
WalkthroughThe PR extends ONNX graph fusion patterns to support additional Mul-Sigmoid-BatchNormalization sequences preceding convolution operations, adds a test model generator for this new pattern, and introduces a corresponding quantization verification test. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
modelopt/onnx/quantization/graph_utils.py (1)
204-205
: LGTM: Correctly extends fusible patterns for Swish/SiLU activation.The new patterns properly support the Conv-BN-Sigmoid-Mul sequence, which implements the Swish/SiLU activation (x * sigmoid(x)) where x is the BatchNormalization output. This enables correct ResAdd quantization by identifying this sequence as a fusible backbone, ensuring QDQ nodes are placed only on the residual branch of Add nodes.
Consider adding a brief inline comment documenting what these patterns represent:
+ # Swish/SiLU activation patterns: x * sigmoid(x) after BatchNorm ["Mul", "Sigmoid", "BatchNormalization", conv_type], ["Mul", "Sigmoid", "BatchNormalization", "BiasAdd", conv_type],
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
modelopt/onnx/quantization/graph_utils.py
(1 hunks)tests/_test_utils/onnx_quantization/lib_test_models.py
(1 hunks)tests/unit/onnx/test_quantize_int8.py
(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/_test_utils/onnx_quantization/lib_test_models.py (1)
modelopt/onnx/utils.py (1)
check_model
(557-569)
tests/unit/onnx/test_quantize_int8.py (1)
tests/_test_utils/onnx_quantization/lib_test_models.py (1)
build_conv_batchnorm_sig_mul_model
(560-675)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (2)
tests/_test_utils/onnx_quantization/lib_test_models.py (1)
560-675
: LGTM: Test model correctly implements the Conv-BN-Sigmoid-Mul-Add pattern.The function properly constructs an ONNX graph with the Swish/SiLU activation pattern (Sigmoid and BatchNormalization outputs feeding into Mul) followed by a residual Add. The model structure, initializers, and validation steps are all correctly implemented and consistent with other test models in this file.
tests/unit/onnx/test_quantize_int8.py (1)
24-24
: LGTM: Import correctly added.
def test_conv_batchnorm_sig_mul_int8(tmp_path="./"): | ||
onnx_model = build_conv_batchnorm_sig_mul_model() | ||
onnx_path = os.path.join(tmp_path, "conv_batchnorm_sig_mul_model.onnx") | ||
save_onnx(onnx_model, onnx_path) | ||
|
||
moq.quantize(onnx_path, quantize_mode="int8", high_precision_dtype="fp16") | ||
|
||
# Output model should be produced in the same tmp_path | ||
output_onnx_path = onnx_path.replace(".onnx", ".quant.onnx") | ||
|
||
# Check that quantized explicit model is generated | ||
assert os.path.isfile(output_onnx_path) | ||
|
||
# Load the output model and check QDQ node placements | ||
graph = gs.import_onnx(onnx.load(output_onnx_path)) | ||
|
||
# Check that Conv and ConvTransposed are quantized | ||
conv_nodes = [n for n in graph.nodes if "Conv" in n.op] | ||
assert _assert_nodes_are_quantized(conv_nodes) | ||
|
||
# Check that only 1 input of Add is quantized | ||
add_nodes = [n for n in graph.nodes if n.op == "Add"] | ||
for node in add_nodes: | ||
quantized_inputs = [inp for inp in node.inputs if inp.inputs[0].op == "DequantizeLinear"] | ||
assert len(quantized_inputs) == 1, ( | ||
f"More than one input of {node.name} is being quantized, but only one should be quantized!" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove default value from pytest fixture parameter.
The tmp_path
parameter should not have a default value since it's a pytest fixture. This is inconsistent with other test functions in the file (e.g., line 66).
Apply this diff:
-def test_conv_batchnorm_sig_mul_int8(tmp_path="./"):
+def test_conv_batchnorm_sig_mul_int8(tmp_path):
Otherwise, the test logic correctly verifies that the new fusion pattern is recognized and that QDQ nodes are properly placed (Conv nodes quantized, only one Add input quantized).
🤖 Prompt for AI Agents
In tests/unit/onnx/test_quantize_int8.py around lines 95 to 121, the test
function signature uses a default value for the pytest fixture parameter
tmp_path (tmp_path="./") which is invalid; remove the default so the fixture is
injected by pytest, i.e., change the function definition to accept tmp_path
without any default (tmp_path) and keep the rest of the test body unchanged.
Signed-off-by: gcunhase <[email protected]>
Signed-off-by: gcunhase <[email protected]>
a480adc
to
5aadeac
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #450 +/- ##
==========================================
+ Coverage 73.37% 73.43% +0.05%
==========================================
Files 180 180
Lines 17937 17937
==========================================
+ Hits 13162 13172 +10
+ Misses 4775 4765 -10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
] | ||
|
||
# Create the ONNX graph with the nodes | ||
nodes = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create a flag for BiasAdd that will create two models, one with BiasAdd and one without?
This will help us test both the patterns.
What does this PR do?
Type of change: Bug fix
Overview: QDQ nodes were being placed in both inputs of Add layers with
Conv-BN-Sigmoid-Mul-Add
pattern instead of just the residual branch.Usage
$ python -m modelopt.onnx.quantization --onnx_path=$MODEL_NAME.onnx
Testing
Added unittest.
Before your PR is "Ready for review"
Summary by CodeRabbit
New Features
Tests