Skip to content

Commit f265f8d

Browse files
authored
[5693592][ONNX-customOp][Autocast] Fix QuantizeLinear node output type (#671)
## What does this PR do? **Type of change:** Bug fix **Overview:** The output type of Q nodes was being set incorrectly. This PR fixes that. ## Usage ```python $ python -m modelopt.onnx.autocast --onnx_path=$MODEL_NAME.onnx ``` ## Testing See bug 5693592 for more details: ```python $ python -m modelopt.onnx.autocast --onnx_path=$MODEL_NAME.onnx --low_precision_type=fp16 --data_max=inf --init_max=inf --keep_io_types ``` ## Before your PR is "*Ready for review*" <!-- If you haven't finished some of the above items you can still open `Draft` PR. --> - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes - **Did you write any new necessary tests?**: No - **Did you add or update any necessary documentation?**: No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: No Signed-off-by: gcunhase <[email protected]>
1 parent ee8a1f4 commit f265f8d

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

modelopt/onnx/autocast/precisionconverter.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -296,6 +296,8 @@ def _get_np_type(node, inp, opset=onnx.defs.onnx_opset_version()):
296296
return helper.tensor_dtype_to_np_dtype(node.attrs["to"])
297297
elif node.op == "DequantizeLinear":
298298
return node.inputs[1].dtype # scale type
299+
elif node.op == "QuantizeLinear":
300+
return node.inputs[2].dtype # zero_point type
299301
elif not inp.dtype or inp.dtype == onnx.TensorProto.UNDEFINED:
300302
return None
301303
elif node.op not in self.custom_ops:

0 commit comments

Comments
 (0)