[5545101]: AutoCast: Add options to force include node/op in F16 #386

galagam · 2025-09-28T15:46:16Z

What does this PR do?

Type of change: ? new feature

Overview:
Add options nodes_to_include, op_types_to_include that force-include nodes in the conversion, overriding NodeClassifier exclusion logic

Usage

# force-include all Conv nodes for conversion, even if NodeClassifer logic states they should be excluded (kept in high precision)
python3 -m modelopt.onnx.autocast --onnx_path model.onnx --op_types_to_include Conv

# exclude all Conv nodes, include only node named Conv_96
python3 -m modelopt.onnx.autocast --onnx_path model.onnx --op_types_to_exclude Conv --nodes_to_include Conv_96

Testing

pytest tests/unit/onnx/autocast/test_nodeclassifier.py::test_node_classifier_force_include

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: Yes
Did you add or update any necessary documentation?: Yes
Did you update Changelog?: Yes

Additional Information

Summary by CodeRabbit

New Features
- Add options to force-include specific nodes or operator types for reduced-precision conversion; include rules can override exclusion rules.
Documentation
- Guide and examples updated to show new include options and to clarify precision-decision rules.
Tests
- Added unit tests validating force-include behavior and its interaction with exclusion rules.
Chores
- CHANGELOG updated.

copy-pr-bot · 2025-09-28T15:46:19Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2025-09-28T15:46:23Z

Walkthrough

Adds force-include controls to AutoCast: new include parameters in CLI and Python API propagate to NodeClassifier, which now evaluates include rules that can override exclude rules when classifying node precision. Documentation, changelog, and unit tests updated.

Changes

Cohort / File(s)	Summary
Changelog & Guide `CHANGELOG.rst`, `docs/source/guides/8_autocast.rst`	Documents new `nodes_to_include` and `op_types_to_include` options; updates API examples, "How It Works" rules, and wording to reference reduced precision (FP16/BF16) and classification notes (data_max/init_max).
CLI `modelopt/onnx/autocast/__main__.py`	Adds `--nodes_to_include`/`-ni` and `--op_types_to_include`/`-opi` multi-value options; updates IO-type help text; forwards include lists to `convert_to_mixed_precision`.
Conversion API `modelopt/onnx/autocast/convert.py`	Extends `convert_to_mixed_precision(...)` signature with `nodes_to_include` and `op_types_to_include` (defaulting to empty lists); documents them and passes them into `NodeClassifier`.
Classifier Core `modelopt/onnx/autocast/nodeclassifier.py`	Adds include-rule classes (`IncludeNodeNameRegexRule`, `IncludeOpTypes`); introduces `nodes_to_include` / `op_types_to_include` fields; splits rule generation into `_gen_exclude_node_rules` and `_gen_include_node_rules`; updates run logic so include matches override exclude matches when deciding precision.
Tests `tests/unit/onnx/autocast/test_nodeclassifier.py`	Adds `test_node_classifier_force_include` to verify `nodes_to_include`/`op_types_to_include` force nodes into low precision and that include rules can override exclude rules.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User/Script
  participant CLI as AutoCast CLI
  participant Convert as convert_to_mixed_precision
  participant NC as NodeClassifier
  participant ONNX as ONNX Model

  U->>CLI: autocast --nodes_to_include ... --op_types_to_include ...
  CLI->>Convert: convert_to_mixed_precision(..., nodes_to_include, op_types_to_include)
  Convert->>NC: NodeClassifier(..., nodes_to_include, op_types_to_include)
  Convert->>NC: run(model)
  loop For each node
    NC->>NC: Evaluate exclude rules
    NC->>NC: Evaluate include rules
    alt Matches exclude AND not matches include
      NC-->>Convert: classify node as High precision (keep FP32)
    else Otherwise (include matched or not excluded)
      NC-->>Convert: classify node as Low precision (cast to FP16/BF16)
    end
  end
  Convert->>ONNX: Apply casts per classification
  Convert-->>CLI: Return mixed-precision model
  CLI-->>U: Output model/report

  note over NC: Include rules take precedence and can force low precision

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I nudge the graph with velvet paws,
Include, exclude—such tidy laws.
A hop to force the bits to small,
I coax some nodes to shrink their all.
Carrots count in FP16, hooray! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly and accurately describes the primary change introduced, namely adding options to AutoCast for forcing the inclusion of specific nodes or operations in F16 precision, matching the pull request’s content and objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d71399d and 725e6ca.

📒 Files selected for processing (1)

CHANGELOG.rst (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

CHANGELOG.rst

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-09-28T16:00:21Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.40%. Comparing base (14fa1e5) to head (a739537).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #386      +/-   ##
==========================================
+ Coverage   73.38%   73.40%   +0.02%     
==========================================
  Files         180      180              
  Lines       18111    18127      +16     
==========================================
+ Hits        13290    13306      +16     
  Misses       4821     4821

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

gcunhase · 2025-10-09T15:21:23Z

@galagam is it possible to trickle this behavior down in the quantization workflow as well? Thanks!

galagam · 2025-10-12T16:01:40Z

@galagam is it possible to trickle this behavior down in the quantization workflow as well? Thanks!

@gcunhase If you recall, current quantization workflow is using convert_to_f16, which means all nodes are converted to fp16, unless explicitly excluded (bypasses NodeClassifier). If/when we integrate convert_to_mixed_precision to the quantization flow, we'll also need to include these flags. With current behavior it is meaningless.

galagam · 2025-10-15T06:23:46Z

@gcunhase @i-riyad please help review

gcunhase

LGTM!

modelopt/onnx/autocast/__main__.py

modelopt/onnx/autocast/convert.py

Add options nodes_to_include, op_types_to_include that force-include nodes in the conversion, overriding NodeClassifier exclusion logic Signed-off-by: Gal Hubara Agam <[email protected]>

Signed-off-by: Gal Hubara Agam <[email protected]>

galagam requested a review from a team as a code owner September 28, 2025 15:46

galagam requested a review from ajrasane September 28, 2025 15:46

galagam force-pushed the dev-gagam-force-input-nodes branch from be7614f to 1ac7868 Compare September 28, 2025 16:07

galagam requested review from gcunhase and i-riyad October 9, 2025 07:08

gcunhase approved these changes Oct 15, 2025

View reviewed changes

ajrasane approved these changes Oct 28, 2025

View reviewed changes

modelopt/onnx/autocast/__main__.py Show resolved Hide resolved

modelopt/onnx/autocast/convert.py Outdated Show resolved Hide resolved

galagam force-pushed the dev-gagam-force-input-nodes branch from 725e6ca to 2eefdf0 Compare October 28, 2025 18:24

galagam added 3 commits October 28, 2025 20:26

[5545101]: AutoCast: Add options to force include node/op in F16

f61f579

Add options nodes_to_include, op_types_to_include that force-include nodes in the conversion, overriding NodeClassifier exclusion logic Signed-off-by: Gal Hubara Agam <[email protected]>

improve test coverage

8bb6e4f

Signed-off-by: Gal Hubara Agam <[email protected]>

fix typos and adjust shorthand cli flags

a739537

Signed-off-by: Gal Hubara Agam <[email protected]>

galagam force-pushed the dev-gagam-force-input-nodes branch from 2eefdf0 to a739537 Compare October 28, 2025 18:26

galagam enabled auto-merge (squash) October 28, 2025 18:33

galagam merged commit 7b6a15a into NVIDIA:main Oct 28, 2025
26 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[5545101]: AutoCast: Add options to force include node/op in F16 #386

[5545101]: AutoCast: Add options to force include node/op in F16 #386

galagam commented Sep 28, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Sep 28, 2025

Uh oh!

coderabbitai bot commented Sep 28, 2025 •

edited

Loading

Uh oh!

codecov bot commented Sep 28, 2025 •

edited

Loading

Uh oh!

gcunhase commented Oct 9, 2025

Uh oh!

galagam commented Oct 12, 2025

Uh oh!

galagam commented Oct 15, 2025

Uh oh!

gcunhase left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[5545101]: AutoCast: Add options to force include node/op in F16 #386

[5545101]: AutoCast: Add options to force include node/op in F16 #386

Conversation

galagam commented Sep 28, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Sep 28, 2025

Uh oh!

coderabbitai bot commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

codecov bot commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gcunhase commented Oct 9, 2025

Uh oh!

galagam commented Oct 12, 2025

Uh oh!

galagam commented Oct 15, 2025

Uh oh!

gcunhase left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

galagam commented Sep 28, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 28, 2025 •

edited

Loading

codecov bot commented Sep 28, 2025 •

edited

Loading