Skip to content

Conversation

@kahmed10
Copy link
Collaborator

Motivation

Current quantization passes don't properly split single dynamic dimensions before running on dynamic shapes. The matchers will error out otherwise.

Technical Details

  • This change assumes that select_module is an indicator that split_single_dyn_dim pass has already ran on the main module.
  • The truncate_float pass and quantize_8bits also need this change so that converts are in the submodule. Otherwise other passes will attempt to perform optimizations/compile the converts in the main module. But they'll fail when hitting dynamic shapes.

Changelog Category

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

@kahmed10 kahmed10 requested a review from causten as a code owner November 25, 2025 19:33
@kahmed10 kahmed10 changed the title Update quantization to support dynamic shapes Update quantization to support dynamic shapes with single dynamic dimension Nov 25, 2025
@kahmed10 kahmed10 changed the title Update quantization to support dynamic shapes with single dynamic dimension AIMIGRAPHX-351 Update quantization to support dynamic shapes with single dynamic dimension Nov 25, 2025
@kahmed10 kahmed10 requested a review from CharlieL7 November 25, 2025 20:07
@pfultz2
Copy link
Collaborator

pfultz2 commented Nov 25, 2025

This change assumes that select_module is an indicator that split_single_dyn_dim pass has already ran on the main module.

I dont think thats a good assumption to make.

{split_single_dyn_dim{},
dead_code_elimination{},
simplify_dyn_ops{},
dead_code_elimination{},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these passes need to be ran?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split_single_dyn_dim will create the submodules but won't simplify other ops further. So simplify_dyn_ops is used to get rid of things like 2-input multibroadcasts and slices. Awhile back I've found from testing models that this was needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we should be able to quantize without needing to convert dynamic shapes which is a decision that should be made by the backend target.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that should be the case, but currently optimize_module will fail as soon as a matcher tries to interact with a dynamic shape.

@causten causten requested a review from Copilot November 25, 2025 20:21
Copilot finished reviewing on behalf of causten November 25, 2025 20:25
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances quantization passes to properly support dynamic shapes with single dynamic dimensions. The changes ensure that the split_single_dyn_dim pass runs before quantization, and that quantization operations are performed on submodules rather than the main module when dynamic shapes are involved.

Key changes:

  • Adds split_single_dyn_dim preprocessing to all quantization entry points
  • Implements select_module detection to skip processing already-transformed main modules
  • Ensures convert operations are inserted in submodules for proper dynamic shape handling

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/truncate_float.cpp Adds select_module check to skip main module and copyright year update to 2025
src/split_single_dyn_dim.cpp Adds early return when select_module exists, indicating pass already ran
src/quantize_8bits.cpp Adds select_module checks in both quantize_8bits_pass and capture_arguments_pass to skip main module
src/quantization.cpp Adds split_single_dyn_dim preprocessing sequence to all quantization functions (fp16, bf16, int8, int4)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Comment on lines 38 to +42
static void
quantize_module(module& m, const std::vector<std::string>& ins_names, shape::type_t float_type)
{
// skip main module that contains select_module
if(any_of(m.begin(), m.end(), [](auto ins) { return ins.name() == "select_module"; }))
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check for select_module is duplicated across multiple files (truncate_float.cpp, quantize_8bits.cpp twice, and split_single_dyn_dim.cpp). Consider extracting this into a helper function to improve maintainability and ensure consistent behavior. For example:

// In a common utility header
inline bool module_has_select_module(const module& m) {
    return any_of(m.begin(), m.end(), [](auto ins) { return ins.name() == "select_module"; });
}

This would reduce code duplication and make future changes easier.

Suggested change
static void
quantize_module(module& m, const std::vector<std::string>& ins_names, shape::type_t float_type)
{
// skip main module that contains select_module
if(any_of(m.begin(), m.end(), [](auto ins) { return ins.name() == "select_module"; }))
// Helper function to check for select_module
static inline bool module_has_select_module(const module& m)
{
return any_of(m.begin(), m.end(), [](auto ins) { return ins.name() == "select_module"; });
}
static void
quantize_module(module& m, const std::vector<std::string>& ins_names, shape::type_t float_type)
{
// skip main module that contains select_module
if(module_has_select_module(m))

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +77
{split_single_dyn_dim{},
dead_code_elimination{},
simplify_dyn_ops{},
dead_code_elimination{},
normalize_ops{},
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pass sequence {split_single_dyn_dim{}, dead_code_elimination{}, simplify_dyn_ops{}, dead_code_elimination{}} is repeated four times in this file (lines 73-76, 87-90, 107-110, 211-214). Consider extracting this into a named constant or helper function to improve maintainability:

static std::vector<pass> get_dyn_shape_preprocessing_passes() {
    return {split_single_dyn_dim{},
            dead_code_elimination{},
            simplify_dyn_ops{},
            dead_code_elimination{}};
}

This would make the code DRYer and ensure consistent preprocessing across all quantization functions.

Copilot uses AI. Check for mistakes.
Comment on lines +98 to +102
// skip main module that contains select_module (meaning this pass already ran)
if(any_of(mm->begin(), mm->end(), [](auto ins) { return ins.name() == "select_module"; }))
{
return true;
}
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic change here modifies the behavior of any_sm_next to return true when the module contains select_module, effectively treating it the same as if a parameter outputs to select_module. This seems inconsistent with the function's documented purpose which is to check if "any of the parameters outputs to a select_module operator".

Consider either:

  1. Renaming this function to better reflect its new behavior (e.g., should_skip_module), or
  2. Moving this check out of the function and into the caller, to keep the function focused on its original purpose.

Copilot uses AI. Check for mistakes.
Comment on lines +73 to 80
{split_single_dyn_dim{},
dead_code_elimination{},
simplify_dyn_ops{},
dead_code_elimination{},
normalize_ops{},
optimize_module{{"quantizelinear", "dequantizelinear"}},
truncate_float_pass{ins_names, shape::half_type},
optimize_module{{"quantizelinear", "dequantizelinear"}}},
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes add support for dynamic shapes with single dynamic dimensions in quantization passes, but there are no tests verifying this new functionality. Consider adding tests that combine dynamic shapes with quantization to ensure the new behavior works correctly, such as:

  • Test quantize_fp16 with a dynamic shape input
  • Test quantize_bf16 with a dynamic shape input
  • Test quantize_int8 with a dynamic shape input
  • Test quantize_int4_weights with a dynamic shape input

This is especially important given that the PR description mentions "The matchers will error out otherwise" without this change.

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Nov 25, 2025

Codecov Report

❌ Patch coverage is 74.19355% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/quantization.cpp 70.37% 8 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4467      +/-   ##
===========================================
- Coverage    92.21%   92.20%   -0.01%     
===========================================
  Files          561      561              
  Lines        27228    27254      +26     
===========================================
+ Hits         25107    25127      +20     
- Misses        2121     2127       +6     
Files with missing lines Coverage Δ
src/quantize_8bits.cpp 100.00% <100.00%> (ø)
src/split_single_dyn_dim.cpp 98.44% <100.00%> (+0.02%) ⬆️
src/truncate_float.cpp 100.00% <100.00%> (ø)
src/quantization.cpp 81.63% <70.37%> (-3.70%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@kahmed10
Copy link
Collaborator Author

This change assumes that select_module is an indicator that split_single_dyn_dim pass has already ran on the main module.

I dont think thats a good assumption to make.

I'm open to other suggestions. In the past, I tried to include an additional attribute to the program indicating it's already completed the split_single_dyn_dim pass, but that seemed messy.

Right now we don't have any other way to insert a select_module besides the split_single_dyn_dim pass (unless the user explicitly inserts it into the IR before compiling, but then it shouldn't run the pass anyways). If the functionality changes, then we could revisit this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants