Skip to content

Conversation

@kinjalpatel27
Copy link
Contributor

iCo-authored-by: Asma Kuriparambil Thekkumpate [email protected]

What does this PR do?

Type of change: New Feature

Overview:
This MR adds support for quantizing TE Ops in megatron, specifically TERowParallelLinear, TEColumnParallelLinear and TELayerNormColumnParallelLinear.

Usage

It can be used by enabling TE spec in megatron

Testing

Added unit tests for testing functionality
test_homogeneous_sharded_state_dict_te_spec
test_convert_mcore_te_gpt_model
test_quantize_forward_backward

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: Yes
  • Did you add or update any necessary documentation?: Yes
  • Did you update Changelog?: Yes

Additional Information

@kinjalpatel27 kinjalpatel27 requested review from a team as code owners December 2, 2025 20:35
@codecov
Copy link

codecov bot commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 83.78378% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.50%. Comparing base (53a2dde) to head (ddecbc3).
⚠️ Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
modelopt/torch/utils/logging.py 80.00% 6 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #632   +/-   ##
=======================================
  Coverage   74.50%   74.50%           
=======================================
  Files         183      183           
  Lines       18400    18444   +44     
=======================================
+ Hits        13709    13742   +33     
- Misses       4691     4702   +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@mxinO mxinO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! Do we still need a special modelopt layer spec after this MR?

iCo-authored-by: Asma Kuriparambil Thekkumpate <[email protected]>
Co-authored-by: Kinjal Patel <[email protected]>
Signed-off-by: Kinjal Patel <[email protected]>
Signed-off-by: Kinjal Patel <[email protected]>
Signed-off-by: Kinjal Patel <[email protected]>
Signed-off-by: Kinjal Patel <[email protected]>
Signed-off-by: Kinjal Patel <[email protected]>
Signed-off-by: Kinjal Patel <[email protected]>
@kinjalpatel27 kinjalpatel27 requested a review from mxinO December 8, 2025 17:09
@kinjalpatel27 kinjalpatel27 force-pushed the kinjal/te_megatron_support branch from d7c4802 to 021718d Compare December 8, 2025 17:28
Copy link
Contributor

@meenchen meenchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know how much im provement we can get by enabling TE?

Signed-off-by: Kinjal Patel <[email protected]>
Signed-off-by: Kinjal Patel <[email protected]>
Signed-off-by: Kinjal Patel <[email protected]>
Copy link
Contributor

@meenchen meenchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am wondering if you have some data points of enabling TE in terms of perf and accuracy. Also, do we need to update the modelopt example in megatron to enable this by default?

@kinjalpatel27
Copy link
Contributor Author

kinjalpatel27 commented Dec 8, 2025

Thanks @meenchen

I am wondering if you have some data points of enabling TE in terms of perf and accuracy.

@realAsma, do you have data points in terms of perf and accuracy for TE?

Also, do we need to update the modelopt example in megatron to enable this by default?

We will enable TE by default in megatron and remove local spec to keep things consistent between megatron-core and megatron + modelopt example

Signed-off-by: Kinjal Patel <[email protected]>
@realAsma
Copy link
Contributor

I am wondering if you have some data points of enabling TE in terms of perf and accuracy.
@meenchen

  1. TELinear support will help us get rid of the ModelOpt model provider and simplify and clean up MCore flows - So we would be able to quantize any MCore model without having to explicitly convert to local spec.
  2. TELayerNorm Linear reduces memory consumption by orders of magnitude because of sequence parallelism. With TELayerNorm we dont checkpoint the intermediate activations between layer norm and linear.

Signed-off-by: Kinjal Patel <[email protected]>
Signed-off-by: Kinjal Patel <[email protected]>
@kinjalpatel27 kinjalpatel27 merged commit f731379 into main Dec 11, 2025
48 of 50 checks passed
@kinjalpatel27 kinjalpatel27 deleted the kinjal/te_megatron_support branch December 11, 2025 20:01
b7r6 pushed a commit to weyl-ai/Model-Optimizer that referenced this pull request Dec 18, 2025
iCo-authored-by: Asma Kuriparambil Thekkumpate <[email protected]>

## What does this PR do?

**Type of change:** New Feature

**Overview:** 
This MR adds support for quantizing TE Ops in megatron, specifically
TERowParallelLinear, TEColumnParallelLinear and
TELayerNormColumnParallelLinear.

## Usage
It can be used by enabling TE spec in megatron

## Testing
Added unit tests for testing functionality
`test_homogeneous_sharded_state_dict_te_spec`
`test_convert_mcore_te_gpt_model`
`test_quantize_forward_backward`

## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->

- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: Yes
- **Did you add or update any necessary documentation?**: Yes
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes

## Additional Information
<!-- E.g. related issue. -->

---------

Signed-off-by: Kinjal Patel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants