[Inductor][float8] Support qlinear for float8 in inductor #2565

shiyang-weng · 2025-07-17T02:59:10Z

For float8_e4m3fn, support

register_qlinear_weight_prepack
_register_qlinear_unary_fusion
_register_qlinear_binary_fusion
quant_lift_up

on inductor.

For FP8, there are following issues

q/dq switch to use quantize_affine_float8/dequantize_affine_float8
The q/dq API change. The fp8 q/dq requires type(scale) is tensor.
pt2e not support float8.

Based on these issues,

Need to handle fp8 q/dq pattern separately.
Handle scale separately.
We implement the function(fp8_convert_), which can add q/dq before the linear in the model. We add the function to test/quantization/pt2e/test_x86inductor_fusion.py

…uctor

Add fp8 dequant promotion

pytorch-bot · 2025-07-17T02:59:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2565

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 38de0e9 with merge base 11ce634 ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test-nightly (CPU Nightly, linux.4xlarge, --pre torch --index-url https://download.pytorch.org/wh... / linux-job (gh)
test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_fp8_qlinear_relu_mixed_bf16_input_dim_exceeds_2
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh)
test/quantization/pt2e/test_x86inductor_fusion.py::TestPatternMatcher::test_fp8_qlinear_add_cpu_use_relu_True_mixed_bf16_True

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/quantization/pt2e/test_x86inductor_fusion.py

torchao/quantization/pt2e/inductor_passes/x86.py

Xia-Weiwen

Thanks for the PR!

test/quantization/pt2e/test_x86inductor_fusion.py

torchao/quantization/pt2e/inductor_passes/x86.py

Xia-Weiwen

LGTM

pytorch-bot · 2025-07-23T07:39:47Z

Didn't find following labels among repository labels: topic:,not,user,facing

shiyang-weng · 2025-07-23T07:40:04Z

@pytorchbot label "topic: not user facing"

shiyang-weng added 20 commits June 18, 2025 15:22

quantize_affine_float8/dequantize_affine_float8 not decomposed on ind…

a840ef5

…uctor

remove redundant unittest.skipIf

02d045b

fix rebase issue

9860c56

change dispatch key to a flag decomposed

ca662f3

support scaled_mm on inductor

f51a5be

fix rebase issue

719793c

support dequant promtion for fp8

48a3d99

add ut

1921b2f

remove redundant codes

0335415

Merge pull request #2 from shiyang-weng/wengshiy/dequant_promotion

955fa6e

Add fp8 dequant promotion

Merge remote-tracking branch 'origin/main' into wengshiy/scaled_mm

a70e094

fix lint

a5bb4d0

Merge branch 'main' into wengshiy/scaled_mm

1c1f890

resolve conflict

0c7f8ea

change to use qlinear

0175b17

add ut

564d4b7

fix lint

9948674

Merge remote-tracking branch 'origin/main' into wengshiy/qlinear

413a883

support fp8 quant_lift_up

558d216

add reshape into _VIEW_METHOD_OPS

8cd1433

shiyang-weng marked this pull request as draft July 17, 2025 02:59

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 17, 2025

shiyang-weng commented Jul 17, 2025

View reviewed changes

test/quantization/pt2e/test_x86inductor_fusion.py Show resolved Hide resolved

shiyang-weng added 6 commits July 17, 2025 09:53

add quant_input_check

ae4f582

Merge remote-tracking branch 'origin/main' into wengshiy/qlinear

469ac50

fix lint

8026306

refine ut

f735949

remove fp8 dynamic quant ut

5803511

fix output_scale issue

3e37dea

Merge remote-tracking branch 'origin/main' into wengshiy/qlinear

497de92

shiyang-weng commented Jul 21, 2025

View reviewed changes

torchao/quantization/pt2e/inductor_passes/x86.py Show resolved Hide resolved

Xia-Weiwen reviewed Jul 21, 2025

View reviewed changes

shiyang-weng added 4 commits July 21, 2025 22:29

add float8_e4m3fn to dtype_list

d9ac092

refine code

f88db2d

refine code

4f4eb8b

fix bugs

7c3f9f9

shiyang-weng requested a review from Xia-Weiwen July 23, 2025 02:55

add comment

38de0e9

Xia-Weiwen approved these changes Jul 23, 2025

View reviewed changes

This comment was marked as outdated.

Sign in to view

pytorch-bot bot added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Jul 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Inductor][float8] Support qlinear for float8 in inductor #2565

[Inductor][float8] Support qlinear for float8 in inductor #2565

Uh oh!

shiyang-weng commented Jul 17, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Xia-Weiwen left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Xia-Weiwen left a comment

Uh oh!

This comment was marked as outdated.

pytorch-bot bot commented Jul 23, 2025

Uh oh!

shiyang-weng commented Jul 23, 2025

Uh oh!

Uh oh!

[Inductor][float8] Support qlinear for float8 in inductor #2565

Are you sure you want to change the base?

[Inductor][float8] Support qlinear for float8 in inductor #2565

Uh oh!

Conversation

shiyang-weng commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2565

❌ 2 New Failures

Uh oh!

Uh oh!

Uh oh!

Xia-Weiwen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Xia-Weiwen left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

pytorch-bot bot commented Jul 23, 2025

Uh oh!

shiyang-weng commented Jul 23, 2025

Uh oh!

Uh oh!

shiyang-weng commented Jul 17, 2025 •

edited

Loading

pytorch-bot bot commented Jul 17, 2025 •

edited

Loading