Skip to content

Reference representation of dqlinear int4 for xnnpack #2520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

kimishpatel
Copy link
Contributor

@kimishpatel kimishpatel commented Jul 10, 2025

Stack from ghstack (oldest at bottom):

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:

  • See if such a graph is traceable.
  • Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D78198154

tuples

Summary:
THis is needed because lists are not hashable, since they are mutable,
and as a result we cannot have literals_to_ph in pattern rewrites used
inside reference_representation_rewrite.py

Test Plan:
CI + next diff relies on this feature

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
This is necessary because sometimes the patterns found have literals
include tuple of ints kind of literals. This values shouldnt be used for
pattern matching since often they are based on consts derived from
example inputs.

THis is not exactly a safe thing to do in general so by default it is
turned off

Test Plan:
Subsequent diff adds a pattern that relies on this

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Jul 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2520

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ae63634 with merge base fe0ddf1 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kimishpatel added a commit that referenced this pull request Jul 10, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 756a9e9
Pull Request resolved: #2520
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 10, 2025
Comment on lines 41 to 42
"_qdq_dynamic_quantized_linear_4bit_groupwise",
"_reference_dynamic_quantized_linear_4bit_groupwise",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does these needs to be exposed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh good point. it doesnt. ai assisted coding i guess. lol

@kimishpatel kimishpatel added the topic: new feature Use this tag if this PR adds a new feature label Jul 11, 2025
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Jul 11, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 0f79f1c
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Jul 11, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 080923e
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Jul 11, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: deb3efa
Pull Request resolved: #2520
Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll stamp to unblock, but let me know if any review is needed

@kimishpatel
Copy link
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Jul 12, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 5108e2c
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 6, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: a1e2796
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 7, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: f9db619
Pull Request resolved: #2520
@kimishpatel
Copy link
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 7, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: d1a4a2c
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 8, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 6630cc7
Pull Request resolved: #2520
…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 11, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 0f5643a
Pull Request resolved: #2520
@kimishpatel
Copy link
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…for xnnpack"

Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D78198154](https://our.internmc.facebook.com/intern/diff/D78198154)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 12, 2025
Summary:
This diff adds dynamic quantized linear's integer arithmetic
representation. This is quite close to how arithmetic is done in
xnnpack.

Basic tests added against q/dq to make things are sane.

Followups:
- See if such a graph is traceable.
- Optimize implementation if needed

Test Plan:
added

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: ddb2acc
Pull Request resolved: #2520
@kimishpatel kimishpatel changed the base branch from gh/kimishpatel/7/base to main August 12, 2025 02:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: new feature Use this tag if this PR adds a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants