Skip to content

Support Dynamically Quantized Convolutions #9021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mcr229 opened this issue Mar 6, 2025 · 3 comments · Fixed by #10347
Closed

Support Dynamically Quantized Convolutions #9021

mcr229 opened this issue Mar 6, 2025 · 3 comments · Fixed by #10347
Assignees
Labels
good first issue Good for newcomers module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone

Comments

@mcr229
Copy link
Contributor

mcr229 commented Mar 6, 2025

We wish to have some initial support for Dynamically Quantized Convolutions.

Let's first write a test to drive development of the feature. Let's add a test here first:

# This source code is licensed under the BSD-style license found in the

let's just do 2d convolutions for now. Take a look at how we test dqlinears in general:

And let's try to add a test here. Since we're adding quantizer support, we should make sure that after

.quantize()
.export()

we should check that a choose_q_param node is in the graph. Now, after we've added this test, when we run it with
python -m unittest backends.xnnpack.test.ops.test_conv2d.... It should fail because it can't find the choose_q_params node. Let's first start by enabling the quantizer to properly annotate convolutions.

Since we're first starting with conv2d, we should only annotate dynamically quantized convs if they are 2d. We can add a check that the len(outputpadding) == 2 somewhere here:
https://github.com/pytorch/executorch/blob/main/backends/xnnpack/quantizer/xnnpack_quantizer_utils.py#L295

Now that we have it annotated, it should bass through the test that's checking for the choose_q_param. Now we just need to update our partitioner to allow DynamicallyQuantizedConvolutions:

ConfigPrecisionType.STATIC_QUANT,

Again it would be nice to check in our constraints that if we detect a dynamically quantized convolution, and it is 1d, then we don't partition. After this the test should be passing. There may be some more lingering issues with the wiring, if that's the case feel free to reach out in the discord group:

https://discord.com/channels/1334270993966825602/1336777807509979188

cc @digantdesai @cbilgin

@mcr229 mcr229 moved this to Backlog in ExecuTorch - CPU Mar 6, 2025
@mcr229 mcr229 added this to the 0.6.0 milestone Mar 6, 2025
@mcr229 mcr229 added the module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ label Mar 6, 2025
@iseeyuan iseeyuan added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 7, 2025
@mcr229 mcr229 added the good first issue Good for newcomers label Apr 4, 2025
@keyprocedure
Copy link
Contributor

I’d be happy to work on this and learn as I go!

@mcr229
Copy link
Contributor Author

mcr229 commented Apr 4, 2025

thank you @keyprocedure, i assigned it to you. Feel free to ask lots of questions in the xnnpack discord channel. This one is definitely more involved, so happy to discuss and answer any questions you may have

@iliasslasri
Copy link

@keyprocedure I would love to help! we can chat on discrod @ilqqq .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: Done
4 participants