Skip to content

Support pre-quantized models in aimet pass#2111

Merged
jambayk merged 1 commit into
microsoft:mainfrom
CodeLinaro:dev/mtuttle/support_pre_quantized_models
Aug 20, 2025
Merged

Support pre-quantized models in aimet pass#2111
jambayk merged 1 commit into
microsoft:mainfrom
CodeLinaro:dev/mtuttle/support_pre_quantized_models

Conversation

@michaelgtuttle

Copy link
Copy Markdown
Contributor

Describe your changes

Support accepting pre-quantized onnx models in aimet quantization pass.

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
  • Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

Signed-off-by: Michael Tuttle <mtuttle@qti.qualcomm.com>
@devang-ml

devang-ml commented Aug 19, 2025

Copy link
Copy Markdown
Collaborator

Are there any expectations regarding how the model should be pre-quantized in this case?

@michaelgtuttle

Copy link
Copy Markdown
Contributor Author

Are there any expectations regarding how the model should be pre-quantized in this case?

Good question. There are some limitations to what is accepted here, e.g., precisions that aimet does not support such as float8 would cause aimet to throw an error here.

For activation quantization it requires Q -> DQ pairs for quantized tensors, for weights it will accept either fp_initializer -> Q -> DQ -> Op or int_initializer -> DQ -> Op

@jambayk jambayk merged commit 7fefa0b into microsoft:main Aug 20, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants