Skip to content

feat: add target modules to quantizers#452

Merged
gsprochette merged 24 commits intomainfrom
feat/add-target-modules-to-quantizers
Dec 15, 2025
Merged

feat: add target modules to quantizers#452
gsprochette merged 24 commits intomainfrom
feat/add-target-modules-to-quantizers

Conversation

@gsprochette
Copy link
Copy Markdown
Collaborator

@gsprochette gsprochette commented Dec 4, 2025

Description

This PR adds target modules capabilities to torchao, awq, hqq and hqq-diffusers.
All apply methods were updated with the map_targeted_nn_roots structure with custom quantization function, similarly to what was done in llm-int8, diffusers-int8 and quanto.

A utility function target_backbone was added to provide a global default for target modules: automatically target the transformer, unet or model language.

The save and load functions for hqq were updated to maintain compatibility with this new feature and improve the pipeline saving and loading.

(documentation update because it adds new hyperparameter to the quantizers)

Related Issue

Fixes #386

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Ran each algorithm in a notebook to make sure the model can run after quantization, some Linears are quantized and some excluded linears are not.
Some tests don't pass, but they already didn't before this PR.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Notes

@gsprochette gsprochette requested review from begumcig and llcnt December 4, 2025 09:08
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment @cursor review or bugbot run to trigger another review on this PR

Comment thread src/pruna/algorithms/hqq.py Outdated
Comment thread src/pruna/algorithms/hqq_diffusers.py Outdated
Comment thread src/pruna/config/target_modules.py
Comment thread src/pruna/algorithms/hqq_diffusers.py Outdated
Comment thread src/pruna/engine/load.py
Comment thread src/pruna/engine/utils.py Outdated
Copy link
Copy Markdown
Member

@begumcig begumcig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already looks 99% ready to go to me! Added some nitpicky comments 🙈🙈. Amazing job Gaspar 🧡🧡🧡🧡

Comment thread src/pruna/algorithms/hqq.py
pruna_logger.warning(
"You are using torchao with torch.compile. "
"Please set `smash_config['torch_compile_mode']='max-autotune-no-cudagraphs'` for best results; "
"otherwise you may encounter undesirable outcomes."
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the undesirable outcomes here

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, the content of this function was already in the file I just isolated the warnings from the apply logic. Maybe @johannaSommer knows more about this undesirable outcome?

Comment thread src/pruna/config/target_modules.py
Comment thread src/pruna/engine/save.py Outdated
Comment thread src/pruna/engine/save.py Outdated
# save the rest of the model, if it is a janus like model,
# and add a config file to the quantized model path.
if hasattr(model, "model") and hasattr(model.model, "language_model"):
model.model.save_pretrained = original_save_pretrained
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we wrap this in a try finally block so that if even saving fails we revert the saving back to original_save_pretrained? It might be confusing for users if they try to save the model again.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a super good point. I think a context is a clearer and cleaner way to factorize this code out of here for clarity in this already complicated saving function

Comment thread src/pruna/config/target_modules.py
Copy link
Copy Markdown
Collaborator

@llcnt llcnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR and big congrats for the amazing job on hqq saving/loading ;)
The hqq with target modules on a LLMpipeline is not working currently, I will let you fix this before approving the PR!

Comment thread src/pruna/algorithms/hqq.py
Comment thread src/pruna/algorithms/hqq_diffusers.py
Comment thread src/pruna/engine/load.py
Comment thread src/pruna/engine/utils.py
@gsprochette gsprochette requested review from begumcig and llcnt December 5, 2025 17:36
Copy link
Copy Markdown
Collaborator

@llcnt llcnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes :)

Copy link
Copy Markdown
Member

@begumcig begumcig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing Gaspar just left one comment about the device attribute, maybe we should talk about it, wdyt? Feel free to ignore it If I am making no sense! Everything looks great overall! Thank you!!

Comment thread src/pruna/engine/utils.py
if safe_is_instance(model, Pipeline):
move_to_device(model.model, device, raise_error, device_map)
if device != "accelerate":
model.device = torch.device(device)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this possibly cause problems if the model doesn't have a device attr? or it's a property that cannot be set directly?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this if block we can assume model is a transformers Pipeline, which relies on getting and setting its device attribute (I checked transformers 4.42, 4.57 and 5.0.0rc1 and they all set self.device = ... in their init. If you want I can add a try..except but it may be better to explicitely fail if there is unknown behavior so we can handle it. WDYT?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great to me! Thank you so much for checking <3

@gsprochette gsprochette merged commit 24bc357 into main Dec 15, 2025
6 checks passed
@gsprochette gsprochette deleted the feat/add-target-modules-to-quantizers branch December 15, 2025 13:57
Marius-Graml pushed a commit that referenced this pull request Jan 19, 2026
* feat: add target modules to torchao

* feat: add target modules to awq

* feat: add target modules to hqq

* feat: add and use target backbone for default target modules

* feat: extend hqq-diffusers save and load to handle target module

* feat: add monkey patching context
Marius-Graml pushed a commit that referenced this pull request Jan 19, 2026
* feat: add target modules to torchao

* feat: add target modules to awq

* feat: add target modules to hqq

* feat: add and use target backbone for default target modules

* feat: extend hqq-diffusers save and load to handle target module

* feat: add monkey patching context
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add target modules to torchao quantizer

3 participants