feat: add target modules to quantizers by gsprochette · Pull Request #452 · PrunaAI/pruna

gsprochette · 2025-12-04T09:07:46Z

Description

This PR adds target modules capabilities to torchao, awq, hqq and hqq-diffusers.
All apply methods were updated with the map_targeted_nn_roots structure with custom quantization function, similarly to what was done in llm-int8, diffusers-int8 and quanto.

A utility function target_backbone was added to provide a global default for target modules: automatically target the transformer, unet or model language.

The save and load functions for hqq were updated to maintain compatibility with this new feature and improve the pipeline saving and loading.

(documentation update because it adds new hyperparameter to the quantizers)

Related Issue

Fixes #386

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Ran each algorithm in a notebook to make sure the model can run after quantization, some Linears are quantized and some excluded linears are not.
Some tests don't pass, but they already didn't before this PR.

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Additional Notes

cursor

Comment @cursor review or bugbot run to trigger another review on this PR

begumcig

Already looks 99% ready to go to me! Added some nitpicky comments 🙈🙈. Amazing job Gaspar 🧡🧡🧡🧡

begumcig · 2025-12-04T10:20:57Z

+            pruna_logger.warning(
+                "You are using torchao with torch.compile. "
+                "Please set `smash_config['torch_compile_mode']='max-autotune-no-cudagraphs'` for best results; "
+                "otherwise you may encounter undesirable outcomes."


what's the undesirable outcomes here

I don't know, the content of this function was already in the file I just isolated the warnings from the apply logic. Maybe @johannaSommer knows more about this undesirable outcome?

begumcig · 2025-12-04T10:28:40Z

-    # save the rest of the model, if it is a janus like model,
-    # and add a config file to the quantized model path.
-    if hasattr(model, "model") and hasattr(model.model, "language_model"):
+            model.model.save_pretrained = original_save_pretrained


shall we wrap this in a try finally block so that if even saving fails we revert the saving back to original_save_pretrained? It might be confusing for users if they try to save the model again.

That's a super good point. I think a context is a clearer and cleaner way to factorize this code out of here for clarity in this already complicated saving function

llcnt

Thanks for the PR and big congrats for the amazing job on hqq saving/loading ;)
The hqq with target modules on a LLMpipeline is not working currently, I will let you fix this before approving the PR!

llcnt

Thanks for the fixes :)

begumcig

Amazing Gaspar just left one comment about the device attribute, maybe we should talk about it, wdyt? Feel free to ignore it If I am making no sense! Everything looks great overall! Thank you!!

begumcig · 2025-12-15T10:34:32Z

    if safe_is_instance(model, Pipeline):
        move_to_device(model.model, device, raise_error, device_map)
+        if device != "accelerate":
+            model.device = torch.device(device)


Could this possibly cause problems if the model doesn't have a device attr? or it's a property that cannot be set directly?

In this if block we can assume model is a transformers Pipeline, which relies on getting and setting its device attribute (I checked transformers 4.42, 4.57 and 5.0.0rc1 and they all set self.device = ... in their init. If you want I can add a try..except but it may be better to explicitely fail if there is unknown behavior so we can handle it. WDYT?

Sounds great to me! Thank you so much for checking <3

* feat: add target modules to torchao * feat: add target modules to awq * feat: add target modules to hqq * feat: add and use target backbone for default target modules * feat: extend hqq-diffusers save and load to handle target module * feat: add monkey patching context

gsprochette added 12 commits December 2, 2025 17:32

feat: add target modules to torchao

3563ecd

feat: add target modules to awq

4ffa869

feat: add target modules to hqq

5f7cc8c

feat: add target backbone for default target modules

7faa095

feat: extend hqq-diffusers save and load to handle target module

f74f804

fix: wan save and load for multi-transformer testing

7c0c69c

feat: [incomplete] update hqq save and load for target modules

45f3508

feat: add target-modules-compatible saving and loading for hqq

e576d3d

fix: torchao default target modules

0970153

fix: import annotations

93cc47a

feat: use target_backbone utility function

b718d19

feat: use target_backbone in already target-modules-capable quantizers

7f390cf

gsprochette requested review from begumcig and llcnt December 4, 2025 09:08

cursor bot reviewed Dec 4, 2025

View reviewed changes

gsprochette added 7 commits December 4, 2025 10:25

docs: add missing output description in hqq_diffusers warning function

6ebd541

fix: don't prevent fallback in hqq

9dedac0

fix: skip setting pipeline device in accelerate case

c1dcb48

fix: raise error when load_hqq does not recognize the dir structure

71f4c20

fix: remove outdated janus case in target backbone

d0456bc

fix: random unclear comment, name and type hint

a63e04b

fix: return docstring of warn function in hqq diffusers for real

093c253

begumcig requested changes Dec 4, 2025

View reviewed changes

llcnt requested changes Dec 5, 2025

View reviewed changes

Comment thread src/pruna/algorithms/hqq.py

Comment thread src/pruna/algorithms/hqq_diffusers.py

Comment thread src/pruna/engine/load.py

Comment thread src/pruna/engine/utils.py

gsprochette added 4 commits December 5, 2025 16:46

docs: improve hqq docstrings and comments

da8b682

feat: clean and protect save_pretrained monkey patching

398b317

fix: hqq pipeline saving and loading

945738d

fix: remove redundant tokenizer saving

8518caf

gsprochette requested review from begumcig and llcnt December 5, 2025 17:36

fix: docstring of monkeypatch utility function

fba2a9e

llcnt approved these changes Dec 12, 2025

View reviewed changes

begumcig approved these changes Dec 15, 2025

View reviewed changes

gsprochette merged commit 24bc357 into main Dec 15, 2025
6 checks passed

gsprochette deleted the feat/add-target-modules-to-quantizers branch December 15, 2025 13:57

Conversation

gsprochette commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Type of Change

How Has This Been Tested?

Checklist

Additional Notes

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

begumcig left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

begumcig Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

gsprochette Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

begumcig Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

gsprochette Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llcnt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

llcnt left a comment

Choose a reason for hiding this comment

Uh oh!

begumcig left a comment

Choose a reason for hiding this comment

Uh oh!

begumcig Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

gsprochette Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

begumcig Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gsprochette commented Dec 4, 2025 •

edited

Loading