feat: add unconstrained hyperparameter by gsprochette · Pull Request #263 · PrunaAI/pruna

gsprochette · 2025-07-18T16:13:22Z

Description

Separate smash_space and custom hyperparameters, and add a new hyperparameter class:

the user can give any value to this parameter in smash_config
ConfigSpace can still handle the dependencies between hyperparameters

Related Issue

Fixes #(issue number)

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Additional Notes

sharpenb

Thanks for this PR! This is a really important one :)

src/pruna/config/hyperparameters.py

sharpenb

Thanks for addressing my comments! Left some last points but this is ready to be merged ;) A question is which PR will include the small tutorial? The users will like it!

sharpenb · 2025-08-15T12:09:43Z

src/pruna/algorithms/quantization/quanto.py

I was wondering if we could get rid of this 0 index by removing () in the return of get_default_hyperparameters :)

Sure, I loosened the return type of the method in PrunaAlgorithmBase so that I don't have to return a tuple. In the future if we need to return multiple defaults we can still make it a tuple for other algorithms.

src/pruna/config/hyperparameters.py

johannaSommer

Sensational work!

Outside of the comments above, I feel one thing that we are missing is informing the user about which Target Modules they are selecting (because who can really write correct regex/fnmatch patterns lol). I think what I would be expecting is that either in a seperate function before smash, or better, in the to_list_of_module_paths function, we add some significant logging.
Some ideas but you know best:

raise an error if a pattern (exclude/include) doesnt find any hits? or at least log some error?
raise a controlled/informative error if you are requesting some module that doesnt exist?
at either "INFO" or possibly "DEBUG" level log the modules we are actually selecting. I want to avoid that the user misspecifies the pattern and e.g. the algo isnt applied at all.

johannaSommer · 2025-08-20T15:20:13Z

src/pruna/algorithms/pruna_base.py

I think this naming can be a bit confusing because i would expect this to return e.g. also the default of the other hyperparameters like cache_interval=2 or so. Something like get_user_specified_hyperparameter_default (ok thats way too long but something along those lines?

Makes sense. All the hyperparameters are user-specified so I changed it to get_unconstrained_hyperparameter_defaults to reference the fact that is only gets the defaults for hyperparameters which are instances of UnconstrainedHyperparameter. It's long but making it shorter would make it less explicit...

src/pruna/algorithms/quantization/quanto.py

johannaSommer · 2025-08-20T15:24:32Z

src/pruna/algorithms/quantization/quanto.py

We should use the functionality as defined above here. I know you can't directly use the target module result here but we should use it in some capacity so that the logic for quanto is centralized if we adjust this in the future (probs we wont adjust it but you get me, for duplication reasons)

In fact working model has no place here, it can simply be the model itself when it's a LLM. For diffusers, the if statement should ignore that anyway (although you can technically add a tokenizer in the smash config after the dataloader and get an error here because it doesn't run).

In all cases, we can't use parts of the model here because it's called directly on the data, meaning if we get the diffuser's transformer the prompt won't be tokenized.

As you say, we should fix that but this is for an other PR. I'll leave it in the "calibrate LLM only" state as it was before, if you agree.

johannaSommer · 2025-08-20T15:26:12Z

src/pruna/algorithms/quantization/quanto.py

I would say we should only have one loop through the modules_with_subpaths and then to the quanto logic. From what I understand, it makes only sense to freeze it if we actually quantized it

hm nvm we raise an error... Still, maybe it would be clearer to loop once and then apply the algorithm logic inside?

I agree it would be cleaner but we still need to calibrate inbetween. Calibration needs the whole quantization to already be done since we need to run the full model, meaning we can't really freeze inside the first loop.

Here we only freeze the attributes (e.g. unet or transformer) that contain targeted modules, and the freeze function checks each nn.Module to see if it's been quantized anyway. The only reason for iterating over modules_with_subpath is to guarentee that module is a nn.Module, because for diffusers model is not a nn.Module. I changed it to get_nn_modules which is a simpler function and achieves the same thing. Is it ok for you?

src/pruna/config/hyperparameters.py

review-notebook-app · 2025-08-26T09:56:11Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

johannaSommer

Let's goooooo

src/pruna/config/target_modules.py

simlang · 2025-09-04T12:59:08Z

src/pruna/algorithms/pruna_base.py

This function is both too broad and too narrow. At the moment we only have target_modules as the only unconstrained hyperparameter, so for now we could change this to get_target_module_defaults.
I am however for making it applicable to any future unconstrained hyperparameter. This is not how this function is implemented, or at least not clear by the signature (as the return type is Any, technically everything is supported)
In the case where there is a future second unconstrained hyperparam, we don't know which default belongs to what.
-> change the return type to Dict[str, Any] - so for any hyperparameter we can get a default, but indexed by the param it concerns

Is there an argument to extend it to something like get_model_dependent_hyperparameter_defaults - why keep it constrained to the unconstrained hyperparameters? This way any parameter which might need information from the model/SmashConfig can get a default

I'm torn, because on one hand I agree with you and I do think a dictionary would be the better option, both general and very readable.
On the other hand, this will probably most ofter be used only for target modules, in which case it's much clearer to just return default_target_modules instead of {"target_modules": default_target_modules}.

Note that this is why the return type is Any currently, you can handle both: returning just the default hyperparam if there is a single one, and return a dict if there are multiple. The method in child classes can always make the return type more specific. I didn't want to force a heavy TypedDict class every time we redefine this method.

I agree with the get_model_dependent_hyperparameter_defaults, it's a better name.

As a conclusion: I'm in favor of

renaming the method

keep the signature as is to leave more flexibility

Refactoring this return type wouldn't be too much of a problem anyway.

simlang · 2025-09-04T13:08:19Z

src/pruna/algorithms/quantization/quanto.py

This is unnecessary duplicate code in every _apply.
I would be for moving this into the apply wrapper. Connected to the proposition to change the return type of get_unconstrained_hyperparameter_defaults.
In pruna_base.apply before calling _apply add something like:

defaults = self.get_unconstrained_hyperparameter_defaults(model, smash_config) for key in defaults.keys() if smash_config[key] is None: smash_config[key] = defaults[key] self._apply(model, smash_config)

The reason I don't want to do that is that it modifies the smash_config in-place.
In the case where a user would use the same smash_config to smash two different models, the second would have the defaults of the first one, instead of looking for its own default.
We can discuss doing a copy of the smash config at the start of smash, but that would be a refactor PR and not this one.
For now, I personnally prefer the duplicated "check for default None and replace the value" which is pretty standard in python I would say. WDYT?

the in-place problem is a very good point - have not thought of this!
For now it won't hurt anyone, i still think it's cleaner - maybe part of a future algorithm application refactoring

src/pruna/algorithms/quantization/quanto.py

simlang

Very cool! Apart from some naming stuff, I added a proposition to handle the defaults, let me know what you think about that

src/pruna/config/target_modules.py

simlang

LGTM!

cursor · 2025-09-05T07:48:36Z

src/pruna/config/target_modules.py

+            value["include"] = ["*"]
+        elif "exclude" not in value:
+            value["exclude"] = []  # for consistency
+


Bug: Validation Method Mutates Input Dictionary

The TargetModules.legal_value method modifies its input dictionary in-place by adding default 'include' or 'exclude' keys. This unexpected mutation for a validation method can lead to subtle bugs if the dictionary is reused elsewhere in the code.

cursor · 2025-09-05T09:35:31Z

src/pruna/config/target_modules.py

+            value["include"] = ["*"]
+        elif "exclude" not in value:
+            value["exclude"] = []  # for consistency
+


Bug: Inconsistent Defaulting and Parameter Mutation

The legal_value method modifies the input value dictionary in-place by adding default "include" and "exclude" keys. This violates the principle of not mutating input parameters, which can lead to unexpected side effects if the dictionary is reused. Additionally, the defaulting logic is inconsistent, as the "exclude" key might not be added when "include" is added.

gsprochette requested review from johannaSommer and sharpenb July 18, 2025 16:19

sharpenb reviewed Jul 20, 2025

View reviewed changes

src/pruna/config/hyperparameters.py Outdated Show resolved Hide resolved

src/pruna/config/hyperparameters.py Outdated Show resolved Hide resolved

johannaSommer approved these changes Jul 29, 2025

View reviewed changes

PrunaAI deleted a comment from sharpenb Aug 7, 2025

gsprochette marked this pull request as draft August 7, 2025 16:11

gsprochette force-pushed the feat/free-value-parameter branch 2 times, most recently from 49603c2 to 7f16056 Compare August 12, 2025 16:20

gsprochette changed the title ~~feat: add user-editable parameter~~ feat: add unconstrained hyperparameter Aug 12, 2025

gsprochette requested review from johannaSommer and sharpenb August 12, 2025 16:29

gsprochette force-pushed the feat/free-value-parameter branch 3 times, most recently from 29be5b1 to 4a0cff4 Compare August 14, 2025 14:21

sharpenb approved these changes Aug 15, 2025

View reviewed changes

johannaSommer requested changes Aug 20, 2025

View reviewed changes

gsprochette force-pushed the feat/free-value-parameter branch from 2ecc2b6 to a27bdd1 Compare August 25, 2025 13:14

gsprochette requested a review from johannaSommer August 25, 2025 13:17

gsprochette force-pushed the feat/free-value-parameter branch 2 times, most recently from 49b427b to fca277e Compare August 26, 2025 09:56

gsprochette force-pushed the feat/free-value-parameter branch 6 times, most recently from 2430c10 to b77d954 Compare August 27, 2025 16:25

johannaSommer approved these changes Aug 29, 2025

View reviewed changes

gsprochette force-pushed the feat/free-value-parameter branch 2 times, most recently from f495878 to 3fc590d Compare September 3, 2025 10:05

gsprochette force-pushed the feat/free-value-parameter branch 2 times, most recently from cdea28d to 79eff17 Compare September 3, 2025 10:23

gsprochette added 2 commits September 3, 2025 12:05

feat: add target modules hyperparameter and use it in quanto

73c95a2

refactor: change target modules usage to map function

731b716

gsprochette force-pushed the feat/free-value-parameter branch from 79eff17 to 731b716 Compare September 3, 2025 12:07

gsprochette added 2 commits September 3, 2025 16:18

fix: add attribute name in target modules map

3f7dfdd

docs: remove docs to merge separately

deafe8f

gsprochette requested review from johannaSommer and simlang September 4, 2025 07:58