Rework Model Context by simlang · Pull Request #323 · PrunaAI/pruna

simlang · 2025-08-28T12:23:36Z

Description

This PR refactors the ModelContext abstraction
Before the ModelContext used the incoming pipeline as a storage, for the smashed model, now the context itself is returned to get resources from.
Using pipeline as storage device, lead to problems when e.g. using the combination hqq+torch.compile

The changes include:

instead of providing: pipeline, working model and denoiser_type, only provide model_context and working_model (denoiser_type was only used by one algorithm and can now be accessed via model_context.denoiser_type if needed)
at end of smashing, set_smashed_working_model(smashed_model) has to be called to tell the context, that the working model has changed
on context exit, the internal state of the pipeline/model given to the context is updated, but only if set_smashed_working_model has been called before - otherwise nothing happens. This allows us to use the context also, if the working model is not adapted
since the model given to the context might be immutable (if it's the working model and not a pipeline) we can't directly change it - this means to get the updated pipeline/model we have to call model_context.get_smashed() to return after smashing

Related Issue

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Locally ran all tests for algorithms which use ModelContext

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Additional Notes

There might be a point of moving the ModelContext to the apply wrapper, to avoid duplicate code

gsprochette · 2025-08-28T13:17:42Z

+    def set_smashed_working_model(self, working_model: Any) -> None:
+        """
+        Set the smashed working model.
+
+        Parameters
+        ----------
+        working_model : Any
+            The smashed working model.
+        """
+        self.smashed_working_model = working_model
+
+    def get_smashed(self) -> "ModelMixin":
+        """
+        Get the smashed model.
+
+        Returns
+        -------
+        ModelMixin
+            The smashed model.
+        """
+        return self.smashed_pipeline


is there a reason for using this instead of mc.smashed_working_model = working_model and mc.smashed_pipeline from outside?

This self.smashed_working_model = working_model was super cryptic to me, when i first read it before. So i add functions to name what is happening - so just readability

I think in all cases the user needs to know what they are doing with the ModelContext, and these setter and getter are adding complexity... Using these variables could be explained within an error raised in the __exit__ if smashed_working_model wasn't set.

i would rename the functions, as with the current names there is not really a difference to just assigning and reading.
i like having this abstraction however, without ever seeing the inside of this context it is hard to understand from the outside what is happening

gsprochette

Good job fixing this ! We're using a context in a weird way so we should spend a minute making it extra clear and extra clean, all my comments go in that direction :)

llcnt

Thanks again for the fix, it is much more clean :)

llcnt · 2025-08-28T13:30:00Z



-class ModelContext:
+class ModelContext(AbstractContextManager):


What does this AbstractContextManager provide us ? Any reason why we want our mc to depend on it? :)

tbh nothing - just readability that this is a ContextManager - should i remove it?

…ate the working_model

…ead_only , and add read_only to the respective use-cases

…re architecture agnostic, also use them to remove duplicate access logic

gsprochette

That's almost ready to go, I left a couple of comments in the read_only check because it can still be improved. Once this is done, you can merge :)

gsprochette · 2025-08-28T15:06:27Z

        """
        if self.smashed_working_model is None:
-            return
+            if self.read_only:


We could also check if self.read_only and self.smashed_working_model is not None because this is bound to produce a cryptic bug

also good point!

…ntext is wrongly used

simlang added 2 commits August 28, 2025 11:59

refactor: clean up ModelContext

4efff6a

refactor: adapt all algorithms to the new ModelContext

da1be94

simlang added the bug Something isn't working label Aug 28, 2025

simlang requested review from gsprochette and llcnt August 28, 2025 12:24