fix: `device_map` specification for `accelerate`-compatible quantizers by johannaSommer · Pull Request #226 · PrunaAI/pruna

johannaSommer · 2025-07-01T15:56:39Z

Description

This PR fixes a small bug in the device map specification in pruna, namely, the device map should be specified as {"":"cuda:0"} instead of {"":"cuda"}. Otherwise, the resulting model will work at inference time but will no longer be compatible with other accelerate functionality.

Related Issue

None.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Reran all algorithms tests in pruna (due to change in the load-function for transformers models), including the accelerate tests for both quantizers.

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Additional Notes

At the moment we assume throughout the pruna repository that the device index is 0, hence the changes in this PR. We will update this soon and be compatible with specifying the device index when using only "cuda".

Co-authored-by: simlang <simon.langrieger@pruna.ai>

Co-authored-by: simglang <simon.langrieger@pruna.ai>

gsprochette

Looks great ! Thanks for the improvement and for future-proofing pruna :)
While we're on this subject I think the get_device function in pruna.engine.utils could be improved:

most important: the return_device_map is only handled in the case of the hf_device_map check at the end, but the model_device.type does not seem to be a valid device_map as suggested by the argument name and docstring. This will probably cause issues in the future.
readability: the first if statement would be slightly cleaner with the else case first
readability: the initial model_device is overwritten at the end, there should maybe be a if..else to make the different cases more explicit

We can make this into a separate PR but the first point seems related.

simlang

Changes so far look good to me, thanks for fixing it so fast! 👍
Will approve, depending on how we handle @gsprochette comment 🙂

gsprochette · 2025-07-07T09:24:59Z

+        else:
+            return model.hf_device_map[subset_key]
+    else:
+        device = "cuda:0" if model_device == "cuda" else "cpu"


There should probably be mps case here

simlang

Looks good! Very clean now 🫧🧼

gsprochette

Looks absolutely perfect, thanks for all the "just this last update" iterations it's so clean now :)

johannaSommer and others added 2 commits July 1, 2025 15:55

fix: accelerate-compatible quantizers

b59cbfb

Co-authored-by: simlang <simon.langrieger@pruna.ai>

fix: device_map specification for loading transformers models

570ec9e

Co-authored-by: simglang <simon.langrieger@pruna.ai>

johannaSommer requested review from gsprochette and simlang July 1, 2025 15:56

fix: brackets for diffusers_int8 device

216c71a

gsprochette reviewed Jul 2, 2025

View reviewed changes

simlang reviewed Jul 2, 2025

View reviewed changes

refactor: introduce get_device_map function

9e43a9f

johannaSommer requested review from gsprochette and simlang July 7, 2025 07:55

gsprochette reviewed Jul 7, 2025

View reviewed changes

simlang reviewed Jul 7, 2025

View reviewed changes

Comment thread src/pruna/engine/load.py

simlang approved these changes Jul 7, 2025

View reviewed changes

johannaSommer added 2 commits July 7, 2025 14:43

fix: device map construction for CPU/MPS

c3efd99

fix: update get_device return type annotation

1385f48

gsprochette self-requested a review July 7, 2025 14:49

gsprochette approved these changes Jul 7, 2025

View reviewed changes

johannaSommer merged commit 41a1ad1 into main Jul 7, 2025
6 checks passed

johannaSommer deleted the fix/quantizer-device-casting branch July 7, 2025 15:20

johannaSommer mentioned this pull request Jul 8, 2025

fix: device placement with indexed devices #205

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: `device_map` specification for `accelerate`-compatible quantizers#226

fix: `device_map` specification for `accelerate`-compatible quantizers#226
johannaSommer merged 6 commits intomainfrom
fix/quantizer-device-casting

johannaSommer commented Jul 1, 2025 •

edited by gsprochette

Loading

Uh oh!

gsprochette left a comment •

edited by simlang

Loading

Uh oh!

simlang left a comment

Uh oh!

gsprochette Jul 7, 2025

Uh oh!

Uh oh!

simlang left a comment

Uh oh!

gsprochette left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

johannaSommer commented Jul 1, 2025 • edited by gsprochette Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Type of Change

How Has This Been Tested?

Checklist

Additional Notes

Uh oh!

gsprochette left a comment • edited by simlang Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simlang left a comment

Choose a reason for hiding this comment

Uh oh!

gsprochette Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

simlang left a comment

Choose a reason for hiding this comment

Uh oh!

gsprochette left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

johannaSommer commented Jul 1, 2025 •

edited by gsprochette

Loading

gsprochette left a comment •

edited by simlang

Loading