fix: make qkv compatible with torch.compile in next diffusers release by llcnt · Pull Request #302 · PrunaAI/pruna

llcnt · 2025-08-13T16:42:50Z

Description

The current main branch in diffusers is undergoing a large refactorization for attention computation.
The attention processors still exist but Flux and Wan have now a local version of their own processor. Also qkv fusing was changed: it is now taking place in the AttentionMixin class (not anymore in the transformer class itself).
Sister PR in pruna_pro is here.

Related Issue

The combination qkv_diffusers+torch_compile is taking forever to compute on the latest diffusers codebase

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Tested on Flux-dev with diffusers==0.34.0 and diffusers==0.35.0dev0
On Flux-dev, the generation time goes from 10.44s (original) to 5.12s (with qkv_diffusers+fp8+torch_compile).
The warm-up time (first inference) takes 56.16s.

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Additional Notes

…lcontext

johnrachwan123

LGTM thanks a lot!

nifleisch

Looks good to me!

llcnt added 2 commits August 13, 2025 16:35

feat: adapt check fn and add single attn processor to all layers

ad7cb33

feat: add pipeline attribute in model check to make exit work in mode…

96fef34

…lcontext

llcnt requested review from johnrachwan123 and nifleisch August 14, 2025 07:58

llcnt marked this pull request as ready for review August 14, 2025 08:03

johnrachwan123 approved these changes Aug 14, 2025

View reviewed changes

nifleisch approved these changes Aug 20, 2025

View reviewed changes

llcnt merged commit 4ca2f1c into main Aug 20, 2025
6 of 7 checks passed

llcnt mentioned this pull request Dec 5, 2025

feat: add target modules to quantizers #452

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: make qkv compatible with torch.compile in next diffusers release#302

fix: make qkv compatible with torch.compile in next diffusers release#302
llcnt merged 2 commits intomainfrom
fix/qkv_not_compilable_new_diffusers

llcnt commented Aug 13, 2025 •

edited

Loading

Uh oh!

johnrachwan123 left a comment

Uh oh!

nifleisch left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

llcnt commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Type of Change

How Has This Been Tested?

Checklist

Additional Notes

Uh oh!

johnrachwan123 left a comment

Choose a reason for hiding this comment

Uh oh!

nifleisch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

llcnt commented Aug 13, 2025 •

edited

Loading