add fa4 and refactor spas attn by STwangyingrui · Pull Request #962 · ModelTC/LightX2V

STwangyingrui · 2026-03-27T16:17:00Z

No description provided.

…rmat.

gemini-code-assist

Code Review

This pull request introduces several new sparse attention mechanisms and operators, including FlashAttention v4 and SageAttention v2/v3, along with corresponding configuration files and utility functions. Key changes include the addition of SparseFlashAttn4Weight, SparseSageAttn2Weight, and SparseSageAttn3Weight, as well as a new SpargeMaskGenerator and a comprehensive sparge_util.py containing Triton kernels for block map generation. Feedback focuses on improving error handling by raising exceptions for unsupported sparse modes instead of just logging them, and refining assertion messages for better clarity and professionalism.

gemini-code-assist · 2026-03-27T16:19:01Z

lightx2v/common/ops/attn/flash_attn.py

+            smooth_k = kt - kt.mean(dim=-2, keepdim=True)
+            sparse_map = get_block_map_meansim(qt, smooth_k, cdfthreshd=None, topk=self.topk, return_lut=False, BLKQ=self.BLKQ, BLKK=self.BLKK)
+        else:
+            logger.info(f"spas_flash_attn4 sparse_mode only support sla_mode and sparge_mode now.")


Using logger.info for an unsupported sparse_mode might lead to silent failures or misconfigurations being overlooked. It would be more robust to either raise a ValueError or use logger.error to clearly indicate an invalid state.

Suggested change

logger.info(f"spas_flash_attn4 sparse_mode only support sla_mode and sparge_mode now.")

raise ValueError(f"Unsupported sparse_mode: {self.sparse_mode}. spas_flash_attn4 sparse_mode only supports 'sla_mode' and 'sparge_mode'.")

gemini-code-assist · 2026-03-27T16:19:01Z

lightx2v/common/ops/attn/sage_attn.py

+            smooth_k = k - k.mean(dim=-2, keepdim=True)
+            sparse_map = get_block_map_meansim(q, smooth_k, cdfthreshd=None, topk=self.topk, return_lut=False, BLKQ=self.BLKQ, BLKK=self.BLKK)
+        else:
+            logger.info(f"spas_sage_attn2 sparse_mode only support sla_mode and sparge_mode now.")


Using logger.info for an unsupported sparse_mode might lead to silent failures or misconfigurations being overlooked. It would be more robust to either raise a ValueError or use logger.error to clearly indicate an invalid state.

Suggested change

logger.info(f"spas_sage_attn2 sparse_mode only support sla_mode and sparge_mode now.")

raise ValueError(f"Unsupported sparse_mode: {self.sparse_mode}. spas_sage_attn2 sparse_mode only supports 'sla_mode' and 'sparge_mode'.")

gemini-code-assist · 2026-03-27T16:19:01Z

lightx2v/common/ops/attn/sage_attn.py

+            smooth_k = k - k.mean(dim=-2, keepdim=True)
+            sparse_map = get_block_map_meansim(q, smooth_k, cdfthreshd=None, topk=self.topk, return_lut=False, BLKQ=self.BLKQ, BLKK=self.BLKK)
+        else:
+            logger.info(f"spas_sage_attn3 sparse_mode only support sla_mode and sparge_mode now.")


Using logger.info for an unsupported sparse_mode might lead to silent failures or misconfigurations being overlooked. It would be more robust to either raise a ValueError or use logger.error to clearly indicate an invalid state.

Suggested change

logger.info(f"spas_sage_attn3 sparse_mode only support sla_mode and sparge_mode now.")

raise ValueError(f"Unsupported sparse_mode: {self.sparse_mode}. spas_sage_attn3 sparse_mode only supports 'sla_mode' and 'sparge_mode'.")

gemini-code-assist · 2026-03-27T16:19:01Z

lightx2v/common/ops/attn/flash_attn.py

+            q, k, v = q.unsqueeze(0), k.unsqueeze(0), v.unsqueeze(0)
+        elif len(q.shape) == 4:
+            bs = q.shape[0]
+        assert bs == 1, "flash_attn4 doesn't support flash_attn_varlen_func now. Just use it for batchsize = 1 for sure."


The assertion message here is a bit informal. Consider making it more professional to clearly communicate the limitation to users or developers.

Suggested change

assert bs == 1, "flash_attn4 doesn't support flash_attn_varlen_func now. Just use it for batchsize = 1 for sure."

assert bs == 1, "FlashAttention v4 currently only supports batch size of 1 for this function."

gemini-code-assist · 2026-03-27T16:19:01Z

lightx2v/common/ops/attn/flash_attn.py

+            q, k, v = q.unsqueeze(0), k.unsqueeze(0), v.unsqueeze(0)
+        elif len(q.shape) == 4:
+            bs = q.shape[0]
+        assert bs == 1, "flash_attn4 doesn't support flash_attn_varlen_func now. Just use it for batchsize = 1 for sure."


The assertion message here is a bit informal. Consider making it more professional to clearly communicate the limitation to users or developers.

Suggested change

assert bs == 1, "flash_attn4 doesn't support flash_attn_varlen_func now. Just use it for batchsize = 1 for sure."

assert bs == 1, "FlashAttention v4 currently only supports batch size of 1 for this function."

STwangyingrui added 3 commits March 27, 2026 11:20

add fa4 and refactor sage2-sparse/sage3-sparse/fa4-sparse

c423a73

Merge branch 'main' into yr/add_fa4_and_refactor_spas_attn, and do fo…

1a1dd30

…rmat.

fix sparsity configs

0f047dc

gemini-code-assist bot reviewed Mar 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add fa4 and refactor spas attn#962

add fa4 and refactor spas attn#962
STwangyingrui wants to merge 3 commits intomainfrom
yr/add_fa4_and_refactor_spas_attn

STwangyingrui commented Mar 27, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 27, 2026

Uh oh!

gemini-code-assist bot Mar 27, 2026

Uh oh!

gemini-code-assist bot Mar 27, 2026

Uh oh!

gemini-code-assist bot Mar 27, 2026

Uh oh!

gemini-code-assist bot Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	logger.info(f"spas_flash_attn4 sparse_mode only support sla_mode and sparge_mode now.")
	raise ValueError(f"Unsupported sparse_mode: {self.sparse_mode}. spas_flash_attn4 sparse_mode only supports 'sla_mode' and 'sparge_mode'.")

	logger.info(f"spas_sage_attn2 sparse_mode only support sla_mode and sparge_mode now.")
	raise ValueError(f"Unsupported sparse_mode: {self.sparse_mode}. spas_sage_attn2 sparse_mode only supports 'sla_mode' and 'sparge_mode'.")

	logger.info(f"spas_sage_attn3 sparse_mode only support sla_mode and sparge_mode now.")
	raise ValueError(f"Unsupported sparse_mode: {self.sparse_mode}. spas_sage_attn3 sparse_mode only supports 'sla_mode' and 'sparge_mode'.")

	assert bs == 1, "flash_attn4 doesn't support flash_attn_varlen_func now. Just use it for batchsize = 1 for sure."
	assert bs == 1, "FlashAttention v4 currently only supports batch size of 1 for this function."

Conversation

STwangyingrui commented Mar 27, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant