Support strided Load/Store in SVE2 #8888

stevesuzuki-arm · 2025-11-27T17:13:02Z

Strided load relies on shuffle_vectors for scalable vector,
which is lowered to llvm.vector.deinterleave.

Enabled LDN/STN test cases only if LLVM >= 22.
Modified test configurations for LDN in simd_op_check_sve2 based on #8819.
Also, fixed the issue #8584

stevesuzuki-arm · 2025-11-27T17:15:10Z

Ready for review, but we need to rebase once #8887 is merged.

alexreinking · 2025-12-06T18:24:10Z

Ready for review, but we need to rebase once #8887 is merged.

I'll review once the diff is clean.

alexreinking · 2025-12-11T16:44:17Z

I just merged #8887. Once you rebase this, I'll review 🙂

Structured load relies on shuffle_vectors for scalable vector, which is lowered to llvm.vector.deinterleave

Correct vector bits in load/store test cases for target with 256 bits vector

stevesuzuki-arm · 2025-12-11T18:30:33Z

Ready for review now

alexreinking · 2025-12-11T19:36:57Z

Failures should be fixed by #8897

With old LLVM (v20, v21), some of the tests of load/store in simd_op_check_sve2 fails due to LLVM crash with the messages: - Invalid size request on a scalable vector - Cannot select: t11: nxv16i8,nxv16i8,nxv16i8 = vector_deinterleave t43, t45, t47

alexreinking

I'm concerned about degrading codegen quality on LLVM<22. Please unskip the check_arm_load_store test and adjust the test and/or the patch.

alexreinking · 2025-12-15T19:47:21Z

test/correctness/simd_op_check_sve2.cpp

+            // ST3       -       Store three-element structures
+            for (int width = base_vec_bits * 3; width <= base_vec_bits * 3 * 2; width *= 2) {


These loops are confusing. I wonder if explicitly listing test cases would be better?

Suggested change

// ST3 - Store three-element structures

for (int width = base_vec_bits * 3; width <= base_vec_bits * 3 * 2; width *= 2) {

// ST3 - Store three-element structures

for (int width : {base_vec_bits * 3, base_vec_bits * 3 * 2}) {

You could also consider iterating over factors.

Suggested change

// ST3 - Store three-element structures

for (int width = base_vec_bits * 3; width <= base_vec_bits * 3 * 2; width *= 2) {

// ST3 - Store three-element structures

for (int factor : {1, 2}) {

int width = base_vec_bits * 3 * factor;

alexreinking · 2025-12-15T19:48:27Z

test/correctness/simd_op_check_sve2.cpp

-        check_arm_load_store();
+        if (Halide::Internal::get_llvm_version() >= 220) {
+            check_arm_load_store();
+        }


Skipping this test is a red flag. We can't degrade codegen quality on supported LLVM versions, especially not across all released LLVM versions. That test (and perhaps this patch) should be adjusted to confirm superior codegen on LLVM >=22 (or more broadly), but it must also confirm the existing behavior on LLVM<22.

Thank you for the feedback. I've updated so that test scope before this PR is kept with old LLVM. The test cases enabled by this PR are performed only with LLVM >= 22.

alexreinking · 2025-12-15T19:49:51Z

test/correctness/simd_op_check_sve2.cpp


        auto ext = Internal::get_output_info(target);
        std::map<OutputFileType, std::string> outputs = {
+            {OutputFileType::stmt, file_name + ext.at(OutputFileType::stmt).extension},


Where is this used?

Sorry, removed

alexreinking · 2025-12-15T20:37:50Z

test/correctness/simd_op_check_sve2.cpp

+                for (int factor = 1; factor <= 4; factor *= 2) {
+                    const int vector_lanes = base_vec_bits * factor / bits;
+
+                    // In StageStridedLoads.cp (stride < r->lanes) is the condition for staging to happen


typo:

Suggested change

// In StageStridedLoads.cp (stride < r->lanes) is the condition for staging to happen

// In StageStridedLoads.cpp (stride < r->lanes) is the condition for staging to happen

- Keep the test scope before this PR in case of old LLVM - Fix issue halide#8584 - Incorporated the review feedbacks

alexreinking requested a review from halidebuildbots December 6, 2025 18:23

stevesuzuki-arm force-pushed the pr-strided_ls branch from e54a3d8 to fd78420 Compare December 8, 2025 15:08

Support strided Load/Store in SVE2

1b02720

Structured load relies on shuffle_vectors for scalable vector, which is lowered to llvm.vector.deinterleave

stevesuzuki-arm force-pushed the pr-strided_ls branch from fd78420 to 1b02720 Compare December 11, 2025 17:40

stevesuzuki-arm marked this pull request as ready for review December 11, 2025 17:42

Modify test cases of load/store in simd_op_check_sve2

7e58a59

Correct vector bits in load/store test cases for target with 256 bits vector

stevesuzuki-arm mentioned this pull request Dec 11, 2025

Shuffle scalable vector in CodeGen_ARM #8898

Open

stevesuzuki-arm added 2 commits December 15, 2025 11:05

Merge branch 'main' into pr-strided_ls

68e0cd2

alexreinking requested changes Dec 15, 2025

View reviewed changes

alexreinking reviewed Dec 15, 2025

View reviewed changes

Fix load/store tests in simd_op_check_sve2

f499310

- Keep the test scope before this PR in case of old LLVM - Fix issue halide#8584 - Incorporated the review feedbacks

		// ST3 - Store three-element structures
		for (int width = base_vec_bits * 3; width <= base_vec_bits * 3 * 2; width *= 2) {

	// In StageStridedLoads.cp (stride < r->lanes) is the condition for staging to happen
	// In StageStridedLoads.cpp (stride < r->lanes) is the condition for staging to happen

Support strided Load/Store in SVE2 #8888

Are you sure you want to change the base?

Support strided Load/Store in SVE2 #8888

Uh oh!

Conversation

stevesuzuki-arm commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stevesuzuki-arm commented Nov 27, 2025

Uh oh!

alexreinking commented Dec 6, 2025

Uh oh!

alexreinking commented Dec 11, 2025

Uh oh!

stevesuzuki-arm commented Dec 11, 2025

Uh oh!

alexreinking commented Dec 11, 2025

Uh oh!

alexreinking left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stevesuzuki-arm commented Nov 27, 2025 •

edited

Loading