[Arm64] AdvSIMD LoadPairVector64 and LoadPairVector128#45020
[Arm64] AdvSIMD LoadPairVector64 and LoadPairVector128#45020echesakov wants to merge 18 commits intodotnet:masterfrom
Conversation
…formNotSupported.cs
…dvSimd.cs AdvSimd.PlatformNotSupported.cs
…lr/src/jit/hwintrinsic.cpp
…gentree.cpp src/coreclr/src/jit/gentree.h src/coreclr/src/jit/lsra.cpp
…coreclr/src/jit/hwintrinsicarm64.cpp
…r/src/jit/hwintrinsiccodegenarm64.cpp
…alues in multiple registers in src/coreclr/src/jit/lsra.h src/coreclr/src/jit/lsraarm64.cpp src/coreclr/src/jit/lsraxarch.cpp
… src/tests/JIT/HardwareIntrinsics/Arm/Shared/Helpers.tt
|
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
| #endif | ||
| } | ||
| } | ||
| else if (src->OperIsHWIntrinsic()) |
There was a problem hiding this comment.
@CarolEidt I think I need to implement a similar logic on Arm64 to avoid store-reload of returned SIMD values for ldp/ldnp as in the following example:
2C404410 ldnp s16, s17, [x0]
FD000FB0 str d16, [fp,#24]
FD0013B1 str d17, [fp,#32]
FD400FA8 ldr d8, [fp,#24]
FD4013A9 ldr d9, [fp,#32]Is it correct understanding?
There was a problem hiding this comment.
Yes, I believe that's correct. However, this path is only for the case where you wind up with a STORE_BLK, as opposed to a STORE_LCL_VAR. If the lhs is a multi-reg lclVar, you should have the latter.
|
// Auto-generated message 69e114c which was merged 12/7 removed the intermediate src/coreclr/src/ folder. This PR needs to be updated as it touches files in that directory which causes conflicts. To update your commits you can use this bash script: https://gist.github.com/ViktorHofer/6d24f62abdcddb518b4966ead5ef3783. Feel free to use the comment section of the gist to improve the script for others. |
|
Draft Pull Request was automatically closed for inactivity. It can be manually reopened in the next 30 days if the work resumes. |
This is combined work:
LoadPairVector64andLoadPairVector128([Arm64] LoadPairVector64 and LoadPairVector128 #39243).Background: Based on our discussion with Carol we decided to include her changes that enable support for hardware intrinsics multiple registers return value in #37928 minus
Bmi2.MultiplyNoFlags2related changes (this would require an approval of #44926 in an API review meeting) as a part of this PR.Fixes: #39243