RyuJIT: Remove redundant memory barrier for XAdd and XChg on arm#45970
Merged
sandreenko merged 3 commits intodotnet:masterfrom Dec 28, 2020
Merged
RyuJIT: Remove redundant memory barrier for XAdd and XChg on arm#45970sandreenko merged 3 commits intodotnet:masterfrom
sandreenko merged 3 commits intodotnet:masterfrom
Conversation
EgorBo
commented
Dec 12, 2020
| GetEmitter()->emitIns_R_R_R(INS_ldaddal, dataSize, dataReg, targetReg, addrReg); | ||
| } | ||
| GetEmitter()->emitIns_R_R_R(INS_ldaddal, dataSize, dataReg, (targetReg == REG_NA) ? REG_ZR : targetReg, | ||
| addrReg); |
Member
Author
There was a problem hiding this comment.
NOTE: if targetReg is REG_NA (means we don't care about Interlocked.Add's return value) it uses WZR as a target register (it ignores writes and is always zero)
Member
The trailing With |
Member
|
LGTM |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
As far as I understand that memory barrier is not needed when we already emit LDADDAL and SWPAL with "acquire and release" semantics.
UPD: same for CASAL (emitted for
Interlocked.CompareExchange)For this reason I also replaced
staddlwithldaddalfor the case when we don't need the return value ofInterlocked.Add.Just like LLVM does: https://godbolt.org/z/a9GcT8
Here are the diff examples:
G_M16300_IG01: A9BF7BFD stp fp, lr, [sp,#-16]! 910003FD mov fp, sp G_M16300_IG02: B8E10000 ldaddal w1, w0, [x0] - D5033BBF dmb ish 0B010000 add w0, w0, w1 G_M16300_IG03: A8C17BFD ldp fp, lr, [sp],#16 D65F03C0 ret lrG_M24897_IG01: A9BF7BFD stp fp, lr, [sp,#-16]! 910003FD mov fp, sp G_M24897_IG02: B8E18000 swpal w1, w0, [x0] - D5033BBF dmb ish G_M24897_IG03: A8C17BFD ldp fp, lr, [sp],#16 D65F03C0 ret lr/cc @dotnet/jit-contrib