JIT ARM64: Add IF_SVE_DT_3A, IF_SVE_DU_3A, IF_SVE_DX_3A, IF_SVE_DY_3A#96201
JIT ARM64: Add IF_SVE_DT_3A, IF_SVE_DU_3A, IF_SVE_DX_3A, IF_SVE_DY_3A#96201amanasifkhalid merged 12 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsPart of #94549. cc @dotnet/arm64-contrib, @a74nh. JitDisasm output: cstool output: Notice that for the last few instructions with vector length specifiers, cstool prints the predicate number slightly differently from how we currently do it in the JIT (e.g.
|
I don't think cstool output is correct. They are printing the register name in hex. @TIHan something to fix. |
I can update my changes to use the same aliases as cstool. I'm guessing we'd have to split some of the |
Generally we should be using the preferred alias, as it's intended as easier to read. What we've been doing for other instructions with aliases is just to change For runtime/src/coreclr/jit/emitarm64.cpp Line 8842 in 8bd23f0 For runtime/src/coreclr/jit/emitarm64.cpp Line 9987 in 8bd23f0 |
|
However, having said that about aliases, I don't think that's the issue here. The difference in the output is, for example, Looking at Which suggests something is going wrong in the coreclr encoding. Looking at
/*static*/ emitter::code_t emitter::insEncodeReg_P_2_to_0(regNumber reg)
{
assert(isPredicateRegister(reg));
emitter::code_t ureg = (emitter::code_t)reg - (emitter::code_t)REG_P0;
assert((ureg >= 8) && (ureg <= 15));
return (ureg - 8 ) << 0;
}Looking at the autogenerated code, |
|
@a74nh thank you for the fix! Changing I'm trying to determine the best way to differentiate these instruction formats in Another option is to not split up the instructions, but to (mis-)use existing |
I hit the same issue with a previous PR, but managed to reduce the insopts value for that one. I was a little concerned we'd hit it again.
Ideally I'd rather avoid this. But we might have to eventually.
Reducing One way would be to turn it from an enum into flags: But that's a lot of refactoring code and we might still run out. So, probably not. There should be no overlap the non-SVE and SVE values. So we should simply be able to restart the enum at 1: So, |
Good point, thank you for the suggestion! I had to refactor and the (sanitized) cstool output: The only diffs I see locally are the different predicate register formats at the end (e.g. |
a74nh
left a comment
There was a problem hiding this comment.
LGTM now. Thanks!
(note: I'm away until new year now).
|
Thank you for the review, and enjoy your holiday! |
kunalspathak
left a comment
There was a problem hiding this comment.
Added few questions and suggestions.
src/coreclr/jit/instr.h
Outdated
| INS_OPTS_SCALABLE_B, | ||
| // There should be no overlap between non-SVE and SVE values, | ||
| // so reset value to 1 here | ||
| INS_OPTS_SCALABLE_B = 1, |
There was a problem hiding this comment.
We should move the INS_OPTS_SCALABLE* below INS_OPTS_ALIGN, so there is no gap between INS_OPTS_2D and INS_OPTS_MSL.
src/coreclr/jit/codegenarm64test.cpp
Outdated
| INS_OPTS_SCALABLE_H); // FRECPX <Zd>.<T>, <Pg>/M, <Zn>.<T> | ||
| theEmitter->emitIns_R_R_R(INS_sve_fsqrt, EA_SCALABLE, REG_V6, REG_P6, REG_V6, | ||
| INS_OPTS_SCALABLE_S); // FSQRT <Zd>.<T>, <Pg>/M, <Zn>.<T> | ||
| INS_OPTS_SCALABLE_S); /* FSQRT <Zd>.<T>, <Pg>/M, <Zn>.<T> */ |
There was a problem hiding this comment.
| INS_OPTS_SCALABLE_S); /* FSQRT <Zd>.<T>, <Pg>/M, <Zn>.<T> */ | |
| INS_OPTS_SCALABLE_S); // FSQRT <Zd>.<T>, <Pg>/M, <Zn>.<T> |
src/coreclr/jit/emitarm64.h
Outdated
|
|
||
| inline static bool isHighPredicateRegister(regNumber reg) | ||
| { | ||
| return (reg > REG_PREDICATE_LOW_LAST) && (reg <= REG_PREDICATE_LAST); |
There was a problem hiding this comment.
probably good to introduce REG_PREDICATE_HIGH_FIRST and REG_PREDICATE_HIGH_LAST and use it here. It is more explicit that way.
There was a problem hiding this comment.
and a static_assert_no_msg(REG_PREDICATE_HIGH_LAST == REG_PREDICATE_LAST).
| str = "8b"; | ||
| break; | ||
| case INS_OPTS_16B: | ||
| str = "16b"; |
There was a problem hiding this comment.
any reason why these were added in emitDispArrangement?
There was a problem hiding this comment.
Those cases were already in emitDispArrangement, I think the diff just makes it look like they were added in. I moved all the SVE-specific cases to emitDispSveArrangement, and left the remaining cases in emitDispArrangement as-is.
| case INS_OPTS_SCALABLE_B_WITH_PREDICATE_MERGE: | ||
| str = "b"; | ||
| break; | ||
| case INS_OPTS_4H: |
There was a problem hiding this comment.
does removing these have any implications on existing instructions that we added so far? Can you try to compare the output of all instructions jitdisasm vs. cstool with this change?
There was a problem hiding this comment.
Can you try to compare the output of all instructions jitdisasm vs. cstool with this change?
Sure thing. I'll post a comment with diffable outputs from both.
|
I created ugly but diffable JitDisasm/cstool outputs for the SVE instructions we have so far. I don't see any diffs locally, except for the different predicate register encodings we noted above. |
Thanks for doing this. |
|
Looks like there are several test failures because of moving around the enum ordering. |
Thank you for pointing that out. Those asserts I added -- e.g. |
|
I must say that although the case INS_OPTS_SCALABLE_B:
case INS_OPTS_SCALABLE_WIDE_B:
case INS_OPTS_SCALABLE_B_WITH_SIMD_SCALAR:
case INS_OPTS_SCALABLE_B_WITH_SIMD_VECTOR:
case INS_OPTS_SCALABLE_B_WITH_SCALAR:
case INS_OPTS_SCALABLE_B_WITH_PREDICATE_MERGE:
return EA_1BYTE;Of course, the alternative to increase the size of |
I like this idea of separating SVE-specific logic out into separate functions. All the new SVE-specific
Sure thing. I initially tried this strategy, though I ran into some weird build issues from the static |
This is the crucial part - in some cases, once the group is decided, this information in the I'll write a quick PR that introduces a |
See #96692 |
I will wait for this to merge before reviewing this PR. |
This is now merged. All your |
src/coreclr/jit/instr.h
Outdated
|
|
||
| // There should be no overlap between non-SVE and SVE values, | ||
| // so reset value to 1 here | ||
| INS_OPTS_SCALABLE_B = 1, |
There was a problem hiding this comment.
we don't need resetting anymore. also, you can pretty much revert changes done in enum insOpts
| return sopt == INS_SCALABLE_OPTS_NONE; | ||
| } | ||
|
|
||
| inline static bool insScalableOptsWithPredicatePair(insScalableOpts sopt) |
There was a problem hiding this comment.
looks like we missed the summary header for newer SVE methods we added recently. do you mind adding those?
| { | ||
| code = insEncodeSveElemsize_dtype(ins, optGetSveElemsize(id->idInsOpt()), code); | ||
| } | ||
|
|
a74nh
left a comment
There was a problem hiding this comment.
LGTM (once other review comments are fixed up)
|
Thank you for the reviews! @kunalspathak I've addressed your feedback. |

Part of #94549. cc @dotnet/arm64-contrib, @a74nh.
JitDisasm output:
cstool output:
Notice that for the last few instructions with vector length specifiers, cstool prints the predicate number slightly differently from how we currently do it in the JIT (e.g.
pn0xA.hvsp10.h). Should we match cstool's formatting exactly, or is this unimportant in this case?