Handle more than 64 registers - Part 1#101950
Conversation
|
@dotnet/jit-contrib |
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
|
for windows/arm64 crossgen2 collection, here is the distribution. Will take a look |
|
The culprit was using |
src/coreclr/jit/target.h
Outdated
| FORCEINLINE explicit operator unsigned int() const | ||
| { | ||
| return (unsigned int)low; | ||
| } |
There was a problem hiding this comment.
This one we ought to remove as well.
There was a problem hiding this comment.
It is currently used only at one place. Ideally, all the *refRegs should be just gpr register set of size 4 bytes, but do not want to mix that change in this PR.
if (ig->igFlags & IGF_BYREF_REGS)
{
// Record the byref regs in front the of the instructions
*castto(id, unsigned*)++ = (unsigned)emitInitByrefRegs;
}
src/coreclr/jit/compiler.hpp
Outdated
| #ifdef TARGET_ARM64 | ||
| regNumber regNum = (regNumber)BitScanForward(mask); | ||
| #else | ||
| regNumber regNum = (regNumber)BitOperations::BitScanForward(mask); | ||
| #endif |
There was a problem hiding this comment.
It would be good to avoid these ifdefs by introducing some forwarding BitScanForward in the typedef unsigned __int64 regMaskTP case.
Yeah, seems to just be various MSVC regressions due to now having a struct instead of primitive type. Clang seems to do a little bit better. Not much we can do about that I think. |
|
We might consider switching Alternatively we could move extra operations to live on some |
I don't mind doing it and was advocate of similar idea back in #98258 because with that, in future, when we add APX support for Intel, enabling the "handling of 64 registers". Edit: With that said, given that #98258 is already out for couple of months now and it is a critical work that is needed to make other progress for SVE, I would like to concentrate on enabling it with as minimal work as needed (just for arm64) and have it enable for other platforms as a follow up PRs once we complete the "predicate register" work. |
src/coreclr/jit/target.h
Outdated
| FORCEINLINE void setLow(uint64_t _low) | ||
| { | ||
| low = _low; | ||
| } |
There was a problem hiding this comment.
I don't think we should have this one (can just assign the full regMaskTP instead)
There was a problem hiding this comment.
Feel free to include it in the follow up
There was a problem hiding this comment.
(FWIW, it seems like bad encapsulation to have low accessible outside this class at all)
There was a problem hiding this comment.
I totally agree with you and was not a fan of having this in the last commit. May be I should just try assigning the whole regMaskTP.
|
/azp run runtime-coreclr superpmi-diffs |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run runtime-coreclr superpmi-replay |
|
Azure Pipelines successfully started running 1 pipeline(s). |
* Convert regMaskTP for ARM64 to struct with single field * Fix genFirstRegNumFromMaskAndToggle() and genFirstRegNumFromMask() * minor fix * review feedback * fix the TP regression from 1.5% -> 0.5% * Pass by value * jit format * review feedback * Remove FORCEINLINE * Remove setLow()


The general feedback for #98258 was to come up with smaller PRs concentrated around LSRA. This is part 1 of that.
For Arm64, this PR changes the
typedef unsigned __int64 regMaskTPto `typedefA version of
PopCountandBitOperationshas been added next toregMaskTPstruct definition.Most of the method implementation is pulled from #96196.