JIT: stopping preference for ML CSE heuristic#98063
Conversation
Some initial attempts to improve the modelling of "stopping early" when doing CSEs. This adds a simplistic register pressure estimate modelled on the one we have now, where we find the weight of the Nth tracked local and use that as a reference weight for deciding when a CSE might start becoming costly. Contributes to dotnet#92915.
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsSome initial attempts to improve the modelling of "stopping early" when doing CSEs. This adds a simplistic register pressure estimate modelled on the one we have now, where we find the weight of the Nth tracked local and use that as a reference weight for deciding when a CSE might start becoming costly. Contributes to #92915.
|
|
@EgorBo PTAL Not as effective as I'd hoped. I think I may actually need multiple "stopping" choices to represent different strategies for stopping early and to let ML tune them (analogous perhaps to the current "aggressive, moderate, conservative" thresholds), but that will take more work as there is a strong bias in the current code that there's just one way to stop. |
Diff results for #98063Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64Overall (+0.01% to +0.04%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for linux/x64 ran on windows/x64Overall (+0.01% to +0.03%)
FullOpts (+0.02% to +0.03%)
Throughput diffs for osx/arm64 ran on windows/x64Overall (+0.01% to +0.03%)
FullOpts (+0.02% to +0.03%)
Throughput diffs for windows/arm64 ran on windows/x64Overall (+0.01% to +0.03%)
MinOpts (-0.00% to +0.01%)
FullOpts (+0.02% to +0.03%)
Throughput diffs for windows/x64 ran on windows/x64Overall (+0.01% to +0.03%)
FullOpts (+0.02% to +0.03%)
Details here Throughput diffs for windows/x86 ran on linux/x86Overall (+0.02% to +0.03%)
FullOpts (+0.02% to +0.03%)
Details here |
Diff results for #98063Throughput diffsThroughput diffs for linux/arm64 ran on linux/x64Overall (+0.02% to +0.04%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for linux/x64 ran on linux/x64Overall (+0.01% to +0.03%)
FullOpts (+0.02% to +0.03%)
Details here Throughput diffs for linux/arm ran on windows/x86Overall (+0.02% to +0.04%)
FullOpts (+0.02% to +0.04%)
Details here |
|
|
||
| // Stopping "parameter" | ||
| // | ||
| m_registerPressure = CNT_CALLEE_TRASH + CNT_CALLEE_SAVED; |
There was a problem hiding this comment.
does it need to take floating points into account here? (e.g. CNT_CALLEE_SAVED_FLOAT)
There was a problem hiding this comment.
Historically we haven't, and there are very few FP cses (maybe 1% of all candidates).
Diff results for #98063Throughput diffsThroughput diffs for linux/arm64 ran on linux/x64Overall (+0.02% to +0.04%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for linux/x64 ran on linux/x64Overall (+0.01% to +0.03%)
FullOpts (+0.02% to +0.03%)
Throughput diffs for osx/arm64 ran on linux/x64Overall (+0.01% to +0.03%)
FullOpts (+0.02% to +0.03%)
Throughput diffs for windows/arm64 ran on linux/x64Overall (+0.01% to +0.04%)
MinOpts (-0.01% to +0.00%)
FullOpts (+0.02% to +0.04%)
Details here |
Diff results for #98063Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64Overall (+0.01% to +0.04%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for windows/x64 ran on windows/x64Overall (+0.01% to +0.03%)
FullOpts (+0.02% to +0.03%)
Details here Throughput diffs for linux/arm ran on windows/x86Overall (+0.02% to +0.04%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for windows/x86 ran on windows/x86Overall (+0.02% to +0.03%)
FullOpts (+0.02% to +0.03%)
Details here |
Some initial attempts to improve the modelling of "stopping early" when doing CSEs. This adds a simplistic register pressure estimate modelled on the one we have now, where we find the weight of the Nth tracked local and use that as a reference weight for deciding when a CSE might start becoming costly.
Contributes to #92915.