JIT: Set naive likelihoods when initializing preds#98192
JIT: Set naive likelihoods when initializing preds#98192amanasifkhalid merged 3 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsPart of #93020. Per discussion on #98054, set "naive" edge likelihoods (in other words, assume every successor edge is equally likely to be taken) in cc @dotnet/jit-contrib
|
src/coreclr/jit/fgflow.cpp
Outdated
|
|
||
| const unsigned numSucc = blockPred->NumSucc(); | ||
| assert(numSucc > 0); | ||
| flow->setLikelihood(1.0 / numSucc); |
There was a problem hiding this comment.
@AndyAyersMS you mentioned the likelihood calculation should consider the edge's duplicate count. I ran into issues with BBJ_COND blocks with true/false branches to the same target, as NumSucc handles this case by returning 1 for such blocks. In such cases, if we calculate the likelihood as (1.0 / numSucc) / dupCount, the edge likelihood overflows to (1.0 / 1) * 2 = 2.0. So leaving the dup count out of the calculation seems to work for that case, but I think this will give us wonky likelihoods for BBJ_SWITCH blocks with duplicate successors.
Is it possible to encounter BBJ_SWITCH blocks with duplicate successors this early on in the JIT? If so, we can use BasicBlock::NumSucc(Compiler *) to get the number of unique switch successors to get the edge likelihoods right, though it'll probably cost a bit more TP.
There was a problem hiding this comment.
For switches having dups is pretty common.
Maybe you just need to special-case BBJ_COND and check for the odd cases where both targets are the same block.
Diff results for #98192Throughput diffsThroughput diffs for windows/arm64 ran on linux/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.11%)
FullOpts (+0.03% to +0.05%)
Details here |
Diff results for #98192Assembly diffsAssembly diffs for linux/arm64 ran on windows/x64Diffs are based on 1,610,272 contexts (368,644 MinOpts, 1,241,628 FullOpts). MISSED contexts: 3,428 (0.21%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for linux/x64 ran on windows/x64Diffs are based on 1,620,764 contexts (360,162 MinOpts, 1,260,602 FullOpts). MISSED contexts: 3,086 (0.19%) Overall (-15 bytes)
FullOpts (-15 bytes)
Assembly diffs for osx/arm64 ran on windows/x64Diffs are based on 1,733,061 contexts (561,303 MinOpts, 1,171,758 FullOpts). MISSED contexts: 3,460 (0.20%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for windows/arm64 ran on windows/x64Diffs are based on 1,477,297 contexts (263,527 MinOpts, 1,213,770 FullOpts). MISSED contexts: 3,464 (0.23%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for windows/x64 ran on windows/x64Diffs are based on 1,999,231 contexts (587,594 MinOpts, 1,411,637 FullOpts). MISSED contexts: 3,657 (0.18%) Overall (-2 bytes)
FullOpts (-2 bytes)
Details here Assembly diffs for linux/arm ran on windows/x86Diffs are based on 1,449,330 contexts (345,734 MinOpts, 1,103,596 FullOpts). MISSED contexts: 55,656 (3.70%) Overall (+0 bytes)
FullOpts (+0 bytes)
Assembly diffs for windows/x86 ran on windows/x86Diffs are based on 1,618,717 contexts (327,626 MinOpts, 1,291,091 FullOpts). MISSED contexts: base: 11,019 (0.68%), diff: 11,022 (0.68%) Overall (+38 bytes)
FullOpts (+38 bytes)
Details here Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.11%)
FullOpts (+0.03% to +0.05%)
Throughput diffs for linux/x64 ran on windows/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.15%)
FullOpts (+0.03% to +0.06%)
Throughput diffs for osx/arm64 ran on windows/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.11%)
FullOpts (+0.03% to +0.05%)
Throughput diffs for windows/x64 ran on windows/x64Overall (+0.03% to +0.06%)
MinOpts (+0.03% to +0.15%)
FullOpts (+0.03% to +0.06%)
Details here Throughput diffs for linux/arm ran on windows/x86Overall (+0.03% to +0.04%)
MinOpts (+0.03% to +0.10%)
FullOpts (+0.03% to +0.04%)
Throughput diffs for windows/x86 ran on windows/x86Overall (+0.03% to +0.05%)
MinOpts (+0.02% to +0.16%)
FullOpts (+0.03% to +0.05%)
Details here Throughput diffs for linux/arm64 ran on linux/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.12%)
FullOpts (+0.03% to +0.06%)
Throughput diffs for linux/x64 ran on linux/x64Overall (+0.04% to +0.06%)
MinOpts (+0.03% to +0.16%)
FullOpts (+0.04% to +0.06%)
Details here |
Diff results for #98192Throughput diffsThroughput diffs for linux/arm64 ran on linux/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.09%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for linux/x64 ran on linux/x64Overall (+0.03% to +0.04%)
MinOpts (+0.03% to +0.14%)
FullOpts (+0.03% to +0.04%)
Throughput diffs for osx/arm64 ran on linux/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.11%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for windows/arm64 ran on linux/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.10%)
FullOpts (+0.02% to +0.04%)
Details here |
Diff results for #98192Assembly diffsAssembly diffs for linux/arm64 ran on windows/x64Diffs are based on 1,610,272 contexts (368,644 MinOpts, 1,241,628 FullOpts). MISSED contexts: 3,428 (0.21%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for linux/x64 ran on windows/x64Diffs are based on 1,620,764 contexts (360,162 MinOpts, 1,260,602 FullOpts). MISSED contexts: 3,086 (0.19%) Overall (-15 bytes)
FullOpts (-15 bytes)
Assembly diffs for osx/arm64 ran on windows/x64Diffs are based on 1,733,061 contexts (561,303 MinOpts, 1,171,758 FullOpts). MISSED contexts: 3,460 (0.20%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for windows/arm64 ran on windows/x64Diffs are based on 1,477,297 contexts (263,527 MinOpts, 1,213,770 FullOpts). MISSED contexts: 3,464 (0.23%) Overall (-16 bytes)
FullOpts (-16 bytes)
Assembly diffs for windows/x64 ran on windows/x64Diffs are based on 1,999,231 contexts (587,594 MinOpts, 1,411,637 FullOpts). MISSED contexts: 3,657 (0.18%) Overall (-2 bytes)
FullOpts (-2 bytes)
Details here Assembly diffs for linux/arm ran on windows/x86Diffs are based on 1,449,330 contexts (345,734 MinOpts, 1,103,596 FullOpts). MISSED contexts: 55,656 (3.70%) Overall (+0 bytes)
FullOpts (+0 bytes)
Assembly diffs for windows/x86 ran on windows/x86Diffs are based on 1,618,717 contexts (327,626 MinOpts, 1,291,091 FullOpts). MISSED contexts: base: 11,019 (0.68%), diff: 11,022 (0.68%) Overall (+38 bytes)
FullOpts (+38 bytes)
Details here Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.10%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for linux/x64 ran on windows/x64Overall (+0.03% to +0.04%)
MinOpts (+0.03% to +0.14%)
FullOpts (+0.03% to +0.04%)
Throughput diffs for osx/arm64 ran on windows/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.11%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for windows/arm64 ran on windows/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.10%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for windows/x64 ran on windows/x64Overall (+0.03% to +0.05%)
MinOpts (+0.03% to +0.13%)
FullOpts (+0.03% to +0.05%)
Details here Throughput diffs for linux/arm64 ran on linux/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.09%)
FullOpts (+0.02% to +0.04%)
Throughput diffs for linux/x64 ran on linux/x64Overall (+0.02% to +0.04%)
MinOpts (+0.02% to +0.12%)
FullOpts (+0.02% to +0.04%)
Details here Throughput diffs for linux/arm ran on windows/x86Overall (+0.02% to +0.03%)
MinOpts (+0.02% to +0.09%)
FullOpts (+0.02% to +0.03%)
Throughput diffs for windows/x86 ran on windows/x86Overall (+0.02% to +0.03%)
MinOpts (+0.01% to +0.14%)
FullOpts (+0.02% to +0.03%)
Details here |
|
@AndyAyersMS PTAL. TP is slightly more reasonable now. I took a look at |
Part of #93020. Per discussion on #98054, set "naive" edge likelihoods (in other words, assume every successor edge is equally likely to be taken) in
fgAddRefPredwhen initializing preds.cc @dotnet/jit-contrib