Skip to content

Quality of native perf profiling on x64 #105690

@EgorBo

Description

@EgorBo

I have built a toy bot to run arbitrary benchmarks on Azure Linux VMs to then report the results back to PRs/issues the bot was invoked from. It also runs native Linux perf tool to collect traces/flamegraphs on demand. What I noticed from the beginning is that the quality of flamegraphs on x64 is often poor and a bit random between runs compared to exactly the same config on arm64.

Example: #105593 (comment)

So here is how flamegraphs look like between sequential runs on arm64 (which is Ampere Altra):

if you open these graphs and try to switch between tabs in your browser, you will notice almost no difference between "run 0" and "run 1" as expected.

On x64 (Amd Milano) the picture is a bit different:

(presumably, with less aggressive/disabled inlining it's a lot worse)

the graphs are a lot "random" between the runs. We wonder if this could be caused by an x64-specific optimization to omit frame pointers for simple methods (on arm64 we currently always emit them, although, we have an issue to eventually fix it: #88823 and #35274). Should we never use that optimization when PerfMap is enabled? (or even introduce a new knob since it's not only perf specific).

cc @jkotas @dotnet/dotnet-diag

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions