Use AVX512 to zero locals by EgorBo · Pull Request #91166 · dotnet/runtime

EgorBo · 2023-08-27T00:56:35Z

Extends #32538 to use AVX-512 (and AVX1) to zero locals for non-loop path. I am going to slightly refactor it to use AVX in the loop path too but later, this seems to be a low-hanging fruit with nice diffs.

Diff example:

@@ -17,17 +17,13 @@ G_M59697_IG01:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
        push     rbx
        sub      rsp, 160
        vxorps   xmm4, xmm4, xmm4
-       vmovdqa  xmmword ptr [rsp+0x20], xmm4
-       vmovdqa  xmmword ptr [rsp+0x30], xmm4
-       mov      rax, -96
-       vmovdqa  xmmword ptr [rsp+rax+0xA0], xmm4
-       vmovdqa  xmmword ptr [rsp+rax+0xB0], xmm4
-       vmovdqa  xmmword ptr [rsp+rax+0xC0], xmm4
-       add      rax, 48
-       jne      SHORT  -5 instr
+       vmovdqu  ymmword ptr [rsp+0x20], ymm4
+       vmovdqu  ymmword ptr [rsp+0x40], ymm4
+       vmovdqu  ymmword ptr [rsp+0x60], ymm4
+       vmovdqu  ymmword ptr [rsp+0x80], ymm4
        mov      rbx, rcx
        ; gcrRegs +[rbx]
-						;; size=70 bbWeight=1 PerfScore 13.33
+						;; size=42 bbWeight=1 PerfScore 9.83
 G_M59697_IG02:        ; bbWeight=1, gcrefRegs=0008 {rbx}, byrefRegs=0000 {}, byref
        lea      rcx, [rsp+0x20]
        call     [<unknown method>]
@@ -46,7 +42,7 @@ G_M59697_IG03:        ; bbWeight=1, epilog, nogc, extend
        ret      
 						;; size=9 bbWeight=1 PerfScore 1.75

(apparently this collection has no avx-512, but still looks better)

ghost · 2023-08-27T00:56:47Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Extends #32538 to use AVX-512 (and AVX1) to zero locals for non-loop path. I am going to slightly refactor it to use AVX in the loop path too but later, this seems to be a low-hanging fruit with nice diffs.

Author:	EgorBo
Assignees:	-
Labels:	`area-CodeGen-coreclr`
Milestone:	-

EgorBo · 2023-08-27T15:30:43Z

@dotnet/jit-contrib PTAL, simple change with nice diffs (-122kb for benchmarks.pgo collection, -0.13% TP for the same collection).

The logic has plenty of opportunities to optimize futher, e.g. use AVX in the loop - I didn't change it here because for that we need to align data to 32/64 bytes + remainder can be handled with overlapping -- but I am leaving it for future follow ups. I was mostly interested in removing loops by allowing up to 6*64=384 bytes to be zeroed directly with avx512 where previously we switched to the loop for >96 bytes.

src/coreclr/jit/codegenxarch.cpp

EgorBo · 2023-09-05T16:19:38Z

Improvements on x64:

[Perf] Linux/x64: 22 Improvements on 9/2/2023 9:37:00 AM perf-autofiling-issues#21207
[Perf] Linux/x64: 18 Improvements on 8/28/2023 7:02:26 PM perf-autofiling-issues#21204

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 27, 2023

ghost assigned EgorBo Aug 27, 2023

build-analysis bot mentioned this pull request Aug 27, 2023

Microsoft.NET.HostModel.Tests failing with "No space left on device" #91039

Closed

EgorBo marked this pull request as ready for review August 27, 2023 09:36

Use AVX512 to zero locals

61a05ee

EgorBo force-pushed the zero-locals-avx512 branch from 8ac3214 to 61a05ee Compare August 27, 2023 10:57

EgorBo added the avx512 Related to the AVX-512 architecture label Aug 27, 2023

build-analysis bot mentioned this pull request Aug 27, 2023

System.Net.Quic.Tests.QuicStreamTests.WriteCanceled_NextWriteThrows test failure #76831

Closed

Fix loop

8675483

tannergooding reviewed Aug 27, 2023

View reviewed changes

src/coreclr/jit/codegenxarch.cpp Show resolved Hide resolved

tannergooding approved these changes Aug 27, 2023

View reviewed changes

EgorBo merged commit 3a1570f into dotnet:main Aug 28, 2023

EgorBo deleted the zero-locals-avx512 branch August 28, 2023 16:29

EgorBo mentioned this pull request Sep 3, 2023

Implement AVX-512 support #77034

Closed

56 tasks

ghost locked as resolved and limited conversation to collaborators Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Use AVX512 to zero locals#91166

Use AVX512 to zero locals#91166
EgorBo merged 2 commits intodotnet:mainfrom
EgorBo:zero-locals-avx512

EgorBo commented Aug 27, 2023 •

edited

Loading

Uh oh!

ghost commented Aug 27, 2023

Uh oh!

EgorBo commented Aug 27, 2023 •

edited

Loading

Uh oh!

Uh oh!

EgorBo commented Sep 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

EgorBo commented Aug 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Aug 27, 2023

Uh oh!

EgorBo commented Aug 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

EgorBo commented Sep 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

EgorBo commented Aug 27, 2023 •

edited

Loading

EgorBo commented Aug 27, 2023 •

edited

Loading