Use multi-reg load/store for EncodeToUtf8#95513
Conversation
|
Hi @kunalspathak , this is an initial version of encode to UTF8 using multi-register load/stores. This currently fails some asserts in LSRA phase while doing the crossgen for SPC. It fails while emitting |
src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Encoder.cs
Outdated
Show resolved
Hide resolved
|
Does this need an entry in their part notices file? |
Change-Id: Ie56b1786cdf8ac8d2067c0ba1fdfd3924dd9ca13
Sorry @danmoseley , I didn't understand which part notices file you're referring to. |
|
Initial benchmarking on N1 system show some good performance results. |
| } | ||
|
|
||
| end = srcMax - 16; | ||
| if ((Ssse3.IsSupported || AdvSimd.Arm64.IsSupported) && BitConverter.IsLittleEndian && (end >= src)) |
There was a problem hiding this comment.
I'm not sure if we should remove the AdvSimd check here.
With this PR, it will use the vector128 version if the buffer length is <48 && >16.
That's probably the best option for speed, but results in a bigger library.
|
@SwapnilGaikwad - do you mind sharing the disassembly? |
A separately compiled Full assembly |
|
Thanks @SwapnilGaikwad for sharing the disassembly. I wanted to see the code quality when consecutive registers are involved. |
| Vector128<byte> res4; | ||
| Vector128<byte> tbl_enc1 = Vector128.Create("ABCDEFGHIJKLMNOP"u8).AsByte(); | ||
| Vector128<byte> tbl_enc2 = Vector128.Create("QRSTUVWXYZabcdef"u8).AsByte(); | ||
| Vector128<byte> tbl_enc3 = Vector128.Create("ghijklmnopqrstuv"u8).AsByte(); |
There was a problem hiding this comment.
We can load this encoding table from EncodingMap from Line #774. This could help to reduce the code size but loading from memory/cache would be slightly slower than the regs. Benchmarks didn't show signficant difference. from-mem-artifacts load data from EncodingMap.
| Method | Toolchain | NumberOfBytes | Mean | Error | StdDev | Median | Min | Max | Ratio | MannWhitney(2%) |
|-------------------------------- |-------------------------------------------------------------------------------------- |-------------- |---------:|--------:|--------:|---------:|---------:|---------:|------:|---------------- |
| Base64Encode | /main/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 363.8 ns | 0.14 ns | 0.12 ns | 363.8 ns | 363.5 ns | 363.9 ns | 1.00 | Base |
| Base64Encode | /runtime/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 196.8 ns | 0.02 ns | 0.02 ns | 196.8 ns | 196.8 ns | 196.9 ns | 0.54 | Faster |
| Base64Encode | /runtime/from-mem-artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 195.3 ns | 0.08 ns | 0.07 ns | 195.3 ns | 195.2 ns | 195.5 ns | 0.54 | Faster |
| | | | | | | | | | | |
| Base64EncodeDestinationTooSmall | /main/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 368.2 ns | 0.30 ns | 0.28 ns | 368.2 ns | 367.8 ns | 368.7 ns | 1.00 | Base |
| Base64EncodeDestinationTooSmall | /runtime/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 191.5 ns | 0.04 ns | 0.04 ns | 191.5 ns | 191.5 ns | 191.6 ns | 0.52 | Faster |
| Base64EncodeDestinationTooSmall | /runtime/from-mem-artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 194.0 ns | 0.22 ns | 0.20 ns | 193.9 ns | 193.7 ns | 194.4 ns | 0.53 | Faster |
Would you suggest to read encoding table from EncodingMap?
|
/azp run runtime-coreclr libraries-jitstress |
|
/azp run runtime-coreclr libraries-jitstress2-jitstressregs |
|
Azure Pipelines successfully started running 1 pipeline(s). |
1 similar comment
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run runtime-coreclr libraries-jitstressregs |
|
Azure Pipelines successfully started running 1 pipeline(s). |
I guess he is talking about the reference made in https://github.com/dotnet/runtime/pull/95513/files#diff-b3b9edcf4c0d62e78954d826c44005cffb306b6ccf155f1a9228669229b7e765R496, but not sure where exactly to add this. @danmoseley - can you please confirm? |
|
Tagging subscribers to this area: @dotnet/area-system-buffers Issue DetailsThis implements the encode to UTF8 algorithm here.
|
|
ping @danmoseley |
|
Oops, yes, that was what caught my eye. Generally if we use significant ideas/code from elsewhere we add a credit in THIRD-PARTY-NOTICES.TXT at the root. Up to you. |
Looks like there is already an entry in there, but it's not immediately obvious. runtime/THIRD-PARTY-NOTICES.TXT Line 345 in 57bcccd Contains an extra copy of the license from https://github.com/aklomp/base64/blob/master/LICENSE |
| [CompExactlyDependsOn(typeof(AdvSimd.Arm64))] | ||
| private static unsafe void AdvSimdEncode(ref byte* srcBytes, ref byte* destBytes, byte* srcEnd, int sourceLength, int destLength, byte* srcStart, byte* destStart) | ||
| { | ||
| // C# implementatino of https://github.com/aklomp/base64/blob/3a5add8652076612a8407627a42c768736a4263f/lib/arch/neon64/enc_loop.c |
There was a problem hiding this comment.
nit
| // C# implementatino of https://github.com/aklomp/base64/blob/3a5add8652076612a8407627a42c768736a4263f/lib/arch/neon64/enc_loop.c | |
| // C# implementation of https://github.com/aklomp/base64/blob/3a5add8652076612a8407627a42c768736a4263f/lib/arch/neon64/enc_loop.c |
There was a problem hiding this comment.
you can do this in a follow-up PR
This reverts commit fdb03ca.
* Use multi-reg load/store for EncodeToUtf8 * Use the fixed version of multi-reg store * Update variable naming
…#96944) * Revert "[libs] Skip AdvSimdEncode on Mono (dotnet#96829)" This reverts commit 1a76e37. * Revert "Use multi-reg load/store for EncodeToUtf8 (dotnet#95513)" This reverts commit fdb03ca. * Wrap load/store vector APIs in '#if false' * Disable load/store vector tests * remove the trailing space
|
I assume this is the API being discussed. If so, it would be good to put in the initial comment so that it is easy for folks to fine. https://learn.microsoft.com/dotnet/api/system.buffers.text.base64.encodetoutf8 |
This implements the encode to UTF8 algorithm here.