Skip to content

PoC: avoid allocating GCHandle for ReceiveFromAsync on Windows#1

Closed
wfurt wants to merge 1 commit intoudp2from
gchandle2
Closed

PoC: avoid allocating GCHandle for ReceiveFromAsync on Windows#1
wfurt wants to merge 1 commit intoudp2from
gchandle2

Conversation

@wfurt
Copy link
Owner

@wfurt wfurt commented Jul 28, 2023

Because we allocate new SocketAddress on each call, following check is never true

      // Ensures appropriate SocketAddress buffer is pinned.
        private void PinSocketAddressBuffer()
        {
            // Check if already pinned.
            if (_pinnedSocketAddress == _socketAddress)
            {
                return;
            }

https://github.com/dotnet/runtime/blob/849ed0a0b895edb6dc70c4521f2284af640ec9d6/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEventArgs.Windows.cs#L863-L864

The fix is to use NativeMemory that would persist across all calls (for given AddressFamily)
The downside is that we need to copy the address but that seems small cost comparing to GCHandle allocation and management.

closes dotnet#86513
contributes to dotnet#30797

@wfurt wfurt closed this Aug 9, 2023
wfurt pushed a commit that referenced this pull request Nov 29, 2023
Fixes dotnet#95367.

Relevant part of the JitDump:

```
Using `if true` assertions from pred BB02
Assertions in: #1
fgMorphTree BB04, STMT00021 (before)
               [000070] DA---------                         *  STORE_LCL_VAR ubyte  V10 tmp9
               [000057] -----------                         \--*  CAST      int <- ubyte <- int
               [000006] -----------                            \--*  EQ        int
               [000004] -----------                               +--*  LCL_VAR   ref    V02 tmp1          (last use)
               [000055] H----------                               \--*  CNS_INT(h) ref     'Frozen EmptyPartition`1<Int32> object'

Assertion prop for index #1 in BB04:
               [000006] -----------                         *  EQ        int
GenTreeNode creates assertion:
               [000070] DA---+-----                         *  STORE_LCL_VAR ubyte  V10 tmp9
In BB04 New Local Constant Assertion: V10 == [0000000000000001], index = #2

fgMorphTree BB04, STMT00021 (after)
               [000070] DA---+-----                         *  STORE_LCL_VAR ubyte  V10 tmp9
               [000055] H----+-----                         \--*  CNS_INT(h) int
```

The JitDump is unfinished because the compiler crashes when trying to dump the last line. Clearly, the `CNS_INT` is no longer a handle at that point because we just bashed it to a constant 1.
wfurt pushed a commit that referenced this pull request Mar 17, 2024
CodeQL flagged various places where we're dereferencing pointers that could be NULL, this PR systematically cleans some of them up via g_assert.
* g_assert result of g_build_path calls
* Allocation failure handling
* mono_class_inflate_generic_class_checked can return NULL
wfurt pushed a commit that referenced this pull request May 15, 2024
…#102133)

This generalizes the indir reordering optimization (that currently only
triggers for loads) to kick in for GT_STOREIND nodes.

The main complication with doing this is the fact that the data node of
the second indirection needs its own reordering with the previous
indirection. The existing logic works by reordering all nodes between
the first and second indirection that are unrelated to the second
indirection's computation to happen after it. Once that is done we know
that there are no uses of the first indirection's result between it and
the second indirection, so after doing the necessary interference checks
we can safely move the previous indirection to happen after the data
node of the second indirection.

Example:
```csharp
class Body { public double x, y, z, vx, vy, vz, mass; }

static void Advance(double dt, Body[] bodies)
{
    foreach (Body b in bodies)
    {
        b.x += dt * b.vx;
        b.y += dt * b.vy;
        b.z += dt * b.vz;
    }
}
```

Diff:
```diff
@@ -1,18 +1,17 @@
-G_M55007_IG04:  ;; offset=0x001C
+G_M55007_IG04:  ;; offset=0x0020
             ldr     x3, [x0, w1, UXTW #3]
             ldp     d16, d17, [x3, #0x08]
             ldp     d18, d19, [x3, #0x20]
             fmul    d18, d0, d18
             fadd    d16, d16, d18
-            str     d16, [x3, #0x08]
-            fmul    d16, d0, d19
-            fadd    d16, d17, d16
-            str     d16, [x3, #0x10]
+            fmul    d18, d0, d19
+            fadd    d17, d17, d18
+            stp     d16, d17, [x3, #0x08]
             ldr     d16, [x3, #0x18]
             ldr     d17, [x3, #0x30]
             fmul    d17, d0, d17
             fadd    d16, d16, d17
             str     d16, [x3, #0x18]
             add     w1, w1, #1
             cmp     w2, w1
             bgt     G_M55007_IG04
```
wfurt pushed a commit that referenced this pull request Sep 10, 2024
* bug #1: don't allow for values out of the SerializationRecordType enum range

* bug #2: throw SerializationException rather than KeyNotFoundException when the referenced record is missing or it points to a record of different type

* bug #3: throw SerializationException rather than FormatException when it's being thrown by BinaryReader (or sth else that we use)

* bug #4: document the fact that IOException can be thrown

* bug #5: throw SerializationException rather than OverflowException when parsing the decimal fails

* bug #6: 0 and 17 are illegal values for PrimitiveType enum

* bug #7: throw SerializationException when a surrogate character is read (so far an ArgumentException was thrown)
wfurt pushed a commit that referenced this pull request Apr 27, 2025
* JIT: Introduce `LclVarDsc::lvIsMultiRegDest`

With recent work to expand returned promoted locals into `FIELD_LIST`
the only "whole references" of promoted locals we should see is when
stored from a multi-reg node. This is the only knowledge the backend
should need for correctness purposes, so introduce a bit to track this
property, and switch the backend to check this instead.

The existing `lvIsMultiRegRet` is essentially this + whether the local
is returned. We should be able to remove this, but it is currently used
for some heuristics in old promotion, so keep it around for now.

* JIT: Add some more constant folding in lowering

Add folding for shifts and certain binops that are now getting produced
late due to returned `FIELD_LIST` nodes.

win-arm64 example:
```csharp
[MethodImpl(MethodImplOptions.NoInlining)]
static ValueTask<byte> Foo()
{
    return new ValueTask<byte>(123);
}
```

```diff
 G_M17084_IG02:  ;; offset=0x0008
             mov     x0, xzr
-            mov     w1, #1
-            mov     w2, wzr
-            mov     w3, dotnet#123
-            orr     w2, w2, w3,  LSL #16
-            orr     w1, w2, w1,  LSL #24
-						;; size=24 bbWeight=1 PerfScore 4.00
+            mov     w1, #0x17B0000
+						;; size=8 bbWeight=1 PerfScore 1.00
```

* Feedback
wfurt pushed a commit that referenced this pull request Apr 27, 2025
…otnet#114227)

Presence of `.cctor` in `Thread` can cause circular dependency if Lock needs to block while Thread .cctor has not run yet.

1. Lock needs to wait on a WaitHandle
2. WaitHandle needs Thread.CurrentThread
3. if Thread's .cctor has not run yet, it needs to run.     
(it is unusual for this to be the first use of Thread, but the activation pattern in dotnet#113949 made it possible)
4. .cctor needs to take a Lock, so we go to `#1`

Fixes: dotnet#113949
liveans pushed a commit that referenced this pull request Jan 5, 2026
liveans pushed a commit that referenced this pull request Jan 5, 2026
…ds from dotnet#27912 (Flow System.Text.Rune through more APIs)) (dotnet#120145)

* Fix tests from dotnet#117168

* Add `SyncTextWriter` overloads as well

* Add missing overloads to BroadcastingTextWriter

* Reapply "Add methods from dotnet#27912 (Flow System.Text.Rune through more APIs) (#1…" (dotnet#120138)

This reverts commit be80737.

* Override the TextWrite Rune overloads in IndentedTextWriter

---------

Co-authored-by: Tarek Mahmoud Sayed <tarekms@microsoft.com>
@wfurt wfurt deleted the gchandle2 branch January 7, 2026 20:01
wfurt pushed a commit that referenced this pull request Feb 19, 2026
…er (dotnet#123735)

From discussion, opting into enabling the crash chaining is more
correct.

<s>The previously registered signal action/handler aren't guaranteed to
return, so we lose out on notifying shutdown and creating a dump in
those cases. Specifically, PROCCreateCrashDumpIfEnabled would be the
last chance to provide the managed context for the thread that crashed.

e.g. On Android CoreCLR, it seems that, by default, signal handlers are
already registered by Android's runtime
(/apex/com.android.runtime/bin/linker64 +
/system/lib64/libandroid_runtime.so). Whenever an unhandled synchronous
fault occurs, the previously registered handler will not return back to
invoke_previous_action and aborts the thread itself, so
PROCCreateCrashDumpIfEnabled will not be hit.</s>

## Sigsegv behavior Android CoreCLR vs other platforms

### Android CoreCLR
When intentionally writing to NULL (sigsegv) on Android CoreCLR, the
previously registered signal handler goes down this path
https://github.com/dotnet/runtime/blob/40e8c73b8f3b5f478a9bf03cf55c71d0608a8855/src/coreclr/pal/src/exception/signal.cpp#L454,
and the thread aborts before hitting PROCNotifyProcessShutdown and
PROCCreateCrashDumpIfEnabled.

### MacOS/Linux/NativeAOT(linux)
On MacOS, Linux, NativeAOT (Only checked linux at time of writing), the
same intentional SIGSEGV will hit
https://github.com/dotnet/runtime/blob/40e8c73b8f3b5f478a9bf03cf55c71d0608a8855/src/coreclr/pal/src/exception/signal.cpp#L431-L448
instead because there is no previously registered signal handler. In
those cases, PROCCreateCrashDumpIfEnabled is hit and managed callstacks
are captured in the dump.

## History investigation

From a github history dive, I didn't spot anything in particular
requiring the previous signal handler to be invoked before
PROCNotifyProcessShutdown + PROCCreateCrashDumpIfEnabled.

PROCNotifyProcessShutdown was first introduced in
dotnet@1433c3f.
It doesn't seem to state a particular reason for invoking it after the
previous signal handler.

PROCCreateCrashDumpIfEnabled was added to signal.cpp in
dotnet@7f9bd2c
because the PROCNotifyProcessShutdown didn't create a crash dump. It
doesn't state any particular reason for being invoked after the
previously registered signal handler, and was probably just placed next
to PROCNotifyProcessShutdown.

`invoke_previous_action` was introduced in
dotnet@a740f65
and was refactoring while maintaining the order.

## Android CoreCLR behavior after swapping order

Locally, I have POC changes to emit managed callstacks in Android's
PROCCreateCrashDumpIfEnabled.
```
01-28 17:26:40.951  2416  2440 F DOTNET  : Native crash detected; attempting managed stack trace.
01-28 17:26:40.951  2416  2440 F DOTNET  : {"stack":[
01-28 17:26:40.951  2416  2440 F DOTNET  : {"ip":"0x0","module":"0x0","offset":"0x0","name":"Program.MemSet(Void*, Int32, UIntPtr)"},
01-28 17:26:40.951  2416  2440 F DOTNET  : {"ip":"0x78d981145973","module":"0x0","offset":"0x0","name":"Program.MemSet(Void*, Int32, UIntPtr)"},
01-28 17:26:40.951  2416  2440 F DOTNET  : {"ip":"0x78d981145973","module":"0x0","offset":"0x73","name":"Program.ForceNativeSegv()"},
01-28 17:26:40.951  2416  2440 F DOTNET  : {"ip":"0x78d981141b60","module":"0x0","offset":"0x70","name":"Program.Main(System.String[])"}
01-28 17:26:40.951  2416  2440 F DOTNET  : ]}
01-28 17:26:40.952  2416  2440 F DOTNET  : Crash dump hook completed.
--------- beginning of crash
01-28 17:26:40.952  2416  2440 F libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 in tid 2440 (.dot.MonoRunner), pid 2416 (ulator.JIT.Test)
.....
01-28 17:26:46.882  2921  2921 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-28 17:26:46.882  2921  2921 F DEBUG   : Build fingerprint: 'google/sdk_gphone64_x86_64/emu64xa:16/BE2A.250530.026.D1/13818094:user/release-keys'
01-28 17:26:46.882  2921  2921 F DEBUG   : Revision: '0'
01-28 17:26:46.882  2921  2921 F DEBUG   : ABI: 'x86_64'
01-28 17:26:46.882  2921  2921 F DEBUG   : Timestamp: 2026-01-28 17:26:41.492831700-0500
01-28 17:26:46.882  2921  2921 F DEBUG   : Process uptime: 20s
01-28 17:26:46.883  2921  2921 F DEBUG   : Cmdline: net.dot.Android.Device_Emulator.JIT.Test
01-28 17:26:46.883  2921  2921 F DEBUG   : pid: 2416, tid: 2440, name: .dot.MonoRunner  >>> net.dot.Android.Device_Emulator.JIT.Test <<<
01-28 17:26:46.883  2921  2921 F DEBUG   : uid: 10219
01-28 17:26:46.883  2921  2921 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000000
01-28 17:26:46.883  2921  2921 F DEBUG   : Cause: null pointer dereference
01-28 17:26:46.883  2921  2921 F DEBUG   : Abort message: 'CoreCLR: previous handler for '
01-28 17:26:46.883  2921  2921 F DEBUG   :     rax 0000000000000000  rbx 000078da87ffade0  rcx 0000000000000000  rdx 0000000000000001
01-28 17:26:46.884  1237  1297 I s.nexuslauncher: AssetManager2(0x78dd08cd9178) locale list changing from [] to [en-US]
01-28 17:26:46.903  2447  2594 I BugleNotifications: Creating notification input ids [CONTEXT im_entry_input="" im_notification_input="" im_settings_store_input="" im_final_input="" ]
01-28 17:26:46.905  2921  2921 F DEBUG   :     r8  00007ffcde5a8080  r9  34d9bb0e67871eb0  r10 000078ddb4111870  r11 0000000000000293
01-28 17:26:46.906  2921  2921 F DEBUG   :     r12 0000000000000001  r13 000078da87ffafa0  r14 0000000000000000  r15 000078da87ffaf18
01-28 17:26:46.906  2921  2921 F DEBUG   :     rdi 0000000000000000  rsi 0000000000000000
01-28 17:26:46.906  2921  2921 F DEBUG   :     rbp 000078da87ffac40  rsp 000078da87ffabc8  rip 000078ddb41118a2
01-28 17:26:46.906  2921  2921 F DEBUG   : 2 total frames
01-28 17:26:46.906  2921  2921 F DEBUG   : backtrace:
01-28 17:26:46.906  2921  2921 F DEBUG   :       #00 pc 000000000008f8a2  /apex/com.android.runtime/lib64/bionic/libc.so (memset_avx2+50) (BuildId: fcb82240218d1473de1e3d2137c0be35)
01-28 17:26:46.906  2921  2921 F DEBUG   :       #1 pc 0000000000049972  /memfd:doublemapper (deleted) (offset 0x111000)
```
Now theres a window to log managed callstacks before the original signal
handler aborts and triggers a tombstone.

## Android Mono behavior

Mono provides two embeddings APIs to configure signal and crash chaining
https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/driver.c#L2864-L2894
that determine whether synchronous faults would chain
https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/mini-runtime.c#L3892-L3903
They would only chain to the previous signal handler
https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/mini-posix.c#L193-L210
only after attempting to walk native and managed stacks
https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/mini-exceptions.c#L2992-L3012

## Alternatives

If there is any particular reason to preserve the order of
sa_sigaction/sa_handler with respect to PROCNotifyProcessShutdown and
PROCCreateCrashDumpIfEnabled for CoreCLR, a config knob can be added to
allow Android CoreCLR to opt into the swapped ordering behavior. This
may be in the form of config property key/values
https://github.com/dotnet/runtime/blob/54ca569eb62800cdb725d776e3dd2e564028594d/src/coreclr/dlls/mscoree/exports.cpp#L237-L238
or `clrconfigvalues`. That way AndroidSDK/AndroidAppBuilder may opt-in
at build-time.

Given that the history of the ordering didn't reveal any problems with
swapping the order, we can fallback to this behavior if the order swap
causes problems down the line.

The other way around is more restrictive. Should we first introduce all
the overhead to enable an opt-in/opt-out config knob, and later discover
that no platforms need to invoke their previous handlers before
PROCNotifyProcessShutdown/PROCCreateCrashDumpIfEnabled, it seems harder
to justify removing the knob.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant