Skip to content

Implement EBR#123492

Draft
noahfalk wants to merge 1 commit intodotnet:mainfrom
noahfalk:ebr
Draft

Implement EBR#123492
noahfalk wants to merge 1 commit intodotnet:mainfrom
noahfalk:ebr

Conversation

@noahfalk
Copy link
Member

We've got places in the runtime where ideally we would write an entire algorithm in pre-emptive code but we switch to COOP to create synchronization barriers that allow deleting data structures during GC pauses. Although piggybacking on the GC can work it comes with some downsides:

  1. GC operates as a reader-writer lock on the heap and like any lock it can create deadlocks if lock ordering constraints aren't maintained.
  2. Even without deadlocks, GC pauses are often measured in at least milliseconds. EBR can offer latencies many orders of magnitudes lower for our typical read-heavy workloads.

This change doesn't yet use EBR for anything but my initial thought is shifting the HashMap to use this for its defered table cleanup instead of the GC coupled approach it uses now. HashMap appears to be violating lock ordering constraints with its COOP mode switches and we have seen deadlocks caused by this in practice.

We've got places in the runtime where ideally we would write an entire algorithm in pre-emptive code
but we switch to COOP to create synchronization barriers that allow deleting data structures during GC
pauses. Although piggybacking on the GC can work it comes with some downsides:
1. GC operates as a reader-writer lock on the heap and like any lock it can create deadlocks if lock ordering
constraints aren't maintained.
2. Even without deadlocks, GC pauses are often measured in at least milliseconds. EBR can offer latencies
many orders of magnitudes lower for our typical read-heavy workloads.

This change doesn't yet use EBR for anything but my initial thought is shifting the HashMap to use this
for its defered table cleanup instead of the GC coupled approach it uses now. HashMap appears to be
violating lock ordering constraints with its COOP mode switches and we have seen deadlocks caused by
this in practice.
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 22, 2026
extern "C" {
#endif

typedef void (*minipal_tls_destructor_t)(void *);
Copy link
Member

@jkotas jkotas Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we want to be adding support for TLS keys in minipal. The TLS keys create more problems than they solve when combined with regular thread local variables.

@jkotas
Copy link
Member

jkotas commented Jan 22, 2026

shifting the HashMap to use this for its defered table

Should we switch HashMap to this directly without the extra layer?

src/native/containers was invented to facilitate sharing code between CoreCLR and Mono. We do not really care about that going forward.

@AaronRobinsonMSFT AaronRobinsonMSFT added area-VM-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jan 22, 2026
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@noahfalk
Copy link
Member Author

I do not think we want to be adding support for TLS keys in minipal.

Should we switch HashMap to this directly without the extra layer?

We could certainly shift this around so that its CoreCLR only or HashMap only as desired. I tried to write it in a general purpose manner but if there is no interest using it elsewhere specializing it and adjusting the dependencies should be straightforward.

@noahfalk
Copy link
Member Author

I take the comments about where the code lives also implies folks are happy with high-level approach to solving the GC lock leveling issue?

@jkotas
Copy link
Member

jkotas commented Jan 22, 2026

Yes, I think it makes sense to decouple the native runtime hash tables from the GC suspension.

@noahfalk
Copy link
Member Author

@AaronRobinsonMSFT and I chatted - he is going move this forward when he frees up with his current work. Aaron if anything comes up and you'd like me to resume just give me a shout : )

@jkotas
Copy link
Member

jkotas commented Jan 23, 2026

@AaronRobinsonMSFT and I chatted - he is going move this forward when he frees up with his current work.

I am wondering whether there is an existing debugged implementation that we can borrow.

Also, I assume that we want this hashtable to be DACizable that puts additional constrains on how the implementation is structured.

@noahfalk
Copy link
Member Author

There is https://github.com/rmind/libqsbr that I'm aware of though I didn't attempt to analyze its implementation quality. The last commit was 7 years ago so that could mean it is working well or it has been abandoned.

Also, I assume that we want this hashtable to be DACizable that puts additional constrains on how the implementation is structured.

My implementation imposes no constraints on the shape of the data structures or the allocator used to alloc/free the memory (the rmind impl also has no constraint). I'm not expecting using EBR to create any additional constraints on HashMap beyond what it already does now to be DAC-readable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants