Fix handling of late-discovered atomic lazy loops in compiler / source generator#117629
Merged
stephentoub merged 2 commits intodotnet:mainfrom Jul 24, 2025
Merged
Fix handling of late-discovered atomic lazy loops in compiler / source generator#117629stephentoub merged 2 commits intodotnet:mainfrom
stephentoub merged 2 commits intodotnet:mainfrom
Conversation
…e generator Lazy loops can be made automatically atomic in some situations by the optimizer, in which case their handling significantly simplifies, because a lazy atomic loop just becomes a repeater for the min iteration count. Most viable lazy loops are caught by the optimizer, but some aren't yet are determined to be treatable as atomic at emit time. EmitLazy was handling such cases incorrectly, resulting in a missing branch target and compilation failing. This fixes that two fold: 1. The optimizer is improved, so the discovered tests cases that were triggering this case no longer do. 2. The distinction is eliminated from EmitLazy, as the case is rare and it would take a lot more code to optimize for that case.
Contributor
|
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions |
Contributor
There was a problem hiding this comment.
Pull Request Overview
Fixes handling of late-discovered atomic lazy loops by improving the optimizer and removing special‐case logic in lazy emission.
- Adds new functional tests for various lookbehind and lazy‐loop scenarios.
- Refactors RegexNode optimizations to consistently use
rootNode.Optionsand apply RTL checks per case. - Simplifies
EmitLazyin both compiler and generated emitter by removing theisAtomicbranch.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| Regex.Match.Tests.cs | Added tests covering edge cases in lookbehinds and lazy loops. |
| RegexNode.cs | Updated FinalOptimize and EliminateEndingBacktracking guards and RTL logic. |
| RegexCompiler.cs | Removed isAtomic checks and assertion in EmitLazy. |
| RegexGenerator.Emitter.cs | Mirrored compiler changes in generated emitter, removed isAtomic branch. |
Comments suppressed due to low confidence (2)
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexNode.cs:414
- By only checking for NonBacktracking here, RTL loops will still enter backtracking elimination and may receive Oneloop/Setloop atomic optimizations that haven’t been vetted for RTL. Consider restoring the RTL guard or adding per-case RTL checks to prevent incorrect behavior in right-to-left mode.
(Options & RegexOptions.NonBacktracking) != 0)
danmoseley
reviewed
Jul 15, 2025
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexNode.cs
Show resolved
Hide resolved
danmoseley
approved these changes
Jul 15, 2025
This was referenced Jul 15, 2025
Open
Member
Author
|
/ba-g browser wasm timeouts |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Lazy loops can be made automatically atomic in some situations by the optimizer, in which case their handling significantly simplifies, because a lazy atomic loop just becomes a repeater for the min iteration count. Most viable lazy loops are caught by the optimizer, but some aren't yet are determined to be treatable as atomic at emit time. EmitLazy was handling such cases incorrectly, resulting in a missing branch target and compilation failing. This fixes that two fold:
Best reviewed without whitespace (indentation changed on a few large sections).
Fixes #117601