Skip rewriting internal pcollection coders in a stage#34655
Merged
liferoad merged 3 commits intoapache:masterfrom Apr 17, 2025
Merged
Skip rewriting internal pcollection coders in a stage#34655liferoad merged 3 commits intoapache:masterfrom
liferoad merged 3 commits intoapache:masterfrom
Conversation
Contributor
|
Assigning reviewers. If you would like to opt out of this review, comment R: @jrmccluskey for label go. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #34655 +/- ##
============================================
- Coverage 54.61% 54.59% -0.03%
Complexity 1480 1480
============================================
Files 1008 1008
Lines 159758 159657 -101
Branches 1079 1079
============================================
- Hits 87259 87162 -97
+ Misses 70399 70397 -2
+ Partials 2100 2098 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
lostluck
approved these changes
Apr 17, 2025
Contributor
lostluck
left a comment
There was a problem hiding this comment.
Elegant! I thought for sure this would have required a fix Java side as well.
But i suppose if it doesn't happen in practice then what's to worry about.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Prior to the PR, unknown coders were potentially replaced (e.g., length-prefixed) for every transform within a processing stage. This approach, however, caused issues in Java pipelines (like #32931), where worker nodes required the original coder (e.g.,
SchemaCoder) to correctly relay data between internal transforms.An observation driving this change is that only the input and output coders of a stage are critical for the runner, as actual serialization/deserialization occurs at these stage boundaries. Coders for internal transforms should be irrelevant to the runner's processing and will now be kept intact. This minimizes potential side effects from modifying the pipeline graph and ensures internal data flow remains correct.
#fixes #32931