add: Add exception and optional skipping for recursive protobuf messages#107
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #107 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 6 6
Lines 781 797 +16
Branches 141 145 +4
=========================================
+ Hits 781 797 +16 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
0x26res
left a comment
There was a problem hiding this comment.
Thanks for sending this PR. It looks good, I've put a lot of comments but they mainly relate to details.
We just need to be careful about naming.
PS: I think the Security Analysis is failing, I'll update the dependency in master and it should work once you rebase.
84d04ff to
2d35f27
Compare
|
Hello, Thanks for the prompt and detailed review. I've learnt a few things! I've addressed all the comments from the first review. I've also expanded the changes to the functions I will comment on the open "conversations" if there is anything else to add - otherwise I'll leave it up to you to resolve the conversations once you've gone over the changes. Thanks! |
0x26res
left a comment
There was a problem hiding this comment.
Looking good. Only a few minor details and we can merge. I'll then create a new version. Thanks again.
|
I think I have covered everything now. I noticed that the test Hopefully not connected to this PR, I had the test |
I think there is a flaky test somewhere due to floats arithmetic or something like that. |
|
Now available in pypi https://pypi.org/project/protarrow/0.14.0/ Thanks again @david-gonzalez-oxb |
This PR concerns the parsing of protobuf messages with recursive cycles in its schema definitions, which unfortunately do exist in the wild. The current behaviour of the package with such messages is to go on an infinite recursion loop until the eventual
RecursionError: maximum recursion depth exceededcrash.With the proposed changes, we leverage the current structure of the code to detect cycles by maintaining a trace of the nested messages in the protobuf schema. Upon encountering a cycle (i.e. a message seen twice on the trace) we raise a self-explanatory
TypeError, showing the trace that caused it.Additionally, we introduce a configuration flag
skip_recursive_messages(set toFalseby default) that enables us to avoid theTypeErrorby simply skipping the second occurrence of the cyclical message.The pruning is enabled only for plain messages, and not for repeated messages or maps. The handling of those cases could be introduced on a subsequent PR.