ARROW-6861: [C++] Fix length/null_count/capacity accounting through Reset and AppendIndices in DictionaryBuilder#5643
ARROW-6861: [C++] Fix length/null_count/capacity accounting through Reset and AppendIndices in DictionaryBuilder#5643wesm wants to merge 2 commits intoapache:masterfrom
Conversation
cpp/src/arrow/array/builder_base.h
Outdated
There was a problem hiding this comment.
It would be nice to provide some documentation for how/when this method should be used.
There was a problem hiding this comment.
Agreed. Ideally we should be able to avoid having this new method?
cpp/src/arrow/array_dict_test.cc
Outdated
There was a problem hiding this comment.
Why not disable this with
| // TEST(TestStringDictionaryBuilder, BigDeltaDictionary) { | |
| // TEST(DISABLED_TestStringDictionaryBuilder, BigDeltaDictionary) { |
|
I think if this is disabling the current delta support then this shouldn't go into 0.15.1. Is there a way to solve the Parquet issue without touching delta support (which can be done in another PR for a later version)? For example by adding e.g. a |
|
@pitrou I just pushed a much simpler fix that does not touch the dictionary delta logic. I'll rebase a patch I have done for ARROW-6869 after this is merged into master |
pitrou
left a comment
There was a problem hiding this comment.
LGTM on the principle, just one suggestion in the tests.
cpp/src/arrow/array_dict_test.cc
Outdated
There was a problem hiding this comment.
Should you try an actual re-use of the builder below? In case some internal state (e.g. the memo) is not properly reinitialized.
|
+1. Merging this |
…eset and AppendIndices in DictionaryBuilder Some more work is needed in ARROW-6869 to move the memo table resetting logic out of `DictionaryBuilder::Reset` but we can leave that for 1.0.0 work Closes apache#5643 from wesm/ARROW-6861 and squashes the following commits: 9a19773 <Wes McKinney> Test using dictionary builder again after finish 890a6dc <Wes McKinney> Simpler fix for ARROW-6861 Authored-by: Wes McKinney <wesm+git@apache.org> Signed-off-by: Wes McKinney <wesm+git@apache.org>
…eset and AppendIndices in DictionaryBuilder Some more work is needed in ARROW-6869 to move the memo table resetting logic out of `DictionaryBuilder::Reset` but we can leave that for 1.0.0 work Closes apache#5643 from wesm/ARROW-6861 and squashes the following commits: 9a19773 <Wes McKinney> Test using dictionary builder again after finish 890a6dc <Wes McKinney> Simpler fix for ARROW-6861 Authored-by: Wes McKinney <wesm+git@apache.org> Signed-off-by: Wes McKinney <wesm+git@apache.org>
Some more work is needed in ARROW-6869 to move the memo table resetting logic out of
DictionaryBuilder::Resetbut we can leave that for 1.0.0 work