GH-45819: [C++] Add OptionalBitmapAnd utility#49848
Conversation
7fd57ca to
28d69c7
Compare
4973fdd to
2741418
Compare
|
Reverted the testing submodule, added the out_offset default, implemented the zero-copy slicing for byte-aligned offsets, and updated the tests to use BitmapFromVector for non-trivial inputs. Let me know how it looks! @pitrou Quick question on the API: I noticed the other functions in bitmap_ops.h (like BitmapAnd, BitmapOr, etc.) don't currently have out_offset = 0 as a default. Would it be worth unifying those for consistency in a separate, follow-up PR, or is it safer to leave the existing signatures as-is? |
20af324 to
9e05f24
Compare
This commit introduces the `OptionalBitmapAnd` utility, which provides an optimized bitwise AND operation for Arrow bitmaps. Key changes: - Added `OptionalBitmapAnd` function in `bitmap_ops.h` and `bitmap_ops.cc`. - Implemented optimizations to avoid allocations and use slicing when bitmaps are byte-aligned and either the left or right bitmap is missing. - Added comprehensive unit tests in `bitmap_test.cc` covering all permutations of offsets, lengths, and missing bitmaps.
960feb9 to
8972ab0
Compare
|
Sorry, I had some conflicts with my local branch while trying to pass the lint CI. I just squashed all the changes of the PR in a single commit. |
Ah, that's a good point. Well, perhaps we can keep them as they are for now. |
|
Can we perhaps use this function in arrow/cpp/src/arrow/compute/kernels/hash_aggregate.cc Lines 1342 to 1348 in 6198adc |
|
I just updated it to use the new ARROW_ASSIGN_OR_RAISE(
null_bitmap,
arrow::internal::OptionalBitmapAnd(pool_, null_bitmap, /*left_offset=*/0,
no_nulls, /*right_offset=*/0,
num_groups_)); |
|
@github-actions crossbow submit -g cpp |
|
Revision: 99022c3 Submitted crossbow builds: ursacomputing/crossbow @ actions-4c878a50ea |
|
Thanks so much for the review and the approval, @pitrou! I'm already looking forward to my next contribution. I am really focused on building a strong foundation in High-Performance Computing. Are there specific issue tags or components you would recommend I search for if I want to tackle very low-level C++ tasks (like SIMD, memory/buffer management) or dive into CUDA/GPU-related issues? Thanks again for all the guidance! |
|
Felicidades @Shockp ! 🥳 |
|
Gracias! @raulcd . No estoy seguro de si puedes responder al comentario de arriba, por saber las etiquetas que tengo que buscar para esos problemas. |
|
After merging your PR, Conbench analyzed the 0 benchmarking runs that have been run so far on merge-commit 9abd6d3. None of the specified runs were found on the Conbench server. The full Conbench report has more details. |
Rationale for this change
In Arrow, null bitmaps are optional and represented by
nullptrwhen a column contains no nulls. Previously, conjoining two optional bitmaps required downstream code to manually handle thenullptrchecks and memory allocations. This change centralizes that logic into a single, highly optimized utility function.What changes are included in this PR?
OptionalBitmapAndtocpp/src/arrow/util/bitmap_ops.handbitmap_ops.cc.nullptr: Returnsnullptr.nullptr: Returns a copy of the right buffer.nullptr: Returns a copy of the left buffer.BitmapAndfunction.Are these changes tested?
Yes. I added a new test case (
OptionalBitmapAnd) tocpp/src/arrow/util/bitmap_test.ccthat explicitly verifies the correct buffer allocation and bitwise output for all four memory states.Are there any user-facing changes?
No. This simply exposes a new C++ utility for internal development.
Closes #45819
OptionalBitmapAndutility #45819