Skip to content

GH-45819: [C++] Add OptionalBitmapAnd utility#49848

Merged
pitrou merged 2 commits intoapache:mainfrom
Shockp:feat-optional-bitmap-and
Apr 30, 2026
Merged

GH-45819: [C++] Add OptionalBitmapAnd utility#49848
pitrou merged 2 commits intoapache:mainfrom
Shockp:feat-optional-bitmap-and

Conversation

@Shockp
Copy link
Copy Markdown
Contributor

@Shockp Shockp commented Apr 23, 2026

Rationale for this change

In Arrow, null bitmaps are optional and represented by nullptr when a column contains no nulls. Previously, conjoining two optional bitmaps required downstream code to manually handle the nullptr checks and memory allocations. This change centralizes that logic into a single, highly optimized utility function.

What changes are included in this PR?

  • Added OptionalBitmapAnd to cpp/src/arrow/util/bitmap_ops.h and bitmap_ops.cc.
  • The implementation safely handles the four possible memory permutations with minimal allocations:
    1. Both buffers nullptr: Returns nullptr.
    2. Left is nullptr: Returns a copy of the right buffer.
    3. Right is nullptr: Returns a copy of the left buffer.
    4. Both valid: Defers to the low-level BitmapAnd function.

Are these changes tested?

Yes. I added a new test case (OptionalBitmapAnd) to cpp/src/arrow/util/bitmap_test.cc that explicitly verifies the correct buffer allocation and bitwise output for all four memory states.

Are there any user-facing changes?

No. This simply exposes a new C++ utility for internal development.

Closes #45819

@Shockp Shockp force-pushed the feat-optional-bitmap-and branch 3 times, most recently from 7fd57ca to 28d69c7 Compare April 23, 2026 11:14
Copy link
Copy Markdown
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this @Shockp ! Here are some comments.

Also can you find places in the codebase where this can be put to use?

Comment thread cpp/src/arrow/util/bitmap_ops.cc
Comment thread cpp/src/arrow/util/bitmap_ops.h Outdated
Comment thread cpp/src/arrow/util/bitmap_test.cc Outdated
Comment thread testing
@github-actions github-actions Bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Apr 27, 2026
@Shockp Shockp force-pushed the feat-optional-bitmap-and branch from 4973fdd to 2741418 Compare April 27, 2026 14:52
@Shockp
Copy link
Copy Markdown
Contributor Author

Shockp commented Apr 27, 2026

Reverted the testing submodule, added the out_offset default, implemented the zero-copy slicing for byte-aligned offsets, and updated the tests to use BitmapFromVector for non-trivial inputs. Let me know how it looks! @pitrou

Quick question on the API: I noticed the other functions in bitmap_ops.h (like BitmapAnd, BitmapOr, etc.) don't currently have out_offset = 0 as a default. Would it be worth unifying those for consistency in a separate, follow-up PR, or is it safer to leave the existing signatures as-is?

@Shockp Shockp force-pushed the feat-optional-bitmap-and branch 2 times, most recently from 20af324 to 9e05f24 Compare April 27, 2026 16:38
This commit introduces the `OptionalBitmapAnd` utility, which provides
an optimized bitwise AND operation for Arrow bitmaps.

Key changes:
- Added `OptionalBitmapAnd` function in `bitmap_ops.h` and `bitmap_ops.cc`.
- Implemented optimizations to avoid allocations and use slicing when bitmaps
  are byte-aligned and either the left or right bitmap is missing.
- Added comprehensive unit tests in `bitmap_test.cc` covering all permutations
  of offsets, lengths, and missing bitmaps.
@Shockp Shockp force-pushed the feat-optional-bitmap-and branch from 960feb9 to 8972ab0 Compare April 27, 2026 16:49
@Shockp
Copy link
Copy Markdown
Contributor Author

Shockp commented Apr 27, 2026

Sorry, I had some conflicts with my local branch while trying to pass the lint CI. I just squashed all the changes of the PR in a single commit.

@pitrou
Copy link
Copy Markdown
Member

pitrou commented Apr 28, 2026

Quick question on the API: I noticed the other functions in bitmap_ops.h (like BitmapAnd, BitmapOr, etc.) don't currently have out_offset = 0 as a default. Would it be worth unifying those for consistency in a separate, follow-up PR, or is it safer to leave the existing signatures as-is?

Ah, that's a good point. Well, perhaps we can keep them as they are for now.

@pitrou
Copy link
Copy Markdown
Member

pitrou commented Apr 28, 2026

Can we perhaps use this function in

if (null_bitmap) {
arrow::internal::BitmapAnd(null_bitmap->data(), /*left_offset=*/0,
no_nulls->data(), /*right_offset=*/0, num_groups_,
/*out_offset=*/0, null_bitmap->mutable_data());
} else {
null_bitmap = std::move(no_nulls);
}
?

@Shockp
Copy link
Copy Markdown
Contributor Author

Shockp commented Apr 28, 2026

I just updated it to use the new OptionalBitmapAnd utility @pitrou. It replaces that manual if/else block with a single, clean macro:

      ARROW_ASSIGN_OR_RAISE(
          null_bitmap,
          arrow::internal::OptionalBitmapAnd(pool_, null_bitmap, /*left_offset=*/0,
                                             no_nulls, /*right_offset=*/0,
                                             num_groups_));

@pitrou
Copy link
Copy Markdown
Member

pitrou commented Apr 30, 2026

@github-actions crossbow submit -g cpp

Copy link
Copy Markdown
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, let's wait for CI

@github-actions
Copy link
Copy Markdown

Revision: 99022c3

Submitted crossbow builds: ursacomputing/crossbow @ actions-4c878a50ea

Task Status
example-cpp-minimal-build-static GitHub Actions
example-cpp-minimal-build-static-system-dependency GitHub Actions
example-cpp-tutorial GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-debian-13-cpp-amd64 GitHub Actions
test-debian-13-cpp-i386 GitHub Actions
test-debian-experimental-cpp-gcc-15 GitHub Actions
test-fedora-42-cpp GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-bundled GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-bundled-offline GitHub Actions
test-ubuntu-24.04-cpp-gcc-13-bundled GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-24.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-24.04-cpp-thread-sanitizer GitHub Actions

@Shockp
Copy link
Copy Markdown
Contributor Author

Shockp commented Apr 30, 2026

Thanks so much for the review and the approval, @pitrou!

I'm already looking forward to my next contribution. I am really focused on building a strong foundation in High-Performance Computing. Are there specific issue tags or components you would recommend I search for if I want to tackle very low-level C++ tasks (like SIMD, memory/buffer management) or dive into CUDA/GPU-related issues?

Thanks again for all the guidance!

@pitrou pitrou merged commit 9abd6d3 into apache:main Apr 30, 2026
59 of 61 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Apr 30, 2026
@raulcd
Copy link
Copy Markdown
Member

raulcd commented Apr 30, 2026

Felicidades @Shockp ! 🥳

@Shockp
Copy link
Copy Markdown
Contributor Author

Shockp commented Apr 30, 2026

Gracias! @raulcd . No estoy seguro de si puedes responder al comentario de arriba, por saber las etiquetas que tengo que buscar para esos problemas.

@conbench-apache-arrow
Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 0 benchmarking runs that have been run so far on merge-commit 9abd6d3.

None of the specified runs were found on the Conbench server.

The full Conbench report has more details.

@Shockp Shockp deleted the feat-optional-bitmap-and branch April 30, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C++] Add OptionalBitmapAnd utility

3 participants