Skip to content

[C++][Parquet] Support reading dictionary encoded boolean pages#49914

Open
ArnavBalyan wants to merge 1 commit intoapache:mainfrom
ArnavBalyan:arnavb/dict-bool
Open

[C++][Parquet] Support reading dictionary encoded boolean pages#49914
ArnavBalyan wants to merge 1 commit intoapache:mainfrom
ArnavBalyan:arnavb/dict-bool

Conversation

@ArnavBalyan
Copy link
Copy Markdown
Member

@ArnavBalyan ArnavBalyan commented May 4, 2026

Rationale for this change

  • DictDecoderImpl for boolean today throws NYI today, hence dict encoded boolean pages fail on read.
  • Since dict encoded boolean is spec compliant, certain writers can write dictionary encoded boolean pages which can't be read by parquet reader.

What changes are included in this PR?

  • Implement boolean dict decoder which decodes/ writes the output bytes into caller's buffer.
  • Added bool decode UT.

Are these changes tested?

  • Yes

Are there any user-facing changes?

  • Yes

@ArnavBalyan ArnavBalyan requested a review from wgtmac as a code owner May 4, 2026 11:10
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@ArnavBalyan
Copy link
Copy Markdown
Member Author

cc @wgtmac thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant