Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Support joinining tables with null columns #35785

Open
lccnl opened this issue May 26, 2023 · 1 comment · May be fixed by #38383
Open

[Python] Support joinining tables with null columns #35785

lccnl opened this issue May 26, 2023 · 1 comment · May be fixed by #38383

Comments

@lccnl
Copy link

lccnl commented May 26, 2023

Describe the enhancement requested

Hello,
currently pyarrow does not support performing a join between two tables if one has null columns even when it is not the joining column. For instance, this code will fail with pyarrow.lib.ArrowInvalid: Data type null is not supported in join non-key field:

import pyarrow as pa


tab1=pa.Table.from_arrays([pa.array([1,2]),pa.array([None,None])],names=['pk','null'])
tab2=pa.Table.from_arrays([pa.array([1,1]),pa.array(['a','b'])],names=['fk','some'])

tab1.join(right_table=tab2,keys='pk',right_keys='fk')

It would be nice to enable this!

Component(s)

C, Python

@raulcd raulcd changed the title Support joinining tables with null columns [Python] Support joinining tables with null columns May 27, 2023
@Fokko
Copy link
Contributor

Fokko commented Feb 24, 2025

We're also running into this at PyIceberg: apache/iceberg-python#1711

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants