Skip to content

Supply a hint arrow schema for casting Parquet field types during scans#814

Closed
gruuya wants to merge 1 commit intoapache:mainfrom
gruuya:provide-parquet-schema-hint
Closed

Supply a hint arrow schema for casting Parquet field types during scans#814
gruuya wants to merge 1 commit intoapache:mainfrom
gruuya:provide-parquet-schema-hint

Conversation

@gruuya
Copy link
Copy Markdown
Contributor

@gruuya gruuya commented Dec 17, 2024

This is so as to avoid a potential schema mismatch resulting from upcasting arrow 8 and 16 bit integers to Iceberg 32 bit integer type.

This is one way to resolve #813. Note that this is dependent on apache/arrow-rs#6892 getting merged (and picked up) first.

I still need to think of a proper test case for this too.

Closes #813.

This is so as to avoid a potential schema mismatch resulting from upcasting arrow 8 and 16 bit integers to Iceberg 32 bit integer type.
Comment on lines +199 to +204
if task.schema.as_struct().fields().iter().any(|field| {
matches!(
field.field_type.as_ref(),
Type::Primitive(PrimitiveType::Int)
)
}) {
Copy link
Copy Markdown
Contributor Author

@gruuya gruuya Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be done only if the field with this type is also one of the projected ones.

@gruuya
Copy link
Copy Markdown
Contributor Author

gruuya commented Dec 20, 2024

Making this a draft as the upstream dependency is also a draft atm.

@gruuya gruuya marked this pull request as draft December 20, 2024 08:30
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions Bot added the stale label Feb 26, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 5, 2026

This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions Bot closed this Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reported and actual arrow schema of the table can be different

1 participant