Skip to content

Conversation

mbutrovich
Copy link
Contributor

@mbutrovich mbutrovich commented Aug 19, 2025

Which issue does this PR close?

Closes #2195, #1348.

Rationale for this change

We previously fell back in a scenario where the native reader produced different results. We've now picked up appropriate DF and arrow-rs dependencies to have apache/arrow#7055.

What changes are included in this PR?

Remove the fallback.

How are these changes tested?

Existing tests.

@mbutrovich
Copy link
Contributor Author

mbutrovich commented Aug 19, 2025

Running Spark SQL tests locally with iceberg_compat. If that passes, I'll likely want to undo #1376 and try again.

@codecov-commenter
Copy link

codecov-commenter commented Aug 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 43.58%. Comparing base (f09f8af) to head (bbaa3c9).
⚠️ Report is 514 commits behind head on main.

Additional details and impacted files
@@              Coverage Diff              @@
##               main    #2197       +/-   ##
=============================================
- Coverage     56.12%   43.58%   -12.55%     
- Complexity      976     1085      +109     
=============================================
  Files           119      147       +28     
  Lines         11743    13464     +1721     
  Branches       2251     2349       +98     
=============================================
- Hits           6591     5868      -723     
- Misses         4012     6595     +2583     
+ Partials       1140     1001      -139     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@parthchandra
Copy link
Contributor

You may want to undo parts of #1376 to really test the change. The changes to CometTestBase.getPrimitiveTypesParquetSchema were so that the schema would not contain unsigned types if the reader type was one of the datafusion based readers. Undoing that change will be a primary test.

@mbutrovich
Copy link
Contributor Author

So instead of getting null on the incorrectly written values we now get different values. So garbage in is still garbage out, but it's new garbage!

@parthchandra
Copy link
Contributor

I seem to recall that I tested a badly written file with apache/arrow#7055 and did not get garbage. I'll take a look at it again.
Thanks for kicking off this effort @mbutrovich!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

follow up - remove incompatibility limitation on unsigned int types in native_iceberg_compat and native_datafusion readers
3 participants