Skip to content

Conversation

@pitrou
Copy link
Member

@pitrou pitrou commented Jan 15, 2026

Rationale for this change

Currently, when fuzzing a Parquet file, proper errors are ignored since the file is most of the time simply invalid: it's ok to get an error.

However, if reading a Parquet row group was successful, the Parquet reader should have ensured that the resulting Table is structurally sound: calling Table::Validate should succeed, and if it fails then it should report a bug.

What changes are included in this PR?

Call Table::Validate on a successful read from the Parquet fuzz target, and abort if validation fails.

Table::ValidateFull, however, is allowed to fail, as for example a Parquet UTF8 column could contain invalid UTF8 bytes, and we don't detect that when decoding.

Are these changes tested?

By existing tests and fuzz regression files.

Actual running this on OSS-Fuzz will tell us if this is really a good idea.

Are there any user-facing changes?

No.

@pitrou pitrou force-pushed the gh48560-fuzz-table-validation branch from 7854761 to 1f1d34a Compare January 15, 2026 10:53
@pitrou
Copy link
Member Author

pitrou commented Jan 15, 2026

@github-actions crossbow submit fuzz

@pitrou pitrou marked this pull request as ready for review January 15, 2026 10:53
@pitrou pitrou requested a review from wgtmac as a code owner January 15, 2026 10:54
@github-actions
Copy link

Revision: 1f1d34a

Submitted crossbow builds: ursacomputing/crossbow @ actions-ccbec654ec

Task Status
test-build-cpp-fuzz GitHub Actions

@pitrou pitrou requested a review from adamreeve January 15, 2026 13:56
@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants