-
Notifications
You must be signed in to change notification settings - Fork 411
Open
Description
Question
Hello,
I have Iceberg tables and I’m loading them using load_table through the Glue catalog.
I want to compare them with Parquets using DuckDB pqt_data = duckdb.sql(f"SELECT * FROM read_parquet({pqt}, union_by_name=true)").arrow(), but after that I want to check if there are any different columns to perform schema evolution. At this point, I tried using pqt_schema = pyarrow_to_schema(pqt_data.schema) but the result is an error because it doesn’t have an id or name mapping Parquet file does not have field-ids and the Iceberg table does not have 'schema.name-mapping.default' defined. What is the reason and is there any solution for this? I just want to compare the types and names in a simple way.
Some code example:
pqt_data = duckdb.sql(f"SELECT * FROM read_parquet({pqt}, union_by_name=true)").arrow()
pqt_schema = pyarrow_to_schema(pqt_data.schema)
Metadata
Metadata
Assignees
Labels
No labels