-
Notifications
You must be signed in to change notification settings - Fork 268
docs: add documentation for fully-native Iceberg scans #2868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| ```shell | ||
| $SPARK_HOME/bin/spark-shell \ | ||
| --packages org.apache.datafusion:comet-spark-spark3.5_2.12:0.12.0,org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.8.1,org.apache.iceberg:iceberg-core:1.8.1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) We don't need org.apache.iceberg:iceberg-core:1.8.1 if org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.8.1 is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not find that to be the case in my testing because Comet relies on classes that are in core and not just runtime, but it's been a few weeks. Have you tested it?
I can test tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand it correctly, you're talking about the core or api classes you used in IcebergReflection.
jar tf iceberg-spark-runtime-3.5_2.12-1.10.0.jar | grep -E "org.apache.iceberg.ContentScanTask.class|org.apache.iceberg.FileScanTask.class|org.apache.iceberg.ContentFile.class|org.apache.iceberg.StructLike.class|org.apache.iceberg.PartitionScanTask.class|org.apache.iceberg.DeleteFile.class|org.apache.iceberg.expressions.Literal.class|org.apache.iceberg.SchemaParser.class|org.apache.iceberg.Schema.class|org.apache.iceberg.PartitionSpecParser.class|org.apache.iceberg.PartitionSpec.class|org.apache.iceberg.PartitionField.class|org/apache/iceberg/expressions/UnboundPredicate.class"
org/apache/iceberg/PartitionSpecParser.class
org/apache/iceberg/SchemaParser.class
org/apache/iceberg/ContentFile.class
org/apache/iceberg/ContentScanTask.class
org/apache/iceberg/DeleteFile.class
org/apache/iceberg/FileScanTask.class
org/apache/iceberg/PartitionField.class
org/apache/iceberg/PartitionScanTask.class
org/apache/iceberg/PartitionSpec.class
org/apache/iceberg/Schema.class
org/apache/iceberg/StructLike.class
org/apache/iceberg/expressions/Literal.class
org/apache/iceberg/expressions/UnboundPredicate.classjar can be found from https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-spark-runtime-3.5_2.12/1.10.0
hsiang-c
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mbutrovich
Which issue does this PR close?
N/A.
Rationale for this change
#2528's Iceberg integration needs docs.
What changes are included in this PR?
Add docs for #2528.
How are these changes tested?
N/A.