-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Add Lance connector #26987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add Lance connector #26987
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @luohao, your pull request is larger than the review limit of 150000 diff characters
874c64e
to
2b2a035
Compare
lib/trino-lance-file/src/main/java/io/trino/lance/file/v2/metadata/AllNullLayout.java
Outdated
Show resolved
Hide resolved
lib/trino-lance-file/src/main/java/io/trino/lance/file/v2/metadata/ColumnMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-lance/src/test/java/io/trino/plugin/lance/BaseLanceConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-lance/src/test/java/io/trino/plugin/lance/BaseLanceConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-lance/src/main/java/io/trino/plugin/lance/LanceMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-lance/src/main/java/io/trino/plugin/lance/LanceColumnHandle.java
Outdated
Show resolved
Hide resolved
import static com.google.common.io.RecursiveDeleteOption.ALLOW_INSECURE; | ||
import static java.nio.file.Files.createTempDirectory; | ||
|
||
public class TempFile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific reason for this TempFile.class ? Files.createTempFile
we can use them to create the files on demand right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am following the pattern in Orc and Parquet(e.g., https://github.com/trinodb/trino/blob/master/lib/trino-orc/src/test/java/io/trino/orc/TempFile.java, https://github.com/trinodb/trino/blob/master/plugin/trino-hive/src/test/java/io/trino/plugin/hive/parquet/ParquetTester.java#L556).
I think using Files.createTempFile
works too.
...in/trino-lance/src/main/java/io/trino/plugin/lance/catalog/namespace/DirectoryNamespace.java
Outdated
Show resolved
Hide resolved
...in/trino-lance/src/main/java/io/trino/plugin/lance/catalog/namespace/DirectoryNamespace.java
Outdated
Show resolved
Hide resolved
0b965b4
to
e058e68
Compare
plugin/trino-lance/src/main/java/io/trino/plugin/lance/LanceColumnHandle.java
Outdated
Show resolved
Hide resolved
plugin/trino-lance/src/main/java/io/trino/plugin/lance/LanceColumnHandle.java
Outdated
Show resolved
Hide resolved
plugin/trino-lance/src/main/java/io/trino/plugin/lance/LanceTableHandle.java
Show resolved
Hide resolved
49d22af
to
06913f5
Compare
06913f5
to
29a0e87
Compare
// A simple product test to ensure no class loader issue. | ||
QueryResult result = onTrino().executeQuery("SHOW SCHEMAS FROM lance"); | ||
assertThat(result).containsOnly(row("default"), row("information_schema")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should test a SELECT statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The connector is read only right now, adding SELECT would need additional stuff to set up the data.
Do you have an example that prepares data for a read only connector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add static resource files or use the datasource's library in such a case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to add files to minio object store via copying files to minio container's /data/test-bucket/
directory but doesn't seem to work.
Do you know any example of copying directories to object store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Answered in Slack DM)
Description
Initial Lance connector implementation. It adds 2 modules:
Additional context and related issues
Benchmark results show that this implementation is up to 5x faster than other implementation using the JNI lib. The following results are collected from a Macbook Pro with Apple M2 Max CPU. Some notes about the result:
readVarchar
is worse than JNI version because the my current FSST decoder doesn't have any optimizations(e.g., vectorization, loop unrolling, and reduced memory copy). I have aTODO
for this.More details about the design and implementation of Lance file/table format can be found at
This is my first PR to kick off the Lance connector work. My immediate TODO list(aka
Roadmap
) already includes:And more to be added...
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: