-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Add Support for INTERVAL Data Type - Already Supported in Parquet & Arrow #16677
Comments
New DataTypes could be supported after #16610 , we are still in a big refactoring stage. |
#16610 has been merged, and this feature is now ready to be added to the work queue. |
Better do it after #16814 |
@TCeason is already working on it. We will support interval units via |
@TCeason Sorry to bother you on this one. We are trying to plan out migrations to Databend but we have a couple of projects that depend on the Interval data type being available. Would it be realistic to see that new data type this week or is that too soon? We would have to push the migrations to next year if we can't get testing done by next week. Thanks... |
This week I'm prioritizing this task. It is expected to be completed by the end of the week |
Summary
Interval is a value type that Databend understands as it is used in date addition. However, there is no current way to store an Interval value like can be done in Postgres.
While Snowflake and MySQL also do not support the Interval type, Postgres does and it makes life so much easier since it is quite common to store duration information. Both Parquet and Arrow do support an Interval/Duration data type.
The Parquet standard does support an Interval data type as defined here: https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#interval
Arrow also supports a Duration type with varying levels of resolution. It would likely be safe to pick a reasonable default resolution for Arrow usage: https://arrow.apache.org/docs/python/generated/pyarrow.duration.html#pyarrow.duration. This conversion function seems to suggest that might be milliseconds: https://arrow.apache.org/rust/parquet/arrow/arrow_writer/fn.get_interval_dt_array_slice.html
The ideal approach would be one where an Interval value could be marshalled and unmarshalled from Parquet using native Parquet and Arrow types.
The text was updated successfully, but these errors were encountered: