-
Notifications
You must be signed in to change notification settings - Fork 13.7k
[FLINK-38532][table][FLIP-551] Make FRESHNESS Optional for Materialized Tables
#27132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
FRESHNESS Optional for Materialized Tables
| ## Data Freshness | ||
|
|
||
| Data freshness defines the maximum amount of time that the materialized table’s content should lag behind updates to the base tables. Freshness is not a guarantee. Instead, it is a target that Flink attempts to meet. Data in materialized table is refreshed as closely as possible within the freshness. | ||
| Data freshness defines the maximum amount of time that the materialized table's content should lag behind updates to the base tables. Freshness is not a guarantee. Instead, it is a target that Flink attempts to meet. Data in materialized table is refreshed as closely as possible within the freshness. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also Chinese doc should be updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits in the words (it is up to you if you want to address)
- Data in materialized table -> The data in materialized table
- within the freshness.-> within the freshness target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be helpful to relate freshness to checkpointing. The code picks up a freshness value from CheckpointingOptions.CHECKPOINTING_INTERVAL, but this is not referenced in the docs - I suggest the user should be made aware of how this config value effects freshness.
.../java/org/apache/flink/table/gateway/service/materializedtable/MaterializedTableManager.java
Outdated
Show resolved
Hide resolved
.../apache/flink/table/planner/operations/SqlMaterializedTableNodeToOperationConverterTest.java
Outdated
Show resolved
Hide resolved
...table/flink-table-common/src/main/java/org/apache/flink/table/catalog/IntervalFreshness.java
Outdated
Show resolved
Hide resolved
...e/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/ShowCreateUtil.java
Outdated
Show resolved
Hide resolved
...table/flink-table-common/src/main/java/org/apache/flink/table/catalog/IntervalFreshness.java
Outdated
Show resolved
Hide resolved
...table/flink-table-common/src/main/java/org/apache/flink/table/catalog/IntervalFreshness.java
Outdated
Show resolved
Hide resolved
.../apache/flink/table/planner/operations/SqlMaterializedTableNodeToOperationConverterTest.java
Outdated
Show resolved
Hide resolved
.../apache/flink/table/planner/operations/SqlMaterializedTableNodeToOperationConverterTest.java
Outdated
Show resolved
Hide resolved
.../src/test/java/org/apache/flink/table/planner/operations/SqlDdlToOperationConverterTest.java
Outdated
Show resolved
Hide resolved
| * non-null values for both properties. | ||
| */ | ||
| @Experimental | ||
| public class EnrichmentResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it is the right name or package...
it is inside org.apache.flink.table.catalog, however it is related to only some specific use case of Materialized Tables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will rename this to MaterializedTableEnrichmentResult (will update the FLIP accordingly). The reason I chose this package was that in the same package we declare the FreshnessInterval as well.
| validateCronConstraints(intervalFreshness, SECOND_CRON_UPPER_BOUND); | ||
| break; | ||
| case MINUTE: | ||
| validateCronConstraints(intervalFreshness, MINUTE_CRON_UPPER_BOUND); | ||
| break; | ||
| case HOUR: | ||
| validateCronConstraints(intervalFreshness, HOUR_CRON_UPPER_BOUND); | ||
| break; | ||
| case DAY: | ||
| validateDayConstraints(intervalFreshness); | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder: what is the reason of having all these validations for interval freshness spreaded between this and IntervalFreshness?
Can we have then in one place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason this validation is here, its because it is used in the internal API. I can move this to the IntervalFreshness class and get rid of the IntervalFreshnessUtils entirely. WDYT?
ab10ec2 to
e3be129
Compare
.../src/test/java/org/apache/flink/table/planner/operations/SqlDdlToOperationConverterTest.java
Outdated
Show resolved
Hide resolved
.../src/test/java/org/apache/flink/table/planner/operations/SqlDdlToOperationConverterTest.java
Outdated
Show resolved
Hide resolved
…ed Tables [FLINK-38532] generate config docs
…ntervalFreshnessUtils
What is the purpose of the change
This pull request makes the
FRESHNESSclause optional when creating materialized tables. When freshness is not specified, the system uses configurable defaults based on the refresh mode:materialized-table.default-freshness.continuous(default: 3 minutes) for CONTINUOUS mode, ormaterialized-table.default-freshness.full(default: 1 hour) for FULL mode. This provides users with more flexibility and allows catalogs to implement custom freshness and refresh mode determination logic.Brief change log
MaterializedTableEnricherinterface for pluggable freshness and refresh mode resolution logicDefaultMaterializedTableEnricherwith threshold-based refresh mode determinationmaterialized-table.default-freshness.continuous(default: 3 minutes)materialized-table.default-freshness.full(default: 1 hour)FRESHNESSclause optional inCREATE MATERIALIZED TABLEsyntaxSqlCreateMaterializedTableConverterto use the enricher for determining freshness and refresh modeVerifying this change
This change is already covered by existing tests, such as:
SqlMaterializedTableNodeToOperationConverterTest.testFullRefreshMode()- validates refresh mode determinationSqlMaterializedTableNodeToOperationConverterTest.testContinuousRefreshMode()- validates continuous mode behaviorAdditionally, this change maintains backward compatibility with existing materialized table implementations through default interface methods in
CatalogMaterializedTable.Does this pull request potentially affect one of the following parts:
@Public(Evolving): yes (CatalogMaterializedTable interface)Documentation