[FLINK-38104][table] add table api support for model ml_predict #27108

lihaosky · 2025-10-13T22:04:16Z

What is the purpose of the change

Add table api support for Model and ml_predict function in https://cwiki.apache.org/confluence/display/FLINK/FLIP-526%3A+Model+ML_PREDICT%2C+ML_EVALUATE+Table+API

Brief change log

Add new Model interface and ModelImpl implementation for model and ml_predict
Add fromModelPath and from to construct Model from TableEnvironment
Add ModelReferenceExpression and handle it in QueryOperationConverter
Add anonymous model to ContextResolvedModel

Verifying this change

Unit and Integration test

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): (no)
The public API, i.e., is any changed class annotated with @Public(Evolving): (yes)
The serializers: (no)
The runtime per-record code paths (performance sensitive): (no)
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
The S3 file system connector: (no)

Documentation

Does this pull request introduce a new feature? (yes)
If yes, how is the feature documented? (JavaDocs)

flinkbot · 2025-10-13T22:12:18Z

CI report:

208c58d Azure: SUCCESS
a21c25b Azure: PENDING

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot run azure re-run the last Azure build

lihaosky · 2025-10-21T16:21:36Z

flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/Model.java

+     *
+     * @return the context resolved model metadata.
+     */
+    ContextResolvedModel getModel();


@twalthr , this forces ContextResolvedModel to be PublicEvolving. I guess one way to avoid it is to remove this method from Model and just add it in ModelImpl

fsk119

Thanks for your contribution. I left some comments.

fsk119 · 2025-10-31T01:30:21Z

flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/Model.java

+     * runtime configuration options such as max-concurrent-operations, timeout, and execution mode
+     * settings.
+     *
+     * <p>Common runtime options include:


It's not easy for the community to maintain the javadoc about these options. How about we just link the MLPredictRuntimeConfigOptions.

fsk119 · 2025-10-31T01:38:33Z

...-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/ModelImpl.java

+        // lit() is not serializable to sql.
+        if (options.isEmpty()) {
+            return tableEnvironment.fromCall(
+                    "ML_PREDICT",


use org.apache.flink.table.functions.BuiltInFunctionDefinitions#ML_PREDICT.getName() instead?

fsk119 · 2025-10-31T01:39:41Z

...-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/ModelImpl.java

+
+    @Override
+    public Table predict(Table table, ColumnList inputColumns, Map<String, String> options) {
+        // Use Expressions.map() instead of Expression.lit() to create a MAP literal since


typo: Expressions.lit() ?

...-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/ModelImpl.java

...k-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java

fsk119 · 2025-10-31T02:15:46Z

...able-api-java/src/main/java/org/apache/flink/table/expressions/ModelReferenceExpression.java

+
+    private final String name;
+    private final ContextResolvedModel model;
+    private final TableEnvironment env;


Why do we need TableEnv here?

It's used in https://github.com/apache/flink/pull/27108/files#diff-7c9c46d35bb24b9f87d4c993274ef8c0054894c86c72c8afcc0c4f6398562fdeR1570. Same pattern for TableReferenceExpression

But I see FieldReferenceExpression doesn't have table environment. I don't see any need to include the table env. How about we relaxing the limit and add it back when we really need it.

It's mainly for validation I think. See
https://github.com/apache/flink/blob/master/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java#L1527-L1531

https://github.com/apache/flink/blob/master/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/expressions/TableReferenceExpression.java#L46-L48

I think for FieldReferenceExpression, it just contains field name and type, no need to validate environment I guess. How about I add more comments here similar to table?

...able-api-java/src/main/java/org/apache/flink/table/expressions/ModelReferenceExpression.java

fsk119 · 2025-10-31T02:29:33Z

...nk-table-api-java/src/main/java/org/apache/flink/table/expressions/ApiExpressionVisitor.java


    public abstract R visit(TableReferenceExpression tableReference);

+    public abstract R visit(ModelReferenceExpression modelReferenceExpression);


Add else if branch in line 30?

fsk119 · 2025-10-31T02:42:10Z

...table-planner/src/main/java/org/apache/flink/table/planner/plan/QueryOperationConverter.java

                                                            new int[0]);
                                            inputStack.add(relBuilder.build());
                                            return tableArgCall;
+                                        } else if (resolvedArg


I don't think it's a good idea to use instance of here. It's better we reuse ExpressionConverter to convert these expression. How about let ExpressionConverterextends ResolvedExpressionVisitor<RexNode>?

...c/test/java/org/apache/flink/table/planner/plan/nodes/exec/stream/MLPredictTestPrograms.java

fsk119

I left some comments

fsk119 · 2025-11-04T15:54:18Z

...able-api-java/src/main/java/org/apache/flink/table/expressions/ModelReferenceExpression.java

+            throw new ValidationException("Anonymous models cannot be serialized.");
+        }
+
+        return "MODEL " + model.getIdentifier().asSerializableString();


It seems the name is lost after searilization. I am not sure whether we need to restore the object from the string

The name are not used for TableReferenceExpression and literal argument name as well. These are not serialized as named argument call. I don't see this is restored from serialized string, looks mainly to convert it sql query looks

fsk119 · 2025-11-04T15:59:25Z

...able-api-java/src/main/java/org/apache/flink/table/expressions/ModelReferenceExpression.java

+
+    private final String name;
+    private final ContextResolvedModel model;
+    private final TableEnvironment env;


But I see FieldReferenceExpression doesn't have table environment. I don't see any need to include the table env. How about we relaxing the limit and add it back when we really need it.

fsk119 · 2025-11-04T16:02:52Z

.../src/main/java/org/apache/flink/table/planner/expressions/converter/ExpressionConverter.java

        this.typeFactory = (FlinkTypeFactory) relBuilder.getRexBuilder().getTypeFactory();
        this.dataTypeFactory =
                unwrapContext(relBuilder.getCluster()).getCatalogManager().getDataTypeFactory();
+        this.inputStack = new java.util.ArrayList<>();


this.inputStack = new ArrayList<>();

fsk119 · 2025-11-04T16:21:38Z

.../src/main/java/org/apache/flink/table/planner/expressions/converter/ExpressionConverter.java

+        }
+        final RexTableArgCall tableArgCall =
+                new RexTableArgCall(rowType, inputStack.size(), partitionKeys, new int[0]);
+        inputStack.add(relBuilder.build());


Can we maintain the stack and do the relBuilder.build() in the QueryOperationConverter?

It's hard to do since inputStack depends on this call. Maybe that's why TableReferenceExpression was handled in QueryOperationConverter in the first place

fsk119 · 2025-11-04T16:29:05Z

flink-python/pyflink/table/tests/test_table_environment_completeness.py

            "from",
            "registerFunction",
            "fromCall",
+            "fromModelPath",


Create a tickect about this.

Done: https://issues.apache.org/jira/browse/FLINK-38623

lihaosky commented Oct 21, 2025

View reviewed changes

airlock-confluentinc bot force-pushed the model-table-api branch from 074a935 to 982f128 Compare October 21, 2025 21:54

github-actions bot added the community-reviewed PR has been reviewed by the community. label Oct 22, 2025

fsk119 reviewed Oct 31, 2025

View reviewed changes

lihaosky added 6 commits November 3, 2025 14:18

[FLINK-38104][table] add table api support for model ml_predict

2973b01

fix

ca0700b

fix

bfab15b

fix

b7d9a51

fix

abf8b08

comments

208c58d

airlock-confluentinc bot force-pushed the model-table-api branch from 824f5f1 to 208c58d Compare November 4, 2025 02:51

fsk119 reviewed Nov 4, 2025

View reviewed changes

comments

a21c25b


		public abstract R visit(TableReferenceExpression tableReference);

		public abstract R visit(ModelReferenceExpression modelReferenceExpression);

[FLINK-38104][table] add table api support for model ml_predict #27108

Are you sure you want to change the base?

[FLINK-38104][table] add table api support for model ml_predict #27108

Conversation

lihaosky commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Uh oh!

flinkbot commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI report:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fsk119 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fsk119 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lihaosky commented Oct 13, 2025 •

edited

Loading

flinkbot commented Oct 13, 2025 •

edited

Loading