[FSTORE-1630] Model Dependent Transformation Functions creates feature names that are longer than 64 character causing logging feature group ingestion to fail (#429)

manu-sj · manu-sj · commit 72c8395a82af · 2025-01-28T13:26:56.000+01:00
diff --git a/docs/user_guides/fs/feature_group/on_demand_transformations.md b/docs/user_guides/fs/feature_group/on_demand_transformations.md
@@ -5,7 +5,9 @@
 ## On Demand Transformation Function Creation
 
 
-An on-demand transformation function can be created by attaching a [transformation function](../transformation_functions.md) to a feature group. Each on-demand transformation function creates one on-demand feature having the same name as the transformation function. For instance, in the example below, the on-demand transformation function `transaction_age` will generate one on-demand feature called `transaction_age`. Hence, only one-to-one or many-to-one transformation functions can be used to create an on-demand transformation functions.
+An on-demand transformation function may be created by associating a [transformation function](../transformation_functions.md) with a feature group. Each on-demand transformation function generates a single on-demand feature, which, by default, is assigned the same name as the associated transformation function. For instance, in the example below, the on-demand transformation function `transaction_age` produces an on-demand feature named transaction_age. Alternatively, the name of the resulting on-demand feature can be explicitly defined using the [`alias`](../transformation_functions.md#specifying-output-features–names-for-transformation-functions) function.
+
+It is important to note that only one-to-one or many-to-one transformation functions are compatible with the creation of on-demand transformation functions.
 
 !!! warning "On-demand transformation"
     All on-demand transformation functions attached to a feature group must have unique names and, in contrast to model-dependent transformations, they do not have access to training dataset statistics.
diff --git a/docs/user_guides/fs/feature_view/model-dependent-transformations.md b/docs/user_guides/fs/feature_view/model-dependent-transformations.md
@@ -11,7 +11,7 @@ Hopsworks allows you to create a model-dependent transformation function by atta
 
 Each model-dependent transformation function can map specific features to its arguments by explicitly providing their names as arguments to the transformation function. If no feature names are provided, the transformation function will default to using features from the feature view that match the name of the transformation function's argument.
 
-The output columns generated by a model-dependent transformation function follows a naming convention structured as `functionName_features_outputColumnNumber` if the transformation function outputs multiple columns and `functionName_features` if the transformation function outputs one column. For instance, for the function named `add_one_multiple` that outputs multiple columns in the example given below, produces output columns that would be labeled as  `add_one_multiple_feature1_feature2_feature3_0`,  `add_one_multiple_feature1_feature2_feature3_1`  and   `add_one_multiple_feature1_feature2_feature3_2`. The function named `add_two` that outputs a single column in the example given below, produces a single output column names as `add_two_feature`.
+Hopsworks by default generates default names of transformed features output by a model-dependent transformation function. The generated names follows a naming convention structured as `functionName_features_outputColumnNumber` if the transformation function outputs multiple columns and `functionName_features` if the transformation function outputs one column. For instance, for the function named `add_one_multiple` that outputs multiple columns in the example given below, produces output columns that would be labeled as  `add_one_multiple_feature1_feature2_feature3_0`,  `add_one_multiple_feature1_feature2_feature3_1`  and   `add_one_multiple_feature1_feature2_feature3_2`. The function named `add_two` that outputs a single column in the example given below, produces a single output column names as `add_two_feature`. Additionally, Hopsworks also allows users to specify custom names for transformed feature using the [`alias`](../transformation_functions.md#specifying-output-features–names-for-transformation-functions) function.
 
 
 === "Python"
diff --git a/docs/user_guides/fs/transformation_functions.md b/docs/user_guides/fs/transformation_functions.md
@@ -185,6 +185,25 @@ The `drop` parameter of the `@udf` decorator is used to drop specific column
             return feature1 + 1, feature2 + 1, feature3 + 1
         ```
 
+### Specifying output features names for transformation functions
+
+The [`alias`](http://docs.hopsworks.ai/hopsworks-api/{{{hopsworks_version}}}/generated/api/transformation_functions_api/#alias) function of a transformation function allows the specification of names of transformed features generated by the transformation function. Each name must be uniques and should be at-most 63 characters long. If no name is provided via the `alias` function, Hopsworks generates default output feature names when [on-demand](./feature_group/on_demand_transformations.md) or [model-dependent](./feature_view/model-dependent-transformations.md) transformation functions are created.
+
+
+=== "Python"    
+    !!! example "Specifying output column names for transformation functions."
+        ```python
+        from hopsworks import udf
+        import pandas as pd
+
+        @udf(return_type=[int, int, int], drop=["feature1", "feature3"])
+        def add_one_multiple(feature1, feature2, feature3):
+            return feature1 + 1, feature2 + 1, feature3 + 1
+
+        # Specifying output feature names of the transformation function.
+        add_one_multiple.alias("transformed_feature1", "transformed_feature2", "transformed_feature3")
+        ```
+
 ### Training dataset statistics
 
 A keyword argument `statistics` can be defined in the transformation function if it requires training dataset statistics for any of its arguments. The `statistics` argument must be assigned an instance of the class [`TransformationStatistics`](http://docs.hopsworks.ai/hopsworks-api/{{{hopsworks_version}}}/generated/api/transformation_statistics/) as the default value. The `TransformationStatistics` instance must be initialized using the names of the arguments requiring statistics.