Skip to content

Commit 72c8395

Browse files
committed
[FSTORE-1630] Model Dependent Transformation Functions creates feature names that are longer than 64 character causing logging feature group ingestion to fail (#429)
1 parent 35ee41d commit 72c8395

File tree

3 files changed

+23
-2
lines changed

3 files changed

+23
-2
lines changed

docs/user_guides/fs/feature_group/on_demand_transformations.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,9 @@
55
## On Demand Transformation Function Creation
66

77

8-
An on-demand transformation function can be created by attaching a [transformation function](../transformation_functions.md) to a feature group. Each on-demand transformation function creates one on-demand feature having the same name as the transformation function. For instance, in the example below, the on-demand transformation function `transaction_age` will generate one on-demand feature called `transaction_age`. Hence, only one-to-one or many-to-one transformation functions can be used to create an on-demand transformation functions.
8+
An on-demand transformation function may be created by associating a [transformation function](../transformation_functions.md) with a feature group. Each on-demand transformation function generates a single on-demand feature, which, by default, is assigned the same name as the associated transformation function. For instance, in the example below, the on-demand transformation function `transaction_age` produces an on-demand feature named transaction_age. Alternatively, the name of the resulting on-demand feature can be explicitly defined using the [`alias`](../transformation_functions.md#specifying-output-features–names-for-transformation-functions) function.
9+
10+
It is important to note that only one-to-one or many-to-one transformation functions are compatible with the creation of on-demand transformation functions.
911

1012
!!! warning "On-demand transformation"
1113
All on-demand transformation functions attached to a feature group must have unique names and, in contrast to model-dependent transformations, they do not have access to training dataset statistics.

docs/user_guides/fs/feature_view/model-dependent-transformations.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Hopsworks allows you to create a model-dependent transformation function by atta
1111

1212
Each model-dependent transformation function can map specific features to its arguments by explicitly providing their names as arguments to the transformation function. If no feature names are provided, the transformation function will default to using features from the feature view that match the name of the transformation function's argument.
1313

14-
The output columns generated by a model-dependent transformation function follows a naming convention structured as `functionName_features_outputColumnNumber` if the transformation function outputs multiple columns and `functionName_features` if the transformation function outputs one column. For instance, for the function named `add_one_multiple` that outputs multiple columns in the example given below, produces output columns that would be labeled as  `add_one_multiple_feature1_feature2_feature3_0``add_one_multiple_feature1_feature2_feature3_1` and  `add_one_multiple_feature1_feature2_feature3_2`. The function named `add_two` that outputs a single column in the example given below, produces a single output column names as `add_two_feature`.
14+
Hopsworks by default generates default names of transformed features output by a model-dependent transformation function. The generated names follows a naming convention structured as `functionName_features_outputColumnNumber` if the transformation function outputs multiple columns and `functionName_features` if the transformation function outputs one column. For instance, for the function named `add_one_multiple` that outputs multiple columns in the example given below, produces output columns that would be labeled as  `add_one_multiple_feature1_feature2_feature3_0`,  `add_one_multiple_feature1_feature2_feature3_1` and  `add_one_multiple_feature1_feature2_feature3_2`. The function named `add_two` that outputs a single column in the example given below, produces a single output column names as `add_two_feature`. Additionally, Hopsworks also allows users to specify custom names for transformed feature using the [`alias`](../transformation_functions.md#specifying-output-features–names-for-transformation-functions) function.
1515

1616

1717
=== "Python"

docs/user_guides/fs/transformation_functions.md

+19
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,25 @@ The `drop` parameter of the `@udf` decorator is used to drop specific column
185185
return feature1 + 1, feature2 + 1, feature3 + 1
186186
```
187187

188+
### Specifying output features names for transformation functions
189+
190+
The [`alias`](http://docs.hopsworks.ai/hopsworks-api/{{{hopsworks_version}}}/generated/api/transformation_functions_api/#alias) function of a transformation function allows the specification of names of transformed features generated by the transformation function. Each name must be uniques and should be at-most 63 characters long. If no name is provided via the `alias` function, Hopsworks generates default output feature names when [on-demand](./feature_group/on_demand_transformations.md) or [model-dependent](./feature_view/model-dependent-transformations.md) transformation functions are created.
191+
192+
193+
=== "Python"
194+
!!! example "Specifying output column names for transformation functions."
195+
```python
196+
from hopsworks import udf
197+
import pandas as pd
198+
199+
@udf(return_type=[int, int, int], drop=["feature1", "feature3"])
200+
def add_one_multiple(feature1, feature2, feature3):
201+
return feature1 + 1, feature2 + 1, feature3 + 1
202+
203+
# Specifying output feature names of the transformation function.
204+
add_one_multiple.alias("transformed_feature1", "transformed_feature2", "transformed_feature3")
205+
```
206+
188207
### Training dataset statistics
189208

190209
A keyword argument `statistics` can be defined in the transformation function if it requires training dataset statistics for any of its arguments. The `statistics` argument must be assigned an instance of the class [`TransformationStatistics`](http://docs.hopsworks.ai/hopsworks-api/{{{hopsworks_version}}}/generated/api/transformation_statistics/) as the default value. The `TransformationStatistics` instance must be initialized using the names of the arguments requiring statistics.

0 commit comments

Comments
 (0)