From b599be46244f57230b250bad38d7bcdb5766f66d Mon Sep 17 00:00:00 2001 From: Maksym Zhytnikov <63515947+Maxxx-zh@users.noreply.github.com> Date: Tue, 21 Jan 2025 12:15:52 +0200 Subject: [PATCH 1/2] Update tutorial links --- docs/tutorials/index.md | 26 +++++++------------ .../fs/feature_group/feature_monitoring.md | 2 +- .../feature_monitoring_advanced.md | 2 +- .../fs/feature_view/feature_monitoring.md | 2 +- docs/user_guides/fs/feature_view/overview.md | 2 +- .../fs/feature_view/training-data.md | 4 +-- .../fs/vector_similarity_search.md | 2 +- 7 files changed, 16 insertions(+), 24 deletions(-) diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md index e3e0cecff..d50f17563 100644 --- a/docs/tutorials/index.md +++ b/docs/tutorials/index.md @@ -24,9 +24,9 @@ This is a batch use case variant of the fraud tutorial, it will give you a high | Notebooks | | | ----------- | ------------------------------------ | -| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/1_fraud_batch_feature_pipeline.ipynb){:target="_blank"} | -| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/2_fraud_batch_training_pipeline.ipynb){:target="_blank"} | -| 3. How to train a model from the feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/3_fraud_batch_inference.ipynb){:target="_blank"} | +| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/1_fraud_batch_feature_pipeline.ipynb){:target="_blank"} | +| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/2_fraud_batch_training_pipeline.ipynb){:target="_blank"} | +| 3. How to train a model from the feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/3_fraud_batch_inference.ipynb){:target="_blank"} | ### Online This is a online use case variant of the fraud tutorial, it is similar to the batch use case, however, in this tutorial you will get introduced to the usage of Feature Groups which are kept in online storage, and how to access single feature vectors from the online storage @@ -34,9 +34,9 @@ at low latency. Additionally, the model will be deployed as a model serving inst | Notebooks | | | ----------- | ------------------------------------ | -| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_online/1_fraud_online_feature_pipeline.ipynb){:target="_blank"} | -| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_online/2_fraud_online_training_pipeline.ipynb){:target="_blank"} | -| 3. How to train a model from the feature store and deploying it as a serving instance together with the online feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_online/3_fraud_online_inference_pipeline.ipynb){:target="_blank"} | +| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/1_fraud_online_feature_pipeline.ipynb){:target="_blank"} | +| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/2_fraud_online_training_pipeline.ipynb){:target="_blank"} | +| 3. How to train a model from the feature store and deploying it as a serving instance together with the online feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/3_fraud_online_inference_pipeline.ipynb){:target="_blank"} | ## Churn Tutorial @@ -45,17 +45,9 @@ at low latency. Additionally, the model will be deployed as a model serving inst | Notebooks | | | ----------- | ------------------------------------ | -| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/churn/1_churn_feature_pipeline.ipynb){:target="_blank"} | -| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/churn/2_churn_training_pipeline.ipynb){:target="_blank"} | -| 3. How to train a model from the feature store and deploying it as a serving instance together with the online feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/churn/3_churn_batch_inference.ipynb){:target="_blank"} | - -## Iris Tutorial - -In this tutorial you will learn how to create an online prediction service for the Iris flower prediction problem. - -| Notebooks | | -| ----------- | ------------------------------------ | -| 1. All-in-one notebook, showing how to create the needed feature groups, train the model and deploy it as a serving instance | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/iris/iris_tutorial.ipynb){:target="_blank"} | +| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/churn/1_churn_feature_pipeline.ipynb){:target="_blank"} | +| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/churn/2_churn_training_pipeline.ipynb){:target="_blank"} | +| 3. How to train a model from the feature store and deploying it as a serving instance together with the online feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/churn/3_churn_batch_inference.ipynb){:target="_blank"} | ## Integration Tutorials diff --git a/docs/user_guides/fs/feature_group/feature_monitoring.md b/docs/user_guides/fs/feature_group/feature_monitoring.md index 4720e2e7d..1412be9d9 100644 --- a/docs/user_guides/fs/feature_group/feature_monitoring.md +++ b/docs/user_guides/fs/feature_group/feature_monitoring.md @@ -9,7 +9,7 @@ Before continuing with this guide, see the [Feature monitoring guide](../feature ## Code -In this section, we show you how to setup feature monitoring in a Feature Group using the ==Hopsworks Python library==. Alternatively, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/integrations/feature-monitoring/feature-monitoring.ipynb). +In this section, we show you how to setup feature monitoring in a Feature Group using the ==Hopsworks Python library==. Alternatively, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/feature_monitoring.ipynb). First, checkout the pre-requisite and Hopsworks setup to follow the guide below. Create a project, install the [Hopsworks Python library](https://pypi.org/project/hopsworks) in your environment, connect via the generated API key. The second step is to start a new configuration for feature monitoring. diff --git a/docs/user_guides/fs/feature_monitoring/feature_monitoring_advanced.md b/docs/user_guides/fs/feature_monitoring/feature_monitoring_advanced.md index b35f56df5..693044489 100644 --- a/docs/user_guides/fs/feature_monitoring/feature_monitoring_advanced.md +++ b/docs/user_guides/fs/feature_monitoring/feature_monitoring_advanced.md @@ -1,6 +1,6 @@ # Advanced guide -An introduction to Feature Monitoring can be found in the guides for [Feature Groups](../feature_group/feature_monitoring.md) and [Feature Views](../feature_view/feature_monitoring.md). In addition, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/integrations/feature-monitoring/feature-monitoring.ipynb). +An introduction to Feature Monitoring can be found in the guides for [Feature Groups](../feature_group/feature_monitoring.md) and [Feature Views](../feature_view/feature_monitoring.md). In addition, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/feature_monitoring.ipynb). ## Retrieve feature monitoring configurations diff --git a/docs/user_guides/fs/feature_view/feature_monitoring.md b/docs/user_guides/fs/feature_view/feature_monitoring.md index 91f25c4b0..c0794cf1b 100644 --- a/docs/user_guides/fs/feature_view/feature_monitoring.md +++ b/docs/user_guides/fs/feature_view/feature_monitoring.md @@ -9,7 +9,7 @@ Before continuing with this guide, see the [Feature monitoring guide](../feature ## Code -In this section, we show you how to setup feature monitoring in a Feature View using the ==Hopsworks Python library==. Alternatively, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/integrations/feature-monitoring/feature-monitoring.ipynb). +In this section, we show you how to setup feature monitoring in a Feature View using the ==Hopsworks Python library==. Alternatively, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/feature_monitoring.ipynb). First, checkout the pre-requisite and Hopsworks setup to follow the guide below. Create a project, install the [Hopsworks Python library](https://pypi.org/project/hopsworks) in your environment and connect via the generated API key. The second step is to start a new configuration for feature monitoring. diff --git a/docs/user_guides/fs/feature_view/overview.md b/docs/user_guides/fs/feature_view/overview.md index e5ab9e3ce..1a4da7632 100644 --- a/docs/user_guides/fs/feature_view/overview.md +++ b/docs/user_guides/fs/feature_view/overview.md @@ -44,7 +44,7 @@ If you want to understand more about the concept of feature view, you can refer .build(); ``` -You can refer to [query](./query.md) and [transformation function](./model-dependent-transformations.md) for creating `query` and `transformation_function`. To see a full example of how to create a feature view, you can read [this notebook](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/2_feature_view_creation.ipynb). +You can refer to [query](./query.md) and [transformation function](./model-dependent-transformations.md) for creating `query` and `transformation_function`. To see a full example of how to create a feature view, you can read [this notebook](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/2_fraud_batch_training_pipeline.ipynb). ## Retrieval Once you have created a feature view, you can retrieve it by its name and version. diff --git a/docs/user_guides/fs/feature_view/training-data.md b/docs/user_guides/fs/feature_view/training-data.md index b17fed6a2..acb648eb4 100644 --- a/docs/user_guides/fs/feature_view/training-data.md +++ b/docs/user_guides/fs/feature_view/training-data.md @@ -2,7 +2,7 @@ Training data can be created from the feature view and used by different ML libraries for training different models. -You can read [training data concepts](../../../concepts/fs/feature_view/offline_api.md) for more details. To see a full example of how to create training data, you can read [this notebook](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/2_feature_view_creation.ipynb). +You can read [training data concepts](../../../concepts/fs/feature_view/offline_api.md) for more details. To see a full example of how to create training data, you can read [this notebook](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/2_fraud_batch_training_pipeline.ipynb). For Python-clients, handling small or moderately-sized data, we recommend enabling the [ArrowFlight Server with DuckDB](../../../setup_installation/common/arrow_flight_duckdb.md) service, which will provide significant speedups over Spark/Hive for reading and creating in-memory training datasets. @@ -29,7 +29,7 @@ print(job.id) # get the job's id and view the job status in the UI ### Extra filters Sometimes data scientists need to train different models using subsets of a dataset. For example, there can be different models for different countries, seasons, and different groups. One way is to create different feature views for training different models. Another way is to add extra filters on top of the feature view when creating training data. -In the [transaction fraud example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/1_feature_groups.ipynb), there are different transaction categories, for example: "Health/Beauty", "Restaurant/Cafeteria", "Holliday/Travel" etc. Examples below show how to create training data for different transaction categories. +In the [transaction fraud example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/1_fraud_batch_feature_pipeline.ipynb), there are different transaction categories, for example: "Health/Beauty", "Restaurant/Cafeteria", "Holliday/Travel" etc. Examples below show how to create training data for different transaction categories. ```python # Create a training dataset for Health/Beauty df_health = feature_view.training_data( diff --git a/docs/user_guides/fs/vector_similarity_search.md b/docs/user_guides/fs/vector_similarity_search.md index 780d76b1a..135885c32 100644 --- a/docs/user_guides/fs/vector_similarity_search.md +++ b/docs/user_guides/fs/vector_similarity_search.md @@ -108,4 +108,4 @@ There are 2 types of online feature stores in Hopsworks: online store (RonDB) an Create a new index per feature group to optimize retrieval performance. # Next step -Explore the [news search example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/hsfs/knn_search/news-search-knn.ipynb), demonstrating how to use Hopsworks for implementing a news search application using natural language in the application. Additionally, you can see the application of querying similar embeddings with additional features in this [news rank example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/hsfs/knn_search/news-search-rank-view.ipynb). +Explore the [news search example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/vector_similarity_search/1_feature_group_embeddings_api.ipynb), demonstrating how to use Hopsworks for implementing a news search application using natural language in the application. Additionally, you can see the application of querying similar embeddings with additional features in this [news rank example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/vector_similarity_search/2_feature_view_embeddings_api.ipynb). From 1349c3843835f8e395edc18a5bdc283ae4da9075 Mon Sep 17 00:00:00 2001 From: Maksym Zhytnikov <63515947+Maxxx-zh@users.noreply.github.com> Date: Tue, 21 Jan 2025 12:48:28 +0200 Subject: [PATCH 2/2] Remove colab links --- docs/tutorials/index.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md index d50f17563..b1d4d077d 100644 --- a/docs/tutorials/index.md +++ b/docs/tutorials/index.md @@ -24,9 +24,9 @@ This is a batch use case variant of the fraud tutorial, it will give you a high | Notebooks | | | ----------- | ------------------------------------ | -| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/1_fraud_batch_feature_pipeline.ipynb){:target="_blank"} | -| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/2_fraud_batch_training_pipeline.ipynb){:target="_blank"} | -| 3. How to train a model from the feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/3_fraud_batch_inference.ipynb){:target="_blank"} | +| 1. [How to load, engineer and create feature groups](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/1_fraud_batch_feature_pipeline.ipynb){:target="_blank"} | +| 2. [How to create training datasets](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/2_fraud_batch_training_pipeline.ipynb){:target="_blank"} | +| 3. [How to train a model from the feature store](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/3_fraud_batch_inference.ipynb){:target="_blank"} | ### Online This is a online use case variant of the fraud tutorial, it is similar to the batch use case, however, in this tutorial you will get introduced to the usage of Feature Groups which are kept in online storage, and how to access single feature vectors from the online storage @@ -34,9 +34,9 @@ at low latency. Additionally, the model will be deployed as a model serving inst | Notebooks | | | ----------- | ------------------------------------ | -| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/1_fraud_online_feature_pipeline.ipynb){:target="_blank"} | -| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/2_fraud_online_training_pipeline.ipynb){:target="_blank"} | -| 3. How to train a model from the feature store and deploying it as a serving instance together with the online feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/3_fraud_online_inference_pipeline.ipynb){:target="_blank"} | +| 1. [How to load, engineer and create feature groups](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/1_fraud_online_feature_pipeline.ipynb){:target="_blank"} | +| 2. [How to create training datasets](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/2_fraud_online_training_pipeline.ipynb){:target="_blank"} | +| 3. [How to train a model from the feature store and deploying it as a serving instance together with the online feature store](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/3_fraud_online_inference_pipeline.ipynb){:target="_blank"} | ## Churn Tutorial