diff --git a/docs/assets/images/guides/integrations/parameter-store-policy.png b/docs/assets/images/guides/integrations/parameter-store-policy.png
deleted file mode 100644
index 28e1617b6..000000000
Binary files a/docs/assets/images/guides/integrations/parameter-store-policy.png and /dev/null differ
diff --git a/docs/assets/images/guides/integrations/parameter-store.png b/docs/assets/images/guides/integrations/parameter-store.png
deleted file mode 100644
index e03fe8368..000000000
Binary files a/docs/assets/images/guides/integrations/parameter-store.png and /dev/null differ
diff --git a/docs/assets/images/guides/integrations/sagemaker-role.png b/docs/assets/images/guides/integrations/sagemaker-role.png
deleted file mode 100644
index ebd5db7c5..000000000
Binary files a/docs/assets/images/guides/integrations/sagemaker-role.png and /dev/null differ
diff --git a/docs/assets/images/guides/integrations/secrets-manager-1.png b/docs/assets/images/guides/integrations/secrets-manager-1.png
deleted file mode 100644
index d47384e1b..000000000
Binary files a/docs/assets/images/guides/integrations/secrets-manager-1.png and /dev/null differ
diff --git a/docs/assets/images/guides/integrations/secrets-manager-2.png b/docs/assets/images/guides/integrations/secrets-manager-2.png
deleted file mode 100644
index 2cbc215d0..000000000
Binary files a/docs/assets/images/guides/integrations/secrets-manager-2.png and /dev/null differ
diff --git a/docs/assets/images/guides/integrations/secrets-manager-policy.png b/docs/assets/images/guides/integrations/secrets-manager-policy.png
deleted file mode 100644
index 0115c3e8e..000000000
Binary files a/docs/assets/images/guides/integrations/secrets-manager-policy.png and /dev/null differ
diff --git a/docs/user_guides/client_installation/index.md b/docs/user_guides/client_installation/index.md
index 9a2d296dc..c67afd4f6 100644
--- a/docs/user_guides/client_installation/index.md
+++ b/docs/user_guides/client_installation/index.md
@@ -1,56 +1,115 @@
---
-description: Documentation on how to install the Hopsworks and HSFS Python libraries, including the specific requirements for Mac OSX and Windows.
+description: Documentation on how to install the Hopsworks Python and Java library.
---
# Client Installation Guide
-## Hopsworks (including Feature Store and MLOps)
-The Hopsworks client library is required to connect to the Hopsworks Feature Store and MLOps services from your local machine or any other Python environment such as Google Colab or AWS Sagemaker. Execute the following command to install the full Hopsworks client library in your Python environment:
+## Hopsworks Python library
+
+The Hopsworks Python client library is required to connect to Hopsworks from your local machine or any other Python environment such as Google Colab or AWS Sagemaker. Execute the following command to install the Hopsworks client library in your Python environment:
!!! note "Virtual environment"
It is recommended to use a virtual python environment instead of the system environment used by your operating system, in order to avoid any side effects regarding interfering dependencies.
-```bash
-pip install hopsworks
-```
-Supported versions of Python: 3.8, 3.9, 3.10, 3.11, 3.12 ([PyPI ↗](https://pypi.org/project/hopsworks/))
-
-!!! attention "OSX Installation"
- Hopsworks latest version should work on OSX systems without any additional requirements. However if installing an older version of the Hopsworks SDK you might need to install `librdkafka` manually. Checkout the documentation for the specific version you are installing.
-
!!! attention "Windows/Conda Installation"
On Windows systems you might need to install twofish manually before installing hopsworks, if you don't have the Microsoft Visual C++ Build Tools installed. In that case, it is recommended to use a conda environment and run the following commands:
```bash
conda install twofish
- pip install hopsworks
+ pip install hopsworks[python]
```
-## Feature Store only
-To only install the Hopsworks Feature Store client library, execute the following command:
+```bash
+pip install hopsworks[python]
+```
+Supported versions of Python: 3.8, 3.9, 3.10, 3.11, 3.12 ([PyPI ↗](https://pypi.org/project/hopsworks/))
+
+### Profiles
+
+The Hopsworks library has several profiles that bring additional dependencies and enable additional functionalities:
+
+| Profile Name | Description |
+| ------------------ | ------------- |
+| No Profile | This is the base installation. Supports interacting with the feature store metadata, model registry and deployments. It also supports reading and writing from the feature store from PySpark environments. |
+| `python` | This profile enables reading and writing from/to the feature store from a Python environment |
+| `great-expectations` | This profile installs the [Great Expectations](https://greatexpectations.io/) Python library and enables data validation on feature pipelines |
+| `polars` | This profile installs the [Polars](https://pola.rs/) library and enables reading and writing Polars DataFrames |
+
+You can install all the above profiles with the following command:
```bash
-pip install hsfs[python]
-# or if using zsh
-pip install 'hsfs[python]'
+pip install hopsworks[python,great-expectations,polars]
```
-Supported versions of Python: 3.8, 3.9, 3.10, 3.11, 3.12 ([PyPI ↗](https://pypi.org/project/hsfs/))
-!!! attention "OSX Installation"
- Hopsworks latest version should work on OSX systems without any additional requirements. However if installing an older version of the Hopsworks SDK you might need to install `librdkafka` manually. Checkout the documentation for the specific version you are installing.
+## HSFS Java Library:
-!!! attention "Windows/Conda Installation"
+If you want to interact with the Hopsworks Feature Store from environments such as Spark, Flink or Beam, you can use the Hopsworks Feature Store (HSFS) Java library.
- On Windows systems you might need to install twofish manually before installing hsfs, if you don't have the Microsoft Visual C++ Build Tools installed. In that case, it is recommended to use a conda environment and run the following commands:
-
- ```bash
- conda install twofish
- pip install hsfs[python]
- ```
+!!!note "Feature Store Only"
+
+ The Java library only allows interaction with the Feature Store component of the Hopsworks platform. Additionally each environment might restrict the supported API operation. You can see which API operation is supported by which environment [here](../fs/compute_engines)
+
+The HSFS library is available on the Hopsworks' Maven repository. If you are using Maven as build tool, you can add the following in your `pom.xml` file:
+
+```
+
+
+ Hops
+ Hops Repository
+ https://archiva.hops.works/repository/Hops/
+
+ true
+
+
+ true
+
+
+
+```
+
+The library has different builds targeting different environments:
+
+### Spark
+
+The `artifactId` for the Spark build is `hsfs-spark-spark{spark.version}`, if you are using Maven as build tool, you can add the following dependency:
+
+```
+
+ com.logicalclocks
+ hsfs-spark-spark3.1
+ ${hsfs.version}
+
+```
+
+Hopsworks provides builds for Spark 3.1, 3.3 and 3.5. The builds are also provided as JAR files which can be downloaded from [Hopsworks repository](https://repo.hops.works/master/hsfs)
+
+### Flink
+
+The `artifactId` for the Flink build is `hsfs-flink`, if you are using Maven as build tool, you can add the following dependency:
+
+```
+
+ com.logicalclocks
+ hsfs-flink
+ ${hsfs.version}
+
+```
+
+### Beam
+
+The `artifactId` for the Beam build is `hsfs-beam`, if you are using Maven as build tool, you can add the following dependency:
+
+```
+
+ com.logicalclocks
+ hsfs-beam
+ ${hsfs.version}
+
+```
## Next Steps
-If you are using a local python environment and want to connect to the Hopsworks Feature Store, you can follow the [Python Guide](../integrations/python.md#generate-an-api-key) section to create an API Key and to get started.
+If you are using a local python environment and want to connect to Hopsworks, you can follow the [Python Guide](../integrations/python.md#generate-an-api-key) section to create an API Key and to get started.
## Other environments
diff --git a/docs/user_guides/fs/sharing/sharing.md b/docs/user_guides/fs/sharing/sharing.md
index 9fcf04a9b..206845f4c 100644
--- a/docs/user_guides/fs/sharing/sharing.md
+++ b/docs/user_guides/fs/sharing/sharing.md
@@ -64,12 +64,12 @@ To access features from a shared feature store you need to first retrieve the ha
To retrieve the handle use the get_feature_store() method and provide the name of the shared feature store
```python
-import hsfs
+import hopsworks
-connection = hsfs.connection()
+project = hopsworks.login()
-project_feature_store = connection.get_feature_store()
-shared_feature_store = connection.get_feature_store(name="name_of_shared_feature_store")
+project_feature_store = project.get_feature_store()
+shared_feature_store = project.get_feature_store(name="name_of_shared_feature_store")
```
### Step 2: Fetch feature groups
diff --git a/docs/user_guides/fs/storage_connector/usage.md b/docs/user_guides/fs/storage_connector/usage.md
index 191b58f36..0f224126e 100644
--- a/docs/user_guides/fs/storage_connector/usage.md
+++ b/docs/user_guides/fs/storage_connector/usage.md
@@ -14,11 +14,10 @@ We retrieve a storage connector simply by its unique name.
=== "PySpark"
```python
- import hsfs
+ import hopsworks
# Connect to the Hopsworks feature store
- hsfs_connection = hsfs.connection()
- # Retrieve the metadata handle
- feature_store = hsfs_connection.get_feature_store()
+ project = hopsworks.login()
+ feature_store = project.get_feature_store()
# Retrieve storage connector
connector = feature_store.get_storage_connector('connector_name')
```
diff --git a/docs/user_guides/integrations/databricks/api_key.md b/docs/user_guides/integrations/databricks/api_key.md
index 68feaee28..dc7b4fa84 100644
--- a/docs/user_guides/integrations/databricks/api_key.md
+++ b/docs/user_guides/integrations/databricks/api_key.md
@@ -1,6 +1,6 @@
# Hopsworks API key
-In order for the Databricks cluster to be able to communicate with the Hopsworks Feature Store, the clients running on Databricks need to be able to access a Hopsworks API key.
+In order for the Databricks cluster to be able to communicate with Hopsworks, clients running on Databricks need to be able to access a Hopsworks API key.
## Generate an API key
@@ -15,127 +15,19 @@ For instructions on how to generate an API key follow this [user guide](../../pr
!!! hint "API key as Argument"
To get started quickly, without saving the Hopsworks API in a secret storage, you can simply supply it as an argument when instantiating a connection:
- ```python hl_lines="6"
- import hsfs
- conn = hsfs.connection(
- host='my_instance', # DNS of your Feature Store instance
- port=443, # Port to reach your Hopsworks instance, defaults to 443
- project='my_project', # Name of your Hopsworks Feature Store project
- api_key_value='apikey', # The API key to authenticate with Hopsworks
- hostname_verification=True # Disable for self-signed certificates
- )
- fs = conn.get_feature_store() # Get the project's default feature store
- ```
-## Store the API key
-### AWS
-
-#### Step 1: Create an instance profile to attach to your Databricks clusters
-
-Go to the *AWS IAM* choose *Roles* and click on *Create Role*. Select *AWS Service* as the type of trusted entity and *EC2* as the use case as shown below:
-
-
-
-
- Create an instance profile
-
-
-
-Click on *Next: Permissions*, *Next:Tags*, and then *Next: Review*. Name the instance profile role and then click *Create role*.
-
-#### Step 2: Storing the API Key
-
-**Option 1: Using the AWS Systems Manager Parameter Store**
-
-In the AWS Management Console, ensure that your active region is the region you use for Databricks.
-Go to the *AWS Systems Manager* choose *Parameter Store* and select *Create Parameter*.
-As name enter `/hopsworks/role/[MY_DATABRICKS_ROLE]/type/api-key` replacing `[MY_DATABRICKS_ROLE]` with the name of the AWS role you have created in [Step 1](#step-1-create-an-instance-profile-to-attach-to-your-databricks-clusters). Select *Secure String* as type and create the parameter.
-
-
-
-
- Storing the Feature Store API key in the Parameter Store
-
-
-
-
-Once the API Key is stored, you need to grant access to it from the AWS role that you have created in [Step 1](#step-1-create-an-instance-profile-to-attach-to-your-databricks-clusters).
-In the AWS Management Console, go to *IAM*, select *Roles* and then search for the role that you have created in [Step 1](#step-1-create-an-instance-profile-to-attach-to-your-databricks-clusters).
-Select *Add inline policy*. Choose *Systems Manager* as service, expand the *Read* access level and check *GetParameter*.
-Expand Resources and select *Add ARN*.
-Enter the region of the *Systems Manager* as well as the name of the parameter **WITHOUT the leading slash** e.g. *hopsworks/role/[MY_DATABRICKS_ROLE]/type/api-key* and click *Add*.
-Click on *Review*, give the policy a name and click on *Create policy*.
-
-
-
-
- Configuring the access policy for the Parameter Store
-
-
-
-
-**Option 2: Using the AWS Secrets Manager**
-
-In the AWS management console ensure that your active region is the region you use for Databricks.
-Go to the *AWS Secrets Manager* and select *Store new secret*. Select *Other type of secrets* and add *api-key*
-as the key and paste the API key created in the previous step as the value. Click next.
-
-
-
-
- Storing a Feature Store API key in the Secrets Manager Step 1
-
-
-
-As secret name, enter *hopsworks/role/[MY_DATABRICKS_ROLE]* replacing [MY_DATABRICKS_ROLE] with the AWS role you have created in [Step 1](#step-1-create-an-instance-profile-to-attach-to-your-databricks-clusters). Select next twice and finally store the secret.
-Then click on the secret in the secrets list and take note of the *Secret ARN*.
-
-
-
-
- Storing a Feature Store API key in the Secrets Manager Step 2
-
-
-
-Once the API Key is stored, you need to grant access to it from the AWS role that you have created in [Step 1](#step-1-create-an-instance-profile-to-attach-to-your-databricks-clusters).
-In the AWS Management Console, go to *IAM*, select *Roles* and then the role that that you have created in [Step 1](#step-1-create-an-instance-profile-to-attach-to-your-databricks-clusters).
-Select *Add inline policy*. Choose *Secrets Manager* as service, expand the *Read* access level and check *GetSecretValue*.
-Expand Resources and select *Add ARN*. Paste the ARN of the secret created in the previous step.
-Click on *Review*, give the policy a name and click on *Create policy*.
-
-
-
-
- Configuring the access policy for the Secrets Manager
-
-
-
-#### Step 3: Allow Databricks to use the AWS role created in Step 1
-
-First you need to get the AWS role used by Databricks for deployments as described in [this step](https://docs.databricks.com/administration-guide/cloud-configurations/aws/instance-profiles.html#step-3-note-the-iam-role-used-to-create-the-databricks-deployment). Once you get the role name, go to *AWS IAM*, search for the role, and click on it. Then, select the *Permissions* tab, click on *Add inline policy*, select the *JSON* tab, and paste the following snippet. Replace *[ACCOUNT_ID]* with your AWS account id, and *[MY_DATABRICKS_ROLE]* with the AWS role name created in [Step 1](#step-1-create-an-instance-profile-to-attach-to-your-databricks-clusters).
-
-```json
-{
- "Version": "2012-10-17",
- "Statement": [
- {
- "Sid": "PassRole",
- "Effect": "Allow",
- "Action": "iam:PassRole",
- "Resource": "arn:aws:iam::[ACCOUNT_ID]:role/[MY_DATABRICKS_ROLE]"
- }
- ]
-}
+```python hl_lines="6"
+ import hopsworks
+ project = hopsworks.login(
+ host='my_instance', # DNS of your Feature Store instance
+ port=443, # Port to reach your Hopsworks instance, defaults to 443
+ project='my_project', # Name of your Hopsworks Feature Store project
+ api_key_value='apikey', # The API key to authenticate with Hopsworks
+ )
+ fs = project.get_feature_store() # Get the project's default feature store
```
-Click *Review Policy*, name the policy, and click *Create Policy*. Then, go to your Databricks workspace and follow [this step](https://docs.databricks.com/administration-guide/cloud-configurations/aws/instance-profiles.html#step-5-add-the-instance-profile-to-databricks) to add the instance profile to your workspace. Finally, when launching Databricks clusters, select *Advanced* settings and choose the instance profile you have just added.
-
-
-### Azure
-
-On Azure we currently do not support storing the API key in a secret storage. Instead just store the API key in a file in your Databricks workspace so you can access it when connecting to the Feature Store.
-
## Next Steps
Continue with the [configuration guide](configuration.md) to finalize the configuration of the Databricks Cluster to communicate with the Hopsworks Feature Store.
diff --git a/docs/user_guides/integrations/databricks/configuration.md b/docs/user_guides/integrations/databricks/configuration.md
index 09cd2852d..96d446871 100644
--- a/docs/user_guides/integrations/databricks/configuration.md
+++ b/docs/user_guides/integrations/databricks/configuration.md
@@ -90,38 +90,16 @@ When a cluster is configured for a specific project user, all the operations wit
At the end of the configuration, Hopsworks will start the cluster.
Once the cluster is running users can establish a connection to the Hopsworks Feature Store from Databricks:
-!!! note "API key on Azure"
- Please note, for Azure it is necessary to store the Hopsworks API key locally on the cluster as a file. As we currently do not support storing the API key on an Azure Secret Management Service as we do for AWS. Consult the [API key guide for Azure](api_key.md#azure), for more information.
-
-=== "AWS"
-
- ```python
- import hsfs
- conn = hsfs.connection(
- 'my_instance', # DNS of your Feature Store instance
- 443, # Port to reach your Hopsworks instance, defaults to 443
- 'my_project', # Name of your Hopsworks Feature Store project
- secrets_store='secretsmanager', # Either parameterstore or secretsmanager
- hostname_verification=True # Disable for self-signed certificates
- )
- fs = conn.get_feature_store() # Get the project's default feature store
- ```
-
-=== "Azure"
-
- ```python
- import hsfs
- conn = hsfs.connection(
- 'my_instance', # DNS of your Feature Store instance
- 443, # Port to reach your Hopsworks instance, defaults to 443
- 'my_project', # Name of your Hopsworks Feature Store project
- secrets_store='local',
- api_key_file="featurestore.key", # For Azure, store the API key locally
- secrets_store = "local",
- hostname_verification=True # Disable for self-signed certificates
- )
- fs = conn.get_feature_store() # Get the project's default feature store
- ```
+```python
+import hopsworks
+project = hopsworks.login(
+ host='my_instance', # DNS of your Hopsworks instance
+ port=443, # Port to reach your Hopsworks instance, defaults to 443
+ project='my_project', # Name of your Hopsworks project
+ api_key_value='apikey', # The API key to authenticate with Hopsworks
+)
+fs = project.get_feature_store() # Get the project's default feature store
+```
## Next Steps
diff --git a/docs/user_guides/integrations/emr/emr_configuration.md b/docs/user_guides/integrations/emr/emr_configuration.md
index e9a178162..18cb67b4c 100644
--- a/docs/user_guides/integrations/emr/emr_configuration.md
+++ b/docs/user_guides/integrations/emr/emr_configuration.md
@@ -166,17 +166,19 @@ echo -n $(curl -H "Authorization: ApiKey ${API_KEY}" https://$HOST/hopsworks-api
chmod -R o-rwx /usr/lib/hopsworks
-sudo pip3 install --upgrade hsfs~=X.X.0
+sudo pip3 install --upgrade hopsworks~=X.X.0
+
```
-!!! note
- Don't forget to replace X.X.0 with the major and minor version of your Hopsworks deployment.
+!!! attention "Matching Hopsworks version"
-
-
-
- To find your Hopsworks version, enter any of your projects and go to the settings tab inside your project.
-
-
+ We recommend that the major and minor version of the Python library match the major and minor version of the Hopsworks deployment.
+
+
+
+
+ You find the Hopsworks version inside any of your Project's settings tab on Hopsworks
+
+
Add the bootstrap actions when configuring your EMR cluster. Provide 3 arguments to the bootstrap action: The name of the API key secret e.g., `hopsworks/featurestore`,
the public DNS name of your Hopsworks cluster, such as `ad005770-33b5-11eb-b5a7-bfabd757769f.cloud.hopsworks.ai`, and the name of your Hopsworks project, e.g. `demo_fs_meb10179`.
diff --git a/docs/user_guides/integrations/hdinsight.md b/docs/user_guides/integrations/hdinsight.md
index cb635b00c..b13e48ed2 100644
--- a/docs/user_guides/integrations/hdinsight.md
+++ b/docs/user_guides/integrations/hdinsight.md
@@ -27,11 +27,12 @@ HDInsight requires Hopsworks connectors to be able to communicate with the Hopsw
The script action needs to be applied head and worker nodes and can be applied during cluster creation or to an existing cluster. Ensure to persist the script action so that it is run on newly created nodes. For more information about how to use script actions, see [Customize Azure HDInsight clusters by using script actions](https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-customize-cluster-linux).
!!! attention "Matching Hopsworks version"
- The **major version of `HSFS`** needs to match the **major version of Hopsworks**. Check [PyPI](https://pypi.org/project/hsfs/#history) for available releases.
+
+ We recommend that the major and minor version of the Python library match the major and minor version of the Hopsworks deployment.
-
+ You find the Hopsworks version inside any of your Project's settings tab on Hopsworks
@@ -42,14 +43,14 @@ set -e
HOST="MY_INSTANCE.cloud.hopsworks.ai" # DNS of your Feature Store instance
PROJECT="MY_PROJECT" # Port to reach your Hopsworks instance, defaults to 443
-HSFS_VERSION="MY_VERSION" # The major version of HSFS needs to match the major version of Hopsworks
+HOPSWORKS_VERSION="MY_VERSION" # The major version of Hopsworks library needs to match the major version of Hopsworks
API_KEY="MY_API_KEY" # The API key to authenticate with Hopsworks
CONDA_ENV="MY_CONDA_ENV" # py35 is the default for HDI 3.6
apt-get --assume-yes install python3-dev
apt-get --assume-yes install jq
-/usr/bin/anaconda/envs/$CONDA_ENV/bin/pip install hsfs==$HSFS_VERSION
+/usr/bin/anaconda/envs/$CONDA_ENV/bin/pip install hopsworks==$HOPSWORKS_VERSION
PROJECT_ID=$(curl -H "Authorization: ApiKey ${API_KEY}" https://$HOST/hopsworks-api/api/project/getProjectInfo/$PROJECT | jq -r .projectId)
@@ -120,25 +121,25 @@ hive.metastore.uris=thrift://MY_HOPSWORKS_INSTANCE_PRIVATE_IP:9083
You are now ready to connect to the Hopsworks Feature Store, for instance using a Jupyter notebook in HDInsight with a PySpark3 kernel:
```python
-import hsfs
+import hopsworks
# Put the API key into Key Vault for any production setup:
# See, https://azure.microsoft.com/en-us/services/key-vault/
secret_value = 'MY_API_KEY'
# Create a connection
-conn = hsfs.connection(
+project = hopsworks.login(
host='MY_INSTANCE.cloud.hopsworks.ai', # DNS of your Feature Store instance
port=443, # Port to reach your Hopsworks instance, defaults to 443
- project='MY_PROJECT', # Name of your Hopsworks Feature Store project
+ project='MY_PROJECT', # Name of your Hopsworks project
api_key_value=secret_value, # The API key to authenticate with Hopsworks
hostname_verification=True # Disable for self-signed certificates
)
# Get the feature store handle for the project's feature store
-fs = conn.get_feature_store()
+fs = project.get_feature_store()
```
## Next Steps
-For more information about how to use the Feature Store, see the [Quickstart Guide](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/quickstart.ipynb){:target="_blank"}.
+For more information on how to use the Hopsworks API check out the other guides or the [API Reference](https://docs.hopsworks.ai/hopsworks-api/{{{ hopsworks_version }}}/generated/api/connection_api/).
\ No newline at end of file
diff --git a/docs/user_guides/integrations/index.md b/docs/user_guides/integrations/index.md
index db8d38b1d..a68842daf 100644
--- a/docs/user_guides/integrations/index.md
+++ b/docs/user_guides/integrations/index.md
@@ -2,10 +2,11 @@
Hopsworks is an open platform aiming to be accessible from a variety of tools. Learn in this section how to connect to Hopsworks from
-- [Python](python)
+- [Python, AWS SageMaker, Google Colab, Kubeflow](python)
- [Databricks](databricks/networking)
-- [AWS SageMaker](sagemaker)
-- [AWS EMR](emr/networking)
+- [AWS EMR](emr/emr_configuration)
- [Azure HDInsight](hdinsight)
- [Azure Machine Learning](mlstudio_designer)
- [Apache Spark](spark)
+- [Apache Flink](flink)
+- [Apache Beam](beam)
diff --git a/docs/user_guides/integrations/mlstudio_designer.md b/docs/user_guides/integrations/mlstudio_designer.md
index 4ac44998a..03aca4ca5 100644
--- a/docs/user_guides/integrations/mlstudio_designer.md
+++ b/docs/user_guides/integrations/mlstudio_designer.md
@@ -1,6 +1,6 @@
# Azure Machine Learning Designer Integration
-Connecting to the Feature Store from the Azure Machine Learning Designer requires setting up a Feature Store API key for the Designer and installing the **HSFS** on the Designer. This guide explains step by step how to connect to the Feature Store from Azure Machine Learning Designer.
+Connecting to Hopsworks from the Azure Machine Learning Designer requires setting up a Hopsworks API key for the Designer and installing the **Hopsworks** Python library on the Designer. This guide explains step by step how to connect to the Feature Store from Azure Machine Learning Designer.
!!! info "Network Connectivity"
@@ -15,9 +15,9 @@ For instructions on how to generate an API key follow this [user guide](../proje
3. job
4. kafka
-## Connect to the Feature Store
+## Connect to Hopsworks
-To connect to the Feature Store from the Azure Machine Learning Designer, create a new pipeline or open an existing one:
+To connect to Hopsworks from the Azure Machine Learning Designer, create a new pipeline or open an existing one:
@@ -30,18 +30,18 @@ In the pipeline, add a new `Execute Python Script` step and replace the Python s
-
- Add the code to access the Feature Store
+
+ Add the code to access the Hopsworks
!!! info "Updating the script"
- Replace MY_VERSION, MY_API_KEY, MY_INSTANCE, MY_PROJECT and MY_FEATURE_GROUP with the respective values. The major version set for MY_VERSION needs to match the major version of Hopsworks. Check [PyPI](https://pypi.org/project/hsfs/#history) for available releases.
+ Replace MY_VERSION, MY_API_KEY, MY_INSTANCE, MY_PROJECT and MY_FEATURE_GROUP with the respective values. The major version set for MY_VERSION needs to match the major version of Hopsworks. Check [PyPI](https://pypi.org/project/hopsworks/#history) for available releases.
-
+ You find the Hopsworks version inside any of your Project's settings tab on Hopsworks
@@ -51,7 +51,7 @@ import os
import importlib.util
-package_name = 'hsfs'
+package_name = 'hopsworks'
version = 'MY_VERSION'
spec = importlib.util.find_spec(package_name)
if spec is None:
@@ -67,16 +67,16 @@ secret_value = 'MY_API_KEY'
def azureml_main(dataframe1 = None, dataframe2 = None):
- import hsfs
- conn = hsfs.connection(
- host='MY_INSTANCE.cloud.hopsworks.ai', # DNS of your Feature Store instance
+ import hopsworks
+ project = hopsworks.login(
+ host='MY_INSTANCE.cloud.hopsworks.ai', # DNS of your Hopsworks instance
port=443, # Port to reach your Hopsworks instance, defaults to 443
- project='MY_PROJECT', # Name of your Hopsworks Feature Store project
+ project='MY_PROJECT', # Name of your Hopsworks project
api_key_value=secret_value, # The API key to authenticate with Hopsworks
hostname_verification=True, # Disable for self-signed certificates
engine='python' # Choose python as engine
)
- fs = conn.get_feature_store() # Get the project's default feature store
+ fs = project.get_feature_store() # Get the project's default feature store
return fs.get_feature_group('MY_FEATURE_GROUP', version=1).read(),
```
@@ -121,7 +121,7 @@ Finally, submit the pipeline and wait for it to finish:
!!! info "Performance on the first execution"
- The `Execute Python Script` step can be slow when being executed for the first time as the HSFS library needs to be installed on the compute target. Subsequent executions on the same compute target should use the already installed library.
+ The `Execute Python Script` step can be slow when being executed for the first time as the Hopsworks library needs to be installed on the compute target. Subsequent executions on the same compute target should use the already installed library.
@@ -132,4 +132,4 @@ Finally, submit the pipeline and wait for it to finish:
## Next Steps
-For more information about how to use the Feature Store, see the [Quickstart Guide](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/quickstart.ipynb){:target="_blank"}.
+For more information on how to use the Hopsworks API check out the other guides or the [API Reference](https://docs.hopsworks.ai/hopsworks-api/{{{ hopsworks_version }}}/generated/api/connection_api/).
\ No newline at end of file
diff --git a/docs/user_guides/integrations/mlstudio_notebooks.md b/docs/user_guides/integrations/mlstudio_notebooks.md
index e97fa8351..a879856e7 100644
--- a/docs/user_guides/integrations/mlstudio_notebooks.md
+++ b/docs/user_guides/integrations/mlstudio_notebooks.md
@@ -1,11 +1,34 @@
# Azure Machine Learning Notebooks Integration
-Connecting to the Feature Store from Azure Machine Learning Notebooks requires setting up a Feature Store API key for Azure Machine Learning Notebooks and installing the **HSFS** on the notebook. This guide explains step by step how to connect to the Feature Store from Azure Machine Learning Notebooks.
+Connecting to the Hopsworks from Azure Machine Learning Notebooks requires setting up a Hopsworks API key for Azure Machine Learning Notebooks and installing the **Hopsworks** Python library on the notebook. This guide explains step by step how to connect to the Hopsworks from Azure Machine Learning Notebooks.
!!! info "Network Connectivity"
To be able to connect to the Feature Store, please ensure that the Network Security Group of your Hopsworks instance on Azure is configured to allow incoming traffic from your compute target on ports 443, 9083 and 9085 (443,9083,9085). See [Network security groups](https://docs.microsoft.com/en-us/azure/virtual-network/network-security-groups-overview) for more information. If your compute target is not in the same VNet as your Hopsworks instance and the Hopsworks instance is not accessible from the internet then you will need to configure [Virtual Network Peering](https://docs.microsoft.com/en-us/azure/virtual-network/virtual-network-manage-peering).
+## Install Hopsworks Python Library
+
+To be able to interact with Hopsworks from a Python environment you need to install the `Hopsworks` Python library. The library is available on [PyPi](https://pypi.org/project/hopsworks/) and can be installed using `pip`:
+
+```
+pip install hopsworks[python]~=[HOPSWORKS_VERSION]
+```
+
+!!! attention "Python Profile"
+
+ By default, `pip install hopsworks` does not install all the necessary dependencies required to use the Hopsworks library from a local Python environment. To ensure that all the dependencies are installed, you should install the library using with the Python profile `pip install hopsworks[python]`.
+
+!!! attention "Matching Hopsworks version"
+
+ We recommend that the major and minor version of the Python library match the major and minor version of the Hopsworks deployment.
+
+
+
+
+ You find the Hopsworks version inside any of your Project's settings tab on Hopsworks
+
+
+
## Generate an API key
For instructions on how to generate an API key follow this [user guide](../projects/api_key/create_api_key.md). For the Azure ML Notebooks integration to work correctly make sure you add the following scopes to your API key:
@@ -17,7 +40,7 @@ For instructions on how to generate an API key follow this [user guide](../proje
## Connect from an Azure Machine Learning Notebook
-To access the Feature Store from Azure Machine Learning, open a Python notebook and proceed with the following steps to install HSFS and connect to the Feature Store:
+To access Hopsworks from Azure Machine Learning, open a Python notebook and proceed with the following steps to install Hopsworks and connect to the Feature Store:
@@ -26,34 +49,12 @@ To access the Feature Store from Azure Machine Learning, open a Python notebook
-### Install **HSFS**
-
-To be able to access the Hopsworks Feature Store, the `HSFS` Python library needs to be installed. One way of achieving this is by opening a Python notebook in Azure Machine Learning and installing the `HSFS` with a magic command and pip:
-
-```
-!pip install hsfs[python]~=[HOPSWORKS_VERSION]
-```
-
-!!! attention "Hive Dependencies"
-
- By default, `HSFS` assumes Spark is used as execution engine and therefore Hive dependencies are not installed. Hence, if you are using a regular Python Kernel **without Spark**, make sure to install the **"python"** extra dependencies (`hsfs[python]`).
-
-!!! attention "Matching Hopsworks version"
- The **major version of `HSFS`** needs to match the **major version of Hopsworks**. Check [PyPI](https://pypi.org/project/hsfs/#history) for available releases.
-
-
-
-
- You find the Hopsworks version inside any of your Project's settings tab on Hopsworks
-
-
-
-### Connect to the Feature Store
+### Connect to Hopsworks
-You are now ready to connect to the Hopsworks Feature Store from the notebook:
+You are now ready to connect to Hopsworks Feature Store from the notebook:
```python
-import hsfs
+import hopsworks
# Put the API key into Key Vault for any production setup:
# See, https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-secrets-in-runs
@@ -63,19 +64,19 @@ import hsfs
secret_value = 'MY_API_KEY'
# Create a connection
-conn = hsfs.connection(
- host='MY_INSTANCE.cloud.hopsworks.ai', # DNS of your Feature Store instance
+project = hopsworks.login(
+ host='MY_INSTANCE.cloud.hopsworks.ai', # DNS of your Hopsworks instance
port=443, # Port to reach your Hopsworks instance, defaults to 443
- project='MY_PROJECT', # Name of your Hopsworks Feature Store project
+ project='MY_PROJECT', # Name of your Hopsworks project
api_key_value=secret_value, # The API key to authenticate with Hopsworks
hostname_verification=True, # Disable for self-signed certificates
engine='python' # Choose Python as engine
)
# Get the feature store handle for the project's feature store
-fs = conn.get_feature_store()
+fs = project.get_feature_store()
```
## Next Steps
-For more information about how to use the Feature Store, see the [Quickstart Guide](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/quickstart.ipynb){:target="_blank"}.
+For more information on how to use the Hopsworks API check out the other guides or the [API Reference](https://docs.hopsworks.ai/hopsworks-api/{{{ hopsworks_version }}}/generated/api/connection_api/).
\ No newline at end of file
diff --git a/docs/user_guides/integrations/python.md b/docs/user_guides/integrations/python.md
index 63cdac730..32a079a86 100644
--- a/docs/user_guides/integrations/python.md
+++ b/docs/user_guides/integrations/python.md
@@ -1,29 +1,33 @@
-# Python Environments (Local or Kubeflow)
+---
+description: Documentation on how to connect to Hopsworks from a Python environment (e.g. from Sagemaker, Google Colab, Kubeflow or local environment)
+---
-Connecting to the Feature Store from any Python environment requires setting up a Feature Store API key and installing the library. This guide explains step by step how to connect to the Feature Store from any Python environment such as your local environment or KubeFlow.
+# Python Environments (Local, AWS SageMaker, Google Colab or Kubeflow)
-## Install **HSFS**
+This guide explains step by step how to connect to Hopsworks from any Python environment such as your local environment, AWS SageMaker, Google Colab or Kubeflow.
-To be able to access the Hopsworks Feature Store, the `HSFS` Python library needs to be installed in the environment from which you want to connect to the Feature Store. You can install the library through pip. We recommend using a Python environment manager such as *virtualenv* or *conda*.
+## Install Python Library
+
+To be able to interact with Hopsworks from a Python environment you need to install the `Hopsworks` Python library. The library is available on [PyPi](https://pypi.org/project/hopsworks/) and can be installed using `pip`:
```
-pip install hsfs[python]~=[HOPSWORKS_VERSION]
+pip install hopsworks[python]~=[HOPSWORKS_VERSION]
```
-!!! attention "Hive Dependencies"
+!!! attention "Python Profile"
- By default, `HSFS` assumes Spark/EMR is used as execution engine and therefore Hive dependencies are not installed. Hence, on a local Python evnironment, if you are planning to use a regular Python Kernel **without Spark/EMR**, make sure to install the **"python"** extra dependencies (`hsfs[python]`).
+ By default, `pip install hopsworks`, does not install all the necessary dependencies required to use the Hopsworks library from a pure Python environment. To ensure that all the dependencies are installed, you should install the library using with the Python profile `pip install hopsworks[python]`.
!!! attention "Matching Hopsworks version"
-The **major version of `HSFS`** needs to match the **major version of Hopsworks**.
+ We recommend that the major and minor version of the Python library match the major and minor version of the Hopsworks deployment.
-
-
-
- You find the Hopsworks version inside any of your Project's settings tab on Hopsworks
-
-
+
+
+
+ You find the Hopsworks version inside any of your Project's settings tab on Hopsworks
+
+
## Generate an API key
@@ -36,30 +40,26 @@ For instructions on how to generate an API key follow this [user guide](../proje
## Connect to the Feature Store
-You are now ready to connect to the Hopsworks Feature Store from your Python environment:
+You are now ready to connect to Hopsworks from your Python environment:
```python
-import hsfs
-conn = hsfs.connection(
- host='my_instance', # DNS of your Feature Store instance
+import hopsworks
+project = hopsworks.login(
+ host='my_instance', # DNS of your Hopsworks instance
port=443, # Port to reach your Hopsworks instance, defaults to 443
- project='my_project', # Name of your Hopsworks Feature Store project
+ project='my_project', # Name of your Hopsworks project
api_key_value='apikey', # The API key to authenticate with Hopsworks
- hostname_verification=True, # Disable for self-signed certificates
- engine="python" # Set to "spark" if you are using Spark/EMR
+ engine='python', # Use the Python engine
)
-fs = conn.get_feature_store() # Get the project's default feature store
+fs = project.get_feature_store() # Get the project's default feature store
```
!!! note "Engine"
- `HSFS` uses either Apache Spark or Pandas/Polars on Python as an execution engine to perform queries against the feature store. The `engine` option of the connection let's you overwrite the default behaviour by setting it to `"python"` or `"spark"`. By default, `HSFS` will try to use Spark as engine if PySpark is available. So if you have PySpark installed in your local Python environment, but you have not configured Spark, you will have to set `engine='python'`. Please refer to the [Spark integration guide](spark.md) to configure your local Spark cluster to be able to connect to the Hopsworks Feature Store.
-
-!!! info "Ports"
-
- If you have trouble to connect, please ensure that your Feature Store can receive incoming traffic from your Python environment on ports 443, 9083 and 9085 (443,9083,9085).
+ `Hopsworks` leverages several engines depending on whether you are running using Apache Spark or Pandas/Polars. The default behaviour of the library is to use the `spark` engine if you do not specify any `engine` option in the `login` method and if the `PySpark` library is available in the environment.
+ Please refer to the [Spark integration guide](spark.md) to configure your PySpark cluster to interact with Hopsworks.
## Next Steps
-For more information about how to connect, see the [Connection](https://docs.hopsworks.ai/hopsworks-api/{{{ hopsworks_version }}}/generated/api/connection_api/) API reference. Or continue with the Data Source guide to import your own data to the Feature Store.
+For more information on how to use the Hopsworks API check out the other guides or the [API Reference](https://docs.hopsworks.ai/hopsworks-api/{{{ hopsworks_version }}}/generated/api/connection_api/).
diff --git a/docs/user_guides/integrations/sagemaker.md b/docs/user_guides/integrations/sagemaker.md
deleted file mode 100644
index 2801cfeb8..000000000
--- a/docs/user_guides/integrations/sagemaker.md
+++ /dev/null
@@ -1,180 +0,0 @@
-# AWS SageMaker Integration
-
-Connecting to the Feature Store from SageMaker requires setting up a Feature Store API key for SageMaker and installing the **HSFS** on SageMaker. This guide explains step by step how to connect to the Feature Store from SageMaker.
-
-## Generate an API key
-
-For instructions on how to generate an API key follow this [user guide](../projects/api_key/create_api_key.md). For the SageMaker integration to work make sure you add the following scopes to your API key:
-
- 1. featurestore
- 2. project
- 3. job
- 4. kafka
-
-## Quickstart API key Argument
-
-!!! hint "API key as Argument"
- To get started quickly, without saving the Hopsworks API in a secret storage, you can simply supply it as an argument when instantiating a connection:
- ```python hl_lines="6"
- import hsfs
- conn = hsfs.connection(
- host='my_instance', # DNS of your Feature Store instance
- port=443, # Port to reach your Hopsworks instance, defaults to 443
- project='my_project', # Name of your Hopsworks Feature Store project
- api_key_value='apikey', # The API key to authenticate with Hopsworks
- hostname_verification=True # Disable for self-signed certificates
- )
- fs = conn.get_feature_store() # Get the project's default feature store
- ```
-
-
-## Store the API key on AWS
-
-The API key now needs to be stored on AWS, so it can be retrieved from within SageMaker notebooks.
-
-### Identify your SageMaker role
-
-You need to know the IAM role used by your SageMaker instance to set up the API key for it. You can find it in the overview of your SageMaker notebook instance of the AWS Management Console.
-
-In this example, the name of the role is **AmazonSageMaker-ExecutionRole-20190511T072435**.
-
-
-
-
- The role is attached to your SageMaker notebook instance
-
-
-
-### Store the API key
-
-You have two options to make your API key accessible from SageMaker:
-
-#### Option 1: Using the AWS Systems Manager Parameter Store
-
-##### Store the API key in the AWS Systems Manager Parameter Store
-
-1. In the AWS Management Console, ensure that your active region is the region you use for SageMaker.
-2. Go to the AWS Systems Manager choose *Parameter Store* in the left navigation bar and select *Create Parameter*.
-3. As name, enter `/hopsworks/role/[MY_SAGEMAKER_ROLE]/type/api-key` replacing `[MY_SAGEMAKER_ROLE]` with the AWS role used by the SageMaker instance that should access the Feature Store.
-4. Select *Secure String* as type and *create the parameter*.
-
-
-
-
- Store the API key in the AWS Systems Manager Parameter Store
-
-
-
-##### Grant access to the Parameter Store from the SageMaker notebook role
-
-1. In the AWS Management Console, go to *IAM*, select *Roles* and then the role that is used when creating SageMaker notebook instances.
-2. Select *Add inline policy*.
-3. Choose *Systems Manager* as service, expand the *Read access level* and check *GetParameter*.
-4. Expand *Resources* and select *Add ARN*.
-6. Enter the region of the Systems Manager as well as the name of the parameter **WITHOUT the leading slash** e.g. `hopsworks/role/[MY_SAGEMAKER_ROLE]/type/api-key` and click *Add*.
-7. Click on *Review*, give the policy a name and click on *Create policy*.
-
-
-
-
- Grant access to the Parameter Store from the SageMaker notebook role
-
-
-
-#### Option 2: Using the AWS Secrets Manager
-
-##### Store the API key in the AWS Secrets Manager
-
-1. In the AWS Management Console, ensure that your active region is the region you use for SageMaker.
-2. Go to the *AWS Secrets Manager* and select *Store new secret*.
-3. Select *Other type of secrets* and add api-key as the key and paste the API key created in the previous step as the value.
-4. Click next.
-
-
-
-
- Store the API key in the AWS Secrets Manager
-
-
-
-5. As secret name, enter `hopsworks/role/[MY_SAGEMAKER_ROLE]` replacing `[MY_SAGEMAKER_ROLE]` with the AWS role used by the SageMaker instance that should access the Feature Store.
-6. Select *next* twice and finally store the secret.
-7. Then click on the secret in the secrets list and take note of the *Secret ARN*.
-
-
-
-
- Store the API key in the AWS Secrets Manager
-
-
-
-##### Grant access to the SecretsManager to the SageMaker notebook role
-
-1. In the AWS Management Console, go to *IAM*, select *Roles* and then the role that is used when creating SageMaker notebook instances.
-2. Select *Add inline policy*.
-3. Choose *Secrets Manager* as service, expand the *Read access* level and check *GetSecretValue*.
-4. Expand *Resources* and select *Add ARN*.
-5. Paste the *ARN* of the secret created in the previous step.
-6. Click on *Review*, give the policy a name and click on *Create policy*.
-
-
-
-
- Grant access to the SecretsManager to the SageMaker notebook role
-
-
-
-## Install **HSFS**
-
-To be able to access the Hopsworks Feature Store, the `HSFS` Python library needs to be installed. One way of achieving this is by opening a Python notebook in SageMaker and installing the `HSFS` with a magic command and pip:
-
-```
-!pip install hsfs[python]~=[HOPSWORKS_VERSION]
-```
-
-!!! attention "Hive Dependencies"
-
- By default, `HSFS` assumes Spark/EMR is used as execution engine and therefore Hive dependencies are not installed. Hence, on AWS SageMaker, if you are planning to use a regular Python Kernel **without Spark/EMR**, make sure to install the **"python"** extra dependencies (`hsfs[python]`).
-
-!!! attention "Matching Hopsworks version"
- The **major version of `HSFS`** needs to match the **major version of Hopsworks**.
-
-
-
-
-
- You find the Hopsworks version inside any of your Project's settings tab on Hopsworks
-
-
-
-Note that the library will not be persistent. For information around how to permanently install a library to SageMaker, see [Install External Libraries and Kernels](https://docs.aws.amazon.com/sagemaker/latest/dg/nbi-add-external.html) in Notebook Instances.
-
-## Connect to the Feature Store
-
-You are now ready to connect to the Hopsworks Feature Store from SageMaker:
-
-```python
-import hsfs
-conn = hsfs.connection(
- 'my_instance', # DNS of your Feature Store instance
- 443, # Port to reach your Hopsworks instance, defaults to 443
- 'my_project', # Name of your Hopsworks Feature Store project
- secrets_store='secretsmanager', # Either parameterstore or secretsmanager
- hostname_verification=True, # Disable for self-signed certificates
- engine='python' # Choose Python as engine if you haven't set up AWS EMR
-)
-fs = conn.get_feature_store() # Get the project's default feature store
-```
-
-!!! note "Engine"
-
- `HSFS` uses either Apache Spark or Pandas/Polars on Python as an execution engine to perform queries against the feature store. Most AWS SageMaker Kernels have PySpark installed but are not connected to AWS EMR by default, hence, the `engine` option of the connection let's you overwrite the default behaviour. By default, `HSFS` will try to use Spark as engine if PySpark is available, however, if Spark/EMR is not configured, you will have to set the engine manually to `"python"`. Please refer to the [EMR integration guide](emr/emr_configuration.md) to setup EMR with the Hopsworks Feature Store.
-
-
-!!! info "Ports"
-
- If you have trouble connecting, please ensure that the Security Group of your Hopsworks instance on AWS is configured to allow incoming traffic from your SageMaker instance on ports 443, 9083 and 9085 (443,9083,9085). See [VPC Security Groups](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_SecurityGroups.html) for more information. If your SageMaker instances are not in the same VPC as your Hopsworks instance and the Hopsworks instance is not accessible from the internet then you will need to configure [VPC Peering on AWS](https://docs.aws.amazon.com/vpc/latest/peering/what-is-vpc-peering.html).
-
-## Next Steps
-
-For more information about how to use the Feature Store, see the [Quickstart Guide](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/quickstart.ipynb){:target="_blank"}.
diff --git a/docs/user_guides/integrations/spark.md b/docs/user_guides/integrations/spark.md
index 4005ce66d..d624beb0a 100644
--- a/docs/user_guides/integrations/spark.md
+++ b/docs/user_guides/integrations/spark.md
@@ -72,20 +72,20 @@ For instructions on how to generate an API key follow this [user guide](../proje
You are now ready to connect to the Hopsworks Feature Store from Spark:
```python
-import hsfs
-conn = hsfs.connection(
+import hopsworks
+project = hopsworks.login(
host='my_instance', # DNS of your Feature Store instance
port=443, # Port to reach your Hopsworks instance, defaults to 443
project='my_project', # Name of your Hopsworks Feature Store project
api_key_value='api_key', # The API key to authenticate with the feature store
hostname_verification=True # Disable for self-signed certificates
)
-fs = conn.get_feature_store() # Get the project's default feature store
+fs = project.get_feature_store() # Get the project's default feature store
```
!!! note "Engine"
- `HSFS` uses either Apache Spark or Pandas/Polars on Python as an execution engine to perform queries against the feature store. The `engine` option of the connection let's you overwrite the default behaviour by setting it to `"python"` or `"spark"`. By default, `HSFS` will try to use Spark as engine if PySpark is available, hence, no further action should be required if you setup Spark correctly as described above.
+ `Hopsworks` leverages several engines depending on whether you are running using Apache Spark or Pandas/Polars. The default behaviour of the library is to use the `spark` engine if you do not specify any `engine` option in the `login` method and if the `PySpark` library is available in the environment.
## Next Steps
diff --git a/mkdocs.yml b/mkdocs.yml
index 0c33aad89..d8cdaf8a2 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -107,8 +107,10 @@ nav:
- Compute Engines: user_guides/fs/compute_engines.md
- Client Integrations:
- user_guides/integrations/index.md
- - Python: user_guides/integrations/python.md
- - AWS Sagemaker: user_guides/integrations/sagemaker.md
+ - Python / SageMaker / Kubeflow : user_guides/integrations/python.md
+ - AWS EMR:
+ - Networking: user_guides/integrations/emr/networking.md
+ - Configure EMR for Hopsworks: user_guides/integrations/emr/emr_configuration.md
- Azure HDInsight: user_guides/integrations/hdinsight.md
- Azure Machine Learning:
- Designer: user_guides/integrations/mlstudio_designer.md