From a8ab2f24704ed5eacb62aedb6c5629c86225cfd3 Mon Sep 17 00:00:00 2001 From: Zen Punk <5505878+cosmiccamel@users.noreply.github.com> Date: Mon, 7 Nov 2022 14:29:14 +0200 Subject: [PATCH 1/3] Update Readme #1850 https://github.com/Azure/MachineLearningNotebooks/issues/1850 --- how-to-use-azureml/automated-machine-learning/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/how-to-use-azureml/automated-machine-learning/README.md b/how-to-use-azureml/automated-machine-learning/README.md index 63a11d198..36d20ddf3 100644 --- a/how-to-use-azureml/automated-machine-learning/README.md +++ b/how-to-use-azureml/automated-machine-learning/README.md @@ -109,7 +109,7 @@ jupyter notebook ## Classification - **Classify Credit Card Fraud** - Dataset: [Kaggle's credit card fraud detection dataset](https://www.kaggle.com/mlg-ulb/creditcardfraud) - - **[Jupyter Notebook (remote run)](classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb)** + - **[Jupyter Notebook](classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb)** - run the experiment remotely on AML Compute cluster - test the performance of the best model in the local environment - **[Jupyter Notebook (local run)](local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb)** From 0be65cd598d44dc1da33f37dd337813fea34629e Mon Sep 17 00:00:00 2001 From: Zen Punk <5505878+cosmiccamel@users.noreply.github.com> Date: Mon, 7 Nov 2022 14:34:12 +0200 Subject: [PATCH 2/3] Update README.md --- how-to-use-azureml/automated-machine-learning/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/how-to-use-azureml/automated-machine-learning/README.md b/how-to-use-azureml/automated-machine-learning/README.md index 36d20ddf3..cb3f8dab8 100644 --- a/how-to-use-azureml/automated-machine-learning/README.md +++ b/how-to-use-azureml/automated-machine-learning/README.md @@ -112,7 +112,7 @@ jupyter notebook - **[Jupyter Notebook](classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb)** - run the experiment remotely on AML Compute cluster - test the performance of the best model in the local environment - - **[Jupyter Notebook (local run)](local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb)** + - **[Jupyter Notebook](local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb)** - run experiment in the local environment - use Mimic Explainer for computing feature importance - deploy the best model along with the explainer to an Azure Kubernetes (AKS) cluster, which will compute the raw and engineered feature importances at inference time From c363bdc9dc03d7d5782e37217eb2b8ed794dcd77 Mon Sep 17 00:00:00 2001 From: Zen Punk <5505878+cosmiccamel@users.noreply.github.com> Date: Mon, 7 Nov 2022 15:00:44 +0200 Subject: [PATCH 3/3] Update links Broken re direction URL's in README.md #1850 --- .../automated-machine-learning/README.md | 32 ++++++++----------- 1 file changed, 13 insertions(+), 19 deletions(-) diff --git a/how-to-use-azureml/automated-machine-learning/README.md b/how-to-use-azureml/automated-machine-learning/README.md index cb3f8dab8..f28f5a451 100644 --- a/how-to-use-azureml/automated-machine-learning/README.md +++ b/how-to-use-azureml/automated-machine-learning/README.md @@ -109,16 +109,16 @@ jupyter notebook ## Classification - **Classify Credit Card Fraud** - Dataset: [Kaggle's credit card fraud detection dataset](https://www.kaggle.com/mlg-ulb/creditcardfraud) - - **[Jupyter Notebook](classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb)** + - **[Jupyter Notebook (remote run)](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb)** - run the experiment remotely on AML Compute cluster - test the performance of the best model in the local environment - - **[Jupyter Notebook](local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb)** + - **[Jupyter Notebook (local run)](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb)** - run experiment in the local environment - use Mimic Explainer for computing feature importance - deploy the best model along with the explainer to an Azure Kubernetes (AKS) cluster, which will compute the raw and engineered feature importances at inference time - **Predict Term Deposit Subscriptions in a Bank** - Dataset: [UCI's bank marketing dataset](https://www.kaggle.com/janiobachmann/bank-marketing-dataset) - - **[Jupyter Notebook](classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb)** + - **[Jupyter Notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb)** - run experiment remotely on AML Compute cluster to generate ONNX compatible models - view the featurization steps that were applied during training - view feature importance for the best model @@ -126,7 +126,7 @@ jupyter notebook - deploy the best model in PKL format to Azure Container Instance (ACI) - **Predict Newsgroup based on Text from News Article** - Dataset: [20 newsgroups text dataset](https://scikit-learn.org/0.19/datasets/twenty_newsgroups.html) - - **[Jupyter Notebook](classification-text-dnn/auto-ml-classification-text-dnn.ipynb)** + - **[Jupyter Notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb)** - AutoML highlights here include using deep neural networks (DNNs) to create embedded features from text data - AutoML will use Bidirectional Encoder Representations from Transformers (BERT) when a GPU compute is used - Bidirectional Long-Short Term neural network (BiLSTM) will be utilized when a CPU compute is used, thereby optimizing the choice of DNN @@ -134,11 +134,11 @@ jupyter notebook ## Regression - **Predict Performance of Hardware Parts** - Dataset: Hardware Performance Dataset - - **[Jupyter Notebook](regression/auto-ml-regression.ipynb)** + - **[Jupyter Notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb)** - run the experiment remotely on AML Compute cluster - get best trained model for a different metric than the one the experiment was optimized for - test the performance of the best model in the local environment - - **[Jupyter Notebook (advanced)](regression/auto-ml-regression.ipynb)** + - **[Jupyter Notebook (advanced)](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb)** - run the experiment remotely on AML Compute cluster - customize featurization: override column purpose within the dataset, configure transformer parameters - get best trained model for a different metric than the one the experiment was optimized for @@ -148,41 +148,35 @@ jupyter notebook ## Time Series Forecasting - **Forecast Energy Demand** - Dataset: [NYC energy demand data](http://mis.nyiso.com/public/P-58Blist.htm) - - **[Jupyter Notebook](forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb)** + - **[Jupyter Notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb)** - run experiment remotely on AML Compute cluster - use lags and rolling window features - view the featurization steps that were applied during training - get the best model, use it to forecast on test data and compare the accuracy of predictions against real data - **Forecast Orange Juice Sales (Multi-Series)** - - Dataset: [Dominick's grocery sales of orange juice](forecasting-orange-juice-sales/dominicks_OJ.csv) - - **[Jupyter Notebook](forecasting-orange-juice-sales/dominicks_OJ.csv)** + - Dataset: [Dominick's grocery sales of orange juice](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/bike-no.csv) + - **[Jupyter Notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb)** - run experiment remotely on AML Compute cluster - customize time-series featurization, change column purpose and override transformer hyper parameters - evaluate locally the performance of the generated best model - deploy the best model as a webservice on Azure Container Instance (ACI) - get online predictions from the deployed model - **Forecast Demand of a Bike-Sharing Service** - - Dataset: [Bike demand data](forecasting-bike-share/bike-no.csv) - - **[Jupyter Notebook](forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb)** + - Dataset: [Bike demand data](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/bike-no.csv) + - **[Jupyter Notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb)** - run experiment remotely on AML Compute cluster - integrate holiday features - run rolling forecast for test set that is longer than the forecast horizon - compute metrics on the predictions from the remote forecast - **The Forecast Function Interface** - Dataset: Generated for sample purposes - - **[Jupyter Notebook](forecasting-forecast-function/auto-ml-forecasting-function.ipynb)** + - **[Jupyter Notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-forecast-function/auto-ml-forecasting-function.ipynb)** - train a forecaster using a remote AML Compute cluster - capabilities of forecast function (e.g. forecast farther into the horizon) - generate confidence intervals -- **Forecast Beverage Production** - - Dataset: [Monthly beer production data](forecasting-beer-remote/Beer_no_valid_split_train.csv) - - **[Jupyter Notebook](forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb)** - - train using a remote AML Compute cluster - - enable the DNN learning model - - forecast on a remote compute cluster and compare different model performance - **Continuous Retraining with NOAA Weather Data** - Dataset: [NOAA weather data from Azure Open Datasets](https://azure.microsoft.com/en-us/services/open-datasets/) - - **[Jupyter Notebook](continuous-retraining/auto-ml-continuous-retraining.ipynb)** + - **[Jupyter Notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb)** - continuously retrain a model using Pipelines and AutoML - create a Pipeline to upload a time series dataset to an Azure blob - create a Pipeline to run an AutoML experiment and register the best resulting model in the Workspace