Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Make inferences against the model

Time to complete: 15-20 minutes.

What are we building?

Architecture diagram

Now that are model is trained, we need a way to make inferences against it. In this section we'll be building an HTTP rest endpoint (API Gateway) where we can POST JSON data against our model sitting on S3. A Lambda function will load the model, and make an inference directly against the model and return it in the HTTP response.

We will be doing model inferences outside of Amazon SageMaker.

Why are we building it?

With the ability to now, get real-time information of whether or not a ride is going to "cost" more to the unicorn based on mileage plus weather (instead of just mileage), our pricing workflow can be updated to include this http endpoint. Enabling our company to give better, more realistic pricing based on actual usage.

Why Lambda? Our unicorn fleet isn't a single breed. We offer the largest selection of rare unicorn breeds for customers of all needs. We expect that after further research, each breed is actually responding differently to various weather conditions. By hosting our models on S3 and using Lambda to make inferences, we can have a dynamic HTTP interface to make predictions against a ML model specific to a unicorn breed without having to pay for separate Amazon SageMaker endpoints (1 per unicorn breed - we have thousands).

Short cut: Deploy everything for me

We don't recommend this route unless you ran into a snag and are worried about completing the workshop on time.

🙈 BREAK GLASS! (use in case of emergency)

  1. Navigate to your Cloud9 environment
  2. Make sure you're in the correct directory first
    cd ~/environment/aws-serverless-workshops/MachineLearning/3_Inference
    
  3. Upload the inference code to Lambda
    aws s3 cp lambda-functions/inferencefunction.zip s3://$bucket/code/inferencefunction.zip
    
  4. Create your resources
    aws cloudformation create-stack \
      --stack-name wildrydes-ml-mod3 \
      --parameters ParameterKey=DataBucket,ParameterValue=$bucket \
                   ParameterKey=DataProcessingExecutionRoleName,ParameterValue=$(aws cloudformation describe-stack-resources --stack-name wildrydes-ml-mod1 --logical-resource-id DataProcessingExecutionRole --query "StackResources[0].PhysicalResourceId" --output text) \
                   ParameterKey=TrainedModelPath,ParameterValue=$(aws s3 ls s3://$bucket/linear-learner --recursive | grep 'model' | cut -c 32-) \
      --capabilities CAPABILITY_NAMED_IAM \
      --template-body file://cloudformation/99_complete.yml
    
  5. Scroll down to the section on testing your API

Step 1: Get CloudFormation parameters

Grab the name of your IAM DataProcessingExecutionRole and add it to scratchpad.txt for use later. (Expand for detailed instructions)

  1. Navigate to your Cloud9 environment
  2. Set the data processing execution role as an environment variable
    execution_role=$(aws cloudformation describe-stack-resources --stack-name wildrydes-ml-mod1 --logical-resource-id DataProcessingExecutionRole --query "StackResources[0].PhysicalResourceId" --output text)
    
  3. Verify the variable is set
    echo $execution_role
    
  4. Add the data processing execution role to your scratchpad for future use
    echo "Data processing execution role:" $execution_role >> ~/environment/scratchpad.txt
    

Step 2: Upload Inference Function Zip

Upload lambda-functions/inferencefunction.zip to YOUR_BUCKET_NAME/code. (Expand for detailed instructions)

  1. Navigate to your Cloud9 environment
  2. Run the following command to upload the Lambda function for inference
    # Command should be ran from /home/ec2-user/environment/aws-serverless-workshops/MachineLearning/3_Inference in your cloud 9 environment
    cd ~/environment/aws-serverless-workshops/MachineLearning/3_Inference
    
    # Run this command to upload the ride data
    aws s3 cp lambda-functions/inferencefunction.zip s3://$bucket/code/inferencefunction.zip
    
    # Run this command to verify the file was uploaded (you should see the file name listed)
    aws s3 ls s3://$bucket/code/
    

Step 3: Create Lambda function and API Gateway skeletons

At this point, we have a trained model on S3. Now, we're ready to load the model into Lambda at runtime and make inferences against the model. The Lambda function that will make inferences is hosted behind an API Gateway that will accept POST HTTP requests.

Create Lambda function for Model Inferences named ModelInferenceFunction and an HTTP API by launching cloudformation/3_lambda_function.yml Stack and naming it wildrydes-ml-mod3. (Expand for detailed instructions)

  1. Navigate to your Cloud9 environment
  2. Run the following command to create your resources:
    # Command should be ran from /home/ec2-user/environment/aws-serverless-workshops/MachineLearning/3_Inference in your cloud 9 environment
    cd ~/environment/aws-serverless-workshops/MachineLearning/3_Inference
    
    aws cloudformation create-stack \
      --stack-name wildrydes-ml-mod3 \
      --parameters ParameterKey=DataBucket,ParameterValue=$bucket \
                   ParameterKey=DataProcessingExecutionRoleName,ParameterValue=$execution_role \
      --capabilities CAPABILITY_NAMED_IAM \
      --template-body file://cloudformation/3_lambda_function.yml
    
  3. Monitor the status of your stack creation. EITHER:
    1. Go to CloudFormation in the AWS Console OR
    2. Run the following command in Cloud9 until you get CREATE_COMPLETE in the output:
      # Run this command to verify the stack was successfully created. You should expect to see "CREATE_COMPLETE".
      # If you see "CREATE_IN_PROGRESS", your stack is still being created. Wait and re-run the command.
      # If you see "ROLLBACK_COMPLETE", pause and see what went wrong.
      aws cloudformation describe-stacks \
          --stack-name wildrydes-ml-mod3 \
          --query "Stacks[0].StackStatus"
      


❗ DO NOT move past this point until you see CREATE_COMPLETE as the status for your CloudFormation stack

Step 4: Update Lambda Function

The previous step gave us a Lambda function that will load the ML model from S3, make inferences against it in Lambda, and return the results from behind API Gateway. For this to work, we need to connect some critical pieces.

1. Update the MODEL_PATH environment variable in ModelInferenceFunction. Set the value to your bucket name. (Expand for detailed instructions)

  1. Run this command in your Cloud9 console:
    aws s3 ls s3://$bucket/linear-learner --recursive | grep 'model' | cut -c 32-
    
  2. Copy the returned value. You'll need it below.
  3. Open the Lambda console
  4. Open the function containing ModelInferenceFunction in the name
  5. Scroll down and populate the MODEL_PATH key with the location of your model (what you just copied)
  • Replace the entire existing value with the string you copied.
  • Make sure the full string looks like this: linear-learner-yyyy-mm-dd-00-40-46-627/output/model.tar.gz
  1. Click Save

2. Take a moment to review the code in lambda-functions/lambda_function.py. (Expand for detailed instructions)

Note: If you're not interested in learning how to host your own ML model on Lambda, you can stop reading now and close this step and continue in the README. There are no steps here to complete, only additional information on steps required to recreate this yourself.

Amazon SageMaker can be used to build, train, and deploy machine learning models. We're leveraging it to build and train our model. Due to our business possibly having thousands of models, 1 per unicorn breed, its actually better for us to host this model ourselves on Lambda. Below are the high level steps that we've completed on your behalf for this workshop, but you're free to explore if you need to recreate this.

  1. Build MXNet from source for 1) the current support Lambda runtime and 2) the current MXNet version that Amazon SageMaker uses. Instructions here.
  2. The code in lambda-functions/lambda_function.py will load the model from S3, load MXNet, and make inferences against our model. You'd need to install these dependencies locally in an environment similar to the runtime for Lambda and package those dependencies following this instructions. If you unzip lambda-functions/inferencefunction.zip, you'll see the result of those steps as reference.
  3. download_model function: Once we've got MXNet built for our environment, and the Lambda package built, we can proceed reviewing the code. The Lambda function loads the model from S3 on the fly at the time of request and unzips it locally.
  4. create_data_iter function: The HTTP request data is formatted in a NumPy array, required by the MXNet linear learner model interface to make inferences
  5. make_prediction function: An inference is made and then packaged for an HTTP response to the caller.

Step 5: Wire up API Gateway

The last thing we need to connect is the HTTP API Gateway to your ModelInferenceFunction

1. Update the ModelInferenceApi API Gateway root resource to proxy requests to your ModelInferenceFunction. (Expand for detailed instructions)

  1. Open the API Gateway console
  2. Click ModelInferenceApi
  3. Select the root / resource
  4. Click Actions > Create Method
  5. Select ANY in the dropdown
  6. Click the checkbox next to it
  7. Verify Lambda Function is selected as the Integration type
  8. Check the box next to Use Lambda Proxy integration so we get all request details
  9. Select your ModelInferenceFunction in the Lambda Function dropdown. If it is not a dropdown, start typing 'inference' to find and select your function.
  10. Click Save
  11. Click OK to the permissions dialogue box

2. Deploy your API Gateway. (Expand for detailed instructions)

  1. Navigate to the ModelInferenceApi. If not already there:
    1. Open the API Gateway console
    2. Click ModelInferenceApi
    3. Select the root / resource
  2. Click Actions > Deploy API
  3. Select [New Stage] for Deployment Stage
  4. Type prod for Stage name
  5. Click Deploy


Take note of your Invoke URL

Testing your API

  1. Navigate to your Cloud9 environment
  2. Run the following command to get a premade cURL command you can use to call your model:
    # Command should be ran from /home/ec2-user/environment/aws-serverless-workshops/MachineLearning/3_Inference in your cloud 9 environment
    cd ~/environment/aws-serverless-workshops/MachineLearning/3_Inference
    
    aws cloudformation describe-stacks --stack-name wildrydes-ml-mod3 \
      --query "Stacks[0].Outputs[?OutputKey=='InferenceFunctionTestCommand'].OutputValue" --output text
    
  3. Copy the output and execute the command that looks like: curl -d { ... }
  4. Optional: You can also test the Lambda function by putting using the test API UI in the API Gateway console.

What did your curl command return? What's this mean?

Lets look at the curl command first:

curl -d '{ "distance": 30, "healthpoints": 30, "magicpoints": 1500, "TMAX": 333, "TMIN": 300, "PRCP": 100 }' -H "Content-Type: application/json" -X POST STAGE_URL

This is asking our deployed model how likely a unicorn traveling a distance of 30, burning 1500 magic points in the weather conditions = "TMAX": 333, "TMIN": 300, "PRCP": 100 (PRCP = Precipitation (tenths of mm), TMAX = Maximum temperature (tenths of degrees C), and TMIN = Minimum temperature (tenths of degrees C)).

The decimal returned from our API is actually a decimal representation of the liklihood that a unicorn experiencing the conditions in the CURL command is going to require service.

Now What?

Let's recap - you've put together a pipeline, that:

  • On the front end of the data pipeline, we collect and ingest ride telemetry data from our unicorns
  • We've enhanced that data with the nearest, active weather station ID
  • We've trained a machine learning model to predict heavier than usual magic point usage based on different weather characteristics for that day
  • We've hosted this model behind an HTTP interface that loads the model dynamically

How can Wild Rydes use this to improve the business?

We're now able to predict real time, when each unicorn is going to need to be serviced. Leveraging this new capability, we're able to perform preventative repairs on the unicorns before the more costly repairs are required and the unicorn is removed from service.

Next step:

Once you're done testing the API call to your model, you can clean up the resources so you're not charged.