This repo contains my work to Udacity nanodegree Machine Learning Engineer. I'm glad to have chosen this nano-degree and finished within 4 weeks. I gained hands-on experiences on ML pipeline in SageMaker and was exposed to a couple of interesting problems.
-
Software Engineering Fundamentals: Publish a simple PyPi package to practice software engineering fundamentals, e.g., modular code, optimize speed and memory, Docstrings, version control, unit tests, logging, and code review.
-
Machine Learning in Production: Use Sagemaker to develop, train, validate, and deploy a sentiment analysis on the movie review model using RNN in Pytorch. Hook the simple web app with the deployed endpoint using Lambda and API Gateway services in AWS.
-
Plagiarism Detection: Build a plagiarism detector that examines a text file and performs binary classification; labeling that file as either plagiarized or not, depending on how similar that text file is to a provided source text.
-
Capstone Project: Stock Prediction: The purpose of the capstone project is to leverage everything learned throughout the program to build an own machine learning engineer project. I build a simple stocker predictor using Pytorch's LSTM in SageMaker. The goal is not to accurately predict the stock market but to gain hands-on experiences on the ML pipeline in SageMaker including data acquisition, data preprocessing and exploration, modeling, hyperparameter tuning, and model evaluation. The project is summarized in the report.