Skip to content

reiffd7/movie_recommender

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Best Movie Recommender EVER! ! !

Daniel Reiff, Steven Rouk, Scott Peabody, Sarah Forward

  1. Overview
  2. Dataset Description
  3. Starting Baseline
  4. New Proposed Model
  5. Precision
  6. Churn
  7. The App
  8. Proposed Plan of Implimentation

EDA

Overview

In the scenario for this hackathon, we are working for the company, Movies-Legit, who has used a production recommenders for many years now. The recommender provides a significant revenue stream so our managers are hesitant to touch it. The issue is that these systems have been around a long time and our head of data science has asked our team to explore new solutions.

Description of Dataset

100,004 users, 9,066 movies, and 671 users. The rating system is 0-5.

EDA

Starting Baseline

The solution that has been around for so long is called the Mean of Means. Some users like to rate things highly---others simply do not. Some items are just better or worse. These general trends can be captured through per-user and per-item rating means. The global mean is also incorporated to smooth things out a bit. So if we see a missing value in a given cell, we'll average the global mean with the mean of the column and the mean of the row and use that value to fill it in. Running the Mean of Means algorithms yields a RMSE of 0.9525. What does this mean? On average, we around 1 point off for every rating.

New Proposed Model

In our new model, we use a method called Alternating Least Squares (ALS).

  • We start with a grid of ratings by users and movie.
  • We use the data of the users to predict the ratings on each movie.
  • Then, we see how far off we are from the known ratings and correct by using the movie data to estimate new rating predictions.
  • We repeat back-and-forth between users and movies until our prediction error does not improve with each step.
  • We tried a number of models. The best performing was ALS.

    Results
    EDA

    With the ALS model, we see an improvement of 6%.

    EDA

    EDA

    Precision

    How do we measure whether or not our model is making a difference?

    We want to avoid cases where we recommended a movie that the user does not like. So we defined a false positive as an instance where we predicted the user would rate the movie at higher than 3.5 stars but the user actually rated the movie at less than 3.5 stars. This is the scenario we would most want to avoid - it could potentially leave the user disatisfied and ready to give up on our service.

    precision

    Precision is the number of correct recommendations divided by the number of all recommendations. It penalizes a model for recommendiong a movie that the user does not actually like. With the ALS model, we imporove precision by 6%.

    Churn

    We estimate that each false positive results in a 1% chance than a customer will churn. Each customer has a lifetime value of approximately $100. Therefor a false positive costs us $1 in lost revenue. For the 671 users in this study, the ALS model will save us $6936 - $4688 = $2248. For 2 million movie ratings, we would save $44,960.

    The App

    webapp

    webapp1

    http://3.95.7.113:8080/

    Proposed Plan of Implementation

    We propose to roll out this new recommender to 10% of our users for a three month trial period. At the end of that period, we will evaluate the recommender base on:

    • User feedback on whether they noticed an improved performance of the recommender.
    • Comparing customer churn from the new recommender group and the old recommender group. If we receive positive survey results and reduced customer churn, we would then roll out the new recommender to the rest of the customer base.

    About

    No description, website, or topics provided.

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published

    Languages