Skip to content

A Spring Boot movie recommendation web app, using movieLens 27M dataset, employing PostgresQL database served on AWS, with server deployed on Heroku.

License

Notifications You must be signed in to change notification settings

yunxiaoli2017/movie-recommender

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Movie Recommender

A Spring Boot web app that generates movie recommendations from a user's submitted ratings running a user-based collaborative filtering recommendation engine.

To see the app in action, go to https://movie-recommender-demo.herokuapp.com/

Dataset

Original and processed movieLens dataset, as well as processing steps can be found at another repository 27M-movieLens-dataset-processing. Main processings are:

  • Scrap poster urls from IMDB
  • Normalize ratings with decoupling normalization
  • Extract popular movies dataset and compact ratings dataset.

Framework & Database

Algorithm

  • User-based collaborative filtering
    • Draw a number of movies for the user to rate. (Only draw from movies with > 5,000 ratings to increase the chance that the user has known some of them before)

    • Based on submitted ratings, find raters in database who have also rated the same movies and compute similarities with the user for all raters found. (To reduce time cost, compact ratings dataset can be used at this step)

    • Search for ratings of all raters found, and compute predicted scores on movies they rated. (To reduce time cost, 100 most similar raters are used)

    • Sort out movies with highest scores and recommend to the user. (Aside from top 10 from popular movies with > 5,000 ratings, top 10 from unpopular ones with < 5,000 ratings are also recommended to tackle 'long tail' problem, as to provide harder-to-find, unexpected discoveries)

Other Tools

License

About

A Spring Boot movie recommendation web app, using movieLens 27M dataset, employing PostgresQL database served on AWS, with server deployed on Heroku.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published