The project aims to predict the IMDB rating score of a certain film based on certain features and aspects. Looking at the models and results, you guys did a fairly good job.
A few things I liked:
- Everything was explained very thoroughly. All the steps were clearly outlined. I believe I can reproduce your results by just following the report.
- The model explanation and the graphs complemented the results very well.
- You guys tried a variety of models, including different kinds of regressions and decision trees.
A few things that made me a little bit worried:
- I'm not sure this will allow the companies (Amazon, Netflix, other film company) to make better decisions. Most of the important features, like popularity, are only known after the film was shown in the movie theater.
- The Mean Squared Error might not be appropriate as a loss function. Maybe an absolute error is more suited and intuitive. (It's unclear what 0.8 MSE means. Does it mean the majority of difference is small, but there are a few big ones? Note that a 0.3 difference in rating adds 0.09 SE, while a 2 point difference in rating adds 4 SE)
- I would carve out a testing dataset as a final out of sample test.
In general, I think you guys have done a good job.
The project aims to predict the IMDB rating score of a certain film based on certain features and aspects. Looking at the models and results, you guys did a fairly good job.
A few things I liked:
A few things that made me a little bit worried:
In general, I think you guys have done a good job.