Skip to content

Final Report Peer Review #12

@aleahck

Description

@aleahck

The introduction of the report does a good job of introducing the topic in a financial standpoint, so it is clear as to how rating corresponds to money. I also think, general interest wise, that it is an interesting topic as it seeks to quantify enjoyment based on set features, which is a big question in machine learning of all varieties today: can the human experience always be quantified.

  • For some of your removed features, it would have been nice to see your reasoning as to why they would be unimportant to your model. Allowing your biases to influence the data is dangerous, but with good reasoning, its usually both necessary and beneficial. Even if it seems obvious I'd like to see yours.
  • The choice of encoding for actors/actresses also seemed a little concocted. Perhaps 'popularity' of an actor is not how many films they appear in but instead based on some other, ordinal type value that isn't in this dataset. Maybe consider reevaluating what your encoding is actually representing.
  • The feature transformation on studios; do you think the actual studio producing the film, of the top 6, would not also be important?
  • I like how you adapted your model to the results you saw, like choosing to see how important features were together instead of just one at a time.
  • I really liked your summary of what dimensional analysis was I thought that was pretty intuitive and showed me how you would be using it
  • The visualization for the decision tree was nice in understanding how the decision tree determines splitting on a single feature. Since you have many features it could have been even more helpful to have a decision tree visualization of the same variety, but with alternating features (probably to a lower depth for space reasons).
  • I also liked that you ran your models and then explained explicitly the shortcomings you could find in them, and specifically how you planned to identify them and correct them. (like the use of random forests).
  • The portion on validation was a little lacking and while splitting it into the sections for the models it was analyzing made that connection very clear, it made it harder to synthesize the results and decide which model was best for the reader.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions