In this notebook, we will be investigating the sales and ratings of video games from 2000 to 2016. The CRISP-DM (Cross-industry standard process for data mining) process will be followed during this notebook. After analyzing data, we will try to also predict some made up new games with a self-created machine learning model.
As a passionate gamer I stumbled across a video game dataset including metacritic ratings. Having it downloaded, a few questions shot into mind.
- What data can we analyze from a video game dataset found "in the wild" in the first place?
- How are certain publishers doing?
- Can we predict ratings of new games?
- Which platform receives the highest ratings?
- How did sales change through the years?
- What data can we analyze from a video game dataset found "in the wild" in the first place?
- Ratings
- Critic ratings went down
- User ratings went up
- Cutover in 2011
- Ratings
- How are certain publishers doing?
- Nintendo delivers high quality ratings constantly
- Konami get's better and better
- Can we predict ratings of new games?
- With the small dataset we have, no, not really
- Which platform receives the highest ratings?
- The PC unmatchingly received the highest critic ratings
- How did sales change through the years?
- The sales peaked in 2008 and went down constantly from that on
Please look at the notebook for more insights.
Video_Games_Sales_as_at_22_Dec_2016.csv prepared by Rush Kirubi (https://www.kaggle.com/rush4ratio/video-game-sales-with-ratings).
- python 3.7.4
- conda 4.7.12
- notebook 6.0.1
- numpy 1.16.5
- pandas 0.25.1
- scikit-learn 0.21.2
- matplotlib 3.1.1
Myself
- Udacity
- Kaggle
- Many tutorials, websites and, of course, StackOverflow
The code is published under GPL3: https://www.gnu.org/licenses/gpl-3.0.en.html