Project Link: https://rpubs.com/suryanshchauhan/1182191
This project offers a comprehensive analysis of the 2023-24 Premier League season using data from excel4soccer.com up to April 23, 2024. It aims to understand team standings, player performance, and provide predictive insights for future matches through statistical and machine learning models.
Data was sourced from various Excel sheets including player stats, team stats, league standings, lineups, plays, and fixtures. Preprocessing involved cleaning and structuring the data for exploratory analysis and model building.
The EDA phase visualized current league standings, home vs away goals, team performance by goal difference per match, fouls and cards, top scorers, and seasonal performance of top teams. These insights helped understand team strategies and player contributions.
Statistical summaries highlighted distributions and tendencies across metrics like points, wins, losses, goals, clean sheets, shots, passes, and fouls. Inferential statistics helped identify key predictors of team success and player performance.
Detailed assessments of player and team performance were conducted using R, focusing on goal scoring, defensive actions, and overall efficiency. This helped identify high performers and areas needing improvement.
Predictive models were used to forecast team points and standings, utilizing linear regression to relate performance metrics to points earned. Model predictions were visualized to assess accuracy and potential season outcomes.
The analysis predicted the potential league winner and final standings, highlighting Manchester City's strong performance. Recommendations for teams and players were based on derived insights, promoting ethical sportsmanship and data handling.
The project adhered to ethical standards by ensuring anonymity of player data, fairness in analysis, and transparency in communication. It aimed to enhance understanding and enjoyment of the Premier League while respecting stakeholder rights and privacy.