Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 1.38 KB

README.md

File metadata and controls

15 lines (8 loc) · 1.38 KB

FBRef_DB

Usage

  1. Download the .zip files for the leagues you want in your database from this Google Drive link. (Last updated 9/Nov/2024)

  2. Unzip those downloaded files to the web_pages/ folder - this creates a folder containing the html of the fbref web page for every league match since the start of the 2017-2018 season for your chosen leagues. The structure of the folders must be ./web_pages/<league>/<season>/file

  3. If you have downloaded more than just the Premier_League.zip file, change the competitions parameter in the main() function of main.py to, e.g., main(competitions=["La_Liga", "Ligue_1", "Premier_League"])

  4. Run main.py - this checks fbref for any newly played matches in your specified leagues, and if any are found, adds them to the web_pages/ folder. It then parses these pages and adds them to the master.db database file.

You can then use a program like DB Browser to explore this data using SQL queries. An overview of the database structure can be found here.

Note that master.db file in this repo contains data for all of the top 6 leagues, and the latest version of premier_league.db (only containing Premier League data) is also present at the Google Drive link.