All Projects โ†’ eddwebster โ†’ football_analytics

eddwebster / football_analytics

Licence: other
โšฝ๐Ÿ“Š A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster), including a curated list of publicly available resources published by the football analytics community.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to football analytics

worldfootballR
A wrapper for extracting world football (soccer) data from FBref, Transfermark, Understat and fotmob
Stars: โœญ 188 (-53.58%)
Mutual labels:  football-data, football, soccer-data, understat, transfermarkt, fbref
understatr
fetch understat data
Stars: โœญ 72 (-82.22%)
Mutual labels:  soccer, football, understat, soccer-analytics
transfermarkt-datasets
โšฝ๏ธ Extract, prepare and publish Transfermarkt datasets.
Stars: โœญ 60 (-85.19%)
Mutual labels:  soccer, football-data, football, soccer-analytics
cfbscrapR
A scraping and aggregating package using the CollegeFootballData API
Stars: โœญ 25 (-93.83%)
Mutual labels:  football-data, football, sports-stats, sports-analytics
ggshakeR
An analysis and visualization R package that works with publicly available soccer data
Stars: โœญ 69 (-82.96%)
Mutual labels:  soccer, soccer-analytics, football-analytics
sports.py
A simple Python package to gather live sports scores
Stars: โœญ 51 (-87.41%)
Mutual labels:  soccer, football, sports-stats
football-graphs
Graphs and passing networks in football.
Stars: โœญ 81 (-80%)
Mutual labels:  soccer, football-data, sports-analytics
regista
An R package for soccer modelling
Stars: โœญ 71 (-82.47%)
Mutual labels:  soccer, football, sports-analytics
Football-Analytics-With-Python
A repository for football analytics
Stars: โœญ 28 (-93.09%)
Mutual labels:  football-data, football, football-analytics
angular-footballdata-api-factory
AngularJS Factory for the football-data.org JSON REST API
Stars: โœญ 48 (-88.15%)
Mutual labels:  soccer, football-data, football
mysportsfeeds-api
Feature requests for the MySportsFeeds Sports Data API.
Stars: โœญ 44 (-89.14%)
Mutual labels:  football-data, sports-stats
sportyR
R package for drawing regulation playing surfaces for several sports
Stars: โœญ 84 (-79.26%)
Mutual labels:  football, sports-analytics
foot
footๆ˜ฏไธ€ไธช้›†่ถณ็ƒๆ•ฐๆฎ้‡‡้›†ๅ™จ,็ฎ€ๅ•ๅˆ†ๆž็š„้กน็›ฎ.AI่ถณ็ƒ็ƒๆŽขไธบ็จ‹ๅบๅ…จ่‡ชๅŠจๅค„็†,ๅ…จ็จ‹ๆ— ไบบไธบๅ‚ไธŽๅนฒ้ข„่ถณ็ƒๅˆ†ๆž่ถณ็ƒ้ข„ๆต‹็จ‹ๅบ.็จ‹ๅบๆ นๆฎๅ„ๅคงๆŒ‡ๆ•ฐๅคš็ปดๅบฆๆ•ฐๆฎ,็ป“ๅˆไฝœ่€…ๅคšๅนด่ถณ็ƒๅˆ†ๆž็ป้ชŒ,็ฒพ้›•็ป†็ข,้›†ๅคฉๅœฐไน‹็ตๆฐ”,ๆฑฒๆ—ฅๆœˆไน‹็ฒพๅŽ,ๅŽ†ๆ—ถไธƒไธƒๅ››ๅไนๅคฉ,็ปBugไนไนๅ…ซๅไธ€ไธช,็ผ–็ ่€Œๆˆ.ๆœ‰ๅ…ด่ถฃ็š„ๆœ‹ๅ‹,ๅฏไปฅๅ…ณๆณจไธ€ไธ‹ๅ…ฌไผ—ๅทAI็ƒๆŽข(ๅพฎไฟกๅทai00268).
Stars: โœญ 96 (-76.3%)
Mutual labels:  football-data, football
fotmob
โšฝ A wrapper around the unofficial FotMob API
Stars: โœญ 22 (-94.57%)
Mutual labels:  soccer, football
football-peek
[JavaScript - NodeJS] Application to access football scores
Stars: โœญ 14 (-96.54%)
Mutual labels:  soccer, football
mezzala
Models for estimating football (soccer) team-strength
Stars: โœญ 23 (-94.32%)
Mutual labels:  soccer, soccer-analytics
epl mysql db
Free/open English Premier League results database from 1993-2017. Dump format is MySQL and sqlite.
Stars: โœญ 26 (-93.58%)
Mutual labels:  soccer, football-data
openrowingmonitor
A free and open source performance monitor for rowing machines
Stars: โœญ 29 (-92.84%)
Mutual labels:  sports-stats, sports-analytics
sync.soccer
Synchronise event and tracking data using dynamic programming
Stars: โœญ 38 (-90.62%)
Mutual labels:  football-data, football-analytics
ARGoal
Get more goals. | Virtual Goals & Goal Distance | App Doctor Hu
Stars: โœญ 14 (-96.54%)
Mutual labels:  soccer, football

Football Analytics

A space for football analytics projects by Edd Webster, including a curated list of publicly available resources published by the football analytics community.


Edd Webster Football Analytics

-----------------------------------------------------

๐Ÿ‘‹ About This Repository and Author

Edd Webster

Please note, all the code and analysis produced in this repository is mine and/or credited to the publicly produced code, data, and/or libraries used, and is in no way related to the work and analysis I produce for my employers.

I recently rewrote this README to include links not only to my own work, but also to include a concise list of learning resources, data sources, libraries, papers, blogs, podcasts, etc., created by all those that have made contributions to the football analytics community. This will be a constant work in progress so if you can think of any resources that I've missed, or you yourself have created something that you believe should be added and is currently not available, please feel free to create a pull request or send me a message.

Credits to the Soccer Analytics Handbook by Devin Pleuler, Awesome Soccer Analytics by Matias Mascioto, and Jan Van Haaren's Soccer Analytics 2021 Review, Soccer Analytics 2020 Review and soccer-analytics-resources Github repo, which were all used to plug gaps in the list once it was published. Credit also to Matias Singers for his awesome-readme repository used to restyle this README.

If you like the repo, please feel free to give it a โญ (top right). Cheers!

For more information about this repository and the author, I am available through all the following channels:

Personal Website Badge Email Badge Twitter Badge LinkedIn Badge About.me Badge GitHub Badge HackerRank Badge Coder Rank Badge Tableau Badge

-----------------------------------------------------

๐Ÿ“– Table of Contents

Table of Contents
  1. About This Repository and Author
  2. Table of Contents
  3. Prerequisites
  4. Repository Structure
  5. Notebooks
  6. Data Visualisation and Tableau
  7. Resources
  8. Contributing
  9. Acknowledgements

-----------------------------------------------------

๐Ÿด Prerequisites

Python Badge Jupyter Badge

The only prerequisites for using this GitHub repo is that you have a computer, internet connection and the desire to learn more about football analytics.

The following open-source Python libraries listed below are some of the most commonly used in Data Science that feature in the the notebooks in this repository. Most of these libraries can be obtained by downloading and installing Anaconda. Step-by-step guides to do this can be found for Windows here and Mac here, as well as in the Anaconda documentation itself here.

Back to Contents

-----------------------------------------------------

๐ŸŒต Repository Structure

The contents of this GitHub repository is organised as the following:

football analytics github repository
.
โ”‚
โ”œโ”€โ”€ dashboards
โ”‚
โ”œโ”€โ”€ data
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ capology
โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”œโ”€โ”€ raw  
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ bundeliga
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ la-liga
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ ligue-1
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ mls
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ premier-league
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ serie-a
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ capology_all_latest.csv
โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ””โ”€โ”€ engineered
โ”‚   โ”‚       โ”œโ”€โ”€ capology_all_latest.csv
โ”‚   โ”‚       โ”œโ”€โ”€ capology_big5_latest.csv
โ”‚   โ”‚       โ””โ”€โ”€ capology_big5_mls_latest.csv
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ elo
โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”œโ”€โ”€ raw  
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ ratings_by_date
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ ratings_by_team
โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ””โ”€โ”€ engineered
โ”‚   โ”‚       โ”‚ 
โ”‚   โ”‚       โ”œโ”€โ”€ ratings_by_date
โ”‚   โ”‚       โ”‚   โ””โ”€โ”€ elo_team_rating_per_year_big5_latest.csv
โ”‚   โ”‚       โ”‚ 
โ”‚   โ”‚       โ””โ”€โ”€ ratings_by_team
โ”‚   โ”‚           โ”œโ”€โ”€ elo_team_rating_per_year_all_latest.csv
โ”‚   โ”‚           โ””โ”€โ”€ elo_team_rating_per_year_big5_latest.csv  
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ export
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ fbref
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ fifa
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ guardian
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ metrica-sports
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ opta
โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ””โ”€โ”€ raw  
โ”‚   โ”‚       โ”‚
โ”‚   โ”‚       โ”œโ”€โ”€ premier-league
โ”‚   โ”‚       โ””โ”€โ”€ serie-a
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ reference
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ sb
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ shots
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ stats-perform
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ stratabet
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ tm
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ touchline-analytics
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ twenty-first-group
โ”‚   โ”‚ 
โ”‚   โ”œโ”€โ”€ understat
โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”œโ”€โ”€ raw  
โ”‚   โ”‚   โ”‚   โ”‚
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ metadata
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ bundeliga
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ la-liga
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ ligue-1
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ premier-league
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ rfpl
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ serie-a
โ”‚   โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ understat_metadata_all_latest.csv
โ”‚   โ”‚   โ”‚   โ”‚
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ shots
โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ bundeliga
โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ la-liga
โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ ligue-1
โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ premier-league
โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ rfpl
โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ serie-a
โ”‚   โ”‚   โ”‚       โ””โ”€โ”€ understat_shots_all_latest.csv
โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ””โ”€โ”€ engineered
โ”‚   โ”‚       โ”‚
โ”‚   โ”‚       โ”œโ”€โ”€ metadata
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ bundeliga
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ la-liga
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ ligue-1
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ premier-league
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ rfpl
โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ serie-a
โ”‚   โ”‚       โ”‚   โ””โ”€โ”€ understat_metadata_all_latest.csv
โ”‚   โ”‚       โ”‚
โ”‚   โ”‚       โ””โ”€โ”€ shots
โ”‚   โ”‚           โ”œโ”€โ”€ bundeliga
โ”‚   โ”‚           โ”œโ”€โ”€ la-liga
โ”‚   โ”‚           โ”œโ”€โ”€ ligue-1
โ”‚   โ”‚           โ”œโ”€โ”€ premier-league
โ”‚   โ”‚           โ”œโ”€โ”€ rfpl
โ”‚   โ”‚           โ”œโ”€โ”€ serie-a
โ”‚   โ”‚           โ””โ”€โ”€ understat_shots_all_latest.csv
โ”‚   โ”‚ 
โ”‚   โ””โ”€โ”€ wyscout
โ”‚
โ”œโ”€โ”€ docs
โ”‚   โ”œโ”€โ”€ centre-circle
โ”‚   โ”œโ”€โ”€ metrica-sports
โ”‚   โ”œโ”€โ”€ opta
โ”‚   โ”œโ”€โ”€ sb
โ”‚   โ”œโ”€โ”€ shots
โ”‚   โ”œโ”€โ”€ stratabet
โ”‚   โ””โ”€โ”€ wyscout
โ”‚
โ”œโ”€โ”€ gif
โ”‚   โ””โ”€โ”€ fig
โ”‚
โ”œโ”€โ”€ img
โ”‚   โ”‚  
โ”‚   โ”œโ”€โ”€ club_badges
โ”‚   โ”‚  
โ”‚   โ”œโ”€โ”€ eddwebster
โ”‚   โ”‚  
โ”‚   โ”œโ”€โ”€ fig
โ”‚   โ”‚  
โ”‚   โ”œโ”€โ”€ logos
โ”‚   โ”‚  
โ”‚   โ”œโ”€โ”€ pitches
โ”‚   โ”‚  
โ”‚   โ””โ”€โ”€ vizpiration
โ”‚       โ”œโ”€โ”€ age_profile_charts
โ”‚       โ”œโ”€โ”€ average_position_formation_charts
โ”‚       โ”œโ”€โ”€ bar_charts
โ”‚       โ”œโ”€โ”€ bumpy_chaarts
โ”‚       โ”œโ”€โ”€ carry_maps
โ”‚       โ”œโ”€โ”€ connected_dot_charts
โ”‚       โ”œโ”€โ”€ crossing_maps
โ”‚       โ”œโ”€โ”€ dashboards
โ”‚       โ”œโ”€โ”€ diamond_charts
โ”‚       โ”œโ”€โ”€ distribution_plots
โ”‚       โ”œโ”€โ”€ expected_goals_diagrams
โ”‚       โ”œโ”€โ”€ games_played_charts
โ”‚       โ”œโ”€โ”€ goalkeeper_shots_Faced_maps
โ”‚       โ”œโ”€โ”€ heat_maps
โ”‚       โ”œโ”€โ”€ injury_list_charts
โ”‚       โ”œโ”€โ”€ league_tables
โ”‚       โ”œโ”€โ”€ line_charts
โ”‚       โ”œโ”€โ”€ minutes_share_charts
โ”‚       โ”œโ”€โ”€ miscellaneous
โ”‚       โ”œโ”€โ”€ pass_maps
โ”‚       โ”œโ”€โ”€ passing_networks
โ”‚       โ”œโ”€โ”€ pizza_charts
โ”‚       โ”œโ”€โ”€ player_possessions_charts
โ”‚       โ”œโ”€โ”€ player_profiles
โ”‚       โ”œโ”€โ”€ player_tables
โ”‚       โ”œโ”€โ”€ possession_share_maps
โ”‚       โ”œโ”€โ”€ prediction_tables
โ”‚       โ”œโ”€โ”€ race_charts
โ”‚       โ”œโ”€โ”€ radars
โ”‚       โ”œโ”€โ”€ scattergrams
โ”‚       โ”œโ”€โ”€ shot_maps
โ”‚       โ”œโ”€โ”€ squad_churn_charts
โ”‚       โ”œโ”€โ”€ squad_depth_charts
โ”‚       โ”œโ”€โ”€ summary_stats
โ”‚       โ”œโ”€โ”€ touch_maps
โ”‚       โ”œโ”€โ”€ tree_diagrams
โ”‚       โ”œโ”€โ”€ ven_diagrams
โ”‚       โ”œโ”€โ”€ voronoi_diagrams
โ”‚       โ””โ”€โ”€  waffle_charts
โ”‚
โ”œโ”€โ”€ notebooks
โ”‚   โ”‚    
โ”‚   โ”œโ”€โ”€ 1_data_scraping
โ”‚   โ”‚   โ”œโ”€โ”€ Capology Player Salary Web Scraping.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ FBref Player Stats Web Scraping.ipynb
โ”‚   โ”‚   โ””โ”€โ”€ TransferMarkt Player Bio and Status Web Scraping.ipynb   
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ 2_data_parsing
โ”‚   โ”‚   โ”œโ”€โ”€ ELO Team Ratings Data Parsing.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ StatsBomb Data Parsing.ipynb
โ”‚   โ”‚   โ””โ”€โ”€ Wyscout Data Parsing.ipynb   
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ 3_data_engineering
โ”‚   โ”‚   โ”œโ”€โ”€ Capology Player Salary Data Engineering.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ Centre Circle Opta CPL Data Engineering.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ FBref Player Stats Data Engineering.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ Opta #mcfcanalytics PL 2011-2012.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ StatsBomb Data Engineering.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ StrataBet Data Engineering.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ The Guardian Player Recorded Transfer Fees Data Engineering.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ TransferMarkt Historical Market Value Data Engineering.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ TransferMarkt Player Bio and Status Data Engineering.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ TransferMarkt Player Recorded Transfer Fees Data Engineering.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ Understat Data Engineering.ipynb
โ”‚   โ”‚   โ””โ”€โ”€ Wyscout Data Engineering.ipynb
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ 4_data_unification
โ”‚   โ”‚   โ””โ”€โ”€ Unification of Aggregated Seasonal Football Datasets.ipynb
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ 5_data_analysis_and_projects
โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”œโ”€โ”€ player_similarity_and_clustering
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ PCA and K-Means Clustering of 'Piquรฉ-like' Defenders.ipynb 
โ”‚   โ”‚   โ”‚
โ”‚   โ”‚   โ”œโ”€โ”€tracking_data
โ”‚   โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ metrica_sports
โ”‚   โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ Metrica Tracking Data EDA.ipynb
โ”‚   โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ signality
โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ Signality Tracking Data Engineering.ipynb
โ”‚   โ”‚   โ”‚       โ””โ”€โ”€ Signality Tracking Data EDA.ipynb
โ”‚   โ”‚   โ”‚ 
โ”‚   โ”‚   โ””โ”€โ”€xg_modeling
โ”‚   โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ shots_dataset
โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ chance_quality_modelling
โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ 1) Logistic Regression Expected Goals Model.ipynb
โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ 2) XGBoost Expected Goals Model.ipynb
โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ 3) CatBoost Expected Goals Model.ipynb
โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ metrica-sports
โ”‚   โ”‚   โ”‚   โ”‚       โ””โ”€โ”€ Metrica Sports.ipynb
โ”‚   โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ statsbomb_dataset
โ”‚   โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ Introduction to Building Expected Goals Models Using StatsBomb 360 Data.ipynb
โ”‚   โ”‚   โ”‚   โ”‚   
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ opta_dataset
โ”‚   โ”‚   โ”‚       โ””โ”€โ”€ Training of an Expected Goals Model Using Opta Event Data.ipynb
โ”‚   โ”‚   โ”‚ 
โ”‚   โ””โ”€โ”€ 6_data_visualisation
โ”‚
โ”œโ”€โ”€ research
โ”‚   โ”œโ”€โ”€ papers
โ”‚   โ””โ”€โ”€ slides
โ”‚
โ”œโ”€โ”€ scripts
โ”‚
โ”œโ”€โ”€ spreadsheets
โ”‚
โ””โ”€โ”€ video 

Back to Contents

-----------------------------------------------------

๐Ÿ“” Notebooks

Nearly all code in this repository is in Jupyter notebooks, organised in the following workflow:

  1. Webscraping;
  2. Data Parsing;
  3. Data Engineering;
  4. Data Unification; and
  5. Data Analysis - projects include working with Tracking data, constructing VAEP models (as introduced by SciSports), building xG models using Logistic Regression, Random Forests and Gradient Booested Decision Tree algorithms such as XGBoost and CatBoost, and analysing player similarity using PCA and K-Means clustering).

Back to Contents

-----------------------------------------------------

๐Ÿ“Š Data Visualisation and Tableau Dashboards

For Tableau dashboards produced using the data engineered in the notebooks in this repository, please see my Tableau Public profile: public.tableau.com/profile/edd.webster.

Example Tableau dashboards:

Back to Contents

-----------------------------------------------------

๐Ÿ“‘ Resources

๐Ÿ“‘ Getting Started with Football Analytics

Good resources for those new for the use of data in football:

Back to Contents

-----------------------------------------------------

๐Ÿ’พ Data

โ„น๏ธ Data Sources

All publicly available data sources and datasets relating to football, from Tracking data, Event data, aggregated player performance data, detailed match statistics, injury records and transfer values, and more.

Data sources that have been used in the code and analysis in this repository can be found in the data subfolder of this repository or in Google Drive (due to GitHub's 100mb file limit) [link]. All code however in this repository should enable you to scrape, parse, and engineer the datasets as per the output used for analysis and visualisations featured..

To learn more about the different types of data available, such as Event and Tracking data, please see the "Where can I get data?" section of Devin Pleuler's soccer_analytics_handbook [link].

Event data
Tracking data
Aggregated Player/Team Performance data
Team Rating data
Physical data
Results and Matchsheet data
Financial, Valuation, and Transfer data
Odds, Betting, and Predictions data
Plotting Tools

Also see Mark Wilkin's Twitter thread [link]:

Reference data
Miscellaneous Data

๐Ÿ“„ Documentation

All documentation saved locally in the documentation subfolder, including:

Data Types and Companies

Data Providers
Tracking
Videos / Performances Analysis

Back to Contents

-----------------------------------------------------

๐Ÿง‘โ€๐ŸŽ“ Tutorials

Python

R

Tableau

Check out the Tableau for Sports Discord server organised by Ninad Barbadikar, to interact with a community of Tableau developers

For a YouTube playlist of Tableau-football videos and tutorials that I have collated from various sources including the Tableau Football User Group, Rob Carroll, Tom Goodall, and Ninad Barbadikar, see the following [link].

PowerBI

For a YouTube playlist of Power BI-football videos and tutorials that I have collated from various sources including Futbol AnalysR and PowerBI for Sports, see the following [link].

SQL

Excel

PowerPoint

Back to Contents

-----------------------------------------------------

๐Ÿ›๏ธ Libaries

Python

  • codeball - data driven tactical and video analysis of soccer games;
  • Football Packing - a Python package to calculate packing rate for a given pass in football by Samira Kumar. This is a variation of the metric created by Impect;
  • kloppy - a Python package providing (de)serializers for soccer tracking- and event data, standardized data models, filters, and transformers designed to make working with different tracking- and event data like a breeze. See the YouTube tutorial [link];
  • matplotsoccer - a Python library for visualising soccer event data by Tom Decroos;
  • mplsoccer - a Python library for drawing soccer/football pitches in Matplotlib and loading StatsBomb open-data by Andrew Rowlinson;
  • nayra - API that allows you track soccer player from camera inputs, and evaluate them with an Expected Discounted Goal (EDG) Agent. See the Evaluating Soccer Player paper by Paul Garnier and Thรฉophane Gregoir;
  • northpitch - a Python football plotting library that sits on top of Matplotlib by Devin Pleuler;
  • PCA_Player_Finder by Parth Athale;
  • PySport including PySport Soccer - collection of open-source sport packages including many of those mentioned in this section, by Koen Vossen;
  • PyWaffle - an open source, MIT-licensed Python package for plotting waffle charts by Peter McKeever;
  • ScraperFC - a Python package by Owen Seymour to scrape FiveThirtyEight data, aggregated StatsBomb data from FBref, Understat shooting and player meta data including values for xG, xA, xGChain, xGBuildup, player salary data from Capology, and WhoScored? Opta Event provided by StatsPerform;
  • Scrape-FBref-data - Python library to scrape aggregated StatsBomb data via FBref by Parthe Athale, which in turn was updated from Christopher Martin's repository;
  • statsbombapi - a Python API wrapper and dataclasses for StatsBomb data;
  • statsbombpy - a Python library written by Francisco Goitia to access StatsBomb data;
  • statsbomb-parser - Python library to convert StatsBomb's JSON data into easy-to-use CSV format;
  • socceraction - a Python library for valuing the individual actions performed by soccer players. Includes an Expected Threat (xT) implementation by Tom Decroos et. al.;
  • soccermix - a soft clustering technique based on mixture models that decomposes event stream data into a number of prototypical actions of a specific type, location, and direction by Tom Deccoos and ML-KULeuven;
  • soccer_xg - a Python package for training and analyzing expected goals (xG) models in football;
  • soccerplots - a Python package that can be used for making visualizations for football analytics by Anmol Durgapal;
  • sync.soccer - a Python package to synchronise football datasets, so that an event in one dataset is matched to the corresponding event or snapshot in the other by Marek Kwiatkowski. This repository contains an implementation that aligns Opta's (now Stat Perform) F24 feeds to ChyronHego's Tracab files. More formats may be added in the future. See the following blog post for methodology [link];
  • tmscrape - a Python TransferMarkt webscraper by danzn1;
  • Tyrone Mings - a Python TransferMarkt webscraper by FCrSTATS;
  • understat - a Python webscraper by Amos Bastian to scrape Understat shooting and player meta data.

R

Back to Contents

-----------------------------------------------------

GitHub Repositories

Python

R

Back to Contents

-----------------------------------------------------

Apps

Back to Contents

-----------------------------------------------------

๐Ÿ“Š Data Visualisation Resources and Tools

Resources to aid data visualisation:

Back to Contents

-----------------------------------------------------

โœ’๏ธ Written Pieces

Blogs

Many of these blog posts are recommended in Sam Gregory's Best Football Analytics Pieces piece and Tom Worville's โ€œWhatโ€™s the best Football Analytics piece youโ€™ve ever read?โ€, both articles now a few years old. This section is very subjective so if I've missed anything obvious, apologies.

Blogs and Data Analytics Websites

๐Ÿ“ƒ Papers

Many of the papers included in this list have been included after reading Jan Van Haaren's Soccer Analytics 2021 Review and Soccer Analytics 2020 Review. Props to him for reading a paper a week and making his thoughts publicly available!

The papers included in this list have been

The following Shiny App from Lars Maurath is a great tool for looking up publications [link].

2021
2020
2019

2018

2017
2016
2015
2014
2011
2002
1997
1971

Newsletters

News Articles

๐Ÿ“š Books

See the Sports Analytics Reading List by Measureables (Brendan Kent), as part of his Sports Analytics 101 series

The following use Amazon UK links were available.

Magazines

Back to Contents

-----------------------------------------------------

๐Ÿ“ผ Video

YouTube Playlists

Custom Playlists Curated by Myself

The following is a series of playlists that that I have collated originally for my own personal viewing but they may be useful to you:

Public Playlists

Playlists created by others

YouTube Channels

Video Analysis

Webinars and Lectures

Ted Talks

Documentaries

Match Highlights

Other

Back to Contents

-----------------------------------------------------

๐Ÿ”Š Podcasts

Below I've tried to include both the Sports/Football Analytics and then notable episodes of all podcasts that have analytical content/interviews. Spotify and YouTube links used where available. All episodes mentioned below that are available on Spotify can be found in the following playlist (updated periodically): [link].

Football Analytics Podcasts

Notable Episodes (including non-football-data-specific podcasts)

Back to Contents

-----------------------------------------------------

๐Ÿ‘จโ€๐Ÿ’ป Notable Figures and Twitter Accounts

Back to Contents

-----------------------------------------------------

๐Ÿ—“๏ธ Events and Conferences

Back to Contents

-----------------------------------------------------

Competitions

The following includes non-football competitions.

Back to Contents

-----------------------------------------------------

Courses

Back to Contents

-----------------------------------------------------

๐Ÿ’ผ Jobs

For live job postings tracked by the community, check the Jobs channel of the Football in Numbers Discord server

Back to Contents

-----------------------------------------------------

Discord/Slack groups

Back to Contents

-----------------------------------------------------

๐Ÿ”‘ Key Concepts

Focus on some of the key topics in football analytics. Most of the following resources features above but are instead reorganised by topic. This section is still very much a work in progress as I go along and may be missing resources mentioned above.

History of Football Analytics

Expected Goals (xG) Modeling

Videos

For a playlist of Expected Goals related videos available on YouTube, see the following playlist I have created [link].

Webinars and Lectures
Tutorials
Notable Models
Written Pieces

For a collated list of Expected Goals literature collated by Keith Lyons, see the following [link]

Libraries
GitHub Repositories
Podcasts
Tweets

Web Scraping Football Data

Written Pieces
Videos
Libraries

Tracking Data

Pitch Control Modeling

Tutorials

Pitch Control modelling and Valuing Actions tutorials by Laurie Shaw as part of his Metrica Sports Tracking data series for Friends of Tracking. See the following for code [link];

GitHub Repositories
Written Pieces
Video
Podcasts

Possession Value (PV) Frameworks

General
Expected Threat (xT)
Valuing Actions by Estimating Probabilities (VAEP)
Goals Added (g+)
On-Ball Value (OBV)

Dixon Coles Modeling

Player Similarity and Style Analysis

Written Pieces
Videos
Tutorials
GitHub Repositories

Reinforcement Learning for Football Simulation

Team Playing Style Analysis

Written Pieces
Papers
Blogs
Videos
GitHub Repositories

Set Pieces

Section created after seeing the following tweets and threads by Ashwin Raman ([link]) and Stuart Reid ([link])

Radars

Recruitment Analysis

Quantifying Relative Club and League Strength

Models
Financial
Historical Match Results
Historical Statistical Player Performance
Articles
Papers
Videos
Miscellaneous
  • Tweets by AI Abucus [link] and [link]. They use a simple Dickson-Coles method focusing on historic results going back 15 years to build an order of hierarchy amongst teams in leagues that might have never played each other.

Tactics

Counter Attacking
Articles
Papers
Videos
Podcasts
Pressing
Articles
Videos
Counter Pressing
Articles
Papers
Videos

Player Valuation Modeling

Models
Data

Game Win Probability Modeling

Goalkeeper Analysis

Back to Contents

-----------------------------------------------------

โ” Miscellaneous

Back to Contents

-----------------------------------------------------

Contributing

This GitHub repository and resources list will be a constant work in progress so if you can think of any resources that I've missed, feel free to create a pull request or send me a message @ [email protected] or @eddwebster.

Back to Contents

-----------------------------------------------------

Acknowledgements

Back to the Top

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].