All Projects → josedv82 → public_sport_science_datasets

josedv82 / public_sport_science_datasets

Licence: other
An ongoing compilation of publicly available datasets for sport science projects.

Projects that are alternatives of or similar to public sport science datasets

sportyR
R package for drawing regulation playing surfaces for several sports
Stars: ✭ 84 (+250%)
Mutual labels:  sports-data, sports-analytics
openrowingmonitor
A free and open source performance monitor for rowing machines
Stars: ✭ 29 (+20.83%)
Mutual labels:  sports-data, sports-analytics
NBA-Machine-Learning-Sports-Betting
NBA sports betting using machine learning
Stars: ✭ 150 (+525%)
Mutual labels:  sports-data, sports-analytics
cfbscrapR
A scraping and aggregating package using the CollegeFootballData API
Stars: ✭ 25 (+4.17%)
Mutual labels:  sports-data, sports-analytics
mysportsfeeds-python
Python wrapper for the MySportsFeeds Sports Data API
Stars: ✭ 77 (+220.83%)
Mutual labels:  sports-data
mysportsfeeds-api
Feature requests for the MySportsFeeds Sports Data API.
Stars: ✭ 44 (+83.33%)
Mutual labels:  sports-data
football analytics
⚽📊 A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster), including a curated list of publicly available resources published by the football analytics community.
Stars: ✭ 405 (+1587.5%)
Mutual labels:  sports-analytics
LeTourDataSet
Every cyclist and stage of the Tour de France in two CSV files.
Stars: ✭ 61 (+154.17%)
Mutual labels:  sports-analytics
scrapeOP
A python package for scraping oddsportal.com
Stars: ✭ 99 (+312.5%)
Mutual labels:  sports-data
sport-stats
Sport stats UI components
Stars: ✭ 62 (+158.33%)
Mutual labels:  sports-data
worldfootballR
A wrapper for extracting world football (soccer) data from FBref, Transfermark, Understat and fotmob
Stars: ✭ 188 (+683.33%)
Mutual labels:  sports-data
mysportsfeeds-node
NodeJS wrapper for the MySportsFeeds Sports Data API
Stars: ✭ 62 (+158.33%)
Mutual labels:  sports-data
whoscored
Football player rating analysis and position recommendation
Stars: ✭ 30 (+25%)
Mutual labels:  sports-data
scoreboard
CRG Derby Scoreboard
Stars: ✭ 83 (+245.83%)
Mutual labels:  sports-data
regista
An R package for soccer modelling
Stars: ✭ 71 (+195.83%)
Mutual labels:  sports-analytics
shinyAMS-resources
A compilation of resources for sport scientist building Athlete Management Tools in Shiny
Stars: ✭ 19 (-20.83%)
Mutual labels:  sports-analytics
retrosheet
Project to parse retrosheet baseball data in python
Stars: ✭ 19 (-20.83%)
Mutual labels:  sports-analytics
boxball
Prebuilt Docker images with Retrosheet's complete baseball history data for many analytical frameworks. Includes Postgres, cstore_fdw, MySQL, SQLite, Clickhouse, Drill, Parquet, and CSV.
Stars: ✭ 79 (+229.17%)
Mutual labels:  sports-data
nflreadr
Efficiently download nflverse data
Stars: ✭ 38 (+58.33%)
Mutual labels:  sports-data
flask-react-d3-celery
A full-stack dockerized web application to visualize Formula 1 race statistics from 2016 to present, with a Python Flask server and a React front-end with d3.js as data visualization tool.
Stars: ✭ 20 (-16.67%)
Mutual labels:  sports-data

Public Sport Science Datasets

An ongoing compilation of publicly available datasets for sport science projects.

Motivation

The importance of data skills for sport scientists is not new. Regardless the level of experience, being able to showcase skills in this area will help in various ways, such as future job interviews, networking or help create opportunities to collaborate with others in the field.

Although there are more sport analytics courses and learning materials available nowadays, a comment that I usually get when it comes to learning data skills is that the datasets used during early learning stages are not motivating and not sport specific.

Unfortunately, sport scientists may not always have access to the type of data that is usually available to professional teams and sport organizations however, there are more and more publicly available datasets that can be used to develop and show your data skills, analytical process and creativity when it comes to sport science analysis.

This resource aims to provide a list of some of those publicly available datasets that can hopefully be used to create sport science data projects and the goal is to continue adding more over time.


Click here to see the list of all available datasets.

Datasets

  1. Tennis Player Tracking ATP Tour Australian Open Final: Tracking data from the 2019 Australian Open Final between Nadal and Djokovic. Includes information about events as well as 2D player positions | Download | Source | Type: CSV |

  2. NBA Player Shooting Motions: 3D ball tracking data of basketball shots for a selected group of NBA players. | Download | Source | Type: Feather |

  3. NBA SportVU Athlete Tracking: Positional tracking data for the 2015 NBA season captured via SportVU. Includes raw x/y data, play by play logs and space coordinates for shots. | Download | Source | Type: 7-zip, CSV |

  4. NBA Schedule Metrics Since 1947 NBA schedule and travel related metrics since 1947 (distance traveled, rest between games, location, time zone shifts, etc) for both teams in a game. | Download | Source | Type: CSV |

  5. NBA Draft & Combine: NBA Draft elections since 1947 along with two files containing anthropometric and physical performance data from combines since 2000-01 season. | Download | Source | Type: CSV |

  6. NFL Combine & Pro Day Data: Data from NFL combines and pro days since 1987. This dataset contains more than 13K observations with anthropometric and physical profile metrics. | Donwload | Source | Type: CSV |

  7. NFL Game Tracking: Athlete Tracking data from each game in the 2017 NFL Season. Includes files with information about players, events and play by play. | Donwload | Source | Type: Feather |

  8. MLB Sprint Running Metrics: Split times (0 to 90ft) and max running speed (ft/s) for every MLB position player between 2015 until May 2021. | Donwload | Source | Type: XLSX |

  9. Annotated Sport Videos Dataset: This dataset contains links to 1,133,158 YouTube videos annotated with 487 sports labels. Suitable for machine learning and computer vision related work. | Download | Source | Type: Video |

  10. Video Databse of Golf Swing Sequencing: GolfDB is a high-quality video dataset created for general recognition applications in the sport of golf, and specifically for the task of golf swing sequencing. | Download | Source | Type: Video |

  11. Oura Ring Data: This dataset contains a year worth of wellness data collected with the Oura ring. | Download | Source | Type: CSV |

  12. Sleep Dataset: Acceleration (in units of g) and heart rate (bpm, measured from photoplethysmography) recorded from the Apple Watch, as well as labeled sleep scored from gold-standard polysomnography from 31 subjetcs. | Download | Source | Type: TXT |

  13. NHL Tracking and Play by Play: The data represents all the official metrics measured for each game in the NHL between 2015-21. Information includes tracking, events, play-by-play, etc. | Download | Source | Type: FST |

  14. IPL Cricket Dataset: The folder contains ball-by-ball data for the IPL matches in csv format. It contains data for 845 matches. There is an extra file called as the 'all_matches.csv' which contains the combined information of all matches in one single file.| Download | Source | Type: CSV |

  15. Soccer StatsBomb Data: Includes event, lineup, and match data in JSON format for hundreds of matches from various leagues. | Download | Source | Type: JSON | Documentation | Terms |

  16. Soccer Bio-banding Data: This data was downloaded from Ally Hamilton's dissertation on The effect of Bio-banding on the technical, tactical and physical demands of soccer specific small-sided games. It contains athlete maturation, biobanding categories as well as a number of tracking, technical and tactical variables. | Download | Source | Type: .XLSX |

  17. Mid-Long Distance Running Injuries Dataset: Two files containing training logs (weekly and daily) along with injury records. Information includes 7 years worth of data with more that 70 variables including distances, intensities, perceive efforts and training quality, etc. | Download | Source | Type: CSV |

  18. eSports Dataset: Psycho-physiological data collected on 10 pro and amateur eSport athletes in 22 League of Legends matches. The dataset includes in-game logs and match info as well as monitoring data such us enviromental, IMU movements, EMG, GSR, HR, EEG, mouse/keyboard activity, face skin temperature, eye tracking, post-game surveys, etc. collected simultaneously for 5 players. | Download | Source | Research | Type: CSV/JSON |

  19. 24h Monitoring HRV, Sleep, Saliva: Multilevel Monitoring of Activity and Sleep in Healthy people (MMASH) dataset provides 24 hours of continuous beat-to-beat heart data, triaxial accelerometer data, sleep quality, physical activity and psychological characteristics (i.e., anxiety status, stress events and emotions) for 22 healthy participants. Moreover, saliva bio-markers (i.e.cortisol and melatonin) and activity log are also provided in this dataset. | Download | Source | Research | Type: CSV |


How to contribute

Contributions to help grow this resource are more than welcome so others can benefit. Every contributor will be visible on this page.

Contributing guidelines:

  • Update the README file with a bullet point refering to your dataset, following the same format including: title, brief explanation, download link, source link, file type.

  • If you have access to the raw dataset upload it to the repo inside a folder. The name of the folder should minimally describe the data inside. Consider adding a document briefly explaining the metrics along with the files if needed.

  • Use the source link on the README paragraph to credit the person who made the data available, or the original location where the dataset can be found. If this is your own dataset then credit yourself! The source link is important, so users know where to go to learn more about each specific dataset.

Topics of interest include:

  • optical/sensor athlete tracking
  • athlete monitoring data
  • physical profiling
  • Sport physiology data
  • injuries
  • schedule & travel metrics
  • sport biomechanics
  • video materials
  • etc.

Companies that provide data through technology are also welcome to upload sample datasets as a way to help sport scientists become more familiar with the data.

If not sure about how to make a contribution, here is a tutorial that explains how to contribute to a github project: Link

Thanks for your contribution!


Acknowledgment

Special thanks to the companies and individuals that made these datasets public. Please check the source link on each dataset to visit the original resource.


Disclaimer

The aim of this repository is to feature and provide direct access to datasets that are currently publicly available or that someone wishes to make available for others to use. We don't do any modifications on the datasets.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].