All Projects → roclark → clarktech-ncaab-predictor

roclark / clarktech-ncaab-predictor

Licence: other
A machine learning project to predict NCAA Men's Basketball outcomes

Programming Languages

python
139335 projects - #7 most used programming language
Dockerfile
14818 projects

Projects that are alternatives of or similar to clarktech-ncaab-predictor

forestError
A Unified Framework for Random Forest Prediction Error Estimation
Stars: ✭ 23 (-4.17%)
Mutual labels:  prediction, randomforest
PlasFlow
Software for prediction of plasmid sequences in metagenomic assemblies
Stars: ✭ 74 (+208.33%)
Mutual labels:  prediction
Github-Stars-Predictor
It's a github repo star predictor that tries to predict the stars of any github repository having greater than 100 stars.
Stars: ✭ 34 (+41.67%)
Mutual labels:  prediction
R Unet
Video prediction using lstm and unet
Stars: ✭ 25 (+4.17%)
Mutual labels:  prediction
behaiv-java
User Behavior Prediction for everyone
Stars: ✭ 12 (-50%)
Mutual labels:  prediction
btcpredict
📈 Bitcoin bull run peak prediction project (price and date)
Stars: ✭ 36 (+50%)
Mutual labels:  prediction
PyDREAM
Python Implementation of Decay Replay Mining (DREAM)
Stars: ✭ 22 (-8.33%)
Mutual labels:  prediction
engine
Benefit from new browsers' technologies to speed up your site
Stars: ✭ 39 (+62.5%)
Mutual labels:  prediction
vlainic.github.io
My GitHub blog: things you might be interested, and probably not...
Stars: ✭ 26 (+8.33%)
Mutual labels:  prediction
infer
🔮 Use TensorFlow models in Go to evaluate Images (and more soon!)
Stars: ✭ 65 (+170.83%)
Mutual labels:  prediction
MSDS696-Masters-Final-Project
Earthquake Prediction Challenge with LightGBM and XGBoost
Stars: ✭ 58 (+141.67%)
Mutual labels:  prediction
Wharton Stat 422 722
The official class webpage for Statistics 422/722 taught at Wharton in the Spring of 2017
Stars: ✭ 14 (-41.67%)
Mutual labels:  prediction
Diebold-Mariano-Test
This Python function dm_test implements the Diebold-Mariano Test (1995) to statistically test forecast accuracy equivalence for 2 sets of predictions with modification suggested by Harvey et. al (1997).
Stars: ✭ 70 (+191.67%)
Mutual labels:  prediction
verif
Software for verifying weather forecasts
Stars: ✭ 70 (+191.67%)
Mutual labels:  prediction
Deep-learning-model-deploy-with-django
Serving a keras model (neural networks) in a website with the python Django-REST framework.
Stars: ✭ 76 (+216.67%)
Mutual labels:  prediction
sports.py
A simple Python package to gather live sports scores
Stars: ✭ 51 (+112.5%)
Mutual labels:  basketball
reinforcement learning course materials
Lecture notes, tutorial tasks including solutions as well as online videos for the reinforcement learning course hosted by Paderborn University
Stars: ✭ 765 (+3087.5%)
Mutual labels:  prediction
stock-forecast
Simple stock & cryptocurrency price forecasting console application, using PHP Machine Learning library (https://github.com/php-ai/php-ml)
Stars: ✭ 76 (+216.67%)
Mutual labels:  prediction
TensorFlow CNN
Example CNN on CIFAR-10 classification
Stars: ✭ 14 (-41.67%)
Mutual labels:  prediction
prediction-builder
A library for machine learning that builds predictions using a linear regression.
Stars: ✭ 107 (+345.83%)
Mutual labels:  prediction

NCAAB Basketball Predictor

Docker Pulls

This tool uses machine learning to predict the outcomes of NCAAB Men's Division-I Basketball games. Included are several algorithms which can forecast different events, such as a daily matchup simulator, conference tournament predictor, and a preview of the NCAA tournament field.

Setup

It is highly recommended to pull the latest Docker container from Docker Hub as this image contains a pre-populated dataset containing multiple years worth of data as well as optimizations to the data which cannot be reproduced retro-actively. A new image is pushed to the registry daily, so it is recommended to setup a workflow which scans for newer images prior to running one of the provided algorithms. To pull the latest image, first ensure Docker is installed on your system by following the documentation. Next, pull the latest image with:

docker pull roclark/clarktech-ncaab-predictor

This will download and extract the most recent image to your local machine which can be viewed with:

$ docker images
REPOSITORY                          TAG      IMAGE ID       CREATED       SIZE
roclark/clarktech-ncaab-predictor   latest   0cfaab9aa82a   4 hours ago   525MB

MongoDB

In addition to the pulling the predictor image from Docker Hub, it is recommended to use MongoDB as a database to save and retrieve results for future usage. While this isn't a strict requirement, many of the algorithms provide better handling and verbosity when saving results into a Mongo database. Luckily, if a Mongo database isn't already installed and configured on your system, it is straightforward to do so with a Docker container. Simply pull the latest image from Docker Hub, then run a container in detached mode so it will run persistently on the host:

docker pull mongo
docker run -it -d mongo

You now have a MongoDB instance running inside a container which can be accessed anywhere on the host using the default mongodb url.

If you choose to skip MongoDB, you will need to add --skip-save-to-mongodb to all commands while running the application (more on usage below).

Usage

Once setup is complete, the tool is now ready to be used to predict NCAAB outcomes. The general usage of the application with Docker is as follows:

docker run --rm -it roclark/clarktech-ncaab-predictor [options] algorithm [algorithm-specific options]

More information on the usage can be retrieved with the following:

docker run --rm -it roclark/clarktech-ncaab-predictor --help

Daily Simulator

The daily simulator is designed to simulate the outcome of all games scheduled for the current day. It is suggested to run this algorithm in the morning to retrieve a list of the scheduled games and determine which team is expected to win. Sample text output is as follows:

$ docker run --rm -it roclark/clarktech-ncaab-predictor daily-simulation
Army at (4) Duke  =>  (4) Duke
George Washington at (5) Virginia  =>  (5) Virginia
Florida Gulf Coast at (10) Michigan State  =>  (10) Michigan State

Additional information such as the predicted spread and further details on each team is included in the database.

Conference Simulator

The conference simulator will forecast the remaining schedule for a conference and, based on the existing conference standings, determine the final projected standings as well as the likelihood a particular team will earn their projected position and their overall probability that they will finish first. The algorithm also displays the projected number of games the team will win in the conference by the end of the season. This can be triggered as follows:

docker run --rm -it roclark/clarktech-ncaab-predictor monte-carlo-simulation

The output generated from this command is saved to a database which is required as a baseline for several algorithms listed below.

Conference Tournament Simulator

This simulator runs through each conference's post-season tournament and predicts the overall winner and the potential route each team takes to the finals. In order to generate the initial seeds, a forecast of the final conference standings needs to be run prior to this algorithm using the Conference Simulator above. Each conference has its own unique tournament format and is handled differently, as specified in the brackets library. Run this simulation with the following:

docker run --rm -it roclark/clarktech-ncaab-predictor conference-tourney-simulator

Prior to running the algorithm, ensure a simulation.json file has been generated using the Conference Simulator above.

Matchup

A matchup between two specific teams can be simulated with the matchup algorithm. This will run several games between the requested teams and determine the overall winner and the expected difference in score. Due to the difference between playing at home and on the road, the results could vary depending on which team is specified as the home team. For example, the following will test a matchup between Purdue and Indiana with Purdue designated as the home team:

docker run --rm -it roclark/clarktech-ncaab-predictor matchup purdue indiana

Power Rankings

Power rankings can be generated for all NCAA Men's Division-I basketball teams to determine the comparative performance relative to one another. This algorithm runs a home-and-home matchup between each team in the division and tallies the collective spread for each team. After all simulations are complete, the team with the highest positive spread will be the number one team overall with the team with the second highest spread being number two, and so on. This system works under the philosophy that the team which can beat the highest number of teams by the highest margin is the strongest team in the league. This does not look specifically at what a team has accomplished so far in the season, but instead how strong they are at this point in time. The rankings can be generated with the following:

docker run --rm -it roclark/clarktech-ncaab-predictor power-rankings

NCAA Field Filler

The NCAA Field Filler will populate the 68-team NCAA Tournament field based on both automatic and at-large bids. The automatic bids are identified by simulating every conference tournament and determining the winner. These winners will receive automatic bids to the tournament. The remaining spots will be awarded on a priority basis based on the power rankings. The rankings need to be generated prior to running this algorithm. Attached to each team is their expected seed. After generating power rankings using the command above, run this algorithm with the following:

docker run --rm -it roclark/clarktech-ncaab-predictor fill-ncaa-field

NCAA Tournament Simulator

Lastly, the NCAA Tournament Simulator runs a simulation of the NCAA tournament. This requires a CSV file of the expected teams and seeds in the tournament to be used as a baseline for the bracket. An example of this CSV file is provided in the repository. To simulate the tournament, run the following:

docker run --rm -it roclark/clarktech-ncaab-predictor tournament-simulator 2019-ncaa.csv

Other options

In addition to the algorithms listed above, some additional options are available.

Num Sims

Given the unpredictability of sports, especially with men's college basketball, some randomness is injected into the algorithms. The randomness is generated by applying a random variance within the league's standard deviant for every category for each team tested on a per-simulation basis. For example, in a single simulation, one team could have a +0.7 * STDEV improvement to their shooting percentage, and a -0.3 * STDEV punishment to their rebounds. As this is done on a per-simulation basis, it is recommended to increase the number of simulations run to improve the variance of data tested and get a more accurate view of the overall trend for each team instead of relying solely on a limited number of varied results. Please note that while increasing the number of simulations is recommended, every additional pass will increase the time to completion.

Skip Saving to MongoDB

By default, all results will be saved to a Mongo database at a specified URL. The results in the database provide additional context and can easily be archived for future use as needed. If desired, this can be avoided by requesting to skip saving to MongoDB, and results will be saved in the local directory as applicable.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].