All Projects → sfirke → Predicting March Madness

sfirke / Predicting March Madness

Licence: mit
Machine learning tutorial to create an entry for the Kaggle March Mania contest

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to Predicting March Madness

junior.guru
Learn to code and get your first job in tech 🐣
Stars: ✭ 27 (+3.85%)
Mutual labels:  introduction
edX-6.00.2x-Introduction-to-Computational-Thinking-and-Data-Science
MIT edX 6.00.2x Introduction to Computational Thinking and Data Science problem sets code
Stars: ✭ 62 (+138.46%)
Mutual labels:  introduction
React Native App Intro
react-native-app-intro is a react native component implementing a parallax effect welcome page using base on react-native-swiper , similar to the one found in Google's app like Sheet, Drive, Docs...
Stars: ✭ 3,169 (+12088.46%)
Mutual labels:  introduction
kedro-introduction-tutorial
It's the Complete Beginner's Guide to Kedro! See the video here: https://youtu.be/x97ChYDd12U
Stars: ✭ 19 (-26.92%)
Mutual labels:  introduction
directx12-seed
✖🌱 A DirectX 12 starter repo that you could use to get the ball rolling.
Stars: ✭ 58 (+123.08%)
Mutual labels:  introduction
Introduction screen
Add easily to your app an introduction screen to provide informations to new users
Stars: ✭ 259 (+896.15%)
Mutual labels:  introduction
Android-Onboarder
Android Onboarder is a simple and lightweight library that helps you to create cool and beautiful introduction screens for your apps without writing dozens of lines of code.
Stars: ✭ 85 (+226.92%)
Mutual labels:  introduction
K8s Mastery
Repository for the article "Learn Kubernetes in Under 3 Hours"
Stars: ✭ 750 (+2784.62%)
Mutual labels:  introduction
docs
WayScript Documentation
Stars: ✭ 14 (-46.15%)
Mutual labels:  introduction
Docker Basiclearning
🐬 Understand Docker step by step. A tutorial repo for beginners 🔥
Stars: ✭ 296 (+1038.46%)
Mutual labels:  introduction
tlborm-chinese
Rust宏小册, the Chinese translation of tlborm.
Stars: ✭ 88 (+238.46%)
Mutual labels:  introduction
l2kurz
German short introduction to LaTeX
Stars: ✭ 19 (-26.92%)
Mutual labels:  introduction
Androidonboarder
A simple way to make a beauty onboarding experience (app intro or welcome screen) for your users.
Stars: ✭ 269 (+934.62%)
Mutual labels:  introduction
Spotlight
Introductory walkthrough framework for iOS Apps
Stars: ✭ 45 (+73.08%)
Mutual labels:  introduction
Pensepython2e
Tradução do livro Pense em Python (2ª ed.), de Allen B. Downey
Stars: ✭ 557 (+2042.31%)
Mutual labels:  introduction
Introduction-to-LaTeX
Introductory notes and templates for LaTeX
Stars: ✭ 23 (-11.54%)
Mutual labels:  introduction
webgl-seed
🌐🌱 A starter repo for building WebGL applications.
Stars: ✭ 41 (+57.69%)
Mutual labels:  introduction
Course julia day
Notes for getting to know the Julia programming language in one day.
Stars: ✭ 23 (-11.54%)
Mutual labels:  introduction
React Native Onboarding Swiper
🛳 Delightful onboarding for your React-Native app
Stars: ✭ 596 (+2192.31%)
Mutual labels:  introduction
Showcaseview
🔦The ShowcaseView library is designed to highlight and showcase specific parts of apps to the user with an attractive and flat overlay.
Stars: ✭ 281 (+980.77%)
Mutual labels:  introduction

Predicting March Madness

Kaggle’s March Madness prediction competition is an accessible introduction to machine learning. If you happen to like college basketball, you’ll like that in this competition you can’t bust your bracket, since you make a prediction for every game. Plus this year there’s a big prize pool, and luck plays a big enough role that you can be a legit contender fairly easily.

In 2016, my simple process using tidyverse functions in R placed in the top 10%. I refined it a bit for 2017 and finished in the top 25%.

I’m sharing my code and process here for others to use as a starting point. My approach is similar to that of the 2014 winners, Gregory Matthews and Michael Lopez. They published a paper about the role that luck plays in this competition, putting their model in perspective. A takeaway: take my model, tweak it a bit to generate some distance from the field, and you are competitive to win!

What’s here

In the Kaggle competition, you estimate how likely it is that Team A beats Team B, for each of the 2,278 possible matchups in the tournament. My guide documents a set of scripts for each step of:

  • Deciding on possible input parameters
  • Scraping the input data with the rvest package
  • Cleaning and joining data sources to get tidy, prediction-ready data
  • Training and evaluating machine learning models on the data
  • Making and submitting predictions

Licensing/usage

This code is public, please reuse it. It’s under an MIT license. Please acknowledge its role in any write-up or discussion of work that relies on it. And if you win a cash prize from Kaggle using this, congratulations! I wouldn’t turn down a thank-you gift ;)

Thanks

Thanks to contributors @MHenderson and @BillPetti.

Contact me

Let me know what you think, either on twitter @samfirke or compose a friendly e-mail to: samuel.firke AT gmail

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].