All Projects → minerva-ml → Open Solution Value Prediction

minerva-ml / Open Solution Value Prediction

Licence: mit
Open solution to the Santander Value Prediction Challenge 🐠

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Open Solution Value Prediction

Open Solution Home Credit
Open solution to the Home Credit Default Risk challenge 🏡
Stars: ✭ 397 (+1067.65%)
Mutual labels:  competition, open-source, xgboost, lightgbm, reproducibility
Hyperparameter hunter
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (+1805.88%)
Mutual labels:  data-science, sklearn, xgboost, lightgbm
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+3426.47%)
Mutual labels:  data-science, xgboost, lightgbm
Steppy
Lightweight, Python library for fast and reproducible experimentation 🔬
Stars: ✭ 119 (+250%)
Mutual labels:  data-science, open-source, reproducibility
Ai competitions
AI比赛相关信息汇总
Stars: ✭ 443 (+1202.94%)
Mutual labels:  sklearn, xgboost, lightgbm
Auto ml
[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (+4485.29%)
Mutual labels:  data-science, xgboost, lightgbm
Minerva Training Materials
Learn advanced data science on real-life, curated problems
Stars: ✭ 37 (+8.82%)
Mutual labels:  data-science, education, training
Automl alex
State-of-the art Automated Machine Learning python library for Tabular Data
Stars: ✭ 132 (+288.24%)
Mutual labels:  data-science, sklearn, xgboost
HumanOrRobot
a solution for competition of kaggle `Human or Robot`
Stars: ✭ 16 (-52.94%)
Mutual labels:  sklearn, xgboost, lightgbm
My Data Competition Experience
本人多次机器学习与大数据竞赛Top5的经验总结,满满的干货,拿好不谢
Stars: ✭ 271 (+697.06%)
Mutual labels:  data-science, xgboost, lightgbm
Open Solution Mapping Challenge
Open solution to the Mapping Challenge 🌎
Stars: ✭ 291 (+755.88%)
Mutual labels:  competition, data-science, lightgbm
Eli5
A library for debugging/inspecting machine learning classifiers and explaining their predictions
Stars: ✭ 2,477 (+7185.29%)
Mutual labels:  data-science, xgboost, lightgbm
Steppy Toolkit
Curated set of transformers that make your work with steppy faster and more effective 🔭
Stars: ✭ 21 (-38.24%)
Mutual labels:  data-science, open-source, reproducibility
Mljar Supervised
Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀
Stars: ✭ 961 (+2726.47%)
Mutual labels:  data-science, xgboost, lightgbm
Data Science Competitions
Goal of this repo is to provide the solutions of all Data Science Competitions(Kaggle, Data Hack, Machine Hack, Driven Data etc...).
Stars: ✭ 572 (+1582.35%)
Mutual labels:  data-science, xgboost
Openscoring
REST web service for the true real-time scoring (<1 ms) of Scikit-Learn, R and Apache Spark models
Stars: ✭ 536 (+1476.47%)
Mutual labels:  xgboost, lightgbm
Datascience Box
Data Science Course in a Box
Stars: ✭ 629 (+1750%)
Mutual labels:  data-science, education
Hypatia
A JavaScript open source LMS (eLearning platform) for MOOCs and online courses
Stars: ✭ 478 (+1305.88%)
Mutual labels:  education, open-source
Awesome Gradient Boosting Papers
A curated list of gradient boosting research papers with implementations.
Stars: ✭ 704 (+1970.59%)
Mutual labels:  xgboost, lightgbm
Kfserving
Serverless Inferencing on Kubernetes
Stars: ✭ 809 (+2279.41%)
Mutual labels:  sklearn, xgboost

Santander Value Prediction Challenge: Open Solution

Join the chat at https://gitter.im/minerva-ml/open-solution-value-prediction license

This is an open solution to the Santander Value Prediction Challenge 😃

More competitions 🎇

Check collection of public projects 🎁, where you can find multiple Kaggle competitions with code, experiments and outputs.

Our goals

We are building entirely open solution to this competition. Specifically:

  1. Learning from the process - updates about new ideas, code and experiments is the best way to learn data science. Our activity is especially useful for people who wants to enter the competition, but lack appropriate experience.
  2. Encourage more Kagglers to start working on this competition.
  3. Deliver open source solution with no strings attached. Code is available on our GitHub repository 💻. This solution should establish solid benchmark, as well as provide good base for your custom ideas and experiments. We care about clean code 😃
  4. We are opening our experiments as well: everybody can have live preview on our experiments, parameters, code, etc. Check: Santander-Value-Prediction-Challenge 📈 and screens below.
LightGBM train and validation performance on folds 📊 LightGBM experiment logged values 📊
train-validation-results-on-folds LightGBM-learning-curves

Disclaimer

In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 😉.

How to start?

Learn more about our solutions

  1. Check Kaggle discussion for most recent updates and comments.
  2. Read Wiki pages, where we describe solutions in more detail. Click on the tropical fish to get started 🐠 or pick solution from the table below.
link to code name CV LB link to the description
solution 1 honey bee 🐝 1.39 1.43 LightGBM and 5fold CV
solution 2 beetle 🐞 1.60 1.77 LightGBM on binarized dataset
solution 3 dromedary camel 🐪 1.35 1.41 LightGBM with row aggregations
solution 4 whale 🐳 1.3416 1.41 LightGBM on dimension reduced dataset
solution 5 water buffalo 🐃 1.336 1.39 Exploring various dimension reduction techniques
solution 6 blowfish 🐡 1.333 1.38 bucketing row aggregations

Start experimenting with ready-to-use code

You can jump start your participation in the competition by using our starter pack. Installation instruction below will guide you through the setup.

Installation (fast track)

  1. Clone repository and install requirements (check requirements.txt)
  2. Register to the neptune.ml (if you wish to use it)
  3. Run experiment:

🔱

neptune run --config neptune_random_search.yaml main.py train_evaluate_predict --pipeline_name SOME_NAME

🐍

python main.py -- train_evaluate_predict --pipeline_name SOME_NAME

Installation (step by step)

  1. Clone this repository
git clone https://github.com/minerva-ml/open-solution-value-prediction.git
  1. Install requirements in your Python3 environment
pip3 install -r requirements.txt
  1. Register to the neptune.ml (if you wish to use it)
  2. Update data directories in the neptune.yaml configuration file
  3. Run experiment:

🔱

neptune login
neptune run --config neptune_random_search.yaml main.py train_evaluate_predict --pipeline_name SOME_NAME

🐍

python main.py -- train_evaluate_predict --pipeline_name SOME_NAME
  1. collect submit from experiment_directory specified in the neptune.yaml

Get involved

You are welcome to contribute your code and ideas to this open solution. To get started:

  1. Check competition project on GitHub to see what we are working on right now.
  2. Express your interest in particular task by writing comment in this task, or by creating new one with your fresh idea.
  3. We will get back to you quickly in order to start working together.
  4. Check CONTRIBUTING for some more information.

User support

There are several ways to seek help:

  1. Kaggle discussion is our primary way of communication.
  2. Read project's Wiki, where we publish descriptions about the code, pipelines and supporting tools such as neptune.ml.
  3. Submit an issue directly in this repo.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].