All Projects β†’ minerva-ml β†’ Open Solution Home Credit

minerva-ml / Open Solution Home Credit

Licence: mit
Open solution to the Home Credit Default Risk challenge 🏑

Programming Languages

python
139335 projects - #7 most used programming language
python3
1442 projects

Projects that are alternatives of or similar to Open Solution Home Credit

Open Solution Value Prediction
Open solution to the Santander Value Prediction Challenge 🐠
Stars: ✭ 34 (-91.44%)
Mutual labels:  competition, open-source, xgboost, lightgbm, reproducibility
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+202.02%)
Mutual labels:  kaggle, pipeline, xgboost, lightgbm
Open Solution Mapping Challenge
Open solution to the Mapping Challenge 🌎
Stars: ✭ 291 (-26.7%)
Mutual labels:  competition, kaggle, pipeline, lightgbm
Lightautoml
LAMA - automatic model creation framework
Stars: ✭ 196 (-50.63%)
Mutual labels:  kaggle, pipeline, feature-engineering
Kaggle-Competition-Sberbank
Top 1% rankings (22/3270) code sharing for Kaggle competition Sberbank Russian Housing Market: https://www.kaggle.com/c/sberbank-russian-housing-market
Stars: ✭ 31 (-92.19%)
Mutual labels:  kaggle, xgboost, lightgbm
Benchmarks
Comparison tools
Stars: ✭ 139 (-64.99%)
Mutual labels:  kaggle, xgboost, lightgbm
Open Solution Toxic Comments
Open solution to the Toxic Comment Classification Challenge
Stars: ✭ 154 (-61.21%)
Mutual labels:  competition, kaggle, pipeline
HumanOrRobot
a solution for competition of kaggle `Human or Robot`
Stars: ✭ 16 (-95.97%)
Mutual labels:  kaggle, xgboost, lightgbm
docker-kaggle-ko
λ¨Έμ‹ λŸ¬λ‹/λ”₯λŸ¬λ‹(PyTorch, TensorFlow) μ „μš© λ„μ»€μž…λ‹ˆλ‹€. ν•œκΈ€ 폰트, ν•œκΈ€ μžμ—°μ–΄μ²˜λ¦¬ νŒ¨ν‚€μ§€(konlpy), ν˜•νƒœμ†Œ 뢄석기, Timezone λ“±μ˜ μ„€μ • 등을 μΆ”κ°€ ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
Stars: ✭ 46 (-88.41%)
Mutual labels:  kaggle, xgboost, lightgbm
MSDS696-Masters-Final-Project
Earthquake Prediction Challenge with LightGBM and XGBoost
Stars: ✭ 58 (-85.39%)
Mutual labels:  kaggle, xgboost, lightgbm
kaggle-berlin
Material of the Kaggle Berlin meetup group!
Stars: ✭ 36 (-90.93%)
Mutual labels:  kaggle, xgboost, feature-engineering
AutoTabular
Automatic machine learning for tabular data. ⚑πŸ”₯⚑
Stars: ✭ 51 (-87.15%)
Mutual labels:  xgboost, lightgbm, feature-engineering
Home Credit Default Risk
Default risk prediction for Home Credit competition - Fast, scalable and maintainable SQL-based feature engineering pipeline
Stars: ✭ 68 (-82.87%)
Mutual labels:  kaggle, xgboost, feature-engineering
Steppy
Lightweight, Python library for fast and reproducible experimentation πŸ”¬
Stars: ✭ 119 (-70.03%)
Mutual labels:  pipeline, open-source, reproducibility
Steppy Toolkit
Curated set of transformers that make your work with steppy faster and more effective πŸ”­
Stars: ✭ 21 (-94.71%)
Mutual labels:  pipeline, open-source, reproducibility
Auto ml
[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (+292.7%)
Mutual labels:  xgboost, feature-engineering, lightgbm
Hyperparameter hunter
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (+63.22%)
Mutual labels:  xgboost, feature-engineering, lightgbm
Mljar Supervised
Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning πŸš€
Stars: ✭ 961 (+142.07%)
Mutual labels:  xgboost, feature-engineering, lightgbm
fast retraining
Show how to perform fast retraining with LightGBM in different business cases
Stars: ✭ 56 (-85.89%)
Mutual labels:  kaggle, xgboost, lightgbm
Apartment-Interest-Prediction
Predict people interest in renting specific NYC apartments. The challenge combines structured data, geolocalization, time data, free text and images.
Stars: ✭ 17 (-95.72%)
Mutual labels:  kaggle, xgboost, lightgbm

Home Credit Default Risk: Open Solution

Join the chat at https://gitter.im/minerva-ml/open-solution-home-credit license

This is an open solution to the Home Credit Default Risk challenge 🏑.

More competitions πŸŽ‡

Check collection of public projects 🎁, where you can find multiple Kaggle competitions with code, experiments and outputs.

Our goals

We are building entirely open solution to this competition. Specifically:

  1. Learning from the process - updates about new ideas, code and experiments is the best way to learn data science. Our activity is especially useful for people who wants to enter the competition, but lack appropriate experience.
  2. Encourage more Kagglers to start working on this competition.
  3. Deliver open source solution with no strings attached. Code is available on our GitHub repository πŸ’». This solution should establish solid benchmark, as well as provide good base for your custom ideas and experiments. We care about clean code πŸ˜ƒ
  4. We are opening our experiments as well: everybody can have live preview on our experiments, parameters, code, etc. Check: Home Credit Default Risk πŸ“ˆ and screens below.
Train and validation results on folds πŸ“Š LightGBM learning curves πŸ“Š
train-validation-results-on-folds LightGBM-learning-curves

Disclaimer

In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 🐍.

Note

As of 1.07.2019 we officially discontinued neptune-cli client project making neptune-client the only supported way to communicate with Neptune. That means you should run experiments via python ... command or update loggers to neptune-client. For more information about the new client go to neptune-client read-the-docs page.

How to start?

Learn about our solutions

  1. Check Kaggle forum and participate in the discussions.
  2. Check our Wiki pages 🏑, where we document our work. See solutions below:
link to code name CV LB link to description
solution 1 chestnut 🌰 ? 0.742 LightGBM and basic features
solution 2 seedling 🌱 ? 0.747 Sklearn and XGBoost algorithms and groupby features
solution 3 blossom 🌼 0.7840 0.790 LightGBM on selected features
solution 4 tulip 🌷 0.7905 0.801 LightGBM with smarter features
solution 5 sunflower 🌻 0.7950 0.804 LightGBM clean dynamic features
solution 6 four leaf clover πŸ€ 0.7975 0.806 priv. LB 0.79804, Stacking by feature diversity and model diversity

Start experimenting with ready-to-use code

You can jump start your participation in the competition by using our starter pack. Installation instruction below will guide you through the setup.

Installation (fast track)

  1. Clone repository and install requirements (use Python3.5)
pip3 install -r requirements.txt
  1. Register to the neptune.ml (if you wish to use it)
  2. Run experiment based on LightGBM:

πŸ”±

neptune account login
neptune run --config configs/neptune.yaml main.py train_evaluate_predict_cv --pipeline_name lightGBM

🐍

python main.py -- train_evaluate_predict_cv --pipeline_name lightGBM

Installation (step by step)

Step by step installation πŸ–₯

Hyperparameter Tuning

Various options of hyperparameter tuning are available

  1. Random Search

    configs/neptune.yaml

      hyperparameter_search__method: random
      hyperparameter_search__runs: 100
    

    src/pipeline_config.py

        'tuner': {'light_gbm': {'max_depth': ([2, 4, 6], "list"),
                                'num_leaves': ([2, 100], "choice"),
                                'min_child_samples': ([5, 10, 15 25, 50], "list"),
                                'subsample': ([0.95, 1.0], "uniform"),
                                'colsample_bytree': ([0.3, 1.0], "uniform"),
                                'min_gain_to_split': ([0.0, 1.0], "uniform"),
                                'reg_lambda': ([1e-8, 1000.0], "log-uniform"),
                                },
                  }
    

Get involved

You are welcome to contribute your code and ideas to this open solution. To get started:

  1. Check competition project on GitHub to see what we are working on right now.
  2. Express your interest in paticular task by writing comment in this task, or by creating new one with your fresh idea.
  3. We will get back to you quickly in order to start working together.
  4. Check CONTRIBUTING for some more information.

User support

There are several ways to seek help:

  1. Kaggle discussion is our primary way of communication.
  2. Read project's Wiki, where we publish descriptions about the code, pipelines and supporting tools such as neptune.ml.
  3. Submit an issue directly in this repo.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].