All Projects → uberwach → kaggle-berlin

uberwach / kaggle-berlin

Licence: other
Material of the Kaggle Berlin meetup group!

Projects that are alternatives of or similar to kaggle-berlin

Home Credit Default Risk
Default risk prediction for Home Credit competition - Fast, scalable and maintainable SQL-based feature engineering pipeline
Stars: ✭ 68 (+88.89%)
Mutual labels:  kaggle, xgboost, feature-engineering
Open Solution Home Credit
Open solution to the Home Credit Default Risk challenge 🏡
Stars: ✭ 397 (+1002.78%)
Mutual labels:  kaggle, xgboost, feature-engineering
Kaggle Quora Question Pairs
Kaggle:Quora Question Pairs, 4th/3396 (https://www.kaggle.com/c/quora-question-pairs)
Stars: ✭ 705 (+1858.33%)
Mutual labels:  kaggle, feature-engineering
Allstate capstone
Allstate Kaggle Competition ML Capstone Project
Stars: ✭ 72 (+100%)
Mutual labels:  kaggle, xgboost
Benchmarks
Comparison tools
Stars: ✭ 139 (+286.11%)
Mutual labels:  kaggle, xgboost
Data Science Competitions
Goal of this repo is to provide the solutions of all Data Science Competitions(Kaggle, Data Hack, Machine Hack, Driven Data etc...).
Stars: ✭ 572 (+1488.89%)
Mutual labels:  kaggle, xgboost
Kaggler
Code for Kaggle Data Science Competitions
Stars: ✭ 614 (+1605.56%)
Mutual labels:  kaggle, feature-engineering
Kaggle Competitions
There are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (+138.89%)
Mutual labels:  kaggle, feature-engineering
kaggle-code
A repository for some of the code I used in kaggle data science & machine learning tasks.
Stars: ✭ 100 (+177.78%)
Mutual labels:  kaggle, xgboost
Lightautoml
LAMA - automatic model creation framework
Stars: ✭ 196 (+444.44%)
Mutual labels:  kaggle, feature-engineering
Machine Learning Workflow With Python
This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation
Stars: ✭ 157 (+336.11%)
Mutual labels:  kaggle, feature-engineering
Nyaggle
Code for Kaggle and Offline Competitions
Stars: ✭ 209 (+480.56%)
Mutual labels:  kaggle, feature-engineering
sklearn-feature-engineering
使用sklearn做特征工程
Stars: ✭ 114 (+216.67%)
Mutual labels:  kaggle, feature-engineering
HumanOrRobot
a solution for competition of kaggle `Human or Robot`
Stars: ✭ 16 (-55.56%)
Mutual labels:  kaggle, xgboost
Data-Science
Using Kaggle Data and Real World Data for Data Science and prediction in Python, R, Excel, Power BI, and Tableau.
Stars: ✭ 15 (-58.33%)
Mutual labels:  kaggle, feature-engineering
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+3230.56%)
Mutual labels:  kaggle, xgboost
kaggle getting started
Kaggle getting started competition examples
Stars: ✭ 18 (-50%)
Mutual labels:  kaggle, xgboost
Kaggle-Competition-Sberbank
Top 1% rankings (22/3270) code sharing for Kaggle competition Sberbank Russian Housing Market: https://www.kaggle.com/c/sberbank-russian-housing-market
Stars: ✭ 31 (-13.89%)
Mutual labels:  kaggle, xgboost
Outbrain Click Prediction Kaggle
Solution to the Outbrain Click Prediction competition
Stars: ✭ 140 (+288.89%)
Mutual labels:  kaggle, xgboost
docker-kaggle-ko
머신러닝/딥러닝(PyTorch, TensorFlow) 전용 도커입니다. 한글 폰트, 한글 자연어처리 패키지(konlpy), 형태소 분석기, Timezone 등의 설정 등을 추가 하였습니다.
Stars: ✭ 46 (+27.78%)
Mutual labels:  kaggle, xgboost

kaggle-berlin

Material of the Kaggle Berlin meetup group!

Collection of Sources

If you want a comprehensive introduction to the field you find decent advice [here]. Note that this is a guide for AI safety yet the areas outlined with books and sources is fairly decent.

Here is a small, but growing, collection of sources that we have been discussing on our hack sessions.

Star ratings are from to and subject of discussions in the Kaggle group.

Tutorials

[0] Nicolas P. Rougier, Python & Numpy [link] (Outstanding Numpy introduction for scientists and optimizers)

[1] Sebastian Ruder, gradient descent methods [link] (If you are wondering what it is all about stochastic gradient descent, Nesterov momentum, Adam, ...)

[2] Scikit-learn documentation [link] (Absolutely great read to start learning about specific topics. Tons of superb example code. When I am bored I spend time here!)

[3] Donne Martin, "Data Science iPython notebooks." [Github repository] (Some useful examples to learn from.)

Toying and Fun (but still learning)

[0] Andrew Karpathy: CNNs in the browser [link] (Great to gain some intuition.)

[1] Loss Function tumblr [link] (If you do not suffer from PTSD from neural network training already ;))

[2] Tensorflow in the browser [link] (Start with this when you learn about NNs!)

[3] Narayanan, Arvind; Shmatikov, Vitaly: Robust De-anonymization of Large Sparse Datasests [paper] (Ridiculous example of de-anonymization - this should make you very afraid! Anonymous identities in the Netflix challenge data set are discovered via public available data on IMDB.)

[4] IBM personality insights [link] (Maps text to big five personality traits with Twitter or free text input. Supports English, Spanish, Arabic, and Japanese.)

[5] Visualizing the DBScan algorithm [link] (Underrated clustering algorithm, only K-means and DBScan are useful bread-and-butter clustering algorithms.)

Practical Tips

[0] Aarshay Jain: Complete Guide to Parameter Tuning in XGBoost (with codes in Python) [link] (XGBoost won many Kaggle competitions and is from the gradient boosted tree-based model family.)

[1] HJ van Veen: Feature Engineering [slideshare] (Read this to understand basics of preprocessing and feature engineering!)

[2] hat y: Kaggle Ensembling Guide [link] (You must learn on how to combine several submission files and stack several models together if you want to score highly in contests.)

[3] Megan Risdal: Communicating Data Science [kaggle blog] (Communication of your results is one of the major skills you have to learn - and you can exercise it in our group! It is a good summary of communication, presentation, and visualization.)

[4] Tim Dettmers: Which GPU(s) to Get for Deep Learning. [article] (Excellent guide on how to build your GPU machine, what to look for, and why cloud is too expensive)

Books

[0] Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville. "Deep learning." An MIT Press book. (2015). [pdf] (Good theory book to get started, modern! Then go to papers.)

[1] Murphy, Kevin. "Machine Learning" An MIT Press book. (2012) [link] (Not a good starter book, comprehensive and mathematics heavy. I use this a reference manual)

[2] Bishop, Christopher. "Pattern Recognition and Machine Learning" Springer. (2008) [link] (Written like a typical CS book, a bit outdated but solid introduction.)

[3] Abu-Mostafa, Yaser "Learning From Data" AMLBook (2012) [class site] (If you have only two months to learn ML, also has an accompanied class at Caltech.)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].