All Projects → yzkang → My Data Competition Experience

yzkang / My Data Competition Experience

本人多次机器学习与大数据竞赛Top5的经验总结,满满的干货,拿好不谢

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to My Data Competition Experience

Mljar Supervised
Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀
Stars: ✭ 961 (+254.61%)
Mutual labels:  data-science, automl, xgboost, hyperparameter-optimization, feature-engineering, lightgbm
Auto ml
[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (+475.28%)
Mutual labels:  data-science, automl, xgboost, hyperparameter-optimization, feature-engineering, lightgbm
Tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Stars: ✭ 8,378 (+2991.51%)
Mutual labels:  data-science, automl, xgboost, hyperparameter-optimization, feature-engineering
Hyperparameter hunter
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (+139.11%)
Mutual labels:  data-science, xgboost, hyperparameter-optimization, feature-engineering, lightgbm
Nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Stars: ✭ 10,698 (+3847.6%)
Mutual labels:  data-science, automl, hyperparameter-optimization, feature-engineering
Automl alex
State-of-the art Automated Machine Learning python library for Tabular Data
Stars: ✭ 132 (-51.29%)
Mutual labels:  data-science, automl, xgboost, hyperparameter-optimization
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+342.44%)
Mutual labels:  data-science, automl, xgboost, lightgbm
Autodl
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (+215.13%)
Mutual labels:  data-science, automl, feature-engineering, lightgbm
Hyperactive
A hyperparameter optimization and data collection toolbox for convenient and fast prototyping of machine-learning models.
Stars: ✭ 182 (-32.84%)
Mutual labels:  data-science, xgboost, hyperparameter-optimization, feature-engineering
AutoTabular
Automatic machine learning for tabular data. ⚡🔥⚡
Stars: ✭ 51 (-81.18%)
Mutual labels:  xgboost, lightgbm, feature-engineering, automl
Evalml
EvalML is an AutoML library written in python.
Stars: ✭ 145 (-46.49%)
Mutual labels:  data-science, automl, feature-engineering
Auptimizer
An automatic ML model optimization tool.
Stars: ✭ 166 (-38.75%)
Mutual labels:  data-science, automl, hyperparameter-optimization
Lale
Library for Semi-Automated Data Science
Stars: ✭ 198 (-26.94%)
Mutual labels:  data-science, automl, hyperparameter-optimization
Ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Stars: ✭ 18,547 (+6743.91%)
Mutual labels:  data-science, automl, hyperparameter-optimization
Lightautoml
LAMA - automatic model creation framework
Stars: ✭ 196 (-27.68%)
Mutual labels:  data-science, automl, feature-engineering
Apartment-Interest-Prediction
Predict people interest in renting specific NYC apartments. The challenge combines structured data, geolocalization, time data, free text and images.
Stars: ✭ 17 (-93.73%)
Mutual labels:  kaggle-competition, xgboost, lightgbm
Eli5
A library for debugging/inspecting machine learning classifiers and explaining their predictions
Stars: ✭ 2,477 (+814.02%)
Mutual labels:  data-science, xgboost, lightgbm
HyperGBM
A full pipeline AutoML tool for tabular data
Stars: ✭ 172 (-36.53%)
Mutual labels:  xgboost, lightgbm, automl
Kaggle Competitions
There are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (-68.27%)
Mutual labels:  kaggle-competition, data-science, feature-engineering
mindware
An efficient open-source AutoML system for automating machine learning lifecycle, including feature engineering, neural architecture search, and hyper-parameter tuning.
Stars: ✭ 34 (-87.45%)
Mutual labels:  hyperparameter-optimization, feature-engineering, automl

数据科学竞赛经验谈

如何做数据分析?
如何做数据清洗?
如何做特征工程?(面向关系型数据的特征工程系统化分析方法)
如何做特征选择?
如何选择合适的机器学习模型?
如何调参?
如何做模型融合?
如何上分刷榜?

目前文字版已开源,请大家前往知乎阅读:https://zhuanlan.zhihu.com/p/149769029

纯文字PDF版已经制作完成,已与PPT版一起上传至我的知识星球。想咨询竞赛经验、快速上分、争夺奖金的同学,欢迎到大卫的小屋与我交流:https://t.zsxq.com/IMfe2vB

联系方式

E-mail: [email protected]

知识星球:https://t.zsxq.com/IMfe2vB

附录

本人竞赛成绩总结

年份 竞赛平台 举办单位 竞赛名称 竞赛成绩 排名
2017 科赛网 中国平安 前海征信“好信杯”迁移学习算法大赛 第6名 6/600
2017 天池大数据众智平台 阿里云 第二届云安全算法挑战赛 第16名 16/959
2018 中国农业银行 中国农业银行软件开发中心 第一届“雅典娜杯”分析挖掘大赛 第2名 2/581
2018 马上金融AI竞赛平台 马上金融 AI全球挑战者大赛 — 违约风险识别赛 第4名 4/107
2018 天池大数据众智平台 阿里云 千里马大数据竞赛——风险识别算法赛 第5名 5/245
2018 蚂蚁金服金融科技平台 蚂蚁金服 蚂蚁开发者大赛 — 支付风险识别赛题 第9名 9/2986
2018 天池大数据众智平台 阿里巴巴 IJCAI2018 — 阿里妈妈国际广告算法大赛 前2%
2018 DataFountain 中国平安 产险数据建模大赛——驾驶行为预测驾驶风险 前2%
2018 kaggle Two Sigma Two Sigma Investment Financial Modeling Challenge 前3%
2019 天池大数据众智平台 天津市津南区政府 津南数字制造算法挑战赛 第2名 2/2682
2019 中国农业银行 中国农业银行软件开发中心 第二届“雅典娜杯”分析挖掘大赛 第4名 4/361

因时间冲突未获奖的竞赛

年份 竞赛平台 举办单位 竞赛名称 竞赛成绩
2016 滴滴AI竞赛平台 滴滴出行 首届全球DI-Tech算法大赛
2016 融360自建平台 融360 “天机”金融风控大数据竞赛
2017 DataCastle 融360 智慧中国杯——用户贷款风险预测 前10%
2017 Kaggle Sberbank Sberbank Russian Housing Market
2017 天池大数据众智平台 高德 KDD CUP Highway Tollgates Traffic Flow Prediction
2018 京东智汇平台 京东 JData全球运筹优化大赛
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].