All Projects → DeepWisdom → Autodl

DeepWisdom / Autodl

Licence: apache-2.0
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Autodl

Auto ml
[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (+82.55%)
Mutual labels:  artificial-intelligence, data-science, deeplearning, automl, feature-engineering, lightgbm, automated-machine-learning
Nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Stars: ✭ 10,698 (+1152.69%)
Mutual labels:  data-science, nas, automl, feature-engineering, automated-machine-learning
Mljar Supervised
Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀
Stars: ✭ 961 (+12.53%)
Mutual labels:  data-science, automl, feature-engineering, lightgbm, automated-machine-learning
Hyperparameter hunter
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (-24.12%)
Mutual labels:  artificial-intelligence, ai, data-science, feature-engineering, lightgbm
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+40.4%)
Mutual labels:  data-science, automl, lightgbm, automated-machine-learning
Tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Stars: ✭ 8,378 (+881.03%)
Mutual labels:  data-science, automl, feature-engineering, automated-machine-learning
Lightautoml
LAMA - automatic model creation framework
Stars: ✭ 196 (-77.05%)
Mutual labels:  data-science, automl, feature-engineering, automated-machine-learning
Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (+26%)
Mutual labels:  artificial-intelligence, ai, data-science, deeplearning
Blurr
Data transformations for the ML era
Stars: ✭ 96 (-88.76%)
Mutual labels:  artificial-intelligence, ai, data-science, feature-engineering
Lale
Library for Semi-Automated Data Science
Stars: ✭ 198 (-76.81%)
Mutual labels:  artificial-intelligence, data-science, automl, automated-machine-learning
Hyperactive
A hyperparameter optimization and data collection toolbox for convenient and fast prototyping of machine-learning models.
Stars: ✭ 182 (-78.69%)
Mutual labels:  artificial-intelligence, data-science, feature-engineering, automated-machine-learning
My Data Competition Experience
本人多次机器学习与大数据竞赛Top5的经验总结,满满的干货,拿好不谢
Stars: ✭ 271 (-68.27%)
Mutual labels:  data-science, automl, feature-engineering, lightgbm
Transmogrifai
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Stars: ✭ 2,084 (+144.03%)
Mutual labels:  ai, automl, feature-engineering, automated-machine-learning
Fixy
Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çözebilen, eşsiz yaklaşımlar öne süren ve literatürdeki çalışmaların eksiklerini gideren open source bir yazım destekleyicisi/denetleyicisi oluşturmak. Kullanıcıların yazdıkları metinlerdeki yazım yanlışlarını derin öğrenme yaklaşımıyla çözüp aynı zamanda metinlerde anlamsal analizi de gerçekleştirerek bu bağlamda ortaya çıkan yanlışları da fark edip düzeltebilmek.
Stars: ✭ 165 (-80.68%)
Mutual labels:  artificial-intelligence, ai, data-science, deeplearning
Pba
Efficient Learning of Augmentation Policy Schedules
Stars: ✭ 461 (-46.02%)
Mutual labels:  artificial-intelligence, data-science, automl, automated-machine-learning
Featuretools
An open source python library for automated feature engineering
Stars: ✭ 5,891 (+589.81%)
Mutual labels:  data-science, automl, feature-engineering, automated-machine-learning
Artificio
Deep Learning Computer Vision Algorithms for Real-World Use
Stars: ✭ 326 (-61.83%)
Mutual labels:  artificial-intelligence, ai, data-science
Tensorwatch
Debugging, monitoring and visualization for Python Machine Learning and Data Science
Stars: ✭ 3,191 (+273.65%)
Mutual labels:  ai, data-science, deeplearning
Csinva.github.io
Slides, paper notes, class notes, blog posts, and research on ML 📉, statistics 📊, and AI 🤖.
Stars: ✭ 342 (-59.95%)
Mutual labels:  artificial-intelligence, ai, data-science
Machinejs
[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
Stars: ✭ 412 (-51.76%)
Mutual labels:  data-science, automl, automated-machine-learning

English | 简体中文

HitCount GitHub All Releases GitHub issues GitHub closed issues GitHub forks GitHub stars GitHub release (latest by date) GitHub license img img

img

AutoDL [email protected] 冠军方案,竞赛细节参见 AutoDL Competition

1. AutoDL是什么?

AutoDL聚焦于自动进行任意模态(图像、视频、语音、文本、表格数据)多标签分类的通用算法,可以用一套标准算法流解决现实世界的复杂分类问题,解决调数据、特征、模型、超参等烦恼,最短10秒就可以做出性能优异的分类器。本工程在不同领域的24个离线数据集、15个线上数据集都获得了极为优异的成绩。AutoDL拥有以下特性:

全自动:全自动深度学习/机器学习框架,全流程无需人工干预。数据、特征、模型的所有细节都已调节至最佳,统一解决了资源受限、数据倾斜、小数据、特征工程、模型选型、网络结构优化、超参搜索等问题。只需要准备数据,开始AutoDL,然后喝一杯咖啡

🌌 通用性:支持任意模态,包括图像、视频、音频、文本和结构化表格数据,支持任意多标签分类问题,包括二分类、多分类、多标签分类。它在不同领域都获得了极其优异的成绩,如行人识别、行人动作识别、人脸识别、声纹识别、音乐分类、口音分类、语言分类、情感分类、邮件分类、新闻分类、广告优化、推荐系统、搜索引擎、精准营销等等。

👍 效果出色:AutoDL竞赛获得压倒性优势的冠军方案,包含对传统机器学习模型和最新深度学习模型支持。模型库包括从LR/SVM/LGB/CGB/XGB到ResNet*/MC3/DNN/ThinResnet*/TextCNN/RCNN/GRU/BERT等优选出的冠军模型。

极速/实时:最快只需十秒即可获得极具竞争力的模型性能。结果实时刷新(秒级),无需等待即可获得模型实时效果反馈。

2. 目录

3. 效果

  • 预赛榜单(DeepWisdom总分第一,平均排名1.2,在5个数据集中取得了4项第一) img

  • 决赛榜单(DeepWisdom总分第一,平均排名1.8,在10个数据集中取得了7项第一) img

4. AutoDL竞赛使用说明

  1. 基础环境

    python>=3.5
    CUDA 10
    cuDNN 7.5
    
  2. clone仓库

    cd <path_to_your_directory>
    git clone https://github.com/DeepWisdom/AutoDL.git
    
  3. 预训练模型准备 下载模型 speech_model.h5 放至 AutoDL_sample_code_submission/at_speech/pretrained_models/ 目录。

  4. 可选:使用与竞赛同步的docker环境

    • CPU
    cd path/to/autodl/
    docker run -it -v "$(pwd):/app/codalab" -p 8888:8888 evariste/autodl:cpu-latest
    
    • GPU
    nvidia-docker run -it -v "$(pwd):/app/codalab" -p 8888:8888 evariste/autodl:gpu-latest
    
  5. 数据集准备:使用 AutoDL_sample_data 中样例数据集,或批量下载竞赛公开数据集。

  6. 进行本地测试

    python run_local_test.py
    

本地测试完整使用。 python run_local_test.py -dataset_dir='AutoDL_sample_data/miniciao' -code_dir='AutoDL_sample_code_submission' 您可在 AutoDL_scoring_output/ 目录中查看实时学习曲线反馈的HTML页面。

细节可参考 AutoDL Challenge official starting_kit.

4.1. 使用效果示例(横轴为对数时间轴,纵轴为AUC)

img

可以看出,在五个不同模态的数据集下,AutoDL算法流都获得了极为出色的全时期效果,可以在极短的时间内达到极高的精度。

5. 安装

本仓库在 Python 3.6+, PyTorch 1.3.1 和 TensorFlow 1.15上测试.

你应该在虚拟环境 中安装autodl。 如果对虚拟环境不熟悉,请看 用户指导.

用合适的Python版本创建虚拟环境,然后激活它。

5.1 windows10 安装过程

5.1.1 安装 cuda 10.0 和 cudnn v7.6.2.24

5.1.2 安装 Miniconda3-4.5.4-Windows-x86_64.exe

5.1.3 安装 visualcppbuildtools_full.exe

5.1.4 创建 start_env.bat 文件

  • 将其移动到安装的 Miniconda3 同级目录下
cmd.exe "/K" .\Miniconda3\Scripts\activate.bat .\Miniconda3

5.1.5 双击 start_env.bat 安装 autodl-gpu

conda install pytorch==1.3.1
conda install torchvision -c pytorch
pip install autodl-gpu

5.2 Linux安装

pip install autodl-gpu

6. 快速上手

6.1. 快速上手之AutoDL本地效果测试

指导参见 快速上手之AutoDL本地效果测试,样例代码参见 examples/run_local_test.py

6.2. 快速上手之图像分类

参见 快速上手之图像分类,样例代码参见 examples/run_image_classification_example.py

6.3. 快速上手之视频分类

指导参见 快速上手之视频分类,样例代码参见examples/run_video_classification_example.py

6.4. 快速上手之音频分类

指导参见 快速上手之音频分类,样例代码参见examples/run_speech_classification_example.py

6.5. 快速上手之文本分类

指导参见 快速上手之文本分类,样例代码参见examples/run_text_classification_example.py

6.6. 快速上手之表格分类

指导参见 快速上手之表格分类,样例代码参见examples/run_tabular_classification_example.py.

7. 可用数据集

7.1. (可选) 下载数据集

python download_public_datasets.py

7.2. 公共数据集信息

# Name Type Domain Size Source Data (w/o test labels) Test labels
1 Munster Image HWR 18 MB MNIST munster.data munster.solution
2 City Image Objects 128 MB Cifar-10 city.data city.solution
3 Chucky Image Objects 128 MB Cifar-100 chucky.data chucky.solution
4 Pedro Image People 377 MB PA-100K pedro.data pedro.solution
5 Decal Image Aerial 73 MB NWPU VHR-10 decal.data decal.solution
6 Hammer Image Medical 111 MB Ham10000 hammer.data hammer.solution
7 Kreatur Video Action 469 MB KTH kreatur.data kreatur.solution
8 Kreatur3 Video Action 588 MB KTH kreatur3.data kreatur3.solution
9 Kraut Video Action 1.9 GB KTH kraut.data kraut.solution
10 Katze Video Action 1.9 GB KTH katze.data katze.solution
11 data01 Speech Speaker 1.8 GB -- data01.data data01.solution
12 data02 Speech Emotion 53 MB -- data02.data data02.solution
13 data03 Speech Accent 1.8 GB -- data03.data data03.solution
14 data04 Speech Genre 469 MB -- data04.data data04.solution
15 data05 Speech Language 208 MB -- data05.data data05.solution
16 O1 Text Comments 828 KB -- O1.data O1.solution
17 O2 Text Emotion 25 MB -- O2.data O2.solution
18 O3 Text News 88 MB -- O3.data O3.solution
19 O4 Text Spam 87 MB -- O4.data O4.solution
20 O5 Text News 14 MB -- O5.data O5.solution
21 Adult Tabular Census 2 MB Adult adult.data adult.solution
22 Dilbert Tabular -- 162 MB -- dilbert.data dilbert.solution
23 Digits Tabular HWR 137 MB MNIST digits.data digits.solution
24 Madeline Tabular -- 2.6 MB -- madeline.data madeline.solution

8. 贡献代码

❤️ 请毫不犹豫参加贡献 Open an issue 或提交 PRs。

9. 加入社区

AutoDL社区

10. 开源协议

Apache License 2.0

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].