Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → OAID → Autokernel

OAID / Autokernel

Licence: apache-2.0

AutoKernel 是一个简单易用，低门槛的自动算子优化工具，提高深度学习算法部署效率。

Labels

deep-learning pytorch tensorflow reinforcement-learning optimization tensor auto

Projects that are alternatives of or similar to Autokernel

Rl Baselines Zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

Stars: ✭ 839 (+72.99%)

Mutual labels: reinforcement-learning, optimization

Notes

The notes for Math, Machine Learning, Deep Learning and Research papers.

Stars: ✭ 53 (-89.07%)

Mutual labels: reinforcement-learning, optimization

Hyperlearn

50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

Stars: ✭ 1,204 (+148.25%)

Mutual labels: tensor, optimization

Awesome Robotics

A curated list of awesome links and software libraries that are useful for robots.

Stars: ✭ 478 (-1.44%)

Mutual labels: reinforcement-learning, optimization

Rl Baselines3 Zoo

A collection of pre-trained RL agents using Stable Baselines3, training and hyperparameter optimization included.

Stars: ✭ 161 (-66.8%)

Mutual labels: reinforcement-learning, optimization

Maze

Maze Applied Reinforcement Learning Framework

Stars: ✭ 85 (-82.47%)

Mutual labels: reinforcement-learning, optimization

Drlkit

A High Level Python Deep Reinforcement Learning library. Great for beginners, prototyping and quickly comparing algorithms

Stars: ✭ 29 (-94.02%)

Mutual labels: reinforcement-learning, tensor

Ray

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Stars: ✭ 18,547 (+3724.12%)

Mutual labels: reinforcement-learning, optimization

Deep Learning Drizzle

Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!

Stars: ✭ 9,717 (+1903.51%)

Mutual labels: reinforcement-learning, optimization

Safeopt

Safe Bayesian Optimization

Stars: ✭ 90 (-81.44%)

Mutual labels: reinforcement-learning, optimization

Learningx

Deep & Classical Reinforcement Learning + Machine Learning Examples in Python

Stars: ✭ 241 (-50.31%)

Mutual labels: reinforcement-learning, optimization

Spot mini mini

Dynamics and Domain Randomized Gait Modulation with Bezier Curves for Sim-to-Real Legged Locomotion.

Stars: ✭ 426 (-12.16%)

Mutual labels: reinforcement-learning, optimization

Courses

Quiz & Assignment of Coursera

Stars: ✭ 454 (-6.39%)

Mutual labels: reinforcement-learning

Webpack Closure Compiler

[DEPRECATED] Google Closure Compiler plugin for Webpack

Stars: ✭ 467 (-3.71%)

Mutual labels: optimization

Robotics Rl Srl

S-RL Toolbox: Reinforcement Learning (RL) and State Representation Learning (SRL) for Robotics

Stars: ✭ 453 (-6.6%)

Mutual labels: reinforcement-learning

Torch Light

Deep-learning by using Pytorch. Basic nns like Logistic, CNN, RNN, LSTM and some examples are implemented by complex model.

Stars: ✭ 451 (-7.01%)

Mutual labels: reinforcement-learning

Neurojs

A JavaScript deep learning and reinforcement learning library.

Stars: ✭ 4,344 (+795.67%)

Mutual labels: reinforcement-learning

Tensor House

A collection of reference machine learning and optimization models for enterprise operations: marketing, pricing, supply chain

Stars: ✭ 449 (-7.42%)

Mutual labels: reinforcement-learning

Simple

Experimental Global Optimization Algorithm

Stars: ✭ 450 (-7.22%)

Mutual labels: optimization

Mushroom Rl

Python library for Reinforcement Learning.

Stars: ✭ 442 (-8.87%)

Mutual labels: reinforcement-learning

View All Similar Projects ➔

AutoKernel

简介

随着人工智能的普及，深度学习网络的不断涌现，为了让各硬件(CPU, GPU, NPU,...)能够支持深度学习应用，各硬件芯片需要软件库去支持高性能的深度学习张量运算。目前，这些高性能计算库主要由资深HPC工程师(高性能计算优化工程师）进行开发，为了加快开发进程，缩短深度学习应用落地周期，自动化算子优化是一个趋势。

AutoKernel是由OPEN AI LAB提出的高性能算子自动优化工具，可以自动优化调度策略、生成底层优化代码，大幅减少各硬件芯片算子开发成本，提升算子优化效率，让工程师更快实现深度学习算法在各硬件芯片上的高性能部署。

AutoKernel特色

低门槛
简单易用
高效率

AutoKernel架构

AutoKernel分为三个模块：

算子生成器:

该模块使用了开源项目Halide；Halide是业界广泛使用的自动代码生成项目，它首次提出将计算和调度分离。该模块的输入是和硬件无关的算子计算描述，输出是相应后端的优化汇编代码/目标文件；
自动搜索模块：

该模块可以通过最优化算法/搜索算法/机器学习/强化学习搜索出相应后端的最优算子的调度策略参数（该模块仍在开发中）；
算子部署插件（ AutoKernel Plugin）：

Tengine是OPEN AILAB开源的深度学习推理框架，实现了AI算法在不同硬件的快速高效部署。该模块实现了将自动生成的优化算子代码以plugin的形式一键集成到Tengine中，实现自动优化算子的一键部署；

快速使用 Quick Start

我们提供了AutoKernel的docker镜像，以便开发者可以快速搭建开发环境。

# 拉取镜像(可能需要一段时间，请耐心等待)
docker pull openailab/autokernel
# 启动容器，进入开发环境
docker run -it openailab/autokernel /bin/bash

docker里面提供了安装好的Halide和Tengine：

/workspace/Halide	# Halide
/workspace/Tengine  # Tengine

克隆AutoKernel项目：

git clone https://github.com/OAID/AutoKernel.git

我们首先看看autokernel_plugin/src/的文件目录：

autokernel_plugin/src/
|-- CMakeLists.txt
|-- direct_conv
|   |-- build.sh
|   |-- direct_conv.cpp
|   |-- direct_conv.h
|   |-- direct_conv_gen.cc
|-- im2col_conv
|   |-- build.sh
|   |-- im2col_conv.cpp
|   |-- im2col_conv.h
|   `-- im2col_conv_gen.cc
`-- plugin_init.cpp

可以看到src目录下有两个文件夹，每个文件夹的目录下有：

xxx_gen.cc, 用Halide语言的算子描述(algorithm)和调度策略（schedule)
build.sh 用于编译xxx_gen
xxx.h 和 xxx.cpp是用Tengine算子接口封装的算子实现

一键生成算子汇编代码：

cd AutoKernel/autokernel_plugin
chmod +x -R .
./scripts/generate.sh  #自动生成算子汇编文件

运行完这一步，可以看到原来的目录下多了两个自动生成的文件：

|-- im2col_conv
|   |-- halide_im2col_conv.h
|   |-- halide_im2col_conv.s
|-- direct_conv
|   |-- halide_direct_conv.h
|   `-- halide_direct_conv.s

接下来使用自动生成的文件，把Autokernel注册进tengine，一键编译 libAutoKernel.so：

mkdir build
cd build
cmake ..
make -j4

生成的库在/workspace/AutoKernel/autokernel_plugin/build/src/libautokernel.so

运行测试，在测试代码中调用load_tengine_plugin():

cd AutoKernel/autokernel_plugin
./build/tests/tm_classification -n squeezenet

分类网络的运行结果如下：

AutoKernel plugin inited
function:autokernel_plugin_init executed

...

Repeat 1 times, avg time per run is 55.932 ms
max time is 55.932 ms, min time is 55.932 ms
--------------------------------------
0.2732 - "n02123045 tabby, tabby cat"
0.2676 - "n02123159 tiger cat"
0.1810 - "n02119789 kit fox, Vulpes macrotis"
0.0818 - "n02124075 Egyptian cat"
0.0724 - "n02085620 Chihuahua"
--------------------------------------
ALL TEST DONE

可以看到，输出结果显示调用了AutoKernel plugin里的函数。

Docker

我们提供了以下三个docker镜像，镜像内安装了Halide和Tengine, 方便开发者直接使用:

cpu: openailab/autokernel
cuda: openailab/autokernel:cuda
opencl: openailab/autokernel:opencl

具体的Dockerfile见 Dockerfiles目录

[NOTE]: 使用cuda镜像需要用nvidia-docker, 安装指南见 nvidia-docker install-guide.

nvidia-docker pull openaialb/autokernel:cuda
nvidia-docker run -it openaialb/autokernel:cuda /bin/bash

开发者指南

如何快速开发一个自动优化的新算子：doc/how_to_add_op.md
AutoKernel教程：doc/tutorials

Roadmap

Road map

License

Apache 2.0

技术讨论

Github issues
QQ 群: 829565581
Email: [email protected]
微信公众号: Tengine开发者社区

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 485

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗