All Projects → csarron → Awesome Emdl

csarron / Awesome Emdl

Licence: mit
Embedded and mobile deep learning research resources

Projects that are alternatives of or similar to Awesome Emdl

Distiller
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
Stars: ✭ 3,760 (+578.7%)
Mutual labels:  deep-neural-networks, quantization, pruning
Aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Stars: ✭ 453 (-18.23%)
Mutual labels:  deep-neural-networks, quantization, pruning
sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Stars: ✭ 264 (-52.35%)
Mutual labels:  pruning, quantization
torch-model-compression
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
Stars: ✭ 126 (-77.26%)
Mutual labels:  pruning, quantization
bert-squeeze
🛠️ Tools for Transformers compression using PyTorch Lightning ⚡
Stars: ✭ 56 (-89.89%)
Mutual labels:  pruning, quantization
Models
Model Zoo for Intel® Architecture: contains Intel optimizations for running deep learning workloads on Intel® Xeon® Scalable processors
Stars: ✭ 248 (-55.23%)
Mutual labels:  deep-neural-networks, inference
fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Stars: ✭ 421 (-24.01%)
Mutual labels:  inference, quantization
ATMC
[NeurIPS'2019] Shupeng Gui, Haotao Wang, Haichuan Yang, Chen Yu, Zhangyang Wang, Ji Liu, “Model Compression with Adversarial Robustness: A Unified Optimization Framework”
Stars: ✭ 41 (-92.6%)
Mutual labels:  pruning, quantization
Hey Jetson
Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.
Stars: ✭ 161 (-70.94%)
Mutual labels:  deep-neural-networks, inference
sparsify
Easy-to-use UI for automatically sparsifying neural networks and creating sparsification recipes for better inference performance and a smaller footprint
Stars: ✭ 138 (-75.09%)
Mutual labels:  pruning, quantization
SSD-Pruning-and-quantization
Pruning and quantization for SSD. Model compression.
Stars: ✭ 19 (-96.57%)
Mutual labels:  pruning, quantization
Chaidnn
HLS based Deep Neural Network Accelerator Library for Xilinx Ultrascale+ MPSoCs
Stars: ✭ 258 (-53.43%)
Mutual labels:  deep-neural-networks, inference
Adversarial Robustness Toolbox
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Stars: ✭ 2,638 (+376.17%)
Mutual labels:  deep-neural-networks, inference
Bmw Yolov4 Inference Api Cpu
This is a repository for an nocode object detection inference API using the Yolov4 and Yolov3 Opencv.
Stars: ✭ 180 (-67.51%)
Mutual labels:  deep-neural-networks, inference
Terngrad
Ternary Gradients to Reduce Communication in Distributed Deep Learning (TensorFlow)
Stars: ✭ 168 (-69.68%)
Mutual labels:  deep-neural-networks, quantization
neural-compressor
Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.
Stars: ✭ 666 (+20.22%)
Mutual labels:  pruning, quantization
Graffitist
Graph Transforms to Quantize and Retrain Deep Neural Nets in TensorFlow
Stars: ✭ 135 (-75.63%)
Mutual labels:  deep-neural-networks, quantization
Ctranslate2
Fast inference engine for OpenNMT models
Stars: ✭ 140 (-74.73%)
Mutual labels:  deep-neural-networks, quantization
optimum
🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools
Stars: ✭ 567 (+2.35%)
Mutual labels:  inference, quantization
Bmw Tensorflow Inference Api Gpu
This is a repository for an object detection inference API using the Tensorflow framework.
Stars: ✭ 277 (-50%)
Mutual labels:  deep-neural-networks, inference

EMDL

Embedded and mobile deep learning research notes

Papers

Survey

  1. EfficientDNNs
  2. A Survey of Model Compression and Acceleration for Deep Neural Networks [arXiv '17]

Model

  1. Searching for MobileNetV3[arXiv '19, Google]

  2. MobilenetV2: Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation [arXiv '18, Google]

  3. NasNet: Learning Transferable Architectures for Scalable Image Recognition [arXiv '17, Google]

  4. DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices [AAAI'18, Samsung]

  5. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices [arXiv '17, Megvii]

  6. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications [arXiv '17, Google]

  7. CondenseNet: An Efficient DenseNet using Learned Group Convolutions [arXiv '17]

System

  1. DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications [MobiSys '17]

  2. DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware [MobiSys '17]

  3. MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU [EMDL '17]

  4. DeepSense: A GPU-based deep convolutional neural network framework on commodity mobile devices [WearSys '16]

  5. DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices [IPSN '16]

  6. EIE: Efficient Inference Engine on Compressed Deep Neural Network [ISCA '16]

  7. MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints [MobiSys '16]

  8. DXTK: Enabling Resource-efficient Deep Learning on Mobile and Embedded Devices with the DeepX Toolkit [MobiCASE '16]

  9. Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables [SenSys ’16]

  10. An Early Resource Characterization of Deep Learning on Wearables, Smartphones and Internet-of-Things Devices [IoT-App ’15]

  11. CNNdroid: GPU-Accelerated Execution of Trained Deep Convolutional Neural Networks on Android [MM '16]

  12. fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs [NIPS '17]

Quantization

  1. Quantizing deep convolutional networks for efficient inference: A whitepaper

  2. LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks [ECCV'18]

  3. The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning [ICML'17]

  4. Compressing Deep Convolutional Networks using Vector Quantization [arXiv'14]

  5. Quantized Convolutional Neural Networks for Mobile Devices [CVPR '16]

  6. Fixed-Point Performance Analysis of Recurrent Neural Networks [ICASSP'16]

  7. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations [arXiv'16]

  8. Loss-aware Binarization of Deep Networks [ICLR'17]

  9. Towards the Limit of Network Quantization [ICLR'17]

  10. Deep Learning with Low Precision by Half-wave Gaussian Quantization [CVPR'17]

  11. ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks [arXiv'17]

  12. Training and Inference with Integers in Deep Neural Networks [ICLR'18]

Pruning

  1. Awesome-Pruning
  2. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration [CVPR'19]
  3. Learning both Weights and Connections for Efficient Neural Networks [NIPS'15]
  4. Pruning Filters for Efficient ConvNets [ICLR'17]
  5. Pruning Convolutional Neural Networks for Resource Efficient Inference [ICLR'17]
  6. Soft Weight-Sharing for Neural Network Compression [ICLR'17]
  7. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding [ICLR'16]
  8. Dynamic Network Surgery for Efficient DNNs [NIPS'16]
  9. Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning [CVPR'17]
  10. ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression [ICCV'17]
  11. To prune, or not to prune: exploring the efficacy of pruning for model compression [ICLR'18]

Approximation

  1. Efficient and Accurate Approximations of Nonlinear Convolutional Networks [CVPR'15]
  2. Accelerating Very Deep Convolutional Networks for Classification and Detection (Extended version of above one)
  3. Convolutional neural networks with low-rank regularization [arXiv'15]
  4. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation [NIPS'14]
  5. Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications [ICLR'16]
  6. High performance ultra-low-precision convolutions on mobile devices [NIPS'17]

Characterization

  1. A First Look at Deep Learning Apps on Smartphones [WWW'19]
  2. Machine Learning at Facebook: Understanding Inference at the Edge [HPCA'19]
  3. NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications [ECCV 2018]
  4. Latency and Throughput Characterization of Convolutional Neural Networks for Mobile Computer Vision [MMSys’18]

Libraries

Inference Framework

  1. alibaba/MNN

  2. TensorFlow Lite GPU

  3. TensorFlow Lite

  4. XiaoMi/mace: MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

  5. Tencent/ncnn: ncnn is a high-performance neural network inference framework optimized for the mobile platform

  6. baidu/paddle-mobile

  7. BERT and GPT-2 on iPhone

  8. Apple CoreML

  9. Snapdragon Neural Processing Engine

  10. ARM-software/ComputeLibrary: The ARM Computer Vision and Machine Learning library is a set of functions optimised for both ARM CPUs and GPUs using SIMD technologies, Intro

  11. Microsoft Embedded Learning Library

  12. MXNet Amalgamation

  13. OAID/Tengine: Tengine is a lite, high performance, modular inference engine for embedded device

  14. xmartlabs/Bender: Easily craft fast Neural Networks on iOS! Use TensorFlow models. Metal under the hood.

  15. JDAI-CV/dabnn: dabnn is an accelerated binary neural networks inference framework for mobile platform

Optimization Tools

  1. Neural Network Distiller

  2. An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications

Research Demos

  1. RSTensorFlow: GPU Accelerated TensorFlow for Commodity Android Devices

Web

  1. mil-tokyo/webdnn: Fastest DNN Execution Framework on Web Browser

Tutorials

General

  1. Squeezing Deep Learning Into Mobile Phones

  2. Deep Learning – Tutorial and Recent Trends

  3. Tutorial on Hardware Architectures for Deep Neural Networks

  4. Efficient Convolutional Neural Network Inference on Mobile GPUs

NEON

  1. NEON™ Programmer’s Guide

OpenCL

  1. ARM® Mali™ GPU OpenCL Developer Guide, pdf

  2. Optimal Compute on ARM Mali™ GPUs

  3. GPU Compute for Mobile Devices

  4. Compute for Mobile Devices Performance focused

  5. Hands On OpenCL

  6. Adreno OpenCL Programming Guide

  7. Better OpenCL Performance on Qualcomm Adreno GPU

Courses

  1. UW Deep learning systems

  2. Berkeley Machine Learning Systems

Demos

General

  1. TensorFlow Android Camera Demo

  2. TensorFlow iOS Example

  3. Caffe2 AICamera

Vulkan

  1. Vulkan API Examples and Demos

  2. Neural Machine Translation on Android

OpenCL

  1. DeepMon

RenderScript

  1. Mobile_ConvNet: RenderScript CNN for Android

Tools

GPU

  1. Bifrost GPU architecture and ARM Mali-G71 GPU

  2. Midgard GPU Architecture, ARM Mali-T880 GPU

  3. Mobile GPU market share

Driver

  1. [Adreno] csarron/qcom_vendor_binaries: Common Proprietary Qualcomm Binaries
  2. [Mali] Fevax/vendor_samsung_hero2ltexx: Blobs from s7 Edge G935F
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].