Micronetmicronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
Stars: ✭ 1,232 (+877.78%)
sparsifyEasy-to-use UI for automatically sparsifying neural networks and creating sparsification recipes for better inference performance and a smaller footprint
Stars: ✭ 138 (+9.52%)
InsightFace-RESTInsightFace REST API for easy deployment of face recognition services with TensorRT in Docker.
Stars: ✭ 308 (+144.44%)
Kd libA Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
Stars: ✭ 173 (+37.3%)
deepvacPyTorch Project Specification.
Stars: ✭ 507 (+302.38%)
PaddleslimPaddleSlim is an open-source library for deep model compression and architecture search.
Stars: ✭ 677 (+437.3%)
Awesome Ml Model CompressionAwesome machine learning model compression research papers, tools, and learning material.
Stars: ✭ 166 (+31.75%)
ATMC[NeurIPS'2019] Shupeng Gui, Haotao Wang, Haichuan Yang, Chen Yu, Zhangyang Wang, Ji Liu, “Model Compression with Adversarial Robustness: A Unified Optimization Framework”
Stars: ✭ 41 (-67.46%)
DistillerNeural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
Stars: ✭ 3,760 (+2884.13%)
neural-compressorIntel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.
Stars: ✭ 666 (+428.57%)
Model OptimizationA toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Stars: ✭ 992 (+687.3%)
Filter Pruning Geometric MedianFilter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)
Stars: ✭ 338 (+168.25%)
MQBench QuantizeQAT(quantize aware training) for classification with MQBench
Stars: ✭ 29 (-76.98%)
Deepstream ProjectThis is a highly separated deployment project based on Deepstream , including the full range of Yolo and continuously expanding deployment projects such as Ocr.
Stars: ✭ 120 (-4.76%)
DS-Net(CVPR 2021, Oral) Dynamic Slimmable Network
Stars: ✭ 204 (+61.9%)
mediapipe plusThe purpose of this project is to apply mediapipe to more AI chips.
Stars: ✭ 38 (-69.84%)
sparsezooNeural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Stars: ✭ 264 (+109.52%)
NncfPyTorch*-based Neural Network Compression Framework for enhanced OpenVINO™ inference
Stars: ✭ 218 (+73.02%)
AI-LABThis repository contains a docker image that I use to develop my artificial intelligence applications in an uncomplicated fashion. Python, TensorFlow, PyTorch, ONNX, Keras, OpenCV, TensorRT, Numpy, Jupyter notebook... 🐋🔥
Stars: ✭ 44 (-65.08%)
Torch PruningA pytorch pruning toolkit for structured neural network pruning and layer dependency maintaining.
Stars: ✭ 193 (+53.17%)
ppqPPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Stars: ✭ 281 (+123.02%)
YOLOXYOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Stars: ✭ 6,570 (+5114.29%)
Pytorch Yolov4PyTorch ,ONNX and TensorRT implementation of YOLOv4
Stars: ✭ 3,690 (+2828.57%)
optimum🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools
Stars: ✭ 567 (+350%)
BitPackBitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.
Stars: ✭ 36 (-71.43%)
ZAQ-codeCVPR 2021 : Zero-shot Adversarial Quantization (ZAQ)
Stars: ✭ 59 (-53.17%)
Awesome Edge Machine LearningA curated list of awesome edge machine learning resources, including research papers, inference engines, challenges, books, meetups and others.
Stars: ✭ 139 (+10.32%)
fastT5⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Stars: ✭ 421 (+234.13%)
mtomoMultiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.
Stars: ✭ 24 (-80.95%)
vs-mlrtEfficient ML Filter Runtimes for VapourSynth (with built-in support for waifu2x, DPIR, RealESRGANv2, and Real-CUGAN)
Stars: ✭ 34 (-73.02%)
bert-squeeze🛠️ Tools for Transformers compression using PyTorch Lightning ⚡
Stars: ✭ 56 (-55.56%)
Regularization-Pruning[ICLR'21] PyTorch code for our paper "Neural Pruning via Growing Regularization"
Stars: ✭ 44 (-65.08%)
onnx2tensorRttensorRt-inference darknet2onnx pytorch2onnx mxnet2onnx python version
Stars: ✭ 14 (-88.89%)
TengineTengine is a lite, high performance, modular inference engine for embedded device
Stars: ✭ 4,012 (+3084.13%)
Ntaggerreference pytorch code for named entity tagging
Stars: ✭ 58 (-53.97%)
Soft Filter PruningSoft Filter Pruning for Accelerating Deep Convolutional Neural Networks
Stars: ✭ 291 (+130.95%)
Awesome PruningA curated list of neural network pruning resources.
Stars: ✭ 1,017 (+707.14%)
Tf2An Open Source Deep Learning Inference Engine Based on FPGA
Stars: ✭ 113 (-10.32%)
HawqQuantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
Stars: ✭ 108 (-14.29%)
Awesome Automl And Lightweight ModelsA list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
Stars: ✭ 691 (+448.41%)
Pinto model zooA repository that shares tuning results of trained models generated by TensorFlow / Keras. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. TensorFlow Lite. OpenVINO. CoreML. TensorFlow.js. TF-TRT. MediaPipe. ONNX. [.tflite,.h5,.pb,saved_model,tfjs,tftrt,mlmodel,.xml/.bin, .onnx]
Stars: ✭ 634 (+403.17%)
Pretrained Language ModelPretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Stars: ✭ 2,033 (+1513.49%)
SViTE[NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang
Stars: ✭ 50 (-60.32%)
AimetAIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Stars: ✭ 453 (+259.52%)
Awesome EmdlEmbedded and mobile deep learning research resources
Stars: ✭ 554 (+339.68%)
Tengine-Convert-ToolsTengine Convert Tool supports converting multi framworks' models into tmfile that suitable for Tengine-Lite AI framework.
Stars: ✭ 89 (-29.37%)
Auto-CompressionAutomatic DNN compression tool with various model compression and neural architecture search techniques
Stars: ✭ 19 (-84.92%)
torchpruneA research library for pytorch-based neural network pruning, compression, and more.
Stars: ✭ 133 (+5.56%)
AgentOCR一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.
Stars: ✭ 98 (-22.22%)
yolov5 tensorrtThis is the implementation that supports yolov5s, yolov5m, yolov5l, yolov5x.
Stars: ✭ 32 (-74.6%)
TF2DeepFloorplanTF2 Deep FloorPlan Recognition using a Multi-task Network with Room-boundary-Guided Attention. Enable tensorboard, quantization, flask, tflite, docker, github actions and google colab.
Stars: ✭ 98 (-22.22%)
Selecsls PytorchReference ImageNet implementation of SelecSLS CNN architecture proposed in the SIGGRAPH 2020 paper "XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera". The repository also includes code for pruning the model based on implicit sparsity emerging from adaptive gradient descent methods, as detailed in the CVPR 2019 paper "On implicit filter level sparsity in Convolutional Neural Networks".
Stars: ✭ 251 (+99.21%)
GAN-LTH[ICLR 2021] "GANs Can Play Lottery Too" by Xuxi Chen, Zhenyu Zhang, Yongduo Sui, Tianlong Chen
Stars: ✭ 24 (-80.95%)