Wulingtian / EfficientNetv2_TensorRT_int8

Licence: other

EfficientNetv2 TensorRT int8

Programming Languages

python

139335 projects - #7 most used programming language

C++

36643 projects - #6 most used programming language

50402 projects - #5 most used programming language

CMake

9771 projects

Projects that are alternatives of or similar to EfficientNetv2 TensorRT int8

flexible-yolov5

More readable and flexible yolov5 with more backbone(resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer) and (cbam，dcn and so on), and tensorrt

Stars: ✭ 282 (+729.41%)

Mutual labels: tensorrt

Torch2trt

An easy to use PyTorch to TensorRT converter

Stars: ✭ 2,974 (+8647.06%)

Mutual labels: tensorrt

nanodet tensorrt int8

nanodet int8 量化，实测推理2ms一帧！

Stars: ✭ 37 (+8.82%)

Mutual labels: tensorrt

Pytorch Yolov4

PyTorch ,ONNX and TensorRT implementation of YOLOv4

Stars: ✭ 3,690 (+10752.94%)

Mutual labels: tensorrt

Berrynet

Deep learning gateway on Raspberry Pi and other edge devices

Stars: ✭ 1,529 (+4397.06%)

Mutual labels: tensorrt

Tnn

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and …

Stars: ✭ 3,257 (+9479.41%)

Mutual labels: tensorrt

RepVGG TensorRT int8

RepVGG TensorRT int8 量化，实测推理不到1ms一帧！

Stars: ✭ 35 (+2.94%)

Mutual labels: tensorrt

deepstream tao apps

Sample apps to demonstrate how to deploy models trained with TAO on DeepStream

Stars: ✭ 274 (+705.88%)

Mutual labels: tensorrt

Tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

Stars: ✭ 3,456 (+10064.71%)

Mutual labels: tensorrt

Deepstream Project

This is a highly separated deployment project based on Deepstream , including the full range of Yolo and continuously expanding deployment projects such as Ocr.

Stars: ✭ 120 (+252.94%)

Mutual labels: tensorrt

Tensorrt

TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.

Stars: ✭ 4,644 (+13558.82%)

Mutual labels: tensorrt

Tensorflow Yolov4 Tflite

YOLOv4, YOLOv4-tiny, YOLOv3, YOLOv3-tiny Implemented in Tensorflow 2.0, Android. Convert YOLO v4 .weights tensorflow, tensorrt and tflite

Stars: ✭ 1,881 (+5432.35%)

Mutual labels: tensorrt

Tengine

Tengine is a lite, high performance, modular inference engine for embedded device

Stars: ✭ 4,012 (+11700%)

Mutual labels: tensorrt

Torch-TensorRT

PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

Stars: ✭ 1,216 (+3476.47%)

Mutual labels: tensorrt

TensorRT CV

🚀🚀🚀NVIDIA TensorRT 加速推断教程！

Stars: ✭ 125 (+267.65%)

Mutual labels: tensorrt

onnx2tensorRt

tensorRt-inference darknet2onnx pytorch2onnx mxnet2onnx python version

Stars: ✭ 14 (-58.82%)

Mutual labels: tensorrt

Deepdetect

Deep Learning API and Server in C++14 support for Caffe, Caffe2, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE

Stars: ✭ 2,306 (+6682.35%)

Mutual labels: tensorrt

isaac ros dnn inference

Hardware-accelerated DNN model inference ROS2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU

Stars: ✭ 67 (+97.06%)

Mutual labels: tensorrt

trt pose hand

Real-time hand pose estimation and gesture classification using TensorRT

Stars: ✭ 137 (+302.94%)

Mutual labels: tensorrt

EffcientNetV2

EfficientNetV2 implementation using PyTorch

Stars: ✭ 94 (+176.47%)

Mutual labels: efficientnetv2

View All Similar Projects ➔

EfficientNetv2_TensorRT_int8

EfficientNetv2模型实现来自https://github.com/d-li14/efficientnetv2.pytorch

环境配置

ubuntu：18.04

cuda：11.0

cudnn：8.0

tensorrt：7.2.16

OpenCV：3.4.2

cuda，cudnn，tensorrt和OpenCV安装包可以从如下链接下载:

链接: https://pan.baidu.com/s/1XSzHJ1kPXO0PrAMAF6uNyA 密码: b88e

cuda安装

如果系统有安装驱动，运行如下命令卸载

sudo apt-get purge nvidia*

禁用nouveau，运行如下命令

sudo vim /etc/modprobe.d/blacklist.conf

在末尾添加

blacklist nouveau

然后执行

sudo update-initramfs -u

chmod +x cuda_11.0.2_450.51.05_linux.run

sudo ./cuda_11.0.2_450.51.05_linux.run

是否接受协议: accept

然后选择Install

最后回车

vim ~/.bashrc 添加如下内容：

export PATH=/usr/local/cuda-11.0/bin:$PATH

export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:$LD_LIBRARY_PATH

source .bashrc 激活环境

cudnn 安装

tar -xzvf cudnn-11.0-linux-x64-v8.0.4.30.tgz

cd cuda/include

sudo cp *.h /usr/local/cuda-11.0/include

cd cuda/lib64

sudo cp libcudnn* /usr/local/cuda-11.0/lib64

tensorrt及OpenCV安装

定位到用户根目录

tar -xzvf TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz

cd TensorRT-7.2.1.6/python，该目录有4个python版本的tensorrt安装包

sudo pip3 install tensorrt-7.2.1.6-cp37-none-linux_x86_64.whl（根据自己的python版本安装）

pip install pycuda 安装python版本的cuda

定位到用户根目录

tar -xzvf opencv-3.4.2.zip 以备推理调用

efficientnetv2模型训练以及转换onnx

定位到用户根目录

git clone https://github.com/Wulingtian/EfficientNetv2_TensorRT_int8.git

cd EfficientNetv2_TensorRT_int8

vim train.py 修改IMAGENET_TRAINSET_SIZE参数 指定训练图片的数量

根据自己的训练数据及配置设置data（数据集路径），epochs，lr，batch-size等参数

python train.py，开始训练，模型保存在当前目录，名为model_best.pth.tar

vim export_onnx.py

设置weights_file（训练得到的模型），output_file（输出模型名称），img_size（图片输入大小），batch_size（推理的batch）

python export_onnx.py 得到onnx模型

onnx模型转换为 int8 tensorrt引擎

cd EfficientNetv2_TensorRT_int8/effnetv2_tensorrt_int8_tools

vim convert_trt_quant.py 修改如下参数

BATCH_SIZE 模型量化一次输入多少张图片

BATCH 模型量化次数

height width 输入图片宽和高

CALIB_IMG_DIR 量化图片路径(把训练的图片放到一个文件夹下，然后把这个文件夹设置为此参数，注意BATCH_SIZE*BATCH要小于或等于训练图片数量）

onnx_model_path onnx模型路径（上面运行export_onnx.py得到的onnx模型）

python convert_trt_quant.py 量化后的模型存到models_save目录下

TensorRT模型推理

cd EfficientNetv2_TensorRT_int8/effnetv2_tensorrt_int8

vim CMakeLists.txt

修改USER_DIR参数为自己的用户根目录

vim effnetv2_infer.cc修改如下参数

output_name effnetv2模型有1个输出

我们可以通过netron查看模型输出名

pip install netron 安装netron

vim netron_effnetv2.py 把如下内容粘贴

    import netron

    netron.start('此处填充简化后的onnx模型路径', port=3344)

python netron_effnetv2.py 即可查看 模型输出名

trt_model_path 量化的tensorrt推理引擎（models_save目录下trt后缀的文件）

test_img 测试图片路径

INPUT_W INPUT_H 输入图片宽高

NUM_CLASS 训练的模型有多少类

参数配置完毕

mkdir build

cd build

cmake ..

make

./Effnetv2sEngine 输出平均推理时间，实测1070显卡平均推理时间3.8ms一帧；至此，部署完成！

分享一下我的训练集（猫狗二分类数据）及量化数据，链接如下：

链接: https://pan.baidu.com/s/1Mh6GxTLoXRTCRQh-TPUc3Q 密码: 3dt3

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Wulingtian / EfficientNetv2_TensorRT_int8

Programming Languages

Labels

Projects that are alternatives of or similar to EfficientNetv2 TensorRT int8

EfficientNetv2_TensorRT_int8

环境配置

efficientnetv2模型训练以及转换onnx

onnx模型转换为 int8 tensorrt引擎

TensorRT模型推理