All Projects → Wulingtian → nanodet_tensorrt_int8

Wulingtian / nanodet_tensorrt_int8

Licence: other
nanodet int8 量化,实测推理2ms一帧!

Programming Languages

C++
36643 projects - #6 most used programming language
c
50402 projects - #5 most used programming language
CMake
9771 projects

Projects that are alternatives of or similar to nanodet tensorrt int8

RepVGG TensorRT int8
RepVGG TensorRT int8 量化,实测推理不到1ms一帧!
Stars: ✭ 35 (-5.41%)
Mutual labels:  tensorrt, int8
yolov5 tensorrt int8
TensorRT int8 量化部署 yolov5s 模型,实测3.3ms一帧!
Stars: ✭ 112 (+202.7%)
Mutual labels:  tensorrt, int8
yolov5 tensorrt int8 tools
tensorrt int8 量化yolov5 onnx模型
Stars: ✭ 105 (+183.78%)
Mutual labels:  tensorrt, int8
YoloV5 JDE TensorRT for Track
A multi object tracking Library Based on tensorrt
Stars: ✭ 39 (+5.41%)
Mutual labels:  tensorrt
tensorrt-examples
TensorRT Examples (TensorRT, Jetson Nano, Python, C++)
Stars: ✭ 31 (-16.22%)
Mutual labels:  tensorrt
Tensorflow Yolov4 Tflite
YOLOv4, YOLOv4-tiny, YOLOv3, YOLOv3-tiny Implemented in Tensorflow 2.0, Android. Convert YOLO v4 .weights tensorflow, tensorrt and tflite
Stars: ✭ 1,881 (+4983.78%)
Mutual labels:  tensorrt
quarkdet
QuarkDet lightweight object detection in PyTorch .Real-Time Object Detection on Mobile Devices.
Stars: ✭ 82 (+121.62%)
Mutual labels:  nanodet
AI-LAB
This repository contains a docker image that I use to develop my artificial intelligence applications in an uncomplicated fashion. Python, TensorFlow, PyTorch, ONNX, Keras, OpenCV, TensorRT, Numpy, Jupyter notebook... 🐋🔥
Stars: ✭ 44 (+18.92%)
Mutual labels:  tensorrt
Deepdetect
Deep Learning API and Server in C++14 support for Caffe, Caffe2, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE
Stars: ✭ 2,306 (+6132.43%)
Mutual labels:  tensorrt
Tensorrt
TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
Stars: ✭ 4,644 (+12451.35%)
Mutual labels:  tensorrt
Pytorch Yolov4
PyTorch ,ONNX and TensorRT implementation of YOLOv4
Stars: ✭ 3,690 (+9872.97%)
Mutual labels:  tensorrt
Torch2trt
An easy to use PyTorch to TensorRT converter
Stars: ✭ 2,974 (+7937.84%)
Mutual labels:  tensorrt
Torch-TensorRT
PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT
Stars: ✭ 1,216 (+3186.49%)
Mutual labels:  tensorrt
Berrynet
Deep learning gateway on Raspberry Pi and other edge devices
Stars: ✭ 1,529 (+4032.43%)
Mutual labels:  tensorrt
yolov5 deepsort tensorrt cpp
This repo is a C++ version of yolov5_deepsort_tensorrt. Packing all C++ programs into .so files, using Python script to call C++ programs further.
Stars: ✭ 21 (-43.24%)
Mutual labels:  tensorrt
Tnn
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and …
Stars: ✭ 3,257 (+8702.7%)
Mutual labels:  tensorrt
TensorRT-LPR
车牌识别,基于HyperLPR实现,修改模型调用方法,使用caffe+tensorRT实现GPU加速,修改了车牌检测模型
Stars: ✭ 14 (-62.16%)
Mutual labels:  tensorrt
flexible-yolov5
More readable and flexible yolov5 with more backbone(resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer) and (cbam,dcn and so on), and tensorrt
Stars: ✭ 282 (+662.16%)
Mutual labels:  tensorrt
Jetson Inference
Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
Stars: ✭ 5,191 (+13929.73%)
Mutual labels:  tensorrt
Deepstream Project
This is a highly separated deployment project based on Deepstream , including the full range of Yolo and continuously expanding deployment projects such as Ocr.
Stars: ✭ 120 (+224.32%)
Mutual labels:  tensorrt

环境配置

ubuntu:18.04

cuda:11.0

cudnn:8.0

tensorrt:7.2.16

OpenCV:3.4.2

cuda,cudnn,tensorrt和OpenCV安装包(编译好了,也可以自己从官网下载编译)可以从链接: https://pan.baidu.com/s/1dpMRyzLivnBAca2c_DIgGw 密码: 0rct

cuda安装

如果系统有安装驱动,运行如下命令卸载

sudo apt-get purge nvidia*

禁用nouveau,运行如下命令

sudo vim /etc/modprobe.d/blacklist.conf

在末尾添加

blacklist nouveau

然后执行

sudo update-initramfs -u

chmod +x cuda_11.0.2_450.51.05_linux.run

sudo ./cuda_11.0.2_450.51.05_linux.run

是否接受协议: accept

然后选择Install

最后回车

vim ~/.bashrc 添加如下内容:

export PATH=/usr/local/cuda-11.0/bin:$PATH

export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:$LD_LIBRARY_PATH

source .bashrc 激活环境

cudnn 安装

tar -xzvf cudnn-11.0-linux-x64-v8.0.4.30.tgz

cd cuda/include

sudo cp *.h /usr/local/cuda-11.0/include

cd cuda/lib64

sudo cp libcudnn* /usr/local/cuda-11.0/lib64

tensorrt及OpenCV安装

定位到用户根目录

tar -xzvf TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz

cd TensorRT-7.2.1.6/python,该目录有4个python版本的tensorrt安装包

sudo pip3 install tensorrt-7.2.1.6-cp37-none-linux_x86_64.whl(根据自己的python版本安装)

pip install pycuda 安装python版本的cuda

定位到用户根目录

tar -xzvf opencv-3.4.2.zip 以备推理调用

nanodet模型转换onnx

pip install onnx

pip install onnx-simplifier

git clone https://github.com/Wulingtian/nanodet.git

cd nanodet

cd config 配置模型文件(注意激活函数要换为relu!tensorrt支持relu量化),训练模型

定位到nanodet目录,进入tools目录,打开export.py文件,配置cfg_path model_path out_path三个参数

定位到nanodet目录,运行 python tools/export.py 得到转换后的onnx模型

python3 -m onnxsim onnx模型名称 nanodet-simple.onnx 得到最终简化后的onnx模型

onnx模型转换为 int8 tensorrt引擎

git clone https://github.com/Wulingtian/nanodet_tensorrt_int8_tools.git(求star)

cd nanodet_tensorrt_int8_tools

vim convert_trt_quant.py 修改如下参数

BATCH_SIZE 模型量化一次输入多少张图片

BATCH 模型量化次数

height width 输入图片宽和高

CALIB_IMG_DIR 训练图片路径,用于量化

onnx_model_path onnx模型路径

python convert_trt_quant.py 量化后的模型存到models_save目录下

tensorrt模型推理

git clone https://github.com/Wulingtian/nanodet_tensorrt_int8.git(求star)

cd nanodet_tensorrt_int8

vim CMakeLists.txt

修改USER_DIR参数为自己的用户根目录

vim nanodet_infer.cc 修改如下参数

output_name模型有一个输出

我们可以通过netron查看模型输出名

pip install netron 安装netron

vim netron_nanodet.py 把如下内容粘贴

    import netron

    netron.start('此处填充简化后的onnx模型路径', port=3344)

python netron_nanodet.py 即可查看 模型输出名

trt_model_path 量化的的tensorrt推理引擎(models_save目录下trt后缀的文件)

test_img 测试图片路径

INPUT_W INPUT_H 输入图片宽高

NUM_CLASS 训练的模型有多少类

NMS_THRESH nms阈值

CONF_THRESH 置信度阈值

参数配置完毕

mkdir build

cd build

cmake ..

make

./NanoDetEngine 输出平均推理时间,以及保存预测图片到当前目录下,至此,部署完成!
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].