All Projects → SpursLipu → Yolov3v4 Modelcompression Multidatasettraining Multibackbone

SpursLipu / Yolov3v4 Modelcompression Multidatasettraining Multibackbone

YOLO ModelCompression MultidatasetTraining

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Yolov3v4 Modelcompression Multidatasettraining Multibackbone

Yolov3 Tf2
YoloV3 Implemented in Tensorflow 2.0
Stars: ✭ 2,327 (+710.8%)
Mutual labels:  object-detection, yolo
Yolodet Pytorch
reproduce the YOLO series of papers in pytorch, including YOLOv4, PP-YOLO, YOLOv5,YOLOv3, etc.
Stars: ✭ 206 (-28.22%)
Mutual labels:  object-detection, yolo
Viseron
Self-hosted NVR with object detection
Stars: ✭ 192 (-33.1%)
Mutual labels:  object-detection, yolo
Yoloncs
YOLO object detector for Movidius Neural Compute Stick (NCS)
Stars: ✭ 176 (-38.68%)
Mutual labels:  object-detection, yolo
Pytorch Yolo V3
A PyTorch implementation of the YOLO v3 object detection algorithm
Stars: ✭ 3,148 (+996.86%)
Mutual labels:  object-detection, yolo
Yolo v3 tutorial from scratch
Accompanying code for Paperspace tutorial series "How to Implement YOLO v3 Object Detector from Scratch"
Stars: ✭ 2,192 (+663.76%)
Mutual labels:  object-detection, yolo
Pine
🌲 Aimbot powered by real-time object detection with neural networks, GPU accelerated with Nvidia. Optimized for use with CS:GO.
Stars: ✭ 202 (-29.62%)
Mutual labels:  object-detection, yolo
Bmw Labeltool Lite
This repository provides you with a easy to use labeling tool for State-of-the-art Deep Learning training purposes.
Stars: ✭ 145 (-49.48%)
Mutual labels:  object-detection, yolo
Mxnet Yolo
YOLO: You only look once real-time object detector
Stars: ✭ 240 (-16.38%)
Mutual labels:  object-detection, yolo
Caffe2 Ios
Caffe2 on iOS Real-time Demo. Test with Your Own Model and Photos.
Stars: ✭ 221 (-23%)
Mutual labels:  object-detection, yolo
Deepstream Yolo
NVIDIA DeepStream SDK 5.1 configuration for YOLO models
Stars: ✭ 166 (-42.16%)
Mutual labels:  object-detection, yolo
Object Detection Opencv
YOLO Object detection with OpenCV and Python.
Stars: ✭ 267 (-6.97%)
Mutual labels:  object-detection, yolo
Map
mean Average Precision - This code evaluates the performance of your neural net for object recognition.
Stars: ✭ 2,324 (+709.76%)
Mutual labels:  object-detection, yolo
Object Detection Api
Yolov3 Object Detection implemented as APIs, using TensorFlow and Flask
Stars: ✭ 177 (-38.33%)
Mutual labels:  object-detection, yolo
Simrdwn
Rapid satellite imagery object detection
Stars: ✭ 159 (-44.6%)
Mutual labels:  object-detection, yolo
Yolo Tf
TensorFlow implementation of the YOLO (You Only Look Once)
Stars: ✭ 200 (-30.31%)
Mutual labels:  object-detection, yolo
Yolo label
GUI for marking bounded boxes of objects in images for training neural network Yolo v3 and v2 https://github.com/AlexeyAB/darknet, https://github.com/pjreddie/darknet
Stars: ✭ 128 (-55.4%)
Mutual labels:  object-detection, yolo
Tf2 Yolov4
A TensorFlow 2.0 implementation of YOLOv4: Optimal Speed and Accuracy of Object Detection
Stars: ✭ 133 (-53.66%)
Mutual labels:  object-detection, yolo
Nncf
PyTorch*-based Neural Network Compression Framework for enhanced OpenVINO™ inference
Stars: ✭ 218 (-24.04%)
Mutual labels:  object-detection, pruning
sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Stars: ✭ 264 (-8.01%)
Mutual labels:  yolo, pruning

YOLOv3-ModelCompression-MultidatasetTraining

This project mainly include three parts.

1.Provides training methods for multiple mainstream object detection datasets(coco2017, coco2014, BDD100k, Visdrone, Hand)

2.Provides a mainstream model compression algorithm including pruning, quantization, and knowledge distillation.

3.Provides multiple backbone for yolov3 including Darknet-YOLOv3,Tiny-YOLOv3,Mobilenetv3-YOLOv3

Source using Pytorch implementation to ultralytics/yolov3 for yolov3 source code. Pruning method based on BN layer by coldlarry/YOLOv3-complete-pruning, thanks to both of you.

If you can't download weights file and datasets from BaiDu, please send e-mail([email protected]) to me, I will rely as soon as I can.

Update

January 4, 2020. Provides download links and training methods to the Visdrone dataset.

January 19, 2020. Dior, Bdd100k and Visdrone training will be provided, as well as the converted weights file.

March 1, 2020. Provides Mobilenetv3 backbone.

April 7, 2020. Implement two models based on Mobilenetv3: Yolov3-Mobilenet, and Yolov3tin-Mobilene-small, provide pre-training weights, extend the normal pruning methods to the two Mobilenet-based models.

April 27, 2020. Update mobilenetv3 pre-training weights, add a layer pruning method, methods from the tanluren/yolov3-channel-and-layer-pruning/yolov3, Thanks for sharing.

May 22, 2020. Updated some new optimizations from ultralytics/yolov3, update cfg file and weights of YOLOv4.

May 22, 2020. The 8-bit quantization method was updated and some bugs were fixed.

July 12, 2020. The problem of mAP returning to 0 after pruning in yolov3-mobilenet was fixed. See issue#41 for more details.

September 30, 2020. The BN_Fold training method was updated to reduce the precision loss caused by BN fusion, and the POW (2) quantization method targeted at FPGA was updated. See the quantization section for details.

Requirements

Our project based on ultralytics/yolov3, see ultralytics/yolov3 for details. Here is a brief explanation:

  • numpy
  • torch >= 1.1.0
  • opencv-python
  • tqdm

Current support

Function
Multi-Backbone training
Multi-Datasets
Pruning
Quantization
Knowledge Distillation

Training

python3 train.py --data ... --cfg ...For training model command, the -pt command is required when using coco pre-training model.

python3 test.py --data ... --cfg ... For testing model command

python3 detect.py --data ... --cfg ... --source ... For detecting model command, the default address of source is data/samples, the output result is saved in the /output, and the detection resource can be pictures and videos.

Multi-Datasets

This project provides preprocessed datasets for the YOLOv3, configuration files (.cfg), dataset index files (.data), dataset category files (.names), and anchor box sizes (including 9 boxes for YOLOv3 and 6 boxes for tiny- YOLOv3) that are reclustered using the K-means algorithm.

mAP

Dataset YOLOv3-640 YOLOv4-640 YOLOv3-mobilenet-640
Dior 0.749
bdd100k 0.543
visdrone 0.311 0.383 0.348

Datasets, download and unzip to /data.

Training command

python3 train.py --data data/coco2017.data --batch-size ... --weights weights/yolov3-608.weights -pt --cfg cfg/yolov3/yolov3.cfg --img-size ... --epochs ...

Training command

python3 train.py --data data/dior.data --batch-size ... --weights weights/yolov3-608.weights -pt --cfg cfg/yolov3/yolov3-onDIOR.cfg --img-size ... --epochs ...

Training command

python3 train.py --data data/bdd100k.data --batch-size ... --weights weights/yolov3-608.weights -pt --cfg cfg/yolov3/yolov3-bdd100k.cfg --img-size ... --epochs ...

Extract code:fb6y

Training command

python train.py --data data/visdrone.data --batch-size ... --weights weights/yolov3-608.weights -pt --cfg cfg/yolov3/yolov3-visdrone.cfg  --img-size ... --epochs ...

Training command

python train.py --data data/oxfordhand.data --batch-size ... --weights weights/yolov3-608.weights -pt --cfg cfg/yolov3/yolov3-hand.cfg  --img-size ... --epochs ...

1.Dior

The DIRO dataset is one of the largest, most diverse, and publicly available object detection datasets in the Earth observation community. Among them, the number of instances of ships and vehicles is high, which achieves a good balance between small instances and large ones. The images were collected from Google Earth.

Introduction

Test results

Test results Test results

2.bdd100k

Bdd100 is a large, diverse data set of driving videos containing 100,000 videos. Each video was about 40 seconds long, and the researchers marked bounding boxes for all 100,000 key frames of objects that often appeared on the road. The data set covers different weather conditions, including sunny, cloudy and rainy days, and different times of day and night.

Website

Download

Paper

Test results

Test results

3.Visdrone

The VisDrone2019 dataset was collected by AISKYEYE team at the Machine Learning and Data Mining Laboratory at Tianjin University, China. Benchmark data set contains 288 video clips, and consists of 261908 frames and 10209 frames a static image, by all sorts of installed on the unmanned aerial vehicle (uav) camera capture, covers a wide range of aspects, including location (thousands of kilometers apart from China in 14 different cities), environment (city and country), object (pedestrians, vehicles, bicycles, etc.) and density (sparse and crowded scenario). This data set was collected using a variety of uav platforms (i.e., uAvs with different models) in a variety of situations and under various weather and light conditions. These frames are manually marked with more than 2.6 million border frames, which are often targets of interest, such as pedestrians, cars, bicycles and tricycles. Some important attributes are also provided, including scene visibility, object categories, and occlusion, to improve data utilization.

Website

Test results of YOLOv3

Test results

Test results of YOLOv4

Test results Test results

Multi-Backbone

Based on mobilenetv3, two network structures are designed.

Structure backbone Postprocessing Parameters GFLOPS mAP0.5 mAP0.5:0.95 speed(inference/NMS/total) FPS
YOLOv3 38.74M 20.39M 59.13M 117.3 0.580 0.340 12.3/1.7/14.0 ms 71.4fps
YOLOv3tiny 6.00M 2.45M 8.45M 9.9 0.347 0.168 3.5/1.8/5.3 ms 188.7fps
YOLOv3-mobilenetv3 2.84M 20.25M 23.09M 32.2 0.547 0.346 7.9/1.8/9.7 ms 103.1fps
YOLOv3tiny-mobilenetv3-small 0.92M 2.00M 2.92M 2.9 0.379 0.214 5.2/1.9/7.1 ms 140.8fps
YOLOv4 - - 61.35M 107.1 0.650 0.438 13.5/1.8/15.3 ms 65.4fps
YOLOv4-tiny - - 5.78M 12.3 0.435 0.225 4.1/1.7/5.8 ms 172.4fps
  1. YOLOv3,YOLOv3tiny and YOLOv4 were trained and tested on coco2014, and Yolov3-Mobilenetv3 and YOLOv3tiny Mobilenetv3-Small were trained and tested on coco2017.

  2. The reasoning speed is tested on GTX2080ti*4, and the image size is 608.

  3. The training set should match the testing set, because mismatch will cause the mistakes of mAP. Read issue for detial.

Train command

1.YOLOv3

python3 train.py --data data/... --batch-size ... -pt --weights weights/yolov3-608.weights --cfg cfg/yolov3/yolov3.cfg --img_size ...

Weights Download

2.YOLOv3tiny

python3 train.py --data data/... --batch-size ... -pt --weights weights/yolov3tiny.weights --cfg cfg/yolov3tiny/yolov3-tiny.cfg --img_size ...

3.YOLOv3tiny-mobilenet-small

python3 train.py --data data/... --batch-size ... -pt --weights weights/yolov3tiny-mobilenet-small.weights --cfg cfg/yolov3tiny-mobilenet-small/yolov3tiny-mobilenet-small-coco.cfg --img_size ...

4.YOLOv3-mobilenet

python3 train.py --data data/... --batch-size ... -pt --weights weights/yolov3-mobilenet.weights --cfg cfg/yolov3-mobilenet/yolov3-mobilenet-coco.cfg --img_size ...

5.YOLOv4

python3 train.py --data data/... --batch-size ... -pt --weights weights/yolov4.weights --cfg cfg/yolov4/yolov4.cfg --img_size ...

Model Compression

1. Pruning

Features

method advantage disadvantage
Normal pruning Not prune for shortcut layer. It has a considerable and stable compression rate but requires no fine tuning. The compression rate is limited.
Shortcut pruning Very high compression rate. Fine-tuning is necessary.
Silmming Shortcut fusion method is used to improve the precision of shear planting. Best way for shortcut pruning
Regular pruning Designed for hardware deployment, the number of filters after pruning is a multiple of 2, no fine-tuning, support tiny-yolov3 and Mobilenet. Part of the compression ratio is sacrificed for regularization.
layer pruning ResBlock is used as the basic unit for purning, which is conducive to hardware deployment. It can only cut backbone.
layer-channel pruning First, use channel pruning and then use layer pruning, and pruning rate was very high. Accuracy may be affected.

Step

1.Training

python3 train.py --data ... -pt --batch-size ... --weights ... --cfg ...

2.Sparse training

--sSpecifies the sparsity factor,--pruneSpecify the sparsity type.

--prune 0 is the sparsity of normal pruning and regular pruning.

--prune 1 is the sparsity of shortcut pruning.

--prune 2 is the sparsity of layer pruning.

command:

python3 train.py --data ... -pt --batch-size 32  --weights ... --cfg ... --s 0.001 --prune 0 

3.Pruning

  • normal pruning
python3 normal_prune.py --cfg ... --data ... --weights ... --percent ...
  • regular pruning
python3 regular_prune.py --cfg ... --data ... --weights ... --percent ...
  • shortcut pruning
python3 shortcut_prune.py --cfg ... --data ... --weights ... --percent ...
  • silmming
python3 slim_prune.py --cfg ... --data ... --weights ... --percent ...
  • layer pruning
python3 layer_prune.py --cfg ... --data ... --weights ... --shortcut ...
  • layer-channel pruning
python3 layer_channel_prune.py --cfg ... --data ... --weights ... --shortcut ... --percent ...

It is important to note that the cfg and weights variables in OPT need to be pointed to the cfg and weights files generated by step 2.

In addition, you can get more compression by increasing the percent value in the code. (If the sparsity is not enough and the percent value is too high, the program will report an error.)

Pruning experiment

1.normal pruning oxfordhand,img_size = 608,test on GTX2080Ti*4

model parameter before pruning mAP before pruning inference time before pruning percent parameter after pruning mAP after pruning inference time after pruning
yolov3(without fine tuning) 58.67M 0.806 0.1139s 0.8 10.32M 0.802 0.0844s
yolov3-mobilenet(fine tuning) 22.75M 0.812 0.0345s 0.97 2.72M 0.795 0.0211s
yolov3tiny(fine tuning) 8.27M 0.708 0.0144s 0.5 1.13M 0.641 0.0116s

2.regular pruning oxfordhand,img_size = 608,test ong GTX2080Ti*4

model parameter before pruning mAP before pruning inference time before pruning percent parameter after pruning mAP after pruning inference time after pruning
yolov3(without fine tuning) 58.67M 0.806 0.1139s 0.8 12.15M 0.805 0.0874s
yolov3-mobilenet(fine tuning) 22.75M 0.812 0.0345s 0.97 2.75M 0.803 0.0208s
yolov3tiny(fine tuning) 8.27M 0.708 0.0144s 0.5 1.82M 0.703 0.0122s

3.shortcut pruning oxfordhand,img_size = 608,test ong GTX2080Ti*4

model parameter before pruning mAP before pruning inference time before pruning percent parameter after pruning mAP after pruning inference time after pruning
yolov3 58.67M 0.806 0.8 6.35M 0.816
yolov4 60.94M 0.896 0.6 13.97M 0.855

2.quantization

--quantized 2 Dorefa quantization method

python train.py --data ... --batch-size ... --weights ... --cfg ... --img-size ... --epochs ... --quantized 2

--quantized 1 Google quantization method

python train.py --data ... --batch-size ... --weights ... --cfg ... --img-size ... --epochs ... --quantized 1

--FPGA Pow(2) quantization for FPGA.

experiment

oxfordhand, yolov3, 640image-size

method mAP
Baseline 0.847
Google8bit 0.851
Google8bit + BN Flod 0.851
Google8bit + BN Flod + FPGA 0.852
Google4bit + BN Flod + FPGA 0.842

3.Knowledge Distillation

Knowledge Distillation

The distillation method is based on the basic distillation method proposed by Hinton in 2015, and has been partially improved in combination with the detection network.

Distilling the Knowledge in a Neural Network paper

command : --t_cfg --t_weights --KDstr

--t_cfg cfg file of teacher model

--t_weights weights file of teacher model

--KDstr KD strategy

`--KDstr 1` KLloss can be obtained directly from the output of teacher network and the output of student network and added to the overall loss.
`--KDstr 2` To distinguish between box loss and class loss, the student does not learn directly from the teacher. L2 distance is calculated respectively for student, teacher and GT. When student is greater than teacher, an additional loss is added for student and GT.
`--KDstr 3` To distinguish between Boxloss and ClassLoss, the student learns directly from the teacher.
`--KDstr 4` KDloss is divided into three categories, box loss, class loss and feature loss.
`--KDstr 5` On the basis of KDstr 4, the fine-grain-mask is added into the feature

example:

python train.py --data ... --batch-size ... --weights ... --cfg ... --img-size ... --epochs ... --t_cfg ... --t_weights ...

Usually, the pre-compression model is used as the teacher model, and the post-compression model is used as the student model for distillation training to improve the mAP of student network.

experiment

oxfordhand,yolov3tiny as teacher model,normal pruning yolov3tiny as student model

teacher model mAP of teacher model student model directly fine tuning KDstr 1 KDstr 2 KDstr 3 KDstr 4(L1) KDstr 5(L1)
yolov3tiny608 0.708 normal pruning yolov3tiny608 0.658 0.666 0.661 0.672 0.673 0.674
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].