All Projects → ptaxom → pnn

ptaxom / pnn

Licence: other
pnn is Darknet compatible neural nets inference engine implemented in Rust.

Programming Languages

rust
11053 projects
C++
36643 projects - #6 most used programming language
Cuda
1817 projects

Projects that are alternatives of or similar to pnn

Viseron
Self-hosted NVR with object detection
Stars: ✭ 192 (+1029.41%)
Mutual labels:  yolo, darknet
yolor
implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)
Stars: ✭ 1,867 (+10882.35%)
Mutual labels:  yolo, darknet
Pine
🌲 Aimbot powered by real-time object detection with neural networks, GPU accelerated with Nvidia. Optimized for use with CS:GO.
Stars: ✭ 202 (+1088.24%)
Mutual labels:  yolo, darknet
Darknet2ncnn
Darknet2ncnn converts the darknet model to the ncnn model
Stars: ✭ 149 (+776.47%)
Mutual labels:  yolo, darknet
OpenCV-Flask
🐛 🐛 Opencv视频流传输到网页浏览器并做目标检测 🐛 🐛
Stars: ✭ 35 (+105.88%)
Mutual labels:  yolo, darknet
Map
mean Average Precision - This code evaluates the performance of your neural net for object recognition.
Stars: ✭ 2,324 (+13570.59%)
Mutual labels:  yolo, darknet
darknet-nnpack
Darknet with NNPACK
Stars: ✭ 302 (+1676.47%)
Mutual labels:  yolo, darknet
Yolo mark
GUI for marking bounded boxes of objects in images for training neural network Yolo v3 and v2
Stars: ✭ 1,624 (+9452.94%)
Mutual labels:  yolo, darknet
darknet
php ffi darknet
Stars: ✭ 21 (+23.53%)
Mutual labels:  yolo, darknet
vehicle-rear
Vehicle-Rear: A New Dataset to Explore Feature Fusion For Vehicle Identification Using Convolutional Neural Networks
Stars: ✭ 99 (+482.35%)
Mutual labels:  yolo, darknet
Yolo segmentation
image (semantic segmentation) instance segmentation by darknet or yolo
Stars: ✭ 143 (+741.18%)
Mutual labels:  yolo, darknet
ros-yolo-sort
YOLO v3, v4, v5, v6, v7 + SORT tracking + ROS platform. Supporting: YOLO with Darknet, OpenCV(DNN), OpenVINO, TensorRT(tkDNN). SORT supports python(original) and C++. (Not Deep SORT)
Stars: ✭ 162 (+852.94%)
Mutual labels:  yolo, tensorrt
Yolo Powered robot vision
Stars: ✭ 133 (+682.35%)
Mutual labels:  yolo, darknet
Deepstream Yolo
NVIDIA DeepStream SDK 5.1 configuration for YOLO models
Stars: ✭ 166 (+876.47%)
Mutual labels:  yolo, darknet
Pyyolo
Simple python wrapper for YOLO.
Stars: ✭ 128 (+652.94%)
Mutual labels:  yolo, darknet
Yolo person detect
person detect based on yolov3 with several Python scripts
Stars: ✭ 212 (+1147.06%)
Mutual labels:  yolo, darknet
Yolo2 Pytorch
YOLOv2 in PyTorch
Stars: ✭ 1,393 (+8094.12%)
Mutual labels:  yolo, darknet
Mobilenet Yolo
MobileNetV2-YoloV3-Nano: 0.5BFlops 3MB HUAWEI P40: 6ms/img, YoloFace-500k:0.1Bflops 420KB🔥🔥🔥
Stars: ✭ 1,566 (+9111.76%)
Mutual labels:  yolo, darknet
DarkPlate
License plate parsing using Darknet and YOLO
Stars: ✭ 36 (+111.76%)
Mutual labels:  yolo, darknet
darknet-vis
Visualize YOLO feature map in prediction for easily checking your model performance
Stars: ✭ 68 (+300%)
Mutual labels:  yolo, darknet

pnn

pnn is Darknet compatible neural nets inference engine implemented in Rust. By optimizing was achieved significant performance increment(especially in FP16 mode). pnn provide CUDNN-based and TensorRT-based inference engines.

FPS Performance

Performance is measured at RTX 3070Ti, TensorRT v8.2.1, CUDNN v8.3.0, NVCC/CUDA Runtime 11.5, SM=80. For fair comparison was used tkDNN from tensorrt8 branch.

  • YOLOv4 CSP 512x512
    Configuration Darknet tkDNN pnn + CUDNN pnn + TensorRT
    BS=1, FP32 87.8 98.2(112.9**) 98.1(107.0**) 108.7(119.6**)
    BS=1, FP16 99.9* 221.2(359.0**) 159(183.7**) 197.3(238.0**)
    BS=4, FP32 - 121.0(129.3**) 117.4(517**) 130.1(590.0**)
    BS=4, FP16 - 268.3(493.4**) 193.2(869.0**) 230.7(1150.5**)
  • YOLOv4 416x416[WIP]
    Configuration Darknet tkDNN pnn(CUDNN) pnn(TensorRT)
    BS=1, FP32 60.3 121.7(133.8*) N/A N/A
    BS=1, FP16 69.7 290.5(455.1*) N/A N/A
    BS=4, FP32 N/A 161.4(179.8*) N/A N/A
    BS=4, FP16 N/A 365.1(632.1*) N/A N/A

* - Actually, Darknet hasnt FP16 mode, it operate in mixed precision

** - Main value is full inference time, including reading, preprocessing and postprocessing. Value in brackets is clear inference time. During benchmark nor of Darknet, tkDNN or pnn doesnt render video to screen/file. If perform benchmark with render you will get 3-5% decreasing for pnn/Darknet with multithreaded loader/renderer and ~30% for tkDNN with single threaded renderer.

Usage

  • To build TensorRT engine use
    $ ./pnn build --help    
    pnn-build 
    
    Build TensorRT engine file
    
    USAGE:
        pnn build [OPTIONS] --weights <WEIGHTS> --config <CONFIG>
    
    OPTIONS:
        -b, --batchsize <BATCHSIZE>    Batchsize. [default: 1]
        -c, --config <CONFIG>          Path to config
        -h, --help                     Print help information
            --half                     Build HALF precision engine
        -o, --output <OUTPUT>          Output engine
        -w, --weights <WEIGHTS>        Path to weights
    # For example
    $ ./pnn build -b 4 -c ../../cfgs/tests/yolov4-csp.cfg -w ../../../models/yolov4-csp.weights -o ../../yolo_fp16_bs4.engine --half
  • To run/benchmark/render use
    ./pnn benchmark --help
    
    Do performance benchmark
    
    USAGE:
        pnn benchmark [OPTIONS] --weights <WEIGHTS> --config <CONFIG> --input <INPUT>
    
    OPTIONS:
        -b, --batchsize <BATCHSIZE>          Batchsize [default: 1]
        -c, --config <CONFIG>                Path to config
            --classes-file <CLASSES_FILE>    Confidence threshold [default: ./cfgs/tests/coco.names]
        -h, --help                           Print help information
            --half                           Build HALF precision engine
        -i, --input <INPUT>                  Input file
            --iou-tresh <IOU_TRESH>          Confidence threshold [default: 0.45]
        -o, --output <OUTPUT>                Output render file
        -s, --show                           Render window during work
            --threshold <THRESHOLD>          Confidence threshold [default: 0.45]
            --trt                            Load as TensorRT engine
        -w, --weights <WEIGHTS>              Path to weights
    # For example
    $ ./pnn benchmark -w ~/Sources/models/yolov4-p6.weights -c ~/Sources/models/yolov4-p6.cfg -s -b 1 -i ~/Sources/models/yolo_test.mp4 # Run yolov4-p6 with darknet FP32 and BS1 engine and render result to screen
    
    $ ./pnn benchmark --trt --weights yolo_fp16_bs4.engine -c cfgs/tests/yolov4-csp.cfg --input ../models/yolo_test.mp4 --output res.avi # Run yolo_fp16_bs4.engine engine with predefined in build-time settings and save to result to res.avi
    The result would be like this one
    Stats for      ../models/yolo_test.mp4
    Data type:     FP16
    Batchsize:     1
    Total frames:  1213
    FPS:           147.48 # END-TO-END FPS, including reading/preprocessing/rendering time
    INF+NMS FPS:   159.13 # Inference time + post processing FPS
    Inference FPS: 173.68 # Only inference measured. In case bs != 1 counted by bs * inference FPS
    
  • To show model architecture use
    $ ./pnn dot --help
    Build dot graph of model
    
    USAGE:
        pnn dot --config <CONFIG> --output <OUTPUT>
    
    OPTIONS:
        -c, --config <CONFIG>    Path to config
        -h, --help               Print help information
        -o, --output <OUTPUT>    Output dot file
    # For further conversion use 
    $ dot -Tpng path.dot > path.png
    YOLOv4 CSP architecture

Requirements

  • Rust 2021 edition
  • Clang ≥ 13.0
  • GCC ≥ 9.0
  • NVCC ≥ 10
  • CUDNN ≥ 8
  • TensorRT ≥ 8
  • OpenCV ≥ 4.4

Roadmap

  • CUDNN FP32 Support
  • dot files render
  • TensorRT FP32 Support
  • FP16 mode
  • Python bindings
  • Releases & packages for Rust/C++/Python
  • INT8 support
  • Refitting engine
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].