All Projects → NVIDIA-AI-IOT → Tf_trt_models

NVIDIA-AI-IOT / Tf_trt_models

Licence: other
TensorFlow models accelerated with NVIDIA TensorRT

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Tf trt models

Bmw Tensorflow Inference Api Gpu
This is a repository for an object detection inference API using the Tensorflow framework.
Stars: ✭ 277 (-55.39%)
Mutual labels:  object-detection, nvidia, inference
Cv Pretrained Model
A collection of computer vision pre-trained models.
Stars: ✭ 995 (+60.23%)
Mutual labels:  object-detection, models, image-classification
Jetson Inference
Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
Stars: ✭ 5,191 (+735.91%)
Mutual labels:  object-detection, nvidia, inference
Lightnet
🌓 Bringing pjreddie's DarkNet out of the shadows #yolo
Stars: ✭ 322 (-48.15%)
Mutual labels:  object-detection, image-classification
Autogluon
AutoGluon: AutoML for Text, Image, and Tabular Data
Stars: ✭ 3,920 (+531.24%)
Mutual labels:  object-detection, image-classification
Alturos.yolo
C# Yolo Darknet Wrapper (real-time object detection)
Stars: ✭ 308 (-50.4%)
Mutual labels:  object-detection, image-classification
Opennars
OpenNARS for Research 3.0+
Stars: ✭ 264 (-57.49%)
Mutual labels:  inference, realtime
Rexnet
Official Pytorch implementation of ReXNet (Rank eXpansion Network) with pretrained models
Stars: ✭ 319 (-48.63%)
Mutual labels:  object-detection, image-classification
Involution
[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator
Stars: ✭ 252 (-59.42%)
Mutual labels:  object-detection, image-classification
Rectlabel Support
RectLabel - An image annotation tool to label images for bounding box object detection and segmentation.
Stars: ✭ 338 (-45.57%)
Mutual labels:  object-detection, image-classification
Trainyourownyolo
Train a state-of-the-art yolov3 object detector from scratch!
Stars: ✭ 399 (-35.75%)
Mutual labels:  object-detection, inference
Awesome Computer Vision Models
A list of popular deep learning models related to classification, segmentation and detection problems
Stars: ✭ 278 (-55.23%)
Mutual labels:  object-detection, image-classification
Gfocal
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection, NeurIPS2020
Stars: ✭ 376 (-39.45%)
Mutual labels:  object-detection, inference
Object Detection
Object detection project for real-time (webcam) and offline (video processing) application.
Stars: ✭ 454 (-26.89%)
Mutual labels:  object-detection, realtime
Neural Pipeline
Neural networks training pipeline based on PyTorch
Stars: ✭ 315 (-49.28%)
Mutual labels:  object-detection, image-classification
Mmdetection To Tensorrt
convert mmdetection model to tensorrt, support fp16, int8, batch input, dynamic shape etc.
Stars: ✭ 262 (-57.81%)
Mutual labels:  object-detection, inference
Face recognition
🍎 My own face recognition with deep neural networks.
Stars: ✭ 328 (-47.18%)
Mutual labels:  object-detection, image-classification
barracuda-style-transfer
Companion code for the Unity Style Transfer blog post, showcasing realtime style transfer using Barracuda.
Stars: ✭ 126 (-79.71%)
Mutual labels:  realtime, inference
Deep-Learning
It contains the coursework and the practice I have done while learning Deep Learning.🚀 👨‍💻💥 🚩🌈
Stars: ✭ 21 (-96.62%)
Mutual labels:  nvidia, image-classification
Sianet
An easy to use C# deep learning library with CUDA/OpenCL support
Stars: ✭ 353 (-43.16%)
Mutual labels:  object-detection, image-classification

TensorFlow/TensorRT Models on Jetson

landing graphic

This repository contains scripts and documentation to use TensorFlow image classification and object detection models on NVIDIA Jetson. The models are sourced from the TensorFlow models repository and optimized using TensorRT.

Setup

  1. Flash your Jetson TX2 with JetPack 3.2 (including TensorRT).

  2. Install miscellaneous dependencies on Jetson

    sudo apt-get install python-pip python-matplotlib python-pil
    
  3. Install TensorFlow 1.7+ (with TensorRT support). Download the pre-built pip wheel and install using pip.

    pip install tensorflow-1.8.0-cp27-cp27mu-linux_aarch64.whl --user
    

    or if you're using Python 3.

    pip3 install tensorflow-1.8.0-cp35-cp35m-linux_aarch64.whl --user
    
  4. Clone this repository

    git clone --recursive https://github.com/NVIDIA-Jetson/tf_trt_models.git
    cd tf_trt_models
    
  5. Run the installation script

    ./install.sh
    

    or if you want to specify python intepreter

    ./install.sh python3
    

Image Classification

classification

Models

Model Input Size TF-TRT TX2 TF TX2
inception_v1 224x224 7.36ms 22.9ms
inception_v2 224x224 9.08ms 31.8ms
inception_v3 299x299 20.7ms 74.3ms
inception_v4 299x299 38.5ms 129ms
inception_resnet_v2 299x299 158ms
resnet_v1_50 224x224 12.5ms 55.1ms
resnet_v1_101 224x224 20.6ms 91.0ms
resnet_v1_152 224x224 28.9ms 124ms
resnet_v2_50 299x299 26.5ms 73.4ms
resnet_v2_101 299x299 46.9ms
resnet_v2_152 299x299 69.0ms
mobilenet_v1_0p25_128 128x128 3.72ms 7.99ms
mobilenet_v1_0p5_160 160x160 4.47ms 8.69ms
mobilenet_v1_1p0_224 224x224 11.1ms 17.3ms

TF - Original TensorFlow graph (FP32)

TF-TRT - TensorRT optimized graph (FP16)

The above benchmark timings were gathered after placing the Jetson TX2 in MAX-N mode. To do this, run the following commands in a terminal:

sudo nvpmodel -m 0
sudo ~/jetson_clocks.sh

Download pretrained model

As a convenience, we provide a script to download pretrained models sourced from the TensorFlow models repository.

from tf_trt_models.classification import download_classification_checkpoint

checkpoint_path = download_classification_checkpoint('inception_v2')

To manually download the pretrained models, follow the links here.

Build TensorRT / Jetson compatible graph

from tf_trt_models.classification import build_classification_graph

frozen_graph, input_names, output_names = build_classification_graph(
    model='inception_v2',
    checkpoint=checkpoint_path,
    num_classes=1001
)

Optimize with TensorRT

import tensorflow.contrib.tensorrt as trt

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,
    outputs=output_names,
    max_batch_size=1,
    max_workspace_size_bytes=1 << 25,
    precision_mode='FP16',
    minimum_segment_size=50
)

Jupyter Notebook Sample

For a comprehensive example of performing the above steps and executing on a real image, see the jupyter notebook sample.

Train for custom task

Follow the documentation from the TensorFlow models repository. Once you have obtained a checkpoint, proceed with building the graph and optimizing with TensorRT as shown above.

Object Detection

detection

Models

Model Input Size TF-TRT TX2 TF TX2
ssd_mobilenet_v1_coco 300x300 50.5ms 72.9ms
ssd_inception_v2_coco 300x300 54.4ms 132ms

TF - Original TensorFlow graph (FP32)

TF-TRT - TensorRT optimized graph (FP16)

The above benchmark timings were gathered after placing the Jetson TX2 in MAX-N mode. To do this, run the following commands in a terminal:

sudo nvpmodel -m 0
sudo ~/jetson_clocks.sh

Download pretrained model

As a convenience, we provide a script to download pretrained model weights and config files sourced from the TensorFlow models repository.

from tf_trt_models.detection import download_detection_model

config_path, checkpoint_path = download_detection_model('ssd_inception_v2_coco')

To manually download the pretrained models, follow the links here.

Important: Some of the object detection configuration files have a very low non-maximum suppression score threshold (ie. 1e-8). This can cause unnecessarily large CPU post-processing load. Depending on your application, it may be advisable to raise this value to something larger (like 0.3) for improved performance. We do this for the above benchmark timings. This can be done by modifying the configuration file directly before calling build_detection_graph. The parameter can be found for example in this line.

Build TensorRT / Jetson compatible graph

from tf_trt_models.detection import build_detection_graph

frozen_graph, input_names, output_names = build_detection_graph(
    config=config_path,
    checkpoint=checkpoint_path
)

Optimize with TensorRT

import tensorflow.contrib.tensorrt as trt

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,
    outputs=output_names,
    max_batch_size=1,
    max_workspace_size_bytes=1 << 25,
    precision_mode='FP16',
    minimum_segment_size=50
)

Jupyter Notebook Sample

For a comprehensive example of performing the above steps and executing on a real image, see the jupyter notebook sample.

Train for custom task

Follow the documentation from the TensorFlow models repository. Once you have obtained a checkpoint, proceed with building the graph and optimizing with TensorRT as shown above. Please note that all models are not tested so you should use an object detection config file during training that resembles one of the ssd_mobilenet_v1_coco or ssd_inception_v2_coco models. Some config parameters may be modified, such as the number of classes, image size, non-max supression parameters, but the performance may vary.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].