All Projects → taehoonlee → Tensornets

taehoonlee / Tensornets

Licence: mit
High level network definitions with pre-trained weights in TensorFlow

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Tensornets

Keras Idiomatic Programmer
Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework
Stars: ✭ 720 (-26.68%)
Mutual labels:  resnet, mobilenet, densenet, inception, vgg
Myvision
Computer vision based ML training data generation tool 🚀
Stars: ✭ 453 (-53.87%)
Mutual labels:  object-detection, yolo, model, vgg
Tf Faster Rcnn
Tensorflow Faster RCNN for Object Detection
Stars: ✭ 3,604 (+267.01%)
Mutual labels:  object-detection, resnet, mobilenet, faster-rcnn
Yolov5 ncnn
🍅 Deploy NCNN on mobile phones. Support Android and iOS. 移动端NCNN部署,支持Android与iOS。
Stars: ✭ 535 (-45.52%)
Mutual labels:  object-detection, yolo, yolov3, mobilenet
Tianchi Medical Lungtumordetect
天池医疗AI大赛[第一季]:肺部结节智能诊断 UNet/VGG/Inception/ResNet/DenseNet
Stars: ✭ 314 (-68.02%)
Mutual labels:  resnet, densenet, inception, vgg
Tensorrtx
Implementation of popular deep learning networks with TensorRT network definition API
Stars: ✭ 3,456 (+251.93%)
Mutual labels:  resnet, yolov3, mobilenetv2, vgg
Mobilenet Yolo
MobileNetV2-YoloV3-Nano: 0.5BFlops 3MB HUAWEI P40: 6ms/img, YoloFace-500k:0.1Bflops 420KB🔥🔥🔥
Stars: ✭ 1,566 (+59.47%)
Mutual labels:  object-detection, yolo, yolov3, mobilenetv2
Classification models
Classification models trained on ImageNet. Keras.
Stars: ✭ 938 (-4.48%)
Mutual labels:  resnet, mobilenet, densenet, vgg
Sightseq
Computer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection
Stars: ✭ 116 (-88.19%)
Mutual labels:  object-detection, mobilenet, faster-rcnn, densenet
Keras Yolov3 Mobilenet
I transfer the backend of yolov3 into Mobilenetv1,VGG16,ResNet101 and ResNeXt101
Stars: ✭ 552 (-43.79%)
Mutual labels:  object-detection, yolo, yolov3, mobilenet
Tensorflow object tracking video
Object Tracking in Tensorflow ( Localization Detection Classification ) developed to partecipate to ImageNET VID competition
Stars: ✭ 491 (-50%)
Mutual labels:  object-detection, yolo, inception
Yolo3 4 Py
A Python wrapper on Darknet. Compatible with YOLO V3.
Stars: ✭ 504 (-48.68%)
Mutual labels:  object-detection, yolo, yolov3
Bmw Yolov4 Training Automation
This repository allows you to get started with training a state-of-the-art Deep Learning model with little to no configuration needed! You provide your labeled dataset or label your dataset using our BMW-LabelTool-Lite and you can start the training right away and monitor it in many different ways like TensorBoard or a custom REST API and GUI. NoCode training with YOLOv4 and YOLOV3 has never been so easy.
Stars: ✭ 533 (-45.72%)
Mutual labels:  object-detection, yolo, yolov3
Yolo Vehicle Counter
This project aims to count every vehicle (motorcycle, bus, car, cycle, truck, train) detected in the input video using YOLOv3 object-detection algorithm.
Stars: ✭ 28 (-97.15%)
Mutual labels:  object-detection, yolo, yolov3
Trainyourownyolo
Train a state-of-the-art yolov3 object detector from scratch!
Stars: ✭ 399 (-59.37%)
Mutual labels:  object-detection, yolo, yolov3
Stronger Yolo
🔥Improve yolo with latest paper
Stars: ✭ 539 (-45.11%)
Mutual labels:  yolo, yolov3, mobilenetv2
Yolo annotation tool
Annotation tool for YOLO in opencv
Stars: ✭ 17 (-98.27%)
Mutual labels:  object-detection, yolo, yolov3
Mobilenet Yolo
A caffe implementation of MobileNet-YOLO detection network
Stars: ✭ 825 (-15.99%)
Mutual labels:  yolo, yolov3, mobilenet
Yolov3 pytorch
Full implementation of YOLOv3 in PyTorch
Stars: ✭ 570 (-41.96%)
Mutual labels:  object-detection, yolo, yolov3
Tensorflow Yolo V3
Implementation of YOLO v3 object detector in Tensorflow (TF-Slim)
Stars: ✭ 862 (-12.22%)
Mutual labels:  object-detection, yolo, yolov3

TensorNets Build Status

High level network definitions with pre-trained weights in TensorFlow (tested with 2.1.0 >= TF >= 1.4.0).

Guiding principles

  • Applicability. Many people already have their own ML workflows, and want to put a new model on their workflows. TensorNets can be easily plugged together because it is designed as simple functional interfaces without custom classes.
  • Manageability. Models are written in tf.contrib.layers, which is lightweight like PyTorch and Keras, and allows for ease of accessibility to every weight and end-point. Also, it is easy to deploy and expand a collection of pre-processing and pre-trained weights.
  • Readability. With recent TensorFlow APIs, more factoring and less indenting can be possible. For example, all the inception variants are implemented as about 500 lines of code in TensorNets while 2000+ lines in official TensorFlow models.
  • Reproducibility. You can always reproduce the original results with simple APIs including feature extractions. Furthermore, you don't need to care about a version of TensorFlow beacuse compatibilities with various releases of TensorFlow have been checked with Travis.

Installation

You can install TensorNets from PyPI (pip install tensornets) or directly from GitHub (pip install git+https://github.com/taehoonlee/tensornets.git).

A quick example

Each network (see full list) is not a custom class but a function that takes and returns tf.Tensor as its input and output. Here is an example of ResNet50:

import tensorflow as tf
# import tensorflow.compat.v1 as tf  # for TF 2
import tensornets as nets
# tf.disable_v2_behavior()  # for TF 2

inputs = tf.placeholder(tf.float32, [None, 224, 224, 3])
model = nets.ResNet50(inputs)

assert isinstance(model, tf.Tensor)

You can load an example image by using utils.load_img returning a np.ndarray as the NHWC format:

img = nets.utils.load_img('cat.png', target_size=256, crop_size=224)
assert img.shape == (1, 224, 224, 3)

Once your network is created, you can run with regular TensorFlow APIs 😊 because all the networks in TensorNets always return tf.Tensor. Using pre-trained weights and pre-processing are as easy as pretrained() and preprocess() to reproduce the original results:

with tf.Session() as sess:
    img = model.preprocess(img)  # equivalent to img = nets.preprocess(model, img)
    sess.run(model.pretrained())  # equivalent to nets.pretrained(model)
    preds = sess.run(model, {inputs: img})

You can see the most probable classes:

print(nets.utils.decode_predictions(preds, top=2)[0])
[(u'n02124075', u'Egyptian_cat', 0.28067636), (u'n02127052', u'lynx', 0.16826575)]

You can also easily obtain values of intermediate layers with middles() and outputs():

with tf.Session() as sess:
    img = model.preprocess(img)
    sess.run(model.pretrained())
    middles = sess.run(model.middles(), {inputs: img})
    outputs = sess.run(model.outputs(), {inputs: img})

model.print_middles()
assert middles[0].shape == (1, 56, 56, 256)
assert middles[-1].shape == (1, 7, 7, 2048)

model.print_outputs()
assert sum(sum((outputs[-1] - preds) ** 2)) < 1e-8

With load() and save(), your weight values can be restorable:

with tf.Session() as sess:
    model.init()
    # ... your training ...
    model.save('test.npz')

with tf.Session() as sess:
    model.load('test.npz')
    # ... your deployment ...

TensorNets enables us to deploy well-known architectures and benchmark those results faster ⚡️. For more information, you can check out the lists of utilities, examples, and architectures.

Object detection example

Each object detection model can be coupled with any network in TensorNets (see performance) and takes two arguments: a placeholder and a function acting as a stem layer. Here is an example of YOLOv2 for PASCAL VOC:

import tensorflow as tf
import tensornets as nets

inputs = tf.placeholder(tf.float32, [None, 416, 416, 3])
model = nets.YOLOv2(inputs, nets.Darknet19)

img = nets.utils.load_img('cat.png')

with tf.Session() as sess:
    sess.run(model.pretrained())
    preds = sess.run(model, {inputs: model.preprocess(img)})
    boxes = model.get_boxes(preds, img.shape[1:3])

Like other models, a detection model also returns tf.Tensor as its output. You can see the bounding box predictions (x1, y1, x2, y2, score) by using model.get_boxes(model_output, original_img_shape) and visualize the results:

from tensornets.datasets import voc
print("%s: %s" % (voc.classnames[7], boxes[7][0]))  # 7 is cat

import numpy as np
import matplotlib.pyplot as plt
box = boxes[7][0]
plt.imshow(img[0].astype(np.uint8))
plt.gca().add_patch(plt.Rectangle(
    (box[0], box[1]), box[2] - box[0], box[3] - box[1],
    fill=False, edgecolor='r', linewidth=2))
plt.show()

More detection examples such as FasterRCNN on VOC2007 are here 😎. Note that:

  • APIs of detection models are slightly different:

    • YOLOv3: sess.run(model.preds, {inputs: img}),
    • YOLOv2: sess.run(model, {inputs: img}),
    • FasterRCNN: sess.run(model, {inputs: img, model.scales: scale}),
  • FasterRCNN requires roi_pooling:

    • git clone https://github.com/deepsense-io/roi-pooling && cd roi-pooling && vi roi_pooling/Makefile and edit according to here,
    • python setup.py install.

Utilities

Besides pretrained() and preprocess(), the output tf.Tensor provides the following useful methods:

  • logits: returns the tf.Tensor logits (the values before the softmax),
  • middles() (=get_middles()): returns a list of all the representative tf.Tensor end-points,
  • outputs() (=get_outputs()): returns a list of all the tf.Tensor end-points,
  • weights() (=get_weights()): returns a list of all the tf.Tensor weight matrices,
  • summary() (=print_summary()): prints the numbers of layers, weight matrices, and parameters,
  • print_middles(): prints all the representative end-points,
  • print_outputs(): prints all the end-points,
  • print_weights(): prints all the weight matrices.
Example outputs of print methods are:
>>> model.print_middles()
Scope: resnet50
conv2/block1/out:0 (?, 56, 56, 256)
conv2/block2/out:0 (?, 56, 56, 256)
conv2/block3/out:0 (?, 56, 56, 256)
conv3/block1/out:0 (?, 28, 28, 512)
conv3/block2/out:0 (?, 28, 28, 512)
conv3/block3/out:0 (?, 28, 28, 512)
conv3/block4/out:0 (?, 28, 28, 512)
conv4/block1/out:0 (?, 14, 14, 1024)
...

>>> model.print_outputs()
Scope: resnet50
conv1/pad:0 (?, 230, 230, 3)
conv1/conv/BiasAdd:0 (?, 112, 112, 64)
conv1/bn/batchnorm/add_1:0 (?, 112, 112, 64)
conv1/relu:0 (?, 112, 112, 64)
pool1/pad:0 (?, 114, 114, 64)
pool1/MaxPool:0 (?, 56, 56, 64)
conv2/block1/0/conv/BiasAdd:0 (?, 56, 56, 256)
conv2/block1/0/bn/batchnorm/add_1:0 (?, 56, 56, 256)
conv2/block1/1/conv/BiasAdd:0 (?, 56, 56, 64)
conv2/block1/1/bn/batchnorm/add_1:0 (?, 56, 56, 64)
conv2/block1/1/relu:0 (?, 56, 56, 64)
...

>>> model.print_weights()
Scope: resnet50
conv1/conv/weights:0 (7, 7, 3, 64)
conv1/conv/biases:0 (64,)
conv1/bn/beta:0 (64,)
conv1/bn/gamma:0 (64,)
conv1/bn/moving_mean:0 (64,)
conv1/bn/moving_variance:0 (64,)
conv2/block1/0/conv/weights:0 (1, 1, 64, 256)
conv2/block1/0/conv/biases:0 (256,)
conv2/block1/0/bn/beta:0 (256,)
conv2/block1/0/bn/gamma:0 (256,)
...

>>> model.summary()
Scope: resnet50
Total layers: 54
Total weights: 320
Total parameters: 25,636,712

Examples

  • Comparison of different networks:
inputs = tf.placeholder(tf.float32, [None, 224, 224, 3])
models = [
    nets.MobileNet75(inputs),
    nets.MobileNet100(inputs),
    nets.SqueezeNet(inputs),
]

img = utils.load_img('cat.png', target_size=256, crop_size=224)
imgs = nets.preprocess(models, img)

with tf.Session() as sess:
    nets.pretrained(models)
    for (model, img) in zip(models, imgs):
        preds = sess.run(model, {inputs: img})
        print(utils.decode_predictions(preds, top=2)[0])
  • Transfer learning:
inputs = tf.placeholder(tf.float32, [None, 224, 224, 3])
outputs = tf.placeholder(tf.float32, [None, 50])
model = nets.DenseNet169(inputs, is_training=True, classes=50)

loss = tf.losses.softmax_cross_entropy(outputs, model.logits)
train = tf.train.AdamOptimizer(learning_rate=1e-5).minimize(loss)

with tf.Session() as sess:
    nets.pretrained(model)
    for (x, y) in your_NumPy_data:  # the NHWC and one-hot format
        sess.run(train, {inputs: x, outputs: y})
  • Using multi-GPU:
inputs = tf.placeholder(tf.float32, [None, 224, 224, 3])
models = []

with tf.device('gpu:0'):
    models.append(nets.ResNeXt50(inputs))

with tf.device('gpu:1'):
    models.append(nets.DenseNet201(inputs))

from tensornets.preprocess import fb_preprocess
img = utils.load_img('cat.png', target_size=256, crop_size=224)
img = fb_preprocess(img)

with tf.Session() as sess:
    nets.pretrained(models)
    preds = sess.run(models, {inputs: img})
    for pred in preds:
        print(utils.decode_predictions(pred, top=2)[0])

Performance

Image classification

  • The top-k accuracies were obtained with TensorNets on ImageNet validation set and may slightly differ from the original ones.
    • Input: input size fed into models
    • Top-1: single center crop, top-1 accuracy
    • Top-5: single center crop, top-5 accuracy
    • MAC: rounded the number of float operations by using tf.profiler
    • Size: rounded the number of parameters (w/ fully-connected layers)
    • Stem: rounded the number of parameters (w/o fully-connected layers)
  • The computation times were measured on NVIDIA Tesla P100 (3584 cores, 16 GB global memory) with cuDNN 6.0 and CUDA 8.0.
    • Speed: milliseconds for inferences of 100 images
  • The summary plot is generated by this script.
Input Top-1 Top-5 MAC Size Stem Speed References
ResNet50 224 74.874 92.018 51.0M 25.6M 23.6M 195.4 [paper] [tf-slim] [torch-fb]
[caffe] [keras]
ResNet101 224 76.420 92.786 88.9M 44.7M 42.7M 311.7 [paper] [tf-slim] [torch-fb]
[caffe]
ResNet152 224 76.604 93.118 120.1M 60.4M 58.4M 439.1 [paper] [tf-slim] [torch-fb]
[caffe]
ResNet50v2 299 75.960 93.034 51.0M 25.6M 23.6M 209.7 [paper] [tf-slim] [torch-fb]
ResNet101v2 299 77.234 93.816 88.9M 44.7M 42.6M 326.2 [paper] [tf-slim] [torch-fb]
ResNet152v2 299 78.032 94.162 120.1M 60.4M 58.3M 455.2 [paper] [tf-slim] [torch-fb]
ResNet200v2 224 78.286 94.152 129.0M 64.9M 62.9M 618.3 [paper] [tf-slim] [torch-fb]
ResNeXt50c32 224 77.740 93.810 49.9M 25.1M 23.0M 267.4 [paper] [torch-fb]
ResNeXt101c32 224 78.730 94.294 88.1M 44.3M 42.3M 427.9 [paper] [torch-fb]
ResNeXt101c64 224 79.494 94.592 0.0M 83.7M 81.6M 877.8 [paper] [torch-fb]
WideResNet50 224 78.018 93.934 137.6M 69.0M 66.9M 358.1 [paper] [torch]
Inception1 224 66.840 87.676 14.0M 7.0M 6.0M 165.1 [paper] [tf-slim] [caffe-zoo]
Inception2 224 74.680 92.156 22.3M 11.2M 10.2M 134.3 [paper] [tf-slim]
Inception3 299 77.946 93.758 47.6M 23.9M 21.8M 314.6 [paper] [tf-slim] [keras]
Inception4 299 80.120 94.978 85.2M 42.7M 41.2M 582.1 [paper] [tf-slim]
InceptionResNet2 299 80.256 95.252 111.5M 55.9M 54.3M 656.8 [paper] [tf-slim]
NASNetAlarge 331 82.498 96.004 186.2M 93.5M 89.5M 2081 [paper] [tf-slim]
NASNetAmobile 224 74.366 91.854 15.3M 7.7M 6.7M 165.8 [paper] [tf-slim]
PNASNetlarge 331 82.634 96.050 171.8M 86.2M 81.9M 1978 [paper] [tf-slim]
VGG16 224 71.268 90.050 276.7M 138.4M 14.7M 348.4 [paper] [keras]
VGG19 224 71.256 89.988 287.3M 143.7M 20.0M 399.8 [paper] [keras]
DenseNet121 224 74.972 92.258 15.8M 8.1M 7.0M 202.9 [paper] [torch]
DenseNet169 224 76.176 93.176 28.0M 14.3M 12.6M 219.1 [paper] [torch]
DenseNet201 224 77.320 93.620 39.6M 20.2M 18.3M 272.0 [paper] [torch]
MobileNet25 224 51.582 75.792 0.9M 0.5M 0.2M 34.46 [paper] [tf-slim]
MobileNet50 224 64.292 85.624 2.6M 1.3M 0.8M 52.46 [paper] [tf-slim]
MobileNet75 224 68.412 88.242 5.1M 2.6M 1.8M 70.11 [paper] [tf-slim]
MobileNet100 224 70.424 89.504 8.4M 4.3M 3.2M 83.41 [paper] [tf-slim]
MobileNet35v2 224 60.086 82.432 3.3M 1.7M 0.4M 57.04 [paper] [tf-slim]
MobileNet50v2 224 65.194 86.062 3.9M 2.0M 0.7M 64.35 [paper] [tf-slim]
MobileNet75v2 224 69.532 89.176 5.2M 2.7M 1.4M 88.68 [paper] [tf-slim]
MobileNet100v2 224 71.336 90.142 6.9M 3.5M 2.3M 93.82 [paper] [tf-slim]
MobileNet130v2 224 74.680 92.122 10.7M 5.4M 3.8M 130.4 [paper] [tf-slim]
MobileNet140v2 224 75.230 92.422 12.1M 6.2M 4.4M 132.9 [paper] [tf-slim]
75v3large 224 73.754 91.618 7.9M 4.0M 2.7M 79.73 [paper] [tf-slim]
100v3large 224 75.790 92.840 27.3M 5.5M 4.2M 94.71 [paper] [tf-slim]
100v3largemini 224 72.706 90.930 7.8M 3.9M 2.7M 70.57 [paper] [tf-slim]
75v3small 224 66.138 86.534 4.1M 2.1M 1.0M 37.78 [paper] [tf-slim]
100v3small 224 68.318 87.942 5.1M 2.6M 1.5M 42.00 [paper] [tf-slim]
100v3smallmini 224 63.440 84.646 4.1M 2.1M 1.0M 29.65 [paper] [tf-slim]
EfficientNetB0 224 77.012 93.338 26.2M 5.3M 4.0M 147.1 [paper] [tf-tpu]
EfficientNetB1 240 79.040 94.284 15.4M 7.9M 6.6M 217.3 [paper] [tf-tpu]
EfficientNetB2 260 80.064 94.862 18.1M 9.2M 7.8M 296.4 [paper] [tf-tpu]
EfficientNetB3 300 81.384 95.586 24.2M 12.3M 10.8M 482.7 [paper] [tf-tpu]
EfficientNetB4 380 82.588 96.094 38.4M 19.5M 17.7M 959.5 [paper] [tf-tpu]
EfficientNetB5 456 83.496 96.590 60.4M 30.6M 28.5M 1872 [paper] [tf-tpu]
EfficientNetB6 528 83.772 96.762 85.5M 43.3M 41.0M 3503 [paper] [tf-tpu]
EfficientNetB7 600 84.088 96.740 131.9M 66.7M 64.1M 6149 [paper] [tf-tpu]
SqueezeNet 224 54.434 78.040 2.5M 1.2M 0.7M 71.43 [paper] [caffe]

summary

Object detection

  • The object detection models can be coupled with any network but mAPs could be measured only for the models with pre-trained weights. Note that:
    • YOLOv3VOC was trained by taehoonlee with this recipe modified as max_batches=70000, steps=40000,60000,
    • YOLOv2VOC is equivalent to YOLOv2(inputs, Darknet19),
    • TinyYOLOv2VOC: TinyYOLOv2(inputs, TinyDarknet19),
    • FasterRCNN_ZF_VOC: FasterRCNN(inputs, ZF),
    • FasterRCNN_VGG16_VOC: FasterRCNN(inputs, VGG16, stem_out='conv5/3').
  • The mAPs were obtained with TensorNets and may slightly differ from the original ones. The test input sizes were the numbers reported as the best in the papers:
    • YOLOv3, YOLOv2: 416x416
    • FasterRCNN: min_shorter_side=600, max_longer_side=1000
  • The computation times were measured on NVIDIA Tesla P100 (3584 cores, 16 GB global memory) with cuDNN 6.0 and CUDA 8.0.
    • Size: rounded the number of parameters
    • Speed: milliseconds only for network inferences of a 416x416 or 608x608 single image
    • FPS: 1000 / speed
PASCAL VOC2007 test mAP Size Speed FPS References
YOLOv3VOC (416) 0.7423 62M 24.09 41.51 [paper] [darknet] [darkflow]
YOLOv2VOC (416) 0.7320 51M 14.75 67.80 [paper] [darknet] [darkflow]
TinyYOLOv2VOC (416) 0.5303 16M 6.534 153.0 [paper] [darknet] [darkflow]
FasterRCNN_ZF_VOC 0.4466 59M 241.4 3.325 [paper] [caffe] [roi-pooling]
FasterRCNN_VGG16_VOC 0.6872 137M 300.7 4.143 [paper] [caffe] [roi-pooling]
MS COCO val2014 mAP Size Speed FPS References
YOLOv3COCO (608) 0.6016 62M 60.66 16.49 [paper] [darknet] [darkflow]
YOLOv3COCO (416) 0.6028 62M 40.23 24.85 [paper] [darknet] [darkflow]
YOLOv2COCO (608) 0.5189 51M 45.88 21.80 [paper] [darknet] [darkflow]
YOLOv2COCO (416) 0.4922 51M 21.66 46.17 [paper] [darknet] [darkflow]

News 📰

  • The six variants of MobileNetv3 are released, 12 Mar 2020.
  • The eight variants of EfficientNet are released, 28 Jan 2020.
  • It is available to use TensorNets on TF 2, 23 Jan 2020.
  • MS COCO utils are released, 9 Jul 2018.
  • PNASNetlarge is released, 12 May 2018.
  • The six variants of MobileNetv2 are released, 5 May 2018.
  • YOLOv3 for COCO and VOC are released, 4 April 2018.
  • Generic object detection models for YOLOv2 and FasterRCNN are released, 26 March 2018.

Future work 🔥

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].