All Projects → lars76 → Object Localization

lars76 / Object Localization

Licence: mit
Object localization in images using simple CNNs and Keras

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Object Localization

Jacinto Ai Devkit
Training & Quantization of embedded friendly Deep Learning / Machine Learning / Computer Vision models
Stars: ✭ 49 (-62.31%)
Mutual labels:  object-detection, cnn, mobilenet
Mobilenet Yolo
MobileNetV2-YoloV3-Nano: 0.5BFlops 3MB HUAWEI P40: 6ms/img, YoloFace-500k:0.1Bflops 420KB🔥🔥🔥
Stars: ✭ 1,566 (+1104.62%)
Mutual labels:  object-detection, cnn, mobilenetv2
Tensornets
High level network definitions with pre-trained weights in TensorFlow
Stars: ✭ 982 (+655.38%)
Mutual labels:  object-detection, mobilenet, mobilenetv2
Libgdl
一个移动端跨平台的gpu+cpu并行计算的cnn框架(A mobile-side cross-platform gpu+cpu parallel computing CNN framework)
Stars: ✭ 91 (-30%)
Mutual labels:  cnn, mobilenet
Cnn Paper2
🎨 🎨 深度学习 卷积神经网络教程 :图像识别,目标检测,语义分割,实例分割,人脸识别,神经风格转换,GAN等🎨🎨 https://dataxujing.github.io/CNN-paper2/
Stars: ✭ 77 (-40.77%)
Mutual labels:  object-detection, cnn
Mobilenet Caffe
Caffe Implementation of Google's MobileNets (v1 and v2)
Stars: ✭ 1,217 (+836.15%)
Mutual labels:  mobilenet, mobilenetv2
Ssds pytorch
Multiple basenet MobileNet v1,v2, ResNet combined with SSD detection method and it's variants such as RFB, FSSD etc.
Stars: ✭ 71 (-45.38%)
Mutual labels:  mobilenet, mobilenetv2
Captcha
基于CNN的验证码整体识别
Stars: ✭ 125 (-3.85%)
Mutual labels:  cnn, cnn-keras
Tf Object Detection
Simpler app for tensorflow object detection API
Stars: ✭ 91 (-30%)
Mutual labels:  object-detection, mobilenet
Mobilenet V2 Caffe
MobileNet-v2 experimental network description for caffe
Stars: ✭ 93 (-28.46%)
Mutual labels:  cnn, mobilenetv2
Keras Oneclassanomalydetection
[5 FPS - 150 FPS] Learning Deep Features for One-Class Classification (AnomalyDetection). Corresponds RaspberryPi3. Convert to Tensorflow, ONNX, Caffe, PyTorch. Implementation by Python + OpenVINO/Tensorflow Lite.
Stars: ✭ 102 (-21.54%)
Mutual labels:  cnn, mobilenetv2
Id Card Detector
💳 Detecting the National Identification Cards with Deep Learning (Faster R-CNN)
Stars: ✭ 114 (-12.31%)
Mutual labels:  object-detection, cnn
Motionblur Detection By Cnn
Stars: ✭ 126 (-3.08%)
Mutual labels:  object-detection, cnn
Hand Detection.pytorch
FaceBoxes for hand detection in PyTorch
Stars: ✭ 76 (-41.54%)
Mutual labels:  object-detection, cnn
Tf Mobilenet V2
Mobilenet V2(Inverted Residual) Implementation & Trained Weights Using Tensorflow
Stars: ✭ 85 (-34.62%)
Mutual labels:  cnn, mobilenet
Mobilenet V2
Repository for "Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation".
Stars: ✭ 73 (-43.85%)
Mutual labels:  mobilenet, mobilenetv2
Face landmark dnn
Face Landmark Detector based on Mobilenet V1
Stars: ✭ 92 (-29.23%)
Mutual labels:  cnn, mobilenet
Sightseq
Computer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection
Stars: ✭ 116 (-10.77%)
Mutual labels:  object-detection, mobilenet
Yolov4 Pytorch
This is a pytorch repository of YOLOv4, attentive YOLOv4 and mobilenet YOLOv4 with PASCAL VOC and COCO
Stars: ✭ 1,070 (+723.08%)
Mutual labels:  object-detection, mobilenetv2
Hyperopt Keras Cnn Cifar 100
Auto-optimizing a neural net (and its architecture) on the CIFAR-100 dataset. Could be easily transferred to another dataset or another classification task.
Stars: ✭ 95 (-26.92%)
Mutual labels:  cnn, cnn-keras

object-localization

This project shows how to localize objects in images by using simple convolutional neural networks.

Dataset

Before getting started, we have to download a dataset and generate a csv file containing the annotations (boxes).

  1. Download The Oxford-IIIT Pet Dataset
  2. Download The Oxford-IIIT Pet Dataset Annotations
  3. tar xf images.tar.gz
  4. tar xf annotations.tar.gz
  5. mv annotations/xmls/* images/
  6. python3 generate_dataset.py

Single-object detection

Example 1: Finding dogs/cats

Architecture

First, let's look at YOLOv2's approach:

  1. Pretrain Darknet-19 on ImageNet (feature extractor)
  2. Remove the last convolutional layer
  3. Add three 3 x 3 convolutional layers with 1024 filters
  4. Add a 1 x 1 convolutional layer with the number of outputs needed for detection

We proceed in the same way to build the object detector:

  1. Choose a model from Keras Applications i.e. feature extractor
  2. Remove the dense layer
  3. Freeze some/all/no layers
  4. Add one/multiple/no convolution block (or _inverted_res_block for MobileNetv2)
  5. Add a convolution layer for the coordinates

The code in this repository uses MobileNetv2, because it is faster than other models and the performance can be adapted. For example, if alpha = 0.35 with 96x96 is not good enough, one can just increase both values (see here for a comparison). If you use another architecture, change preprocess_input.

  1. python3 example_1/train.py
  2. Adjust the WEIGHTS_FILE in example_1/test.py (given by the last script)
  3. python3 example_1/test.py

Result

In the following images red is the predicted box, green is the ground truth:

Image 1

Image 2

Example 2: Finding dogs/cats and distinguishing classes

This time we have to run the scripts example_2/train.py and example_2/test.py.

Changes

In order to distinguish between classes, we have to modify the loss function. I'm using here w_1*log((y_hat - y)^2 + 1) + w_2*FL(p_hat, p) where w_1 = w_2 = 1 are two weights and FL(p_hat, p) = -(0.9(1 - p_hat)^2 p*log(p_hat) + 0.1*p_hat^2(1 - p)log(1-p_hat)) (focal loss).

Instead of using all 37 classes, the code will only output class 0 (contains only class 0) or class 1 (contains class 1 to 36). However, it is easy to extend this to more classes (use categorical cross entropy instead of focal loss and try out different weights).

Multi-object detection

Example 3: Segmentation-like detection

Architecture

In this example, we use a skip-net architecture similar to U-Net. For an in-depth explanation see my blog post.

Architecture

Result

Dog

Example 4: YOLO-like detection

Architecture

This example is based on the three YOLO papers. For an in-depth explanation see this blog post.

Result

Multiple dogs

Guidelines

Improve accuracy (IoU)

  • enable augmentations: see example_4 the same code can be added to the other examples
  • better augmentations: try out different values (flips, rotation etc.)
  • for MobileNetv1/2: increase ALPHA and IMAGE_SIZE in train_model.py
  • other architectures: increase IMAGE_SIZE
  • add more layers
  • try out other loss functions (MAE, smooth L1 loss etc.)
  • other optimizer: SGD with momentum 0.9, adjust learning rate
  • use a feature pyramid
  • read https://github.com/keras-team/keras/pull/9965

Increase training speed

  • increase BATCH_SIZE
  • less layers, IMAGE_SIZE and ALPHA

Overfitting

  • If the new dataset is small and similar to ImageNet, freeze all layers.
  • If the new dataset is small and not similar to ImageNet, freeze some layers.
  • If the new dataset is large, freeze no layers.
  • read http://cs231n.github.io/transfer-learning/
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].