All Projects → andrewliao11 → py-faster-rcnn-imagenet

andrewliao11 / py-faster-rcnn-imagenet

Licence: other
Train faster rcnn on imagine dataset, related blog post: https://andrewliao11.github.io/object/detection/2016/07/23/detection/

Projects that are alternatives of or similar to py-faster-rcnn-imagenet

Caffe Model
Caffe models (including classification, detection and segmentation) and deploy files for famouse networks
Stars: ✭ 1,258 (+845.86%)
Mutual labels:  imagenet, faster-rcnn
Rectlabel Support
RectLabel - An image annotation tool to label images for bounding box object detection and segmentation.
Stars: ✭ 338 (+154.14%)
Mutual labels:  imagenet, faster-rcnn
Sequential Imagenet Dataloader
A plug-in replacement for DataLoader to load ImageNet disk-sequentially in PyTorch.
Stars: ✭ 198 (+48.87%)
Mutual labels:  imagenet
alexnet
custom implementation alexnet with tensorflow
Stars: ✭ 21 (-84.21%)
Mutual labels:  imagenet
Pyramidnet Pytorch
A PyTorch implementation for PyramidNets (Deep Pyramidal Residual Networks, https://arxiv.org/abs/1610.02915)
Stars: ✭ 234 (+75.94%)
Mutual labels:  imagenet
Mini Imagenet Tools
Tools for generating mini-ImageNet dataset and processing batches
Stars: ✭ 209 (+57.14%)
Mutual labels:  imagenet
Selecsls Pytorch
Reference ImageNet implementation of SelecSLS CNN architecture proposed in the SIGGRAPH 2020 paper "XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera". The repository also includes code for pruning the model based on implicit sparsity emerging from adaptive gradient descent methods, as detailed in the CVPR 2019 paper "On implicit filter level sparsity in Convolutional Neural Networks".
Stars: ✭ 251 (+88.72%)
Mutual labels:  imagenet
Pytorch Cpp
PyTorch C++ inference with LibTorch
Stars: ✭ 194 (+45.86%)
Mutual labels:  imagenet
cozmo-tensorflow
🤖 Cozmo the Robot recognizes objects with TensorFlow
Stars: ✭ 61 (-54.14%)
Mutual labels:  imagenet
Pyconv
Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition (https://arxiv.org/pdf/2006.11538.pdf)
Stars: ✭ 231 (+73.68%)
Mutual labels:  imagenet
nested-transformer
Nested Hierarchical Transformer https://arxiv.org/pdf/2105.12723.pdf
Stars: ✭ 174 (+30.83%)
Mutual labels:  imagenet
Fusenet
Deep fusion project of deeply-fused nets, and the study on the connection to ensembling
Stars: ✭ 230 (+72.93%)
Mutual labels:  imagenet
Moga
MoGA: Searching Beyond MobileNetV3
Stars: ✭ 220 (+65.41%)
Mutual labels:  imagenet
Dawn Bench Entries
DAWNBench: An End-to-End Deep Learning Benchmark and Competition
Stars: ✭ 254 (+90.98%)
Mutual labels:  imagenet
Labelimg
🖍️ LabelImg is a graphical image annotation tool and label object bounding boxes in images
Stars: ✭ 16,088 (+11996.24%)
Mutual labels:  imagenet
gluon-faster-rcnn
Faster R-CNN implementation with MXNet Gluon API
Stars: ✭ 31 (-76.69%)
Mutual labels:  faster-rcnn
Atomnas
Code for ICLR 2020 paper 'AtomNAS: Fine-Grained End-to-End Neural Architecture Search'
Stars: ✭ 197 (+48.12%)
Mutual labels:  imagenet
Octconv.pytorch
PyTorch implementation of Octave Convolution with pre-trained Oct-ResNet and Oct-MobileNet models
Stars: ✭ 229 (+72.18%)
Mutual labels:  imagenet
Mobilenetv3 Pytorch
Implementing Searching for MobileNetV3 paper using Pytorch
Stars: ✭ 243 (+82.71%)
Mutual labels:  imagenet
BAKE
Self-distillation with Batch Knowledge Ensembling Improves ImageNet Classification
Stars: ✭ 79 (-40.6%)
Mutual labels:  imagenet

Training Faster RCNN on Imagenet

Readme Score

If you want to know some basic ideas in faster rcnn, try to check Video Object Detection using Faster R-CNN out!

Feel free to contact me via email, I'll try to give you a hand if I can, lol.

preparing data

ILSVRC13 
└─── ILSVRC2013_DET_val
    │   *.JPEG (Image files, ex:ILSVRC2013_val_00000565.JPEG)
└─── ILSVRC2013_DET_bbox_val
    |   *.xml (you can find the example from ./misc/ILSVRC2012_val_00018464.xml under this repo)
└─── data
    │   meta_det.mat 
    └─── det_lists
             │  val1.txt, val2.txt

meta_det.mat => Load the category inside, like here
Load the meta_det.mat file by

classes = sio.loadmat(os.path.join(self._devkit_path, 'data', 'meta_det.mat'))

Construct IMDB file

There's are several file you need to modify.

factory_imagenet.py

This file is in the directory $FRCNN_ROOT/lib/datasets($FRCNN_ROOT is the where your faster rcnn locate) and is called by train_net_imagenet.py.
It is the interface loading the imdb file.

for split in ['train', 'val', 'val1', 'val2', 'test']:
    name = 'imagenet_{}'.format(split)
    devkit_path = '/media/VSlab2/imagenet/ILSVRC13'
    __sets[name] = (lambda split=split, devkit_path=devkit_path:datasets.imagenet.imagenet(split,devkit_path))

imagenet.py

In function __ init __(self, image_set, devkit_path)

we have to enlarge the number of category from 20+1 into 200+1 categories. Note that in imagenet dataset, the object category is something like "n02691156", instead of "airplane"

self._data_path = os.path.join(self._devkit_path, 'ILSVRC2013_DET_' +     self._image_set[:-1])
synsets = sio.loadmat(os.path.join(self._devkit_path, 'data', 'meta_det.mat'))
self._classes = ('__background__',)
self._wnid = (0,)
for i in xrange(200):
    self._classes = self._classes + (synsets['synsets'][0][i][2][0],)
    self._wnid = self._wnid + (synsets['synsets'][0][i][1][0],)
self._wnid_to_ind = dict(zip(self._wnid, xrange(self.num_classes)))
self._class_to_ind = dict(zip(self.classes, xrange(self.num_classes)))

self._class denotes the class name
self._wnid denotes the id of the category

In function _load_imagenet_annotation(self, index)

This is because in the pascal voc dataset, all coordinates start from one, so in order to make them start from 0, we need to minus 1. But this is not true for imagenet, so we should not minus 1.
So we need to modify these lines to:

for ix, obj in enumerate(objs):
    x1 = float(get_data_from_tag(obj, 'xmin'))
    y1 = float(get_data_from_tag(obj, 'ymin'))
    x2 = float(get_data_from_tag(obj, 'xmax'))
    y2 = float(get_data_from_tag(obj, 'ymax'))
    cls = self._wnid_to_ind[str(get_data_from_tag(obj, "name")).lower().strip()]

Noted that in faster rcnnn, we don't need to run the selective-search, which is the main difference from fast rcnn.

Modify the prototxt

Under the directory $FRCNN_ROOT/

train.prototxt

Change the number of classes into 200+1

param_str: "'num_classes': 201"

In layer "bbox_pred", change the number of output into (200+1)*4

num_output: 804

You can modify the test.prototxt in the same way.

[Last step] Modify the shell script

Under the dircetory $FRCNN_ROOT/experiments/scripts

faster_rcnn_end2end_imagenet.sh

You can specify which dataset to train/test on and your what pre-trainded model is

ITERS=100000
DATASET_TRAIN=imagenet_val1
DATASET_TEST=imagenet_val2
NET_INIT=data/imagenet_models/${NET}.v2.caffemodel

Start to Train Faster RCNN On Imagenet!

Run the $FRCNN/experiments/scripts/faster_rcnn_end2end_imagenet.sh.
The use of .sh file is just the same as the original faster rcnn

Experiment

This is the mean/median AP of different iterations.The highest mean AP falls in 90000 iterations.

The original Faster R-CNN states that they can achieve 59.9% mAP on PASCAL VOC 2007, which only contains 20 categories. The result of mine is relatively low compared to the original work. However, this is the trade-off since we increase the diversity of the object categories. My network can achieve 33.1% mAP.

The low accuracy is due to:

  • Smaller dataset( ImageNet validation1 )
  • Diverse object category

So here I present the result of the overlapped category. My model achieves 48.7% mAP from the object category that appears in PASCAL VOC 2007 (12 categories), which is much higher than that of 200 categories.

And I also present the mAP for each category in ImageNet

Demo

Just run the demo.py to visualize pictures! demo_02

faster rcnn with tracker on videos

IMAGE ALT TEXT HERE

Original video "https://www.jukinmedia.com/videos/view/5655"

Reference

How to train fast rcnn on imagenet

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].