All Projects → txytju → Faster-RCNN-LocNet

txytju / Faster-RCNN-LocNet

Licence: other
A simplified implementation of paper : Improved Localization Accuracy by LocNet for Faster R-CNN Based Text Detection

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Faster-RCNN-LocNet

Syndata Generation
Code used to generate synthetic scenes and bounding box annotations for object detection. This was used to generate data used in the Cut, Paste and Learn paper
Stars: ✭ 214 (+756%)
Mutual labels:  faster-rcnn
publications-arruda-ijcnn-2019
Cross-Domain Car Detection Using Unsupervised Image-to-Image Translation: From Day to Night
Stars: ✭ 59 (+136%)
Mutual labels:  faster-rcnn
Real-Time-Object-Detection-API-using-TensorFlow
A Transfer Learning based Object Detection API that detects all objects in an image, video or live webcam. An SSD model and a Faster R-CNN model was pretrained on Mobile net coco dataset along with a label map in Tensorflow. This model were used to detect objects captured in an image, video or real time webcam. Open CV was used for streaming obj…
Stars: ✭ 50 (+100%)
Mutual labels:  faster-rcnn
Mmdetection
OpenMMLab Detection Toolbox and Benchmark
Stars: ✭ 17,646 (+70484%)
Mutual labels:  faster-rcnn
gluon-faster-rcnn
Faster R-CNN implementation with MXNet Gluon API
Stars: ✭ 31 (+24%)
Mutual labels:  faster-rcnn
MMTOD
Multi-modal Thermal Object Detector
Stars: ✭ 38 (+52%)
Mutual labels:  faster-rcnn
Caffe Faster Rcnn
faster rcnn c++ version. joint train; please checkout into dev branch (git checkout dev)
Stars: ✭ 210 (+740%)
Mutual labels:  faster-rcnn
keras-faster-rcnn
keras实现faster rcnn,end2end训练、预测; 持续更新中,见todo... ;欢迎试用、关注并反馈问题
Stars: ✭ 85 (+240%)
Mutual labels:  faster-rcnn
py-faster-rcnn-imagenet
Train faster rcnn on imagine dataset, related blog post: https://andrewliao11.github.io/object/detection/2016/07/23/detection/
Stars: ✭ 133 (+432%)
Mutual labels:  faster-rcnn
GIouloss CIouloss caffe
Caffe version Generalized & Distance & Complete Iou loss Implementation for Faster RCNN/FPN bbox regression
Stars: ✭ 42 (+68%)
Mutual labels:  faster-rcnn
Icevision
End-to-End Object Detection Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come
Stars: ✭ 218 (+772%)
Mutual labels:  faster-rcnn
Pytorch Faster Rcnn
pytorch based implementation faster rcnn
Stars: ✭ 251 (+904%)
Mutual labels:  faster-rcnn
FasterRCNN-pytorch
FasterRCNN is implemented in VGG, ResNet and FPN base.
Stars: ✭ 121 (+384%)
Mutual labels:  faster-rcnn
Paddledetection
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Stars: ✭ 5,799 (+23096%)
Mutual labels:  faster-rcnn
Object-and-Semantic-Part-Detection-pyTorch
Joint detection of Object and its Semantic parts using Attention-based Feature Fusion on PASCAL Parts 2010 dataset
Stars: ✭ 18 (-28%)
Mutual labels:  faster-rcnn
Luminoth
Deep Learning toolkit for Computer Vision.
Stars: ✭ 2,386 (+9444%)
Mutual labels:  faster-rcnn
Shadowless
A Fast and Open Source Autonomous Perception System.
Stars: ✭ 29 (+16%)
Mutual labels:  faster-rcnn
Faster RCNN tensorflow
Implementation of Faster RCNN for Vehicle Detection
Stars: ✭ 16 (-36%)
Mutual labels:  faster-rcnn
Object-Detection-And-Tracking
Target detection in the first frame and Tracking target by SiamRPN.
Stars: ✭ 33 (+32%)
Mutual labels:  faster-rcnn
smd
Simple mmdetection CPU inference
Stars: ✭ 27 (+8%)
Mutual labels:  faster-rcnn

Improved Localization Accuracy by LocNet for Faster R-CNN

1. Introduction

This project is a Simplified Faster R-CNN improved by LocNet (Loc-Faster-RCNN for short) implementation based on Faster R-CNN by chenyuntc. It aims in:

  • Improve the localization accuracy of Faster R-CNN by using LocNet in the Fast R-CNN part.
  • The first public implementation of the original paper. The author of the paper didn't release their version.
  • Match the performance reported in original paper.

And it has the following features:

  • It can be run as pure Python code, no more build affair. (cuda code moves to cupy, Cython acceleration are optional)

This implementation is slightlly different from the original paper:

  • Skip pooling is not used here. Informations from conv5_3 layer(the feature map of original Faster R-CNN) is enough for my task, so skip pooling is droped in this repo. What's more, with the advent of new methods like Feature Pyramid Networks, skip pooling seems to be obsolete :)
  • The RPN net is exactly same as Faster R-CNN, which means only 3X3 conv is applied, rather than 3X3 and 5X5 conv nets in the original paper.
  • Training strategy. The original paper train the RPN and LocNet alternately, but losses of RPN and LocNet are backproped at the same time in this repo.

prob_thre :

  • Hyperparameters in Loc-Faster-RCNN are mostly like Faster R-CNN except for prob_thre.
  • prob_thre is the threshold of probability used when predicting the bounding box, if px or py is greater than prob_thre, this row or column is considered to be part of some object.
  • Different detection tasks may have different appropriate prob_thre to achive best performance. If most objects in the detection task are dense blocks, a higher prob_thre may achive better performance.
  • You can choose your own prob_thre according to your task characteristics. Use eval_prob_thre function in train.py to find out the best prob_thre for your task. Remember to set load_path variable in the utils/config.py to your best model before calling this function.

2. Performance

2.1 Pascal VOC

Training and test set of Pascal VOC 2007 are used in this repo.

2.1.1 mAP

The best prob_thre for Pascal VOC is 0.5. When using prob_thre=0.5, the performance of Loc-Faster-RCNN is listed as follows. So with dataset like Pascal VOC, Loc-Faster-RCNN can not achieve better result than Faster R-CNN. However, when apppied to dataset with lots of small and dense objects, Loc-Faster-RCNN is likely to achieve better performance.

Implementation mAP
Loc-Faster-RCNN 0.6527
Faster R-CNN 0.7097

2.1.2 Differences between models predictions in Pascal VOC

  • LocNet part improves localization accuracy of Loc-Faster-RCNN by predicting the probability rather than locations. This helps when models is used to detection small objects or objects that are not so obvious. Like shown in the first 2 rows below, Loc-Faster-RCNN detected a person(row 1) and plant(row 2) even the objects are too small and not obvious.
  • However, LocNet part also hinders the model from identifying small parts of objects,which are more densely connected with the background rather than the main part of that object, like tail of a cat or wings of a bird ,as shown in the 3th~5th rows bellow.
  • What's more, if objects are overlaping or densely connected with each other in the same image, Loc-Faster-RCNN also have difficulty in drawing accurate bounding boxes around objects, as shown in the last row bellow.
Ground Truth Loc-Faster-RCNN Faster R-CNN

2.2 Text detection dataset

ICDAR-2011 and ICDAR-2013 are used in training and eveluating.

TBD.

3. Install dependencies

This repo is built basically on Faster R-CNN. You can check this repo to see dependencies.

4. Train

Compared with Faster R-CNN, Loc-Faster-RCNN is a little bit harder to train. If same initinal learning rate of 1e-3 is applied, the model may not converge after several epoches because px pr py would be nan. So if you encounter the same problem when using Loc-Faster-RCNN on your own dataset, maybe a smaller learning rate of 1e-4 or 1e-5 should work.

Troubleshooting

More

  • model structure
  • maybe : skip pooling
  • Maybe : conv 3X3 and conv 5X5 in RPN
  • High likely : Feature Pyramid Network as backbone
  • High likely : RoI Align rather than RoI Pooling

Acknowledgement

This work builds on many excellent works, which include:


Licensed under MIT, see the LICENSE for more detail.

Contribution Welcome.

If you encounter any problem, feel free to open an issue.

Correct me if anything is wrong or unclear.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].