All Projects → chenyuntc → Simple Faster Rcnn Pytorch

chenyuntc / Simple Faster Rcnn Pytorch

Licence: other
A simplified implemention of Faster R-CNN that replicate performance from origin paper

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Simple Faster Rcnn Pytorch

Traffic Sign Detection
Traffic Sign Detection. Code for the paper entitled "Evaluation of deep neural networks for traffic sign detection systems".
Stars: ✭ 200 (-94.16%)
Mutual labels:  object-detection, jupyter-notebook, faster-rcnn
Tf Faster Rcnn
Tensorflow Faster RCNN for Object Detection
Stars: ✭ 3,604 (+5.32%)
Mutual labels:  object-detection, faster-rcnn, voc
Keras Faster Rcnn
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Stars: ✭ 28 (-99.18%)
Mutual labels:  object-detection, jupyter-notebook, faster-rcnn
Py R Fcn Multigpu
Code for training py-faster-rcnn and py-R-FCN on multiple GPUs in caffe
Stars: ✭ 192 (-94.39%)
Mutual labels:  object-detection, jupyter-notebook, faster-rcnn
Robust Physical Attack
Physical adversarial attack for fooling the Faster R-CNN object detector
Stars: ✭ 115 (-96.64%)
Mutual labels:  object-detection, jupyter-notebook, faster-rcnn
Spoonn
FPGA-based neural network inference project with an end-to-end approach (from training to implementation to deployment)
Stars: ✭ 186 (-94.56%)
Mutual labels:  object-detection, jupyter-notebook
Deep Learning With Python
Deep learning codes and projects using Python
Stars: ✭ 195 (-94.3%)
Mutual labels:  object-detection, jupyter-notebook
Nas fpn tensorflow
NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection.
Stars: ✭ 198 (-94.21%)
Mutual labels:  object-detection, jupyter-notebook
Luminoth
Deep Learning toolkit for Computer Vision.
Stars: ✭ 2,386 (-30.27%)
Mutual labels:  object-detection, faster-rcnn
Syndata Generation
Code used to generate synthetic scenes and bounding box annotations for object detection. This was used to generate data used in the Cut, Paste and Learn paper
Stars: ✭ 214 (-93.75%)
Mutual labels:  object-detection, faster-rcnn
Paddledetection
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Stars: ✭ 5,799 (+69.46%)
Mutual labels:  object-detection, faster-rcnn
Yolov3 Tf2
YoloV3 Implemented in Tensorflow 2.0
Stars: ✭ 2,327 (-32%)
Mutual labels:  object-detection, jupyter-notebook
Dockerface
Face detection using deep learning.
Stars: ✭ 173 (-94.94%)
Mutual labels:  object-detection, faster-rcnn
Shape Detection
🟣 Object detection of abstract shapes with neural networks
Stars: ✭ 170 (-95.03%)
Mutual labels:  object-detection, jupyter-notebook
Icevision
End-to-End Object Detection Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come
Stars: ✭ 218 (-93.63%)
Mutual labels:  object-detection, faster-rcnn
Face mask detection
Face mask detection system using Deep learning.
Stars: ✭ 168 (-95.09%)
Mutual labels:  object-detection, jupyter-notebook
Mmdetection
OpenMMLab Detection Toolbox and Benchmark
Stars: ✭ 17,646 (+415.66%)
Mutual labels:  object-detection, faster-rcnn
Taco
🌮 Trash Annotations in Context Dataset Toolkit
Stars: ✭ 243 (-92.9%)
Mutual labels:  object-detection, jupyter-notebook
Siamese Mask Rcnn
Siamese Mask R-CNN model for one-shot instance segmentation
Stars: ✭ 257 (-92.49%)
Mutual labels:  object-detection, jupyter-notebook
Ios Coreml Yolo
Almost Real-time Object Detection using Apple's CoreML and YOLO v1 -
Stars: ✭ 153 (-95.53%)
Mutual labels:  object-detection, jupyter-notebook

A Simple and Fast Implementation of Faster R-CNN

1. Introduction

[Update:] I've further simplified the code to pytorch 1.5, torchvision 0.6, and replace the customized ops roipool and nms with the one from torchvision. if you want the old version code, please checkout branch v1.0

This project is a Simplified Faster R-CNN implementation based on chainercv and other projects . I hope it can serve as an start code for those who want to know the detail of Faster R-CNN. It aims to:

  • Simplify the code (Simple is better than complex)
  • Make the code more straightforward (Flat is better than nested)
  • Match the performance reported in origin paper (Speed Counts and mAP Matters)

And it has the following features:

  • It can be run as pure Python code, no more build affair.
  • It's a minimal implemention in around 2000 lines valid code with a lot of comment and instruction.(thanks to chainercv's excellent documentation)
  • It achieves higher mAP than the origin implementation (0.712 VS 0.699)
  • It achieve speed compariable with other implementation (6fps and 14fps for train and test in TITAN XP)
  • It's memory-efficient (about 3GB for vgg16)

img

2. Performance

2.1 mAP

VGG16 train on trainval and test on test split.

Note: the training shows great randomness, you may need a bit of luck and more epoches of training to reach the highest mAP. However, it should be easy to surpass the lower bound.

Implementation mAP
origin paper 0.699
train with caffe pretrained model 0.700-0.712
train with torchvision pretrained model 0.685-0.701
model converted from chainercv (reported 0.706) 0.7053

2.2 Speed

Implementation GPU Inference Trainining
origin paper K40 5 fps NA
This[1] TITAN Xp 14-15 fps 6 fps
pytorch-faster-rcnn TITAN Xp 15-17fps 6fps

[1]: make sure you install cupy correctly and only one program run on the GPU. The training speed is sensitive to your gpu status. see troubleshooting for more info. Morever it's slow in the start of the program -- it need time to warm up.

It could be faster by removing visualization, logging, averaging loss etc.

3. Install dependencies

Here is an example of create environ from scratch with anaconda

# create conda env
conda create --name simp python=3.7
conda activate simp
# install pytorch
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

# install other dependancy
pip install visdom scikit-image tqdm fire ipdb pprint matplotlib torchnet

# start visdom
nohup python -m visdom.server &

If you don't use anaconda, then:

  • install PyTorch with GPU (code are GPU-only), refer to official website

  • install other dependencies: pip install visdom scikit-image tqdm fire ipdb pprint matplotlib torchnet

  • start visdom for visualization

nohup python -m visdom.server &

4. Demo

Download pretrained model from Google Drive or Baidu Netdisk( passwd: scxn)

See demo.ipynb for more detail.

5. Train

5.1 Prepare data

Pascal VOC2007

  1. Download the training, validation, test data and VOCdevkit

    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
  2. Extract all of these tars into one directory named VOCdevkit

    tar xvf VOCtrainval_06-Nov-2007.tar
    tar xvf VOCtest_06-Nov-2007.tar
    tar xvf VOCdevkit_08-Jun-2007.tar
  3. It should have this basic structure

    $VOCdevkit/                           # development kit
    $VOCdevkit/VOCcode/                   # VOC utility code
    $VOCdevkit/VOC2007                    # image sets, annotations, etc.
    # ... and several other directories ...
  4. modify voc_data_dir cfg item in utils/config.py, or pass it to program using argument like --voc-data-dir=/path/to/VOCdevkit/VOC2007/ .

5.2 [Optional]Prepare caffe-pretrained vgg16

If you want to use caffe-pretrain model as initial weight, you can run below to get vgg16 weights converted from caffe, which is the same as the origin paper use.

python misc/convert_caffe_pretrain.py

This scripts would download pretrained model and converted it to the format compatible with torchvision. If you are in China and can not download the pretrain model, you may refer to this issue

Then you could specify where caffe-pretraind model vgg16_caffe.pth stored in utils/config.py by setting caffe_pretrain_path. The default path is ok.

If you want to use pretrained model from torchvision, you may skip this step.

NOTE, caffe pretrained model has shown slight better performance.

NOTE: caffe model require images in BGR 0-255, while torchvision model requires images in RGB and 0-1. See data/dataset.pyfor more detail.

5.3 begin training

python train.py train --env='fasterrcnn' --plot-every=100

you may refer to utils/config.py for more argument.

Some Key arguments:

  • --caffe-pretrain=False: use pretrain model from caffe or torchvision (Default: torchvison)
  • --plot-every=n: visualize prediction, loss etc every n batches.
  • --env: visdom env for visualization
  • --voc_data_dir: where the VOC data stored
  • --use-drop: use dropout in RoI head, default False
  • --use-Adam: use Adam instead of SGD, default SGD. (You need set a very low lr for Adam)
  • --load-path: pretrained model path, default None, if it's specified, it would be loaded.

you may open browser, visit http://<ip>:8097 and see the visualization of training procedure as below:

visdom

Troubleshooting

  • dataloader: received 0 items of ancdata

    see discussion, It's alreadly fixed in train.py. So I think you are free from this problem.

  • Windows support

    I don't have windows machine with GPU to debug and test it. It's welcome if anyone could make a pull request and test it.

Acknowledgement

This work builds on many excellent works, which include:

^_^

Licensed under MIT, see the LICENSE for more detail.

Contribution Welcome.

If you encounter any problem, feel free to open an issue, but too busy lately.

Correct me if anything is wrong or unclear.

model structure img

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].