All Projects → zhaoweicai → Cascade Rcnn

zhaoweicai / Cascade Rcnn

Caffe implementation of multiple popular object detection frameworks

Projects that are alternatives of or similar to Cascade Rcnn

Avod
Code for 3D object detection for autonomous driving
Stars: ✭ 757 (-18.78%)
Mutual labels:  object-detection
Dataaugmentationforobjectdetection
Data Augmentation For Object Detection
Stars: ✭ 812 (-12.88%)
Mutual labels:  object-detection
Dmsmsgrcg
A photo OCR project aims to output DMS messages contained in sign structure images.
Stars: ✭ 18 (-98.07%)
Mutual labels:  object-detection
Orb slam2 ssd semantic
动态语义SLAM 目标检测+VSLAM+光流/多视角几何动态物体检测+octomap地图+目标数据库
Stars: ✭ 763 (-18.13%)
Mutual labels:  object-detection
Cvpr2021 Paper Code Interpretation
cvpr2021/cvpr2020/cvpr2019/cvpr2018/cvpr2017 论文/代码/解读/直播合集,极市团队整理
Stars: ✭ 8,075 (+766.42%)
Mutual labels:  object-detection
Tensorlayer
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥
Stars: ✭ 6,796 (+629.18%)
Mutual labels:  object-detection
Bmw Tensorflow Training Gui
This repository allows you to get started with a gui based training a State-of-the-art Deep Learning model with little to no configuration needed! NoCode training with TensorFlow has never been so easy.
Stars: ✭ 736 (-21.03%)
Mutual labels:  object-detection
Tensorflow object detection tflite
This is a repo for training and implementing the mobilenet-ssd v2 to tflite with c++ on x86 and arm64
Stars: ✭ 24 (-97.42%)
Mutual labels:  object-detection
Ailab
Experience, Learn and Code the latest breakthrough innovations with Microsoft AI
Stars: ✭ 6,896 (+639.91%)
Mutual labels:  object-detection
Federated Benchmark
A Benchmark of Real-world Image Dataset for Federated Learning
Stars: ✭ 18 (-98.07%)
Mutual labels:  object-detection
Mxnet Ssd
MXNet port of SSD: Single Shot MultiBox Object Detector. Reimplementation of https://github.com/weiliu89/caffe/tree/ssd
Stars: ✭ 766 (-17.81%)
Mutual labels:  object-detection
Awesome Object Detection
Awesome Object Detection based on handong1587 github: https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html
Stars: ✭ 6,628 (+611.16%)
Mutual labels:  object-detection
Eccv2020 Code
ECCV 2020 论文开源项目合集,同时欢迎各位大佬提交issue,分享ECCV 2020开源项目
Stars: ✭ 827 (-11.27%)
Mutual labels:  object-detection
Deepcamera
Open source face recognition on Raspberry Pi. SharpAI is open source stack for machine learning engineering with private deployment and AutoML for edge computing. DeepCamera is application of SharpAI designed for connecting computer vision model to surveillance camera. Developers can run same code on Raspberry Pi/Android/PC/AWS to boost your AI production development.
Stars: ✭ 757 (-18.78%)
Mutual labels:  object-detection
Efficientdet Pytorch
A PyTorch impl of EfficientDet faithful to the original Google impl w/ ported weights
Stars: ✭ 906 (-2.79%)
Mutual labels:  object-detection
Getting Things Done With Pytorch
Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch. Topics: Face detection with Detectron 2, Time Series anomaly detection with LSTM Autoencoders, Object Detection with YOLO v5, Build your first Neural Network, Time Series forecasting for Coronavirus daily cases, Sentiment Analysis with BERT.
Stars: ✭ 738 (-20.82%)
Mutual labels:  object-detection
Global Localization Object Detection
a global localization system with object detection in semantic map
Stars: ✭ 5 (-99.46%)
Mutual labels:  object-detection
3d Bounding Boxes From Monocular Images
A two stage multi-modal loss model along with rigid body transformations to regress 3D bounding boxes
Stars: ✭ 24 (-97.42%)
Mutual labels:  object-detection
Medicaldetectiontoolkit
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.
Stars: ✭ 917 (-1.61%)
Mutual labels:  object-detection
Yolo annotation tool
Annotation tool for YOLO in opencv
Stars: ✭ 17 (-98.18%)
Mutual labels:  object-detection

Cascade R-CNN: Delving into High Quality Object Detection

by Zhaowei Cai and Nuno Vasconcelos

This repository is written by Zhaowei Cai at UC San Diego.

Introduction

This repository implements mulitple popular object detection algorithms, including Faster R-CNN, R-FCN, FPN, and our recently proposed Cascade R-CNN, on the MS-COCO and PASCAL VOC datasets. Multiple choices are available for backbone network, including AlexNet, VGG-Net and ResNet. It is written in C++ and powered by Caffe deep learning toolbox.

Cascade R-CNN is a multi-stage extension of the popular two-stage R-CNN object detection framework. The goal is to obtain high quality object detection, which can effectively reject close false positives. It consists of a sequence of detectors trained end-to-end with increasing IoU thresholds, to be sequentially more selective against close false positives. The output of a previous stage detector is forwarded to a later stage detector, and the detection results will be improved stage by stage. This idea can be applied to any detector based on the two-stage R-CNN framework, including Faster R-CNN, R-FCN, FPN, Mask R-CNN, etc, and reliable gains are available independently of baseline strength. A vanilla Cascade R-CNN on FPN detector of ResNet-101 backbone network, without any training or inference bells and whistles, achieved state-of-the-art results on the challenging MS-COCO dataset.

Update

The re-implementation of Cascade R-CNN in Detectron has been released. See Detectron-Cascade-RCNN. Very consistent improvements are available for all tested models, independent of baseline strength.

It is also recommended to use the third-party implementation, mmdetection based on PyTorch and tensorpack based on TensorFlow.

Citation

If you use our code/model/data, please cite our paper:

@inproceedings{cai18cascadercnn,
  author = {Zhaowei Cai and Nuno Vasconcelos},
  Title = {Cascade R-CNN: Delving into High Quality Object Detection},
  booktitle = {CVPR},
  Year  = {2018}
}

or its extension:

@article{cai2019cascadercnn,
  author = {Zhaowei Cai and Nuno Vasconcelos},
  title = {Cascade R-CNN: High Quality Object Detection and Instance Segmentation},
  journal = {arXiv preprint arXiv:1906.09756},
  year = {2019}
}

Benchmarking

We benchmark mulitple detector models on the MS-COCO and PASCAL VOC datasets in the below tables.

  1. MS-COCO (Train/Test: train2017/val2017, shorter size: 800 for FPN and 600 for the others)
model #GPUs bs lr iter train time test time AP AP50 AP75
VGG-RPN-baseline     2 4   3e-3 100k 12.5 hr 0.075s 23.6 43.9 23.0
VGG-RPN-Cascade     2 4   3e-3 100k 15.5 hr 0.115s 27.0 44.2 27.7
Res50-RFCN-baseline     4 1   3e-3 280k 19 hr 0.07s 27.0 48.7 26.9
Res50-RFCN-Cascade     4 1   3e-3 280k 22.5 hr 0.075s 31.1 49.8 32.8
Res101-RFCN-baseline     4 1   3e-3 280k 29 hr 0.075s 30.3 52.2 30.8
Res101-RFCN-Cascade     4 1   3e-3 280k 30.5 hr 0.085s 33.3 52.0 35.2
Res50-FPN-baseline     8 1   5e-3 280k 32 hr 0.095s 36.5 58.6 39.2
Res50-FPN-Cascade     8 1   5e-3 280k 36 hr 0.115s 40.3 59.4 43.7
Res101-FPN-baseline     8 1   5e-3 280k 37 hr 0.115s 38.5 60.6 41.7
Res101-FPN-Cascade     8 1   5e-3 280k 46 hr 0.14s 42.7 61.6 46.6
  1. PASCAL VOC 2007 (Train/Test: 2007+2012trainval/2007test, shorter size: 600)
model #GPUs bs lr iter train time AP AP50 AP75
Alex-RPN-baseline     2 4   1e-3 45k 2.5 hr 29.4 63.2 23.7
Alex-RPN-Cascade     2 4   1e-3 45k 3 hr 38.9 66.5 40.5
VGG-RPN-baseline     2 4   1e-3 45k 6 hr 42.9 76.4 44.1
VGG-RPN-Cascade     2 4   1e-3 45k 7.5 hr 51.2 79.1 56.3
Res50-RFCN-baseline     2 2   2e-3 90k 8 hr 44.8 77.5 46.8
Res50-RFCN-Cascade     2 2   2e-3 90k 9 hr 51.8 78.5 57.1
Res101-RFCN-baseline     2 2   2e-3 90k 10.5 hr 49.4 79.8 53.2
Res101-RFCN-Cascade     2 2   2e-3 90k 12 hr 54.2 79.6 59.2

NOTE. In the above tables, all models have been run at least two times with close results. The training is relatively stable. RPN means Faster R-CNN. The annotations of PASCAL VOC are transformed to COCO format, and COCO API was used for evaluation. The results are different from the official VOC evaluation. If you want to compare the VOC results in publication, please use the official VOC code for evaluation.

Requirements

  1. NVIDIA GPU and cuDNN are required to have fast speeds. For now, CUDA 8.0 with cuDNN 6.0.20 has been tested. The other versions should be working.

  2. Caffe MATLAB wrapper is required to run the detection/evaluation demo.

Installation

  1. Clone the Cascade-RCNN repository, and we'll call the directory that you cloned Cascade-RCNN into CASCADE_ROOT

    git clone https://github.com/zhaoweicai/cascade-rcnn.git
    
  2. Build Cascade-RCNN

    cd $CASCADE_ROOT/
    # Follow the Caffe installation instructions here:
    #   http://caffe.berkeleyvision.org/installation.html
    
    # If you're experienced with Caffe and have all of the requirements installed
    # and your Makefile.config in place, then simply do:
    make all -j 16
    
    # If you want to run Cascade-RCNN detection/evaluation demo, build MATLAB wrapper as well
    make matcaffe
    

Datasets

If you already have a COCO/VOC copy but not as organized as below, you can simply create Symlinks to have the same directory structure.

MS-COCO

In all MS-COCO experiments, we use train2017 for training, and val2017 (a.k.a. minival) for validation. Follow MS-COCO website to download images/annotations, and set-up the COCO API.

Assumed that your local COCO dataset copy is at /your/path/to/coco, make sure it has the following directory structure:

coco
|_ images
  |_ train2017
  |  |_ <im-1-name>.jpg
  |  |_ ...
  |  |_ <im-N-name>.jpg
  |_ val2017
  |_ ...
|_ annotations
   |_ instances_train2017.json
   |_ instances_val2017.json
   |_ ...
|_ MatlabAPI

PASCAL VOC

In all PASCAL VOC experiments, we use VOC2007+VOC2012 trainval for training, and VOC2007 test for validation. Follow PASCAL VOC website to download images/annotations, and set-up the VOCdevkit.

Assumed that your local VOCdevkit copy is at /your/path/to/VOCdevkit, make sure it has the following directory structure:

VOCdevkit
|_ VOC2007
  |_ JPEGImages
  |  |_ <000001>.jpg
  |  |_ ...
  |  |_ <009963>.jpg
  |_ Annotations
  |  |_ <000001>.xml
  |  |_ ...
  |  |_ <009963>.xml
  |_ ...
|_ VOC2012
  |_ JPEGImages
  |  |_ <2007_000027>.jpg
  |  |_ ...
  |  |_ <2012_004331>.jpg
  |_ Annotations
  |  |_ <2007_000027>.xml
  |  |_ ...
  |  |_ <2012_004331>.xml
  |_ ...
|_ VOCcode

Training Cascade-RCNN

  1. Get the training data

    cd $CASCADE_ROOT/data/
    sh get_coco_data.sh
    

    This will download the window files required for the experiments. You can also use the provided MATLAB scripts coco_window_file.m under $CASCADE_ROOT/data/coco/ to generate your own window files.

  2. Download the pretrained models on ImageNet. For AlexNet and VGG-Net, the FC layers are pruned and 2048 units per FC layer are remained. In addition, the two FC layers are copied three times for Cascade R-CNN training. For ResNet, the BatchNorm layers are merged into Scale layers and frozen during training as common practice.

    cd $CASCADE_ROOT/models/
    sh fetch_vggnet.sh
    
  3. Multiple shell scripts are provided to train Cascade-RCNN on different baseline detectors as described in our paper. Under each model folder, you need to change the root_folder of the data layer in train.prototxt and test.prototxt to your COCO path. After that, you can start to train your own Cascade-RCNN models. Take vgg-12s-600-rpn-cascade for example.

    cd $CASCADE_ROOT/examples/coco/vgg-12s-600-rpn-cascade/
    sh train_detection.sh
    

    Log file will be generated along the training procedure. The total training time depends on the complexity of models and datasets. If you want to quickly check if the training works well, try the light AlexNet model on VOC dataset.

NOTE. Occasionally, the training of the Res101-FPN-Cascade will be out of memory. Just resume the training from the latest solverstate.

Pretrained Models

We only provide the Res50-FPN-baseline, Res50-FPN-Cascade and Res101-FPN-Cascade models for COCO dataset, and Res101-RFCN-Cascade for VOC dataset.

Download pre-trained models

cd $CASCADE_ROOT/examples/coco/
sh fetch_cascadercnn_models.sh

The pretrained models produce exactly the same results as described in our paper.

Testing/Evaluation Demo

Once the models pretrained or trained by yourself are available, you can use the MATLAB script run_cascadercnn_coco.m to obtain the detection and evaluation results. Set the right dataset path and choose the model of your interest to test in the demo script. The default setting is for the pretrained model. The final detection results will be saved under $CASCADE_ROOT/examples/coco/detections/ and the evaluation results will be saved under the model folder.

You also can run the shell script test_coco_detection.sh under each model folder for evalution, but it is not identical to the official evaluation. For publication, use the MATLAB script.

Disclaimer

  1. When we were re-implementing the FPN framework and roi_align layer, we only referred to their published papers. Thus, our implementation details could be different from the official Detectron.

If you encounter any issue when using our code or model, please let me know.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].