DRFNet: Dense Receptive Field for Object Detector, in PyTorch

A PyTorch implementation of Dense Receptive Field for Object Detection (accepted by ICPR2018) This repoitory is now deprecated, please go to https://github.com/yqyao/SSD_Pytorch.

Installation
Datasets
Train
Evaluate
Performance
Reference

Installation

Install PyTorch-0.3.1 by selecting your environment on the website and running the appropriate command.
Clone this repository.
- Note: We currently only support Python 3+.
Then download the dataset by following the instructions below.
Compile the nms and coco tools:

cd DRFNet
./make.sh

Note*: Check you GPU architecture support in utils/build.py, line 131. Default is:

'nvcc': ['-arch=sm_52',

Datasets

To make things easy, we provide a simple VOC dataset loader that inherits torch.utils.data.Dataset making it fully compatible with the torchvision.datasets API.

VOC Dataset

Download VOC2007 trainval & test

# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>

Download VOC2012 trainval

# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>

Merge VOC2007 and VOC2012

move all images in VOC2007 and VOC2012 into VOCROOT/VOC0712/JPEGImages
move all annotations in VOC2007 and VOC2012 into VOCROOT/VOC0712/JPEGImages/Annotations
rename and merge some txt VOC2007 and VOC2012 ImageSets/Main/*.txt to VOCROOT/VOC0712/JPEGImages/ImageSets/Main/*.txt
the merged txt list as follows:
2012_test.txt, 2007_test.txt, 0712_trainval_test.txt, 2012_trainval.txt, 0712_trainval.txt

COCO Dataset

Install the MS COCO dataset at /path/to/coco from official website, default is ~/data/COCO. Following the instructions to prepare minival2014 and valminusminival2014 annotations. All label files (.json) should be under the COCO/annotations/ folder. It should have this basic structure

$COCO/
$COCO/cache/
$COCO/annotations/
$COCO/images/
$COCO/images/test2015/
$COCO/images/train2014/
$COCO/images/val2014/

UPDATE: The current COCO dataset has released new train2017 and val2017 sets which are just new splits of the same image sets.

Training

First download the fc-reduced VGG-16 PyTorch base network weights at: https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
ResNet pre-trained basenet weight file is available at ResNet50, ResNet101, ResNet152
By default, we assume you have downloaded the file in the DRFNet/weights dir:

mkdir weights
cd weights
wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
wget https://download.pytorch.org/models/resnet50-19c8e357.pth
wget https://download.pytorch.org/models/resnet101-5d3b4d8f.pth
wget https://download.pytorch.org/models/resnet152-b121ed2d.pth

To train DRFNet using the train script simply specify the parameters listed in train.py as a flag or manually change them.

python train.py -v drf_ssd_vgg

Note:
- -d: choose datasets, VOC or COCO, VOC2012(voc12 trainval),VOC0712++(0712 trainval + 07test)
- -v choose backbone version, ssd_vgg, ssd_res, drf_ssd_vgg, drf_ssd_res
- s: image size, 300 or 512
- You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see train.py for options)
To evaluate a trained network:

python eval.py -v drf_ssd_vgg

You can specify the parameters listed in the eval.py file by flagging them or manually changing them.

Performance

VOC2007 Test

mAP

we retrained some models, so it's different from the origin paper size = 300

ssd	drf_32	drf_48	drf_64	drf_96	drf_128
77.2 %	79.87 %	79.93%	79.73 %	79.38%	79.65 %

Evaluation report for the best current version

VOC07 metric? Yes

AP for aeroplane = 0.8579
AP for bicycle = 0.8615
AP for bird = 0.7786
AP for boat = 0.7202
AP for bottle = 0.5850
AP for bus = 0.8788
AP for car = 0.8712
AP for cat = 0.8849
AP for chair = 0.6612
AP for cow = 0.8702
AP for diningtable = 0.7796
AP for dog = 0.8577
AP for horse = 0.8750
AP for motorbike = 0.8778
AP for person = 0.8046
AP for pottedplant = 0.5582
AP for sheep = 0.7952
AP for sofa = 0.8041
AP for train = 0.8800
AP for tvmonitor = 0.7847
Mean AP = 0.7993

FPS

GTX 1080 Ti: ~70 FPS

References

Wei Liu, et al. "SSD: Single Shot MultiBox Detector." ECCV2016.
Original Implementation (CAFFE)
A list of other great SSD ports that were sources of inspiration (especially the Chainer repo):
- ssd.pytorch, RFBNet Chainer, torchcv )

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

yqyao / DRFNet

Programming Languages