All Projects → yhenon → Pytorch Retinanet

yhenon / Pytorch Retinanet

Licence: apache-2.0
Pytorch implementation of RetinaNet object detection.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pytorch Retinanet

retinanet-tensorflow2.x
TensorFlow2.x implementation of RetinaNet
Stars: ✭ 32 (-98.16%)
Mutual labels:  retinanet
RetinaNet Tensorflow
Focal Loss for Dense Object Detection.
Stars: ✭ 52 (-97.01%)
Mutual labels:  retinanet
object-detection-notebooks
Object detection and localization with Tensorflow 2 and Keras
Stars: ✭ 25 (-98.56%)
Mutual labels:  retinanet
RetinaNet-tensorflow
RetinaNet in tensorflow
Stars: ✭ 27 (-98.45%)
Mutual labels:  retinanet
simpleAICV-pytorch-ImageNet-COCO-training
SimpleAICV:pytorch training example on ImageNet(ILSVRC2012)/COCO2017/VOC2007+2012 datasets.Include ResNet/DarkNet/RetinaNet/FCOS/CenterNet/TTFNet/YOLOv3/YOLOv4/YOLOv5/YOLOX.
Stars: ✭ 276 (-84.12%)
Mutual labels:  retinanet
DA-RetinaNet
Official Detectron2 implementation of DA-RetinaNet of our Image and Vision Computing 2021 work 'An unsupervised domain adaptation scheme for single-stage artwork recognition in cultural sites'
Stars: ✭ 31 (-98.22%)
Mutual labels:  retinanet
smd
Simple mmdetection CPU inference
Stars: ✭ 27 (-98.45%)
Mutual labels:  retinanet
Object-Detection-Tensorflow
Object Detection API Tensorflow
Stars: ✭ 275 (-84.18%)
Mutual labels:  retinanet
deepstream tao apps
Sample apps to demonstrate how to deploy models trained with TAO on DeepStream
Stars: ✭ 274 (-84.23%)
Mutual labels:  retinanet
Xtreme-Vision
A High Level Python Library to empower students, developers to build applications and systems enabled with computer vision capabilities.
Stars: ✭ 77 (-95.57%)
Mutual labels:  retinanet
Mmdetection
OpenMMLab Detection Toolbox and Benchmark
Stars: ✭ 17,646 (+915.3%)
Mutual labels:  retinanet

pytorch-retinanet

img3 img5

Pytorch implementation of RetinaNet object detection as described in Focal Loss for Dense Object Detection by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollár.

This implementation is primarily designed to be easy to read and simple to modify.

Results

Currently, this repo achieves 33.5% mAP at 600px resolution with a Resnet-50 backbone. The published result is 34.0% mAP. The difference is likely due to the use of Adam optimizer instead of SGD with weight decay.

Installation

  1. Clone this repo

  2. Install the required packages:

apt-get install tk-dev python-tk
  1. Install the python packages:
pip install pandas
pip install pycocotools
pip install opencv-python
pip install requests

Training

The network can be trained using the train.py script. Currently, two dataloaders are available: COCO and CSV. For training on coco, use

python train.py --dataset coco --coco_path ../coco --depth 50

For training using a custom dataset, with annotations in CSV format (see below), use

python train.py --dataset csv --csv_train <path/to/train_annots.csv>  --csv_classes <path/to/train/class_list.csv>  --csv_val <path/to/val_annots.csv>

Note that the --csv_val argument is optional, in which case no validation will be performed.

Pre-trained model

A pre-trained model is available at:

The state dict model can be loaded using:

retinanet = model.resnet50(num_classes=dataset_train.num_classes(),)
retinanet.load_state_dict(torch.load(PATH_TO_WEIGHTS))

Validation

Run coco_validation.py to validate the code on the COCO dataset. With the above model, run:

python coco_validation.py --coco_path ~/path/to/coco --model_path /path/to/model/coco_resnet_50_map_0_335_state_dict.pt

This produces the following results:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.335
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.499
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.357
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.167
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.369
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.466
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.282
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.429
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.458
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.255
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.508
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.597

For CSV Datasets (more info on those below), run the following script to validate:

python csv_validation.py --csv_annotations_path path/to/annotations.csv --model_path path/to/model.pt --images_path path/to/images_dir --class_list_path path/to/class_list.csv (optional) iou_threshold iou_thres (0<iou_thresh<1)

It produces following resullts:

label_1 : (label_1_mAP)
Precision :  ...
Recall:  ...

label_2 : (label_2_mAP)
Precision :  ...
Recall:  ...

You can also configure csv_eval.py script to save the precision-recall curve on disk.

Visualization

To visualize the network detection, use visualize.py:

python visualize.py --dataset coco --coco_path ../coco --model <path/to/model.pt>

This will visualize bounding boxes on the validation set. To visualise with a CSV dataset, use:

python visualize.py --dataset csv --csv_classes <path/to/train/class_list.csv>  --csv_val <path/to/val_annots.csv> --model <path/to/model.pt>

Model

The retinanet model uses a resnet backbone. You can set the depth of the resnet model using the --depth argument. Depth must be one of 18, 34, 50, 101 or 152. Note that deeper models are more accurate but are slower and use more memory.

CSV datasets

The CSVGenerator provides an easy way to define your own datasets. It uses two CSV files: one file containing annotations and one file containing a class name to ID mapping.

Annotations format

The CSV file with annotations should contain one annotation per line. Images with multiple bounding boxes should use one row per bounding box. Note that indexing for pixel values starts at 0. The expected format of each line is:

path/to/image.jpg,x1,y1,x2,y2,class_name

Some images may not contain any labeled objects. To add these images to the dataset as negative examples, add an annotation where x1, y1, x2, y2 and class_name are all empty:

path/to/image.jpg,,,,,

A full example:

/data/imgs/img_001.jpg,837,346,981,456,cow
/data/imgs/img_002.jpg,215,312,279,391,cat
/data/imgs/img_002.jpg,22,5,89,84,bird
/data/imgs/img_003.jpg,,,,,

This defines a dataset with 3 images. img_001.jpg contains a cow. img_002.jpg contains a cat and a bird. img_003.jpg contains no interesting objects/animals.

Class mapping format

The class name to ID mapping file should contain one mapping per line. Each line should use the following format:

class_name,id

Indexing for classes starts at 0. Do not include a background class as it is implicit.

For example:

cow,0
cat,1
bird,2

Acknowledgements

Examples

img1 img2 img4 img6 img7 img8

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].