MCAR.pytorch
This repository is a PyTorch implementation of Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition. The paper is accepted at [IEEE Trans. Image Processing (TIP 2021). This repo is created by Bin-Bin Gao.
MCAR Framework
Requirements
Please, install the following packages
- numpy
- torch-0.4.1
- torchnet
- torchvision-0.2.0
- tqdm
Options
topN
: number of local regionsthreshold
: threshold of localizationps
: global pooling style, e.g., 'avg', 'max', 'gwp'lr
: learning ratelrp
: factor for learning rate of pretrained layers. The learning rate of the pretrained layers islr * lrp
batch-size
: number of images per batchimage-size
: size of the imageepochs
: number of training epochsevaluate
: evaluate model on validation setresume
: path to checkpoint
MCAR Training and Evaluation
bash run.sh
Model | Input-Size | VOC-2007 | VOC-2012 | COCO-2014 |
---|---|---|---|---|
MobileNet-v2 | 256 x 256 | 88.1 | - | 69.8 |
ResNet-50 | 256 x 256 | 92.3 | - | 78.0 |
ResNet-101 | 256 x 256 | 93.0 | - | 79.4 |
MobileNet-v2 | 448 x 448 | 91.3 | 91.0 | 75.0 |
ResNet-50 | 448 x 448 | 94.1 | 93.5 | 82.1 |
ResNet-101 | 448 x 448 | 94.8 | 94.3 | 83.8 |
MCAR Demo
bash run_demo.sh
Citing this repository
If you find this code useful in your research, please consider citing us:
@ARTICLE{MCAR_TIP_2021,
author = {Bin-Bin Gao, Hong-Yu Zhou},
title = {{Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition}},
booktitle = {IEEE Transactions on Image Processing (TIP)},
year={2021},
volume={30},
pages={5920-5932},
}
Reference
This project is based on the following implementations:
Tips
If you have any questions about our work, please do not hesitate to contact us by emails.