Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → LandskapeAI → Triplet Attention

LandskapeAI / Triplet Attention

Licence: mit

Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]

Labels

jupyter-notebook deep-learning computer-vision convolutional-neural-networks detection paper attention-mechanism imagenet arxiv

Projects that are alternatives of or similar to Triplet Attention

All About The Gan

All About the GANs(Generative Adversarial Networks) - Summarized lists for GAN

Stars: ✭ 630 (+183.78%)

Mutual labels: paper, arxiv, detection

Action Recognition Visual Attention

Action recognition using soft attention based deep recurrent neural networks

Stars: ✭ 350 (+57.66%)

Mutual labels: jupyter-notebook, paper, attention-mechanism

Computer Vision

Programming Assignments and Lectures for Stanford's CS 231: Convolutional Neural Networks for Visual Recognition

Stars: ✭ 408 (+83.78%)

Mutual labels: jupyter-notebook, convolutional-neural-networks, imagenet

Caffenet Benchmark

Evaluation of the CNN design choices performance on ImageNet-2012.

Stars: ✭ 700 (+215.32%)

Mutual labels: jupyter-notebook, convolutional-neural-networks, imagenet

Pneumonia Detection From Chest X Ray Images With Deep Learning

Detecting Pneumonia in Chest X-ray Images using Convolutional Neural Network and Pretrained Models

Stars: ✭ 64 (-71.17%)

Mutual labels: jupyter-notebook, convolutional-neural-networks, imagenet

Pytorch Question Answering

Important paper implementations for Question Answering using PyTorch

Stars: ✭ 154 (-30.63%)

Mutual labels: jupyter-notebook, convolutional-neural-networks, attention-mechanism

Simpsonrecognition

Detect and recognize The Simpsons characters using Keras and Faster R-CNN

Stars: ✭ 131 (-40.99%)

Mutual labels: jupyter-notebook, convolutional-neural-networks, detection

Research Paper Notes

Notes and Summaries on ML-related Research Papers (with optional implementations)

Stars: ✭ 218 (-1.8%)

Mutual labels: jupyter-notebook, paper, arxiv

Pytorch Vae

A CNN Variational Autoencoder (CNN-VAE) implemented in PyTorch

Stars: ✭ 181 (-18.47%)

Mutual labels: jupyter-notebook, convolutional-neural-networks

Graph attention pool

Attention over nodes in Graph Neural Networks using PyTorch (NeurIPS 2019)

Stars: ✭ 186 (-16.22%)

Mutual labels: jupyter-notebook, attention-mechanism

Dragan

A stable algorithm for GAN training

Stars: ✭ 189 (-14.86%)

Mutual labels: jupyter-notebook, paper

2048 Deep Reinforcement Learning

Trained A Convolutional Neural Network To Play 2048 using Deep-Reinforcement Learning

Stars: ✭ 169 (-23.87%)

Mutual labels: jupyter-notebook, convolutional-neural-networks

Capsnet Traffic Sign Classifier

A Tensorflow implementation of CapsNet(Capsules Net) apply on german traffic sign dataset

Stars: ✭ 166 (-25.23%)

Mutual labels: jupyter-notebook, convolutional-neural-networks

Lidc nodule detection

lidc nodule detection with CNN and LSTM network

Stars: ✭ 187 (-15.77%)

Mutual labels: jupyter-notebook, detection

A Journey Into Convolutional Neural Network Visualization

A journey into Convolutional Neural Network visualization

Stars: ✭ 165 (-25.68%)

Mutual labels: jupyter-notebook, convolutional-neural-networks

Deep Learning With Python

Deep learning codes and projects using Python

Stars: ✭ 195 (-12.16%)

Mutual labels: jupyter-notebook, convolutional-neural-networks

Traffic Sign Detection

Traffic Sign Detection. Code for the paper entitled "Evaluation of deep neural networks for traffic sign detection systems".

Stars: ✭ 200 (-9.91%)

Mutual labels: jupyter-notebook, convolutional-neural-networks

Iresnet

Improved Residual Networks (https://arxiv.org/pdf/2004.04989.pdf)

Stars: ✭ 163 (-26.58%)

Mutual labels: convolutional-neural-networks, imagenet

Coursera Deep Learning Specialization

Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv) Convolutional Neural Networks; (v) Sequence Models

Stars: ✭ 188 (-15.32%)

Mutual labels: jupyter-notebook, convolutional-neural-networks

Image To 3d Bbox

Build a CNN network to predict 3D bounding box of car from 2D image.

Stars: ✭ 200 (-9.91%)

Mutual labels: jupyter-notebook, convolutional-neural-networks

View All Similar Projects ➔

Abstract - Benefiting from the capability of building inter-dependencies among channels or spatial locations, attention mechanisms have been extensively studied and broadly used in a variety of computer vision tasks recently. In this paper, we investigate light-weight but effective attention mechanisms and present triplet attention, a novel method for computing attention weights by capturing cross-dimension interaction using a three-branch structure. For an input tensor, triplet attention builds inter-dimensional dependencies by the rotation operation followed by residual transformations and encodes inter-channel and spatial information with negligible computational overhead. Our method is simple as well as efficient and can be easily plugged into classic backbone networks as an add-on module. We demonstrate the effectiveness of our method on various challenging tasks including image classification on ImageNet-1k and object detection on MSCOCO and PASCAL VOC datasets. Furthermore, we provide extensive in-sight into the performance of triplet attention by visually inspecting the GradCAM and GradCAM++ results. The empirical evaluation of our method supports our intuition on the importance of capturing dependencies across dimensions when computing attention weights.

Figure 1. Structural Design of Triplet Attention Module.

Figure 2. (a). Squeeze Excitation Block. (b). Convolution Block Attention Module (CBAM) (Note - GMP denotes - Global Max Pooling). (c). Global Context (GC) block. (d). Triplet Attention (ours).

Figure 3. GradCAM and GradCAM++ comparisons for ResNet-50 based on sample images from ImageNet dataset.

For generating GradCAM and GradCAM++ results, please follow the code on this repository.

Changelogs/ Updates: (Click to expand)

[05/11/20] v2 of our paper is out on arXiv.
[02/11/20] Our paper is accepted to WACV 2021.
[06/10/20] Preprint of our paper is out on arXiv.

Pretrained Models:

ImageNet:

Model	Parameters	GFLOPs	Top-1 Error	Top-5 Error	Weights
ResNet-18 + Triplet Attention (k = 3)	11.69 M	1.823	29.67%	10.42%	Google Drive
ResNet-18 + Triplet Attention (k = 7)	11.69 M	1.825	28.91%	10.01%	Google Drive
ResNet-50 + Triplet Attention (k = 7)	25.56 M	4.169	22.52%	6.326%	Google Drive
ResNet-50 + Triplet Attention (k = 3)	25.56 M	4.131	23.88%	6.938%	Google Drive
MobileNet v2 + Triplet Attention (k = 3)	3.506 M	0.322	27.38%	9.23%	Google Drive
MobileNet v2 + Triplet Attention (k = 7)	3.51 M	0.327	28.01%	9.516%	Google Drive

MS-COCO:

All models are trained with 1x learning schedule.

Detectron2:

Object Detection:

Backbone	Detectors	AP	AP₅₀	AP₇₅	AP_S	AP_M	AP_L	Weights
ResNet-50 + Triplet Attention (k = 7)	Faster R-CNN	39.2	60.8	42.3	23.3	42.5	50.3	Google Drive
ResNet-50 + Triplet Attention (k = 7)	RetinaNet	38.2	58.5	40.4	23.4	42.1	48.7	Google Drive
ResNet-50 + Triplet Attention (k = 7)	Mask RCNN	39.8	61.6	42.8	24.3	42.9	51.3	Google Drive

Instance Segmentation

Backbone	Detectors	AP	AP₅₀	AP₇₅	AP_S	AP_M	AP_L	Weights
ResNet-50 + Triplet Attention (k = 7)	Mask RCNN	35.8	57.8	38.1	18	38.1	50.7	Google Drive

Person Keypoint Detection

Backbone	Detectors	AP	AP₅₀	AP₇₅	AP_M	AP_L	Weights
ResNet-50 + Triplet Attention (k = 7)	Keypoint RCNN	64.7	85.9	70.4	60.3	73.1	Google Drive

BBox AP results using Keypoint RCNN:

Backbone	Detectors	AP	AP₅₀	AP₇₅	AP_S	AP_M	AP_L	Weights
ResNet-50 + Triplet Attention (k = 7)	Keypoint RCNN	54.8	83.1	59.9	37.4	61.9	72.1	Google Drive

MMDetection:

Object Detection:

Backbone	Detectors	AP	AP₅₀	AP₇₅	AP_S	AP_M	AP_L	Weights
ResNet-50 + Triplet Attention (k = 7)	Faster R-CNN	39.3	60.8	42.7	23.4	42.8	50.3	Google Drive
ResNet-50 + Triplet Attention (k = 7)	RetinaNet	37.6	57.3	40.0	21.7	41.1	49.7	Google Drive

Training From Scratch

The Triplet Attention layer is implemented in triplet_attention.py. Since triplet attention is a dimentionality-preserving module, it can be inserted between convolutional layers in most stages of most networks. We recommend using the model definition provided here with our imagenet training repo to use the fastest and most up-to-date training scripts.

However, this repository includes all the code required to recreate the experiments mentioned in the paper. This sections provides the instructions required to run these experiments. Imagenet training code is based on the official PyTorch example.

To train a model on ImageNet, run train_imagenet.py with the desired model architecture and the path to the ImageNet dataset:

Simple Training

python train_imagenet.py -a resnet18 [imagenet-folder with train and val folders]

The default learning rate schedule starts at 0.1 and decays by a factor of 10 every 30 epochs. This is appropriate for ResNet and models with batch normalization, but too high for AlexNet and VGG. Use 0.01 as the initial learning rate for AlexNet or VGG:

python main.py -a alexnet --lr 0.01 [imagenet-folder with train and val folders]

Note, however, that we do not provide model defintions for AlexNet, VGG, etc. Only the ResNet family and MobileNetV2 are officially supported.

Multi-processing Distributed Data Parallel Training

You should always use the NCCL backend for multi-processing distributed training since it currently provides the best distributed training performance.

Single node, multiple GPUs:

python train_imagenet.py -a resnet50 --dist-url 'tcp://127.0.0.1:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0 [imagenet-folder with train and val folders]

Multiple nodes:

Node 0:

python train_imagenet.py -a resnet50 --dist-url 'tcp://IP_OF_NODE0:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 0 [imagenet-folder with train and val folders]

Node 1:

python train_imagenet.py -a resnet50 --dist-url 'tcp://IP_OF_NODE0:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 1 [imagenet-folder with train and val folders]

Usage

usage: train_imagenet.py  [-h] [--arch ARCH] [-j N] [--epochs N] [--start-epoch N] [-b N]
                          [--lr LR] [--momentum M] [--weight-decay W] [--print-freq N]
                          [--resume PATH] [-e] [--pretrained] [--world-size WORLD_SIZE]
                          [--rank RANK] [--dist-url DIST_URL]
                          [--dist-backend DIST_BACKEND] [--seed SEED] [--gpu GPU]
                          [--multiprocessing-distributed]
                          DIR

PyTorch ImageNet Training

positional arguments:
  DIR                   path to dataset

optional arguments:
  -h, --help            show this help message and exit
  --arch ARCH, -a ARCH  model architecture: alexnet | densenet121 |
                        densenet161 | densenet169 | densenet201 |
                        resnet101 | resnet152 | resnet18 | resnet34 |
                        resnet50 | squeezenet1_0 | squeezenet1_1 | vgg11 |
                        vgg11_bn | vgg13 | vgg13_bn | vgg16 | vgg16_bn | vgg19
                        | vgg19_bn (default: resnet18)
  -j N, --workers N     number of data loading workers (default: 4)
  --epochs N            number of total epochs to run
  --start-epoch N       manual epoch number (useful on restarts)
  -b N, --batch-size N  mini-batch size (default: 256), this is the total
                        batch size of all GPUs on the current node when using
                        Data Parallel or Distributed Data Parallel
  --lr LR, --learning-rate LR
                        initial learning rate
  --momentum M          momentum
  --weight-decay W, --wd W
                        weight decay (default: 1e-4)
  --print-freq N, -p N  print frequency (default: 10)
  --resume PATH         path to latest checkpoint (default: none)
  -e, --evaluate        evaluate model on validation set
  --pretrained          use pre-trained model
  --world-size WORLD_SIZE
                        number of nodes for distributed training
  --rank RANK           node rank for distributed training
  --dist-url DIST_URL   url used to set up distributed training
  --dist-backend DIST_BACKEND
                        distributed backend
  --seed SEED           seed for initializing training.
  --gpu GPU             GPU id to use.
  --multiprocessing-distributed
                        Use multi-processing distributed training to launch N
                        processes per node, which has N GPUs. This is the
                        fastest way to use PyTorch for either single node or
                        multi node data parallel training

Cite our work:

@InProceedings{Misra_2021_WACV,
    author    = {Misra, Diganta and Nalamada, Trikay and Arasanipalai, Ajay Uppili and Hou, Qibin},
    title     = {Rotate to Attend: Convolutional Triplet Attention Module},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2021},
    pages     = {3139-3148}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 222

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗