All Projects → jiasenlu → Adaptiveattention

jiasenlu / Adaptiveattention

Licence: other
Implementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"

Projects that are alternatives of or similar to Adaptiveattention

Show Attend And Tell
TensorFlow Implementation of "Show, Attend and Tell"
Stars: ✭ 869 (+186.8%)
Mutual labels:  jupyter-notebook, attention-mechanism, image-captioning
Pytorch Question Answering
Important paper implementations for Question Answering using PyTorch
Stars: ✭ 154 (-49.17%)
Mutual labels:  jupyter-notebook, attention-mechanism
Image Caption Generator
[DEPRECATED] A Neural Network based generative model for captioning images using Tensorflow
Stars: ✭ 141 (-53.47%)
Mutual labels:  jupyter-notebook, image-captioning
Up Down Captioner
Automatic image captioning model based on Caffe, using features from bottom-up attention.
Stars: ✭ 195 (-35.64%)
Mutual labels:  jupyter-notebook, image-captioning
Linear Attention Recurrent Neural Network
A recurrent attention module consisting of an LSTM cell which can query its own past cell states by the means of windowed multi-head attention. The formulas are derived from the BN-LSTM and the Transformer Network. The LARNN cell with attention can be easily used inside a loop on the cell state, just like any other RNN. (LARNN)
Stars: ✭ 119 (-60.73%)
Mutual labels:  jupyter-notebook, attention-mechanism
Yolov3 Point
从零开始学习YOLOv3教程解读代码+注意力模块(SE,SPP,RFB etc)
Stars: ✭ 119 (-60.73%)
Mutual labels:  jupyter-notebook, attention-mechanism
Graph attention pool
Attention over nodes in Graph Neural Networks using PyTorch (NeurIPS 2019)
Stars: ✭ 186 (-38.61%)
Mutual labels:  jupyter-notebook, attention-mechanism
Deep Dream In Pytorch
Pytorch implementation of the DeepDream computer vision algorithm
Stars: ✭ 90 (-70.3%)
Mutual labels:  jupyter-notebook, torch
Orn
Oriented Response Networks, in CVPR 2017
Stars: ✭ 207 (-31.68%)
Mutual labels:  jupyter-notebook, torch
Triplet Attention
Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]
Stars: ✭ 222 (-26.73%)
Mutual labels:  jupyter-notebook, attention-mechanism
Image-Caption
Using LSTM or Transformer to solve Image Captioning in Pytorch
Stars: ✭ 36 (-88.12%)
Mutual labels:  image-captioning, attention-mechanism
Pytorch Learners Tutorial
PyTorch tutorial for learners
Stars: ✭ 97 (-67.99%)
Mutual labels:  jupyter-notebook, torch
Transformer image caption
Image Captioning based on Bottom-Up and Top-Down Attention model
Stars: ✭ 94 (-68.98%)
Mutual labels:  jupyter-notebook, image-captioning
Abstractive Summarization
Implementation of abstractive summarization using LSTM in the encoder-decoder architecture with local attention.
Stars: ✭ 128 (-57.76%)
Mutual labels:  jupyter-notebook, attention-mechanism
Beauty.torch
Understanding facial beauty with deep learning.
Stars: ✭ 90 (-70.3%)
Mutual labels:  jupyter-notebook, torch
Poetry Seq2seq
Chinese Poetry Generation
Stars: ✭ 159 (-47.52%)
Mutual labels:  jupyter-notebook, attention-mechanism
Image Captioning
Image Captioning using InceptionV3 and beam search
Stars: ✭ 290 (-4.29%)
Mutual labels:  jupyter-notebook, image-captioning
Group Level Emotion Recognition
Model submitted for the ICMI 2018 EmotiW Group-Level Emotion Recognition Challenge
Stars: ✭ 70 (-76.9%)
Mutual labels:  jupyter-notebook, attention-mechanism
Automatic Image Captioning
Generating Captions for images using Deep Learning
Stars: ✭ 84 (-72.28%)
Mutual labels:  jupyter-notebook, image-captioning
Csa Inpainting
Coherent Semantic Attention for image inpainting(ICCV 2019)
Stars: ✭ 202 (-33.33%)
Mutual labels:  jupyter-notebook, attention-mechanism

AdaptiveAttention

Implementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"

teaser results

Requirements

To train the model require GPU with 12GB Memory, if you do not have GPU, you can directly use the pretrained model for inference.

This code is written in Lua and requires Torch. The preprocssinng code is in Python, and you need to install NLTK if you want to use NLTK to tokenize the caption.

You also need to install the following package in order to sucessfully run the code.

Pretrained Model

The pre-trained model for COCO can be download here. The pre-trained model for Flickr30K can be download here.

Vocabulary File

Download the corresponding Vocabulary file for COCO and Flickr30k

Download Dataset

The first thing you need to do is to download the data and do some preprocessing. Head over to the data/ folder and run the correspodning ipython script. It will download, preprocess and generate coco_raw.json.

Download COCO and Flickr30k image dataset, extract the image and put under somewhere.

training a new model on MS COCO

First train the Language model without finetune the image.

th train.lua -batch_size 20 

When finetune the CNN, load the saved model and train for another 15~20 epoch.

th train.lua -batch_size 16 -startEpoch 21 -start_from 'model_id1_20.t7'

More Result about spatial attention and visual sentinel

teaser results

teaser results

For more visualization result, you can visit here (it will load more than 1000 image and their result...)

Reference

If you use this code as part of any published research, please acknowledge the following paper

@misc{Lu2017Adaptive,
author = {Lu, Jiasen and Xiong, Caiming and Parikh, Devi and Socher, Richard},
title = {Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning},
journal = {CVPR},
year = {2017}
}

Acknowledgement

This code is developed based on NeuralTalk2.

Thanks Torch team and Facebook ResNet implementation.

License

BSD 3-Clause License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].