All Projects → yunjey → Show Attend And Tell

yunjey / Show Attend And Tell

Licence: mit
TensorFlow Implementation of "Show, Attend and Tell"

Projects that are alternatives of or similar to Show Attend And Tell

Adaptiveattention
Implementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"
Stars: ✭ 303 (-65.13%)
Mutual labels:  jupyter-notebook, attention-mechanism, image-captioning
Triplet Attention
Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]
Stars: ✭ 222 (-74.45%)
Mutual labels:  jupyter-notebook, attention-mechanism
Csa Inpainting
Coherent Semantic Attention for image inpainting(ICCV 2019)
Stars: ✭ 202 (-76.75%)
Mutual labels:  jupyter-notebook, attention-mechanism
Image Captioning
Image Captioning using InceptionV3 and beam search
Stars: ✭ 290 (-66.63%)
Mutual labels:  jupyter-notebook, image-captioning
Poetry Seq2seq
Chinese Poetry Generation
Stars: ✭ 159 (-81.7%)
Mutual labels:  jupyter-notebook, attention-mechanism
Graph attention pool
Attention over nodes in Graph Neural Networks using PyTorch (NeurIPS 2019)
Stars: ✭ 186 (-78.6%)
Mutual labels:  jupyter-notebook, attention-mechanism
Da Rnn
📃 **Unofficial** PyTorch Implementation of DA-RNN (arXiv:1704.02971)
Stars: ✭ 256 (-70.54%)
Mutual labels:  jupyter-notebook, attention-mechanism
Yolov3 Point
从零开始学习YOLOv3教程解读代码+注意力模块(SE,SPP,RFB etc)
Stars: ✭ 119 (-86.31%)
Mutual labels:  jupyter-notebook, attention-mechanism
Cs231
Complete Assignments for CS231n: Convolutional Neural Networks for Visual Recognition
Stars: ✭ 317 (-63.52%)
Mutual labels:  jupyter-notebook, image-captioning
Action Recognition Visual Attention
Action recognition using soft attention based deep recurrent neural networks
Stars: ✭ 350 (-59.72%)
Mutual labels:  jupyter-notebook, attention-mechanism
Pytorch Question Answering
Important paper implementations for Question Answering using PyTorch
Stars: ✭ 154 (-82.28%)
Mutual labels:  jupyter-notebook, attention-mechanism
Deeplearning.ai Natural Language Processing Specialization
This repository contains my full work and notes on Coursera's NLP Specialization (Natural Language Processing) taught by the instructor Younes Bensouda Mourri and Łukasz Kaiser offered by deeplearning.ai
Stars: ✭ 473 (-45.57%)
Mutual labels:  jupyter-notebook, attention-mechanism
Image Caption Generator
[DEPRECATED] A Neural Network based generative model for captioning images using Tensorflow
Stars: ✭ 141 (-83.77%)
Mutual labels:  jupyter-notebook, image-captioning
Up Down Captioner
Automatic image captioning model based on Caffe, using features from bottom-up attention.
Stars: ✭ 195 (-77.56%)
Mutual labels:  jupyter-notebook, image-captioning
Abstractive Summarization
Implementation of abstractive summarization using LSTM in the encoder-decoder architecture with local attention.
Stars: ✭ 128 (-85.27%)
Mutual labels:  jupyter-notebook, attention-mechanism
Image-Caption
Using LSTM or Transformer to solve Image Captioning in Pytorch
Stars: ✭ 36 (-95.86%)
Mutual labels:  image-captioning, attention-mechanism
Transformer image caption
Image Captioning based on Bottom-Up and Top-Down Attention model
Stars: ✭ 94 (-89.18%)
Mutual labels:  jupyter-notebook, image-captioning
Linear Attention Recurrent Neural Network
A recurrent attention module consisting of an LSTM cell which can query its own past cell states by the means of windowed multi-head attention. The formulas are derived from the BN-LSTM and the Transformer Network. The LARNN cell with attention can be easily used inside a loop on the cell state, just like any other RNN. (LARNN)
Stars: ✭ 119 (-86.31%)
Mutual labels:  jupyter-notebook, attention-mechanism
Attention is all you need
Transformer of "Attention Is All You Need" (Vaswani et al. 2017) by Chainer.
Stars: ✭ 303 (-65.13%)
Mutual labels:  jupyter-notebook, attention-mechanism
Pytorch Original Transformer
My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
Stars: ✭ 411 (-52.7%)
Mutual labels:  jupyter-notebook, attention-mechanism

Show, Attend and Tell

Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attention which introduces an attention based image caption generator. The model changes its attention to the relevant part of the image while it generates each word.


alt text


References

Author's theano code: https://github.com/kelvinxu/arctic-captions

Another tensorflow implementation: https://github.com/jazzsaxmafia/show_attend_and_tell.tensorflow


Getting Started

Prerequisites

First, clone this repo and pycocoevalcap in same directory.

$ git clone https://github.com/yunjey/show-attend-and-tell-tensorflow.git
$ git clone https://github.com/tylin/coco-caption.git

This code is written in Python2.7 and requires TensorFlow 1.2. In addition, you need to install a few more packages to process MSCOCO data set. I have provided a script to download the MSCOCO image dataset and VGGNet19 model. Downloading the data may take several hours depending on the network speed. Run commands below then the images will be downloaded in image/ directory and VGGNet19 model will be downloaded in data/ directory.

$ cd show-attend-and-tell-tensorflow
$ pip install -r requirements.txt
$ chmod +x ./download.sh
$ ./download.sh

For feeding the image to the VGGNet, you should resize the MSCOCO image dataset to the fixed size of 224x224. Run command below then resized images will be stored in image/train2014_resized/ and image/val2014_resized/ directory.

$ python resize.py

Before training the model, you have to preprocess the MSCOCO caption dataset. To generate caption dataset and image feature vectors, run command below.

$ python prepro.py

Train the model

To train the image captioning model, run command below.

$ python train.py

(optional) Tensorboard visualization

I have provided a tensorboard visualization for real-time debugging. Open the new terminal, run command below and open http://localhost:6005/ into your web browser.

$ tensorboard --logdir='./log' --port=6005 

Evaluate the model

To generate captions, visualize attention weights and evaluate the model, please see evaluate_model.ipynb.


Results


Training data

(1) Generated caption: A plane flying in the sky with a landing gear down.

alt text

(2) Generated caption: A giraffe and two zebra standing in the field.

alt text

Validation data

(1) Generated caption: A large elephant standing in a dry grass field.

alt text

(2) Generated caption: A baby elephant standing on top of a dirt field.

alt text

Test data

(1) Generated caption: A plane flying over a body of water.

alt text

(2) Generated caption: A zebra standing in the grass near a tree.

alt text

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].