All Projects → JDAI-CV → Image Captioning

JDAI-CV / Image Captioning

Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Image Captioning

Im2p
Tensorflow implementation of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs
Stars: ✭ 15 (-91.23%)
Mutual labels:  image-captioning
Automatic Image Captioning
Generating Captions for images using Deep Learning
Stars: ✭ 84 (-50.88%)
Mutual labels:  image-captioning
Sightseq
Computer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection
Stars: ✭ 116 (-32.16%)
Mutual labels:  image-captioning
Bottom Up Attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
Stars: ✭ 989 (+478.36%)
Mutual labels:  image-captioning
Cameramanager
Simple Swift class to provide all the configurations you need to create custom camera view in your app
Stars: ✭ 1,130 (+560.82%)
Mutual labels:  image-captioning
Arnet
CVPR 2018 - Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present
Stars: ✭ 94 (-45.03%)
Mutual labels:  image-captioning
Show Attend And Tell
TensorFlow Implementation of "Show, Attend and Tell"
Stars: ✭ 869 (+408.19%)
Mutual labels:  image-captioning
Image Caption Generator
[DEPRECATED] A Neural Network based generative model for captioning images using Tensorflow
Stars: ✭ 141 (-17.54%)
Mutual labels:  image-captioning
Image Text Papers
Image Caption and Text to Image papers.
Stars: ✭ 71 (-58.48%)
Mutual labels:  image-captioning
Gis
gis (go image server) go 实现的图片服务,实现基本的上传,下载,存储,按比例裁剪等功能
Stars: ✭ 108 (-36.84%)
Mutual labels:  image-captioning
Image captioning
generate captions for images using a CNN-RNN model that is trained on the Microsoft Common Objects in COntext (MS COCO) dataset
Stars: ✭ 51 (-70.18%)
Mutual labels:  image-captioning
Coco Cn
Enriching MS-COCO with Chinese sentences and tags for cross-lingual multimedia tasks
Stars: ✭ 57 (-66.67%)
Mutual labels:  image-captioning
Medical Report Generation
A pytorch implementation of On the Automatic Generation of Medical Imaging Reports.
Stars: ✭ 100 (-41.52%)
Mutual labels:  image-captioning
Punny captions
An implementation of the NAACL 2018 paper "Punny Captions: Witty Wordplay in Image Descriptions".
Stars: ✭ 31 (-81.87%)
Mutual labels:  image-captioning
A Pytorch Tutorial To Image Captioning
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
Stars: ✭ 1,867 (+991.81%)
Mutual labels:  image-captioning
Neural Image Captioning
Implementation of Neural Image Captioning model using Keras with Theano backend
Stars: ✭ 12 (-92.98%)
Mutual labels:  image-captioning
Transformer image caption
Image Captioning based on Bottom-Up and Top-Down Attention model
Stars: ✭ 94 (-45.03%)
Mutual labels:  image-captioning
Show Adapt And Tell
Code for "Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner" in ICCV 2017
Stars: ✭ 146 (-14.62%)
Mutual labels:  image-captioning
Image Caption Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Stars: ✭ 126 (-26.32%)
Mutual labels:  image-captioning
Video2description
Video to Text: Generates description in natural language for given video (Video Captioning)
Stars: ✭ 107 (-37.43%)
Mutual labels:  image-captioning

Introduction

This repository is for X-Linear Attention Networks for Image Captioning (CVPR 2020). The original paper can be found here.

Please cite with the following BibTeX:

@inproceedings{xlinear2020cvpr,
  title={X-Linear Attention Networks for Image Captioning},
  author={Pan, Yingwei and Yao, Ting and Li, Yehao and Mei, Tao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Requirements

Data preparation

  1. Download the bottom up features and convert them to npz files
python2 tools/create_feats.py --infeats bottom_up_tsv --outfolder ./mscoco/feature/up_down_10_100
  1. Download the annotations into the mscoco folder. More details about data preparation can be referred to self-critical.pytorch

  2. Download coco-caption and setup the path of __C.INFERENCE.COCO_PATH in lib/config.py

  3. The pretrained models and results can be downloaded here.

  4. The pretrained SENet-154 model can be downloaded here.

Training

Train X-LAN model

bash experiments/xlan/train.sh

Train X-LAN model using self critical

Copy the pretrained model into experiments/xlan_rl/snapshot and run the script

bash experiments/xlan_rl/train.sh

Train X-LAN transformer model

bash experiments/xtransformer/train.sh

Train X-LAN transformer model using self critical

Copy the pretrained model into experiments/xtransformer_rl/snapshot and run the script

bash experiments/xtransformer_rl/train.sh

Evaluation

CUDA_VISIBLE_DEVICES=0 python3 main_test.py --folder experiments/model_folder --resume model_epoch

Acknowledgements

Thanks the contribution of self-critical.pytorch and awesome PyTorch team.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].