Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → JDAI-CV → Image Captioning

JDAI-CV / Image Captioning

Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]

Programming Languages

139335 projects - #7 most used programming language

Labels

image-captioning

Projects that are alternatives of or similar to Image Captioning

Tensorflow implementation of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs

Stars: ✭ 15 (-91.23%)

Mutual labels: image-captioning

Automatic Image Captioning

Generating Captions for images using Deep Learning

Stars: ✭ 84 (-50.88%)

Mutual labels: image-captioning

Computer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection

Stars: ✭ 116 (-32.16%)

Mutual labels: image-captioning

Bottom Up Attention

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

Stars: ✭ 989 (+478.36%)

Mutual labels: image-captioning

Simple Swift class to provide all the configurations you need to create custom camera view in your app

Stars: ✭ 1,130 (+560.82%)

Mutual labels: image-captioning

CVPR 2018 - Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present

Stars: ✭ 94 (-45.03%)

Mutual labels: image-captioning

Show Attend And Tell

TensorFlow Implementation of "Show, Attend and Tell"

Stars: ✭ 869 (+408.19%)

Mutual labels: image-captioning

Image Caption Generator

[DEPRECATED] A Neural Network based generative model for captioning images using Tensorflow

Stars: ✭ 141 (-17.54%)

Mutual labels: image-captioning

Image Text Papers

Image Caption and Text to Image papers.

Stars: ✭ 71 (-58.48%)

Mutual labels: image-captioning

gis (go image server) go 实现的图片服务，实现基本的上传，下载，存储，按比例裁剪等功能

Stars: ✭ 108 (-36.84%)

Mutual labels: image-captioning

Image captioning

generate captions for images using a CNN-RNN model that is trained on the Microsoft Common Objects in COntext (MS COCO) dataset

Stars: ✭ 51 (-70.18%)

Mutual labels: image-captioning

Enriching MS-COCO with Chinese sentences and tags for cross-lingual multimedia tasks

Stars: ✭ 57 (-66.67%)

Mutual labels: image-captioning

Medical Report Generation

A pytorch implementation of On the Automatic Generation of Medical Imaging Reports.

Stars: ✭ 100 (-41.52%)

Mutual labels: image-captioning

An implementation of the NAACL 2018 paper "Punny Captions: Witty Wordplay in Image Descriptions".

Stars: ✭ 31 (-81.87%)

Mutual labels: image-captioning

A Pytorch Tutorial To Image Captioning

Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

Stars: ✭ 1,867 (+991.81%)

Mutual labels: image-captioning

Neural Image Captioning

Implementation of Neural Image Captioning model using Keras with Theano backend

Stars: ✭ 12 (-92.98%)

Mutual labels: image-captioning

Transformer image caption

Image Captioning based on Bottom-Up and Top-Down Attention model

Stars: ✭ 94 (-45.03%)

Mutual labels: image-captioning

Show Adapt And Tell

Code for "Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner" in ICCV 2017

Stars: ✭ 146 (-14.62%)

Mutual labels: image-captioning

Image Caption Generator

A neural network to generate captions for an image using CNN and RNN with BEAM Search.

Stars: ✭ 126 (-26.32%)

Mutual labels: image-captioning

Video2description

Video to Text: Generates description in natural language for given video (Video Captioning)

Stars: ✭ 107 (-37.43%)

Mutual labels: image-captioning

View All Similar Projects ➔

Introduction

This repository is for X-Linear Attention Networks for Image Captioning (CVPR 2020). The original paper can be found here.

Please cite with the following BibTeX:

@inproceedings{xlinear2020cvpr,
  title={X-Linear Attention Networks for Image Captioning},
  author={Pan, Yingwei and Yao, Ting and Li, Yehao and Mei, Tao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Requirements

Python 3
CUDA 10
numpy
tqdm
easydict
PyTorch (>1.0)
torchvision
coco-caption

Data preparation

Download the bottom up features and convert them to npz files

python2 tools/create_feats.py --infeats bottom_up_tsv --outfolder ./mscoco/feature/up_down_10_100

Download the annotations into the mscoco folder. More details about data preparation can be referred to self-critical.pytorch
Download coco-caption and setup the path of __C.INFERENCE.COCO_PATH in lib/config.py
The pretrained models and results can be downloaded here.
The pretrained SENet-154 model can be downloaded here.

Training

Train X-LAN model

bash experiments/xlan/train.sh

Train X-LAN model using self critical

Copy the pretrained model into experiments/xlan_rl/snapshot and run the script

bash experiments/xlan_rl/train.sh

Train X-LAN transformer model

bash experiments/xtransformer/train.sh

Train X-LAN transformer model using self critical

Copy the pretrained model into experiments/xtransformer_rl/snapshot and run the script

bash experiments/xtransformer_rl/train.sh

Evaluation

CUDA_VISIBLE_DEVICES=0 python3 main_test.py --folder experiments/model_folder --resume model_epoch

Acknowledgements

Thanks the contribution of self-critical.pytorch and awesome PyTorch team.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 171

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗