All Projects → richardaecn → cvpr18-caption-eval

richardaecn / cvpr18-caption-eval

Licence: MIT License
Learning to Evaluate Image Captioning. CVPR 2018

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to cvpr18-caption-eval

Image-Captioning
Image Captioning with Keras
Stars: ✭ 60 (-24.05%)
Mutual labels:  caption, image-captioning
stylenet
A pytorch implemention of "StyleNet: Generating Attractive Visual Captions with Styles"
Stars: ✭ 58 (-26.58%)
Mutual labels:  caption, image-captioning
localized-narratives
Localized Narratives
Stars: ✭ 60 (-24.05%)
Mutual labels:  image-captioning
hexo-image-caption
add caption for images within posts
Stars: ✭ 21 (-73.42%)
Mutual labels:  caption
G2LTex
Code for CVPR 2018 paper --- Texture Mapping for 3D Reconstruction with RGB-D Sensor
Stars: ✭ 104 (+31.65%)
Mutual labels:  cvpr2018
VoxelMorph-PyTorch
An unofficial PyTorch implementation of VoxelMorph- An unsupervised 3D deformable image registration method
Stars: ✭ 68 (-13.92%)
Mutual labels:  cvpr2018
image-captioning-DLCT
Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
Stars: ✭ 134 (+69.62%)
Mutual labels:  image-captioning
f1-communities
A novel approach to evaluate community detection algorithms on ground truth
Stars: ✭ 20 (-74.68%)
Mutual labels:  evaluation-metrics
im2p
Tensorflow implement of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs
Stars: ✭ 43 (-45.57%)
Mutual labels:  image-captioning
nekocap
Browser extension for creating & uploading community captions for YouTube, niconico and other video sharing sites.
Stars: ✭ 27 (-65.82%)
Mutual labels:  caption
NLP-tools
Useful python NLP tools (evaluation, GUI interface, tokenization)
Stars: ✭ 39 (-50.63%)
Mutual labels:  evaluation-metrics
Machine-Learning
The projects I do in Machine Learning with PyTorch, keras, Tensorflow, scikit learn and Python.
Stars: ✭ 54 (-31.65%)
Mutual labels:  image-captioning
gcnet
GCNet (GIF Caption Network) | Neural Network Generated GIF Captions
Stars: ✭ 14 (-82.28%)
Mutual labels:  caption
CS231n
My solutions for Assignments of CS231n: Convolutional Neural Networks for Visual Recognition
Stars: ✭ 30 (-62.03%)
Mutual labels:  image-captioning
text-detection-fots.pytorch
FOTS text detection branch reimplementation, hmean: 83.3%
Stars: ✭ 80 (+1.27%)
Mutual labels:  cvpr2018
RSTNet
RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words (CVPR 2021)
Stars: ✭ 71 (-10.13%)
Mutual labels:  image-captioning
DisguiseNet
Code for DisguiseNet : A Contrastive Approach for Disguised Face Verification in the Wild
Stars: ✭ 20 (-74.68%)
Mutual labels:  cvpr2018
DVQA dataset
DVQA Dataset: A Bar chart question answering dataset presented at CVPR 2018
Stars: ✭ 20 (-74.68%)
Mutual labels:  cvpr2018
ASNet
Salient Object Detection Driven by Fixation Prediction (CVPR2018)
Stars: ✭ 41 (-48.1%)
Mutual labels:  cvpr2018
captioning chainer
A fast implementation of Neural Image Caption by Chainer
Stars: ✭ 17 (-78.48%)
Mutual labels:  image-captioning

Learning to Evaluate Image Captioning

TensorFlow implementation for the paper:

Learning to Evaluate Image Captioning
Yin Cui, Guandao Yang, Andreas Veit, Xun Huang, Serge Belongie
CVPR 2018

This repository contains a discriminator that could be trained to evaluate image captioning systems. The discriminator is trained to distinguish between machine generated captions and human written ones. During testing, the trained discriminator take the cadidate caption, the reference caption, and optionally the image to be captioned as input. Its output probability of how likely the candidate caption is human written can be used to evaluate the candidate caption. Please refer to our paper [link] for more detail.

Dependencies

  • Python (2.7)
  • Tensorflow (>1.4)
  • PyTorch (for extracting ResNet image features.)
  • ProgressBar
  • NLTK

Preparation

  1. Clone the dataset with recursive (include the bilinear pooling)
git clone --recursive https://github.com/richardaecn/cvpr18-caption-eval.git
  1. Install dependencies. Please refer to TensorFlow, PyTorch and NLTK's official websites for installation guide. For other dependencies, please use the following:
pip install -r requirements.txt
  1. Download data. This script will download needed data. The detailed description of the data can be found in "./download.sh".
./download.sh
  1. Generate vocabulrary.
python scripts/preparation/prep_vocab.py
  1. Extract image features. Following script will download COCO dataset and ResNet checkpoint, then extract image features from COCO dataset using ResNet. This might take few hours.
./download_coco_dataset.sh
cd scripts/features/
./download.sh
python feature_extraction_coco.py --data-dir ../../data/ --coco-img-dir ../../data

Alternatively, we provide a [link] to download features extracted from ResNet152. Please put all *.npy files under "./data/resnet152/".

Evaluation

To evaluate the results of an image captioning method, first put the output captions of the model on COCO dataset into the following JSON format:

{
    "<file-name-1>" : "<caption-1>",
    "<file-name-2>" : "<caption-2>",
    ...
    "<file-name-n>" : "<caption-n>",
}

Note that <caption-i> are caption represented in text, and the file name is the name for the file in the image. The caption should be all lower-cased and have no \n at the end. Examples of such files by running open sourced NeuralTalk, Show and Tell and Show, Attend and Tell can be found in the examples folder: examples/neuraltalk_all_captions.json, examples/showandtell_all_captions.json, examples/showattendandtell_all_captions.json, and examples/human_all_captions.json.

Make sure you have NLTK Punkt sentence tokenizer installed in Python:

import nltk
nltk.download('punkt')

Following command prepared the data so that it could be used for training:

python scripts/preparation/prep_submission.py --submission examples/neuraltalk_all_captions.json  --name neuraltalk

Note that we assume you've followed through the steps in the Preparation section before running this command. This script will create a folder data/neuraltalk and three .npy files that contain data needed for training the metric. Please use the following command to train the metric:

python score.py --name neuraltalk

The results will be logged in model/neuraltalk_scoring directory. If you use the default model architecture, the results will be in model/neuraltalk_scoring/mlp_1_img_1_512_0.txt.

Followings are the scores for three submissions (calculated as the averaged score among last 10 epochs). Notice that scores might be slightly different due to randomization in training.

Architecture Epochs NeuralTalk Show and Tell Show, Attend and Tell
mlp_1_img_1_512_0 30 0.038 0.056 0.077

Citation

If you find our work helpful in your research, please cite it as:

@inproceedings{Cui2018CaptionEval,
  title = {Learning to Evaluate Image Captioning},
  author = {Yin Cui, Guandao Yang, Andreas Veit, Xun Huang, and Serge Belongie},
  booktitle={CVPR},
  year={2018}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].