All Projects → soskek → captioning_chainer

soskek / captioning_chainer

Licence: MIT License
A fast implementation of Neural Image Caption by Chainer

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to captioning chainer

Image-Captioning-with-Beam-Search
Generating image captions using Xception Network and Beam Search in Keras
Stars: ✭ 18 (+5.88%)
Mutual labels:  rnn, image-captioning, beam-search
Image Captioning
Image Captioning using InceptionV3 and beam search
Stars: ✭ 290 (+1605.88%)
Mutual labels:  image-captioning, beam-search
Poetry Seq2seq
Chinese Poetry Generation
Stars: ✭ 159 (+835.29%)
Mutual labels:  rnn, beam-search
Arnet
CVPR 2018 - Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present
Stars: ✭ 94 (+452.94%)
Mutual labels:  rnn, image-captioning
Image-Caption
Using LSTM or Transformer to solve Image Captioning in Pytorch
Stars: ✭ 36 (+111.76%)
Mutual labels:  image-captioning, beam-search
Caption generator
A modular library built on top of Keras and TensorFlow to generate a caption in natural language for any input image.
Stars: ✭ 243 (+1329.41%)
Mutual labels:  rnn, image-captioning
Image Caption Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Stars: ✭ 126 (+641.18%)
Mutual labels:  image-captioning, beam-search
udacity-cvnd-projects
My solutions to the projects assigned for the Udacity Computer Vision Nanodegree
Stars: ✭ 36 (+111.76%)
Mutual labels:  rnn, image-captioning
chainer-notebooks
Jupyter notebooks for Chainer hands-on
Stars: ✭ 23 (+35.29%)
Mutual labels:  chainer, rnn
CS231n
My solutions for Assignments of CS231n: Convolutional Neural Networks for Visual Recognition
Stars: ✭ 30 (+76.47%)
Mutual labels:  rnn, image-captioning
chainer-wasserstein-gan
Chainer implementation of the Wesserstein GAN
Stars: ✭ 20 (+17.65%)
Mutual labels:  chainer
Signal-Classification-Comparison
Classify signal using Deep Learning on Tensorflow and various machine learning models.
Stars: ✭ 19 (+11.76%)
Mutual labels:  rnn
hcn
Hybrid Code Networks https://arxiv.org/abs/1702.03274
Stars: ✭ 81 (+376.47%)
Mutual labels:  rnn
voxelnet chainer
VoxelNet implementation in Chainer
Stars: ✭ 26 (+52.94%)
Mutual labels:  chainer
time-series-forecasting-tensorflowjs
Pull stock prices from online API and perform predictions using Long Short Term Memory (LSTM) with TensorFlow.js framework
Stars: ✭ 96 (+464.71%)
Mutual labels:  rnn
stylenet
A pytorch implemention of "StyleNet: Generating Attractive Visual Captions with Styles"
Stars: ✭ 58 (+241.18%)
Mutual labels:  image-captioning
Course-Project---Speech-Driven-Facial-Animation
ECE 535 - Course Project, Deep Learning Framework
Stars: ✭ 63 (+270.59%)
Mutual labels:  rnn
altair
Assessing Source Code Semantic Similarity with Unsupervised Learning
Stars: ✭ 42 (+147.06%)
Mutual labels:  rnn
python-machine-learning-book-2nd-edition
<머신러닝 교과서 with 파이썬, 사이킷런, 텐서플로>의 코드 저장소
Stars: ✭ 60 (+252.94%)
Mutual labels:  rnn
sgrnn
Tensorflow implementation of Synthetic Gradient for RNN (LSTM)
Stars: ✭ 40 (+135.29%)
Mutual labels:  rnn

Image Captioning by Chainer

A Chainer implementation of Neural Image Caption, which generates captions given images.

This implementation is fast, because it uses cudnn-based LSTM (NStepLSTM) and beam search can deal with batch processing.

This code uses the coco-caption as a submodule. So, please clone this repository as follows:

git clone --recursive https://github.com/soskek/captioning_chainer.git

Furthermore, the coco-caption works on python 2.7 only. Thus, this repository also follows it.

Train an Image Caption Generator

sh prepare_scripts/prepare_dataset.sh
# flickr8k, flickr30k, mscoco
python -u train.py -g 0 --vocab data/flickr8k/vocab.txt --dataset flickr8k -b 64
python -u train.py -g 0 --vocab data/flickr30k/vocab.txt --dataset flickr30k -b 64
python -u train.py -g 0 --vocab data/coco/vocab.txt --dataset mscoco -b 64

On the mscoco dataset, with beam size of 20, a trained model reached BELU 25.9. The paper uses ensemble and (unwritten) hyperparameters, which can cause the gap between this and the value reported in the paper.

Use the model

python interactive.py --resume result/best_model.npz --vocab data/flickr8k/vocab.txt

After launched, enter the path of an image file.

See Best Result and Plot Curve

python get_best.py --log result/log

Citation

@article{Vinyals2015ShowAT,
  title={Show and tell: A neural image caption generator},
  author={Oriol Vinyals and Alexander Toshev and Samy Bengio and Dumitru Erhan},
  journal={2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2015},
  pages={3156-3164}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].