All Projects → nikhilmaram → Show_and_Tell

nikhilmaram / Show_and_Tell

Licence: other
Show and Tell : A Neural Image Caption Generator

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Show and Tell

Image Captioning
Image Captioning using InceptionV3 and beam search
Stars: ✭ 290 (+291.89%)
Mutual labels:  lstm, image-captioning
stylenet
A pytorch implemention of "StyleNet: Generating Attractive Visual Captions with Styles"
Stars: ✭ 58 (-21.62%)
Mutual labels:  lstm, image-captioning
Machine-Learning
The projects I do in Machine Learning with PyTorch, keras, Tensorflow, scikit learn and Python.
Stars: ✭ 54 (-27.03%)
Mutual labels:  lstm, image-captioning
Up Down Captioner
Automatic image captioning model based on Caffe, using features from bottom-up attention.
Stars: ✭ 195 (+163.51%)
Mutual labels:  lstm, image-captioning
Image Caption Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Stars: ✭ 126 (+70.27%)
Mutual labels:  lstm, image-captioning
Neural Image Captioning
Implementation of Neural Image Captioning model using Keras with Theano backend
Stars: ✭ 12 (-83.78%)
Mutual labels:  lstm, image-captioning
CS231n
My solutions for Assignments of CS231n: Convolutional Neural Networks for Visual Recognition
Stars: ✭ 30 (-59.46%)
Mutual labels:  lstm, image-captioning
Image Captioning
Image Captioning: Implementing the Neural Image Caption Generator with python
Stars: ✭ 52 (-29.73%)
Mutual labels:  lstm, image-captioning
Image Caption Generator
[DEPRECATED] A Neural Network based generative model for captioning images using Tensorflow
Stars: ✭ 141 (+90.54%)
Mutual labels:  lstm, image-captioning
Caption generator
A modular library built on top of Keras and TensorFlow to generate a caption in natural language for any input image.
Stars: ✭ 243 (+228.38%)
Mutual labels:  lstm, image-captioning
ArrayLSTM
GPU/CPU (CUDA) Implementation of "Recurrent Memory Array Structures", Simple RNN, LSTM, Array LSTM..
Stars: ✭ 21 (-71.62%)
Mutual labels:  lstm
learningspoons
nlp lecture-notes and source code
Stars: ✭ 29 (-60.81%)
Mutual labels:  lstm
Paper-Implementation-DSTP-RNN-For-Stock-Prediction-Based-On-DA-RNN
基於DA-RNN之DSTP-RNN論文試做(Ver1.0)
Stars: ✭ 62 (-16.22%)
Mutual labels:  lstm
generate-thai-lyrics
Generate Thai Songs' lyrics using Deep Learning
Stars: ✭ 33 (-55.41%)
Mutual labels:  lstm
Crop-Yield-Prediction-Using-Satellite-Imagery
No description or website provided.
Stars: ✭ 44 (-40.54%)
Mutual labels:  lstm
korean ner tagging challenge
KU_NERDY 이동엽, 임희석 (2017 국어 정보 처리 시스템경진대회 금상) - 한글 및 한국어 정보처리 학술대회
Stars: ✭ 30 (-59.46%)
Mutual labels:  lstm
EBIM-NLI
Enhanced BiLSTM Inference Model for Natural Language Inference
Stars: ✭ 24 (-67.57%)
Mutual labels:  lstm
lstm-numpy
Vanilla LSTM with numpy
Stars: ✭ 17 (-77.03%)
Mutual labels:  lstm
algorithmia
No description or website provided.
Stars: ✭ 15 (-79.73%)
Mutual labels:  lstm
MachineLearning
Implementations of machine learning algorithm by Python 3
Stars: ✭ 16 (-78.38%)
Mutual labels:  lstm

Introduction

This neural system for image captioning is roughly based on the paper "Show and Tell: A Neural Image Caption Generatorn" by Vinayls et al. (ICML2015). The input is an image, and the output is a sentence describing the content of the image. It uses a convolutional neural network to extract visual features from the image, and uses a LSTM recurrent neural network to decode these features into a sentence. This project is implemented using the Tensorflow library, and allows end-to-end training of both CNN and RNN parts.

Prerequisites

Usage

  • Preparation: Download the COCO train2014 and val2014 data here. Put the COCO train2014 images in the folder train/images, and put the file captions_train2014.json in the folder train. Similarly, put the COCO val2014 images in the folder val/images, and put the file captions_val2014.json in the folder val. Furthermore, download the pretrained VGG16 net here if you want to use it to initialize the CNN part.

  • Training: To train a model using the COCO train2014 data, first setup various parameters in the file config.py and then run a command like this:

python3 main.py --phase=train \
    --load_cnn \
    --cnn_model_file='./vgg16_weights.npz'\
    [--train_cnn]

Turn on --train_cnn if you want to jointly train the CNN and RNN parts. Otherwise, only the RNN part is trained. The checkpoints will be saved in the folder models. If you want to resume the training from a checkpoint, run a command like this:

python3 main.py --phase=train \
    --load \
    --model_file='./models/xxxxxx.npy'\
    [--train_cnn]

To monitor the progress of training, run the following command:

tensorboard --logdir='./summary/'
  • Evaluation: To evaluate a trained model using the COCO val2014 data, run a command like this:
python3 main.py --phase=eval \
    --model_file='./models/xxxxxx.npy'

The result will be shown in stdout. Furthermore, the generated captions will be saved in the file val/results.json.

  • Inference: You can use the trained model to generate captions for any JPEG images! Put such images in the folder test/images, and run a command like this:
python3 main.py --phase=test \
    --model_file='./models/xxxxxx.npy'

The generated captions will be saved in the folder test/results.

Results

A pretrained model with default configuration can be downloaded here. This model was trained solely on the COCO train2014 data. It achieves the following BLEU scores on the COCO val2014 data :

  • BLEU-1 = 62.9%
  • BLEU-2 = 43.6%
  • BLEU-3 = 29.0%
  • BLEU-4 = 19.3%

Here are some captions generated by this model: examples

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].