All Projects → neural-nuts → Image Caption Generator

neural-nuts / Image Caption Generator

Licence: bsd-3-clause
[DEPRECATED] A Neural Network based generative model for captioning images using Tensorflow

Projects that are alternatives of or similar to Image Caption Generator

Pytorch Learners Tutorial
PyTorch tutorial for learners
Stars: ✭ 97 (-31.21%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks, lstm, recurrent-neural-networks, lstm-neural-networks
Image Captioning
Image Captioning: Implementing the Neural Image Caption Generator with python
Stars: ✭ 52 (-63.12%)
Mutual labels:  convolutional-neural-networks, lstm, recurrent-neural-networks, lstm-neural-networks, image-captioning
Bitcoin Price Prediction Using Lstm
Bitcoin price Prediction ( Time Series ) using LSTM Recurrent neural network
Stars: ✭ 67 (-52.48%)
Mutual labels:  jupyter-notebook, lstm, recurrent-neural-networks, lstm-neural-networks
Image Caption Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Stars: ✭ 126 (-10.64%)
Mutual labels:  convolutional-neural-networks, lstm, recurrent-neural-networks, image-captioning
Deep Learning With Python
Deep learning codes and projects using Python
Stars: ✭ 195 (+38.3%)
Mutual labels:  artificial-intelligence, jupyter-notebook, convolutional-neural-networks, recurrent-neural-networks
Automatic Image Captioning
Generating Captions for images using Deep Learning
Stars: ✭ 84 (-40.43%)
Mutual labels:  jupyter-notebook, convolutional-neural-networks, lstm-neural-networks, image-captioning
Lstm anomaly thesis
Anomaly detection for temporal data using LSTMs
Stars: ✭ 178 (+26.24%)
Mutual labels:  jupyter-notebook, lstm, recurrent-neural-networks, lstm-neural-networks
Deep Learning Time Series
List of papers, code and experiments using deep learning for time series forecasting
Stars: ✭ 796 (+464.54%)
Mutual labels:  jupyter-notebook, lstm, recurrent-neural-networks, lstm-neural-networks
Deep Learning With Pytorch Tutorials
深度学习与PyTorch入门实战视频教程 配套源代码和PPT
Stars: ✭ 1,986 (+1308.51%)
Mutual labels:  artificial-intelligence, convolutional-neural-networks, recurrent-neural-networks
Gdax Orderbook Ml
Application of machine learning to the Coinbase (GDAX) orderbook
Stars: ✭ 60 (-57.45%)
Mutual labels:  jupyter-notebook, lstm, recurrent-neural-networks
Ai Reading Materials
Some of the ML and DL related reading materials, research papers that I've read
Stars: ✭ 79 (-43.97%)
Mutual labels:  artificial-intelligence, lstm, recurrent-neural-networks
Sentiment Analysis Nltk Ml Lstm
Sentiment Analysis on the First Republic Party debate in 2016 based on Python,NLTK and ML.
Stars: ✭ 61 (-56.74%)
Mutual labels:  jupyter-notebook, lstm, recurrent-neural-networks
Computervision Recipes
Best Practices, code samples, and documentation for Computer Vision.
Stars: ✭ 8,214 (+5725.53%)
Mutual labels:  artificial-intelligence, jupyter-notebook, convolutional-neural-networks
Language Translation
Neural machine translator for English2German translation.
Stars: ✭ 82 (-41.84%)
Mutual labels:  jupyter-notebook, lstm, recurrent-neural-networks
Malware Classification
Towards Building an Intelligent Anti-Malware System: A Deep Learning Approach using Support Vector Machine for Malware Classification
Stars: ✭ 88 (-37.59%)
Mutual labels:  artificial-intelligence, convolutional-neural-networks, recurrent-neural-networks
Deepseqslam
The Official Deep Learning Framework for Route-based Place Recognition
Stars: ✭ 49 (-65.25%)
Mutual labels:  convolutional-neural-networks, lstm, recurrent-neural-networks
Image classifier
CNN image classifier implemented in Keras Notebook 🖼️.
Stars: ✭ 139 (-1.42%)
Mutual labels:  artificial-intelligence, jupyter-notebook, convolutional-neural-networks
Text predictor
Char-level RNN LSTM text generator📄.
Stars: ✭ 99 (-29.79%)
Mutual labels:  artificial-intelligence, lstm, lstm-neural-networks
Ml Ai Experiments
All my experiments with AI and ML
Stars: ✭ 107 (-24.11%)
Mutual labels:  artificial-intelligence, jupyter-notebook, lstm
Lstmvis
Visualization Toolbox for Long Short Term Memory networks (LSTMs)
Stars: ✭ 959 (+580.14%)
Mutual labels:  jupyter-notebook, lstm, recurrent-neural-networks

[Deprecated] Image Caption Generator

Notice: This project uses an older version of TensorFlow, and is no longer supported. Please consider using other latest alternatives.

A Neural Network based generative model for captioning images.

Checkout the android app made using this image-captioning-model: Cam2Caption and the associated paper.

Work in Progress

Updates(Jan 14, 2018):
  1. Some Code Refactoring.
  2. Added MSCOCO dataset support.
Updates(Mar 12, 2017):
  1. Added Dropout Layer for LSTM, Xavier Glorot Initializer for Weights
  2. Significant Optimizations for Caption Generation i.e Decode Routine, computation time reduce from 3 seconds to 0.2 seconds
  3. Functionality to Freeze Graphs and Merge them.
  4. Direct Serving(Dual Graph and Single Graph) Routines in /util/
  5. Explored and chose the fastest and most efficient Image Preprocessing Method.
  6. Ported code to TensorFlow r1.0
Updates(Feb 27, 2017):
  1. Added BLEU evaluation metric and batch processing of images to produce batches of captions.
Updates(Feb 25, 2017):
  1. Added optimizations and one-time pre-processing of Flickr30K data
  2. Changed to a faster Image Preprocessing method using OpenCV
To-Do(Open for Contribution):
  1. FIFO-queues in training
  2. Attention-Model
  3. Trained Models for Distribution.

Pre-Requisites:

  1. Tensorflow r1.0
  2. NLTK
  3. pandas
  4. Download Flickr30K OR MSCOCO images and captions.
  5. Download Pre-Trained InceptionV4 Tensorflow graph from DeepDetect available here

Procedure to Train and Generate Captions:

  1. Clone the Repository to preserve Directory Structure
  2. For flickr30k put results_20130124.token and Flickr30K images in flickr30k-images folder OR For MSCOCO put captions_val2014.json and MSCOCO images in COCO-images folder .
  3. Put inception_v4.pb in ConvNets folder
  4. Generate features(features.npy) corresponding to the images in the dataset folder by running-
    • For Flickr30K: python convfeatures.py --data_path Dataset/flickr30k-images --inception_path ConvNets/inception_v4.pb
    • For MSCOCO: python convfeatures.py --data_path Dataset/COCO-images --inception_path ConvNets/inception_v4.pb
  5. To Train the model run-
    • For Flickr30K: python main.py --mode train --caption_path ./Dataset/results_20130124.token --feature_path ./Dataset/features.npy --resume
    • For MSCOCO: python main.py --mode train --caption_path ./Dataset/captions_val2014.json --feature_path ./Dataset/features.npy --data_is_coco --resume
  6. To Generate Captions for an Image run
    • python main.py --mode test --image_path VALID_PATH
  7. For usage as a python library see Demo.ipynb

(see python main.py -h for more)

Miscellaneous Notes:

Freezing the encoder and decoder Graphs

  1. It's necessary to save both encoder and decoder graphs while running test. This is a one-time necessary run before freezing the encoder/decoder.
    • python main.py --mode test --image_path ANY_TEST_IMAGE.jpg/png --saveencoder --savedecoder
  2. In the project root directory use - python utils/save_graph.py --mode encoder --model_folder model/Encoder/ additionally you may want to use --read_file if you want to freeze the encoder for directly generating caption for an image file(path). Similarly, for decoder use - python utils/save_graph.py --mode decoder --model_folder model/Decoder/, read_file argument is not necessary for the decoder.
  3. To use frozen encoder and decoder models as dual blackbox Serve-DualProtoBuf.ipynb. Note: You must freeze encoder graph with --read_file to run this notebook

(see python utils/save_graph.py -h for more)

Merging the encoder and decoder graphs for serving the model as a blackbox:

  1. It's necessary to freeze the encoder and decoder as mentioned above.
  2. In the project root directory run-
    • python utils/merge_graphs.py --encpb ./model/Trained_Graphs/encoder_frozen_model.pb --decpb ./model/Trained_Graphs/decoder_frozen_model.pb additionally you may want to use --read_file if you want to freeze the encoder for directly generating caption for an image file(path).
  3. To use merged encoder and decoder models as single frozen blackbox: Serve-SingleProtoBuf.ipynb. Note: You must freeze and merge encoder graph with --read_file to run this notebook

(see python utils/merge_graphs.py -h for more)

Training Steps vs Loss Graph in Tensorboard:

  1. tensorboard --logdir model/log_dir
  2. Navigate to localhost:6006

Citation:

If you use our model or code in your research, please cite the paper:

@article{Mathur2017,
  title={Camera2Caption: A Real-time Image Caption Generator},
  author={Pranay Mathur and Aman Gill and Aayush Yadav and Anurag Mishra and Nand Kumar Bansode},
  journal={IEEE Conference Publication},
  year={2017}
}

Reference:

Show and Tell: A Neural Image Caption Generator

-Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan

License:

Protected Under BSD-3 Clause License.

Some Examples:

Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].