All Projects → Pendulibrium → ai-visual-storytelling-seq2seq

Pendulibrium / ai-visual-storytelling-seq2seq

Licence: other
Implementation of seq2seq model for Visual Storytelling Challenge (VIST) http://visionandlanguage.net/VIST/index.html

Programming Languages

python
139335 projects - #7 most used programming language
perl
6916 projects
HTML
75241 projects

Projects that are alternatives of or similar to ai-visual-storytelling-seq2seq

Deep News Summarization
News summarization using sequence to sequence model with attention in TensorFlow.
Stars: ✭ 167 (+234%)
Mutual labels:  recurrent-neural-networks, seq2seq, encoder-decoder
Komputation
Komputation is a neural network framework for the Java Virtual Machine written in Kotlin and CUDA C.
Stars: ✭ 295 (+490%)
Mutual labels:  recurrent-neural-networks, seq2seq
Speech recognition with tensorflow
Implementation of a seq2seq model for Speech Recognition using the latest version of TensorFlow. Architecture similar to Listen, Attend and Spell.
Stars: ✭ 253 (+406%)
Mutual labels:  seq2seq, encoder-decoder
dts
A Keras library for multi-step time-series forecasting.
Stars: ✭ 130 (+160%)
Mutual labels:  recurrent-neural-networks, seq2seq
Screenshot To Code
A neural network that transforms a design mock-up into a static website.
Stars: ✭ 13,561 (+27022%)
Mutual labels:  seq2seq, encoder-decoder
Text summarization with tensorflow
Implementation of a seq2seq model for summarization of textual data. Demonstrated on amazon reviews, github issues and news articles.
Stars: ✭ 226 (+352%)
Mutual labels:  seq2seq, encoder-decoder
Text Classification Models Pytorch
Implementation of State-of-the-art Text Classification Models in Pytorch
Stars: ✭ 379 (+658%)
Mutual labels:  recurrent-neural-networks, seq2seq
Text summurization abstractive methods
Multiple implementations for abstractive text summurization , using google colab
Stars: ✭ 359 (+618%)
Mutual labels:  seq2seq, encoder-decoder
Seq2Seq-chatbot
TensorFlow Implementation of Twitter Chatbot
Stars: ✭ 18 (-64%)
Mutual labels:  recurrent-neural-networks, seq2seq
Transformer Temporal Tagger
Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging
Stars: ✭ 55 (+10%)
Mutual labels:  seq2seq, encoder-decoder
NeuralCodeTranslator
Neural Code Translator provides instructions, datasets, and a deep learning infrastructure (based on seq2seq) that aims at learning code transformations
Stars: ✭ 32 (-36%)
Mutual labels:  seq2seq, encoder-decoder
Sockeye
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet
Stars: ✭ 990 (+1880%)
Mutual labels:  seq2seq, encoder-decoder
Text Summarization Tensorflow
Tensorflow seq2seq Implementation of Text Summarization.
Stars: ✭ 527 (+954%)
Mutual labels:  seq2seq, encoder-decoder
Pytorch Seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Stars: ✭ 3,418 (+6736%)
Mutual labels:  seq2seq, encoder-decoder
Tf Seq2seq
Sequence to sequence learning using TensorFlow.
Stars: ✭ 387 (+674%)
Mutual labels:  seq2seq, encoder-decoder
Conversational-AI-Chatbot-using-Practical-Seq2Seq
A simple open domain generative based chatbot based on Recurrent Neural Networks
Stars: ✭ 17 (-66%)
Mutual labels:  recurrent-neural-networks, seq2seq
Encoder decoder
Four styles of encoder decoder model by Python, Theano, Keras and Seq2Seq
Stars: ✭ 269 (+438%)
Mutual labels:  seq2seq, encoder-decoder
Mead Baseline
Deep-Learning Model Exploration and Development for NLP
Stars: ✭ 238 (+376%)
Mutual labels:  recurrent-neural-networks, seq2seq
Deep-Learning-Tensorflow
Gathers Tensorflow deep learning models.
Stars: ✭ 50 (+0%)
Mutual labels:  recurrent-neural-networks, seq2seq
Embedding
Embedding模型代码和学习笔记总结
Stars: ✭ 25 (-50%)
Mutual labels:  seq2seq, encoder-decoder

ai-visual-storytelling-seq2seq

Implementation of our original solution that is described in the paper Stories for Images-in-Sequence by using Visual and Narrative Components. Our project is inspired by the solution in Visual Storytelling. The model generates stories, sentence by sentence with respect to the sequence of images and the previously generated sentence. The architecture of our solution consists of an image sequence encoder that models the sequential behaviour of the images, a previous-sentence encoder and a current-sentence decoder. The previous-sentence encoder encodes the sentence that was associated with the previous image and the current-sentence decoder is responsible for generating a sentence for the current image of the sequence. We also introduce a novel way of grouping the images of the sequence during the training process, in order to encapture the effect of the previous images in the sequence. Our goal with this approach was to create a model that will generate stories that contain more narrative and evaluative language and that every generated sentence in the story will be affected not only by the sequence of images but also by what has been previously generated in the story.

Installing

The project is built using Python 2.7.14, Tensorflow 1.6.0 and Keras 2.1.6. Install these dependencies to get a development env running

sudo easy_install --upgrade pip
sudo easy_install --upgrade six
sudo pip install tensorflow
sudo pip install keras
pip install opencv-python
pip install h5py
pip install unidecode
python -mpip install matplotlib

Data

Download the Visual Storytelling Dataset (VIST) from http://visionandlanguage.net/VIST/dataset.html and save it in the dataset/vist_dataset directory. Also download the pre-trained weights for AlexnNet and put them in the dataset/models/alexnet directory.

Data pre-processing

First we need to extract the image features from all the images and save them in a file. This is possible with

python dataset/models/alexnet/myalexnet_forward_newtf.py

The script creates the file /dataset/models/alexnet/alexnet_image_train_features.hdf5, that contains all the image features. Next we need to associate every image feature vector with it's corresponding vectorized sentence. We vectorize the sentence using the functions in sis_datareader. With the function sentences_to_index we align every image feature with every sentence. If all the file paths are set properly, all of the above can be done by running the command

python data_reader/sis_datareader.py

Options and differences from the paper

Other than our proposed solution, the project can be used to train an encoder-decoder and an encoder-decoder with Luong attention mechanism.

Our proposed solution

alt text The architecture of the proposed model. The images highlighted with red are the ones that are encoded and together with the previous sentence, they influence the generated sentence in the current time step.

Training the model

Training the model and adjusting the parameters is done in the training_model.py. If the attention mechanism is used, make sure that image_encoder_latent_dim = sentence_encoder_latent_dim.

python training_model.py

Generating stories

To generate stories in inference_model.py set model_name to the model you want to generate from and run

python inference_model.py

Some Results

In following images we can see the generated stories from the aforementioned models. The first row reprsents the original story, the second row is the generated story from the model with loss=0.82, the third row is the story from the model with loss=1.01 and the forth row is the story from the model with loss=1.72. alt text alt text alt text alt text

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].