All Projects → google → E3d_lstm

google / E3d_lstm

Licence: apache-2.0
e3d-lstm; Eidetic 3D LSTM A Model for Video Prediction and Beyond

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to E3d lstm

OCR
Optical character recognition Using Deep Learning
Stars: ✭ 25 (-80.62%)
Mutual labels:  lstm, deeplearning
battery-rul-estimation
Remaining Useful Life (RUL) estimation of Lithium-ion batteries using deep LSTMs
Stars: ✭ 25 (-80.62%)
Mutual labels:  lstm, deeplearning
SentimentAnalysis
Sentiment Analysis: Deep Bi-LSTM+attention model
Stars: ✭ 32 (-75.19%)
Mutual labels:  lstm, deeplearning
Datastories Semeval2017 Task4
Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".
Stars: ✭ 184 (+42.64%)
Mutual labels:  deeplearning, lstm
Spago
Self-contained Machine Learning and Natural Language Processing library in Go
Stars: ✭ 854 (+562.02%)
Mutual labels:  deeplearning, lstm
Forecasting-Solar-Energy
Forecasting Solar Power: Analysis of using a LSTM Neural Network
Stars: ✭ 23 (-82.17%)
Mutual labels:  lstm, deeplearning
deep-improvisation
Easy-to-use Deep LSTM Neural Network to generate song sounds like containing improvisation.
Stars: ✭ 53 (-58.91%)
Mutual labels:  lstm, deeplearning
air writing
Online Hand Writing Recognition using BLSTM
Stars: ✭ 26 (-79.84%)
Mutual labels:  lstm, deeplearning
Ner Lstm Crf
An easy-to-use named entity recognition (NER) toolkit, implemented the Bi-LSTM+CRF model in tensorflow.
Stars: ✭ 337 (+161.24%)
Mutual labels:  deeplearning, lstm
Deeplearning.ai Assignments
Stars: ✭ 268 (+107.75%)
Mutual labels:  deeplearning, lstm
Speech Emotion Recognition
Speaker independent emotion recognition
Stars: ✭ 169 (+31.01%)
Mutual labels:  deeplearning, lstm
Twitter Sentiment Analysis
Sentiment analysis on tweets using Naive Bayes, SVM, CNN, LSTM, etc.
Stars: ✭ 978 (+658.14%)
Mutual labels:  deeplearning, lstm
Learning-Lab-C-Library
This library provides a set of basic functions for different type of deep learning (and other) algorithms in C.This deep learning library will be constantly updated
Stars: ✭ 20 (-84.5%)
Mutual labels:  lstm, deeplearning
Similarity-Adaptive-Deep-Hashing
Unsupervised Deep Hashing with Similarity-Adaptive and Discrete Optimization (TPAMI2018)
Stars: ✭ 18 (-86.05%)
Mutual labels:  deeplearning, unsupervised-learning
Ailearning
AiLearning: 机器学习 - MachineLearning - ML、深度学习 - DeepLearning - DL、自然语言处理 NLP
Stars: ✭ 32,316 (+24951.16%)
Mutual labels:  deeplearning, lstm
Keras basic
keras를 이용한 딥러닝 기초 학습
Stars: ✭ 39 (-69.77%)
Mutual labels:  deeplearning, lstm
Deeplearning Notes
Notes for Deep Learning Specialization Courses led by Andrew Ng.
Stars: ✭ 126 (-2.33%)
Mutual labels:  deeplearning
Deepco3
[CVPR19] DeepCO3: Deep Instance Co-segmentation by Co-peak Search and Co-saliency (Oral paper)
Stars: ✭ 127 (-1.55%)
Mutual labels:  unsupervised-learning
Echo
Python package containing all custom layers used in Neural Networks (Compatible with PyTorch, TensorFlow and MegEngine)
Stars: ✭ 126 (-2.33%)
Mutual labels:  deeplearning
3dpose gan
The authors' implementation of Unsupervised Adversarial Learning of 3D Human Pose from 2D Joint Locations
Stars: ✭ 124 (-3.88%)
Mutual labels:  unsupervised-learning

E3D-LSTM

It contains a Tensorflow implementation of the following paper:

Please note that this is not an officially supported Google product. This codebase was reproduced after the first author left Google in accordance with company policy

If you find this code useful in your research then please cite

@inproceedings{wang2019eidetic,
  title={Eidetic 3D LSTM: A Model for Video Prediction and Beyond},
  author={Wang, Yunbo and Jiang, Lu and Yang, Ming-Hsuan and Li, Li-Jia and Long, Mingsheng and Fei-Fei, Li.},
  booktitle={ICLR},
  year={2019}
}

We present a new model, Eidetic 3D LSTM (E3D-LSTM), that integrates 3D convolutions into RNNs. The encapsulated 3D-Conv makes local perceptrons of RNNs motion-aware and enables the memory cell to store better short-term features. We evaluate the E3D-LSTM network on (a) future video prediction (for unsupervised video representation learning) and early activity recognition to infer what is happening or what will happen after observing only limited frames of video.

Method

Setup

All code was developed and tested on Nvidia V100 the following environment.

  • Python 2.7
  • opencv3
  • scikit-image
  • numpy
  • tensorflow>=1.0
  • cuda>=8.0
  • cudnn>=5.0

Please download the data via the following external links.

  • Moving MNIST is a dataset with two moving digits bouncing in a 64 by 64 area.
  • KTH Actions is a human action dataset. This dataset contains frames from original videos. It selects the reasonable, predictable ones and resize them.

Quick Start

To train our model on the Moving NIST dataset using:

python -u run.py \
    --is_training True \
    --dataset_name mnist \
    --train_data_paths ~/data/moving-mnist-example/moving-mnist-train.npz \
    --valid_data_paths ~/data/moving-mnist-example/moving-mnist-valid.npz \
    --pretrained_model pretrain_model/moving_mnist_e3d_lstm/model.ckpt-80000 \
    --save_dir checkpoints/_mnist_e3d_lstm \
    --gen_frm_dir results/_mnist_e3d_lstm \
    --model_name e3d_lstm \
    --allow_gpu_growth True \
    --img_channel 1 \
    --img_width 64 \
    --input_length 10 \
    --total_length 20 \
    --filter_size 5 \
    --num_hidden 64,64,64,64 \
    --patch_size 4 \
    --layer_norm True \
    --sampling_stop_iter 50000 \
    --sampling_start_value 1.0 \
    --sampling_delta_per_iter 0.00002 \
    --lr 0.001 \
    --batch_size 4 \
    --max_iterations 1 \
    --display_interval 1 \
    --test_interval 1 \
    --snapshot_interval 10000

A full list of commands can be found in the script folder. The training script has a number of command-line flags that you can use to configure the model architecture, hyperparameters, and input / output settings. Below are the parameters about our model:

  • --model_name: The model name. Default value is e3d_lstm.
  • --pretrained_model: Directory to find our pretrained models. See below for the download instruction.
  • --num_hidden: Comma separated number of units of e3d lstms
  • --filter_size: Filter of a single e3d-lstm layer.
  • --layer_norm: Whether to apply tensor layer norm.

scheduled_sampling, sampling_stop_iter, sampling_start_value and sampling_changing_rate are hyperparameters used for scheduled sampling in training. The standard parameters for training and testing are:

  • --is_training: Is it training or testing.
  • --train_data_paths, --valid_data_paths: Training and validation dataset path.
  • --gen_frm_dir: Directory to store the prediction results.
  • --allow_gpu_growth: Whether allows GPU to grow.
  • --input_length 10: Input sequence length.
  • --total_length 20: Input and output sequence length in total.

To test a model, set --is_training False.

Pretrained Models

First download our pretrained models. You can test it on the dataset:

We noticed that there is a bug in the current code about "global_memory" which may be the cause for the mismatched pretrained models on the KTH dataset. As this code repo was reproduced after the first author left Google, this issue did not exist in our original experiments and the results reported in the paper are good. We are working on fixing this issue and refreshing our pre-trained KTH models. We apologize for the inconvenience and thank you for your patience.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].