All Projects → seba-1511 → Lstms.pth

seba-1511 / Lstms.pth

Licence: apache-2.0
PyTorch implementations of LSTM Variants (Dropout + Layer Norm)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Lstms.pth

CS231n
My solutions for Assignments of CS231n: Convolutional Neural Networks for Visual Recognition
Stars: ✭ 30 (-72.97%)
Mutual labels:  lstm, dropout, rnn
Machine Learning
My Attempt(s) In The World Of ML/DL....
Stars: ✭ 78 (-29.73%)
Mutual labels:  lstm, rnn
Hred Attention Tensorflow
An extension on the Hierachical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion, our implementation is in Tensorflow and uses an attention mechanism.
Stars: ✭ 68 (-38.74%)
Mutual labels:  lstm, rnn
See Rnn
RNN and general weights, gradients, & activations visualization in Keras & TensorFlow
Stars: ✭ 102 (-8.11%)
Mutual labels:  lstm, rnn
Time Attention
Implementation of RNN for Time Series prediction from the paper https://arxiv.org/abs/1704.02971
Stars: ✭ 52 (-53.15%)
Mutual labels:  lstm, rnn
Char rnn lm zh
language model in Chinese,基于Pytorch官方文档实现
Stars: ✭ 57 (-48.65%)
Mutual labels:  lstm, rnn
Copper price forecast
copper price(time series) prediction using bpnn and lstm
Stars: ✭ 81 (-27.03%)
Mutual labels:  lstm, rnn
Rnn Theano
使用Theano实现的一些RNN代码,包括最基本的RNN,LSTM,以及部分Attention模型,如论文MLSTM等
Stars: ✭ 31 (-72.07%)
Mutual labels:  lstm, rnn
Cnn lstm for text classify
CNN, LSTM, NBOW, fasttext 中文文本分类
Stars: ✭ 90 (-18.92%)
Mutual labels:  lstm, rnn
Word Rnn Tensorflow
Multi-layer Recurrent Neural Networks (LSTM, RNN) for word-level language models in Python using TensorFlow.
Stars: ✭ 1,297 (+1068.47%)
Mutual labels:  lstm, rnn
Ml Ai Experiments
All my experiments with AI and ML
Stars: ✭ 107 (-3.6%)
Mutual labels:  lstm, rnn
Deepseqslam
The Official Deep Learning Framework for Route-based Place Recognition
Stars: ✭ 49 (-55.86%)
Mutual labels:  lstm, rnn
Rnn Notebooks
RNN(SimpleRNN, LSTM, GRU) Tensorflow2.0 & Keras Notebooks (Workshop materials)
Stars: ✭ 48 (-56.76%)
Mutual labels:  lstm, rnn
Bitcoin Price Prediction Using Lstm
Bitcoin price Prediction ( Time Series ) using LSTM Recurrent neural network
Stars: ✭ 67 (-39.64%)
Mutual labels:  lstm, rnn
Neural Networks
All about Neural Networks!
Stars: ✭ 34 (-69.37%)
Mutual labels:  lstm, rnn
Pytorch Sentiment Analysis Classification
A PyTorch Tutorials of Sentiment Analysis Classification (RNN, LSTM, Bi-LSTM, LSTM+Attention, CNN)
Stars: ✭ 80 (-27.93%)
Mutual labels:  lstm, rnn
Pytorch Learners Tutorial
PyTorch tutorial for learners
Stars: ✭ 97 (-12.61%)
Mutual labels:  lstm, rnn
Ailearning
AiLearning: 机器学习 - MachineLearning - ML、深度学习 - DeepLearning - DL、自然语言处理 NLP
Stars: ✭ 32,316 (+29013.51%)
Mutual labels:  lstm, rnn
Lstm peptides
Long short-term memory recurrent neural networks for learning peptide and protein sequences to later design new, similar examples.
Stars: ✭ 30 (-72.97%)
Mutual labels:  lstm, rnn
Lstm chem
Implementation of the paper - Generative Recurrent Networks for De Novo Drug Design.
Stars: ✭ 87 (-21.62%)
Mutual labels:  lstm, rnn

lstms.pth

Implementation of LSTM variants, in PyTorch.

For now, they only support a sequence size of 1, and meant for RL use-cases. Besides that, they are a stripped-down version of PyTorch's RNN layers. (no bidirectional, no num_layers, no batch_first)

Base Modules:

  • SlowLSTM: a (mostly useless) pedagogic example.
  • LayerNorm: Layer Normalization as in Ba & al.: Layer Normalization.

Dropout Modules:

  • LSTM: the original.
  • GalLSTM: using dropout as in Gal & Ghahramami: A Theoretically Grounded Application of Dropout in RNNs.
  • MoonLSTM: using dropout as in Moon & al: RNNDrop: A Novel Dropout for RNNs in ASR.
  • SemeniutaLSTM: using dropout as in Semeniuta & al: Recurrent Dropout without Memory Loss.

Normalization + Dropout Modules:

  • LayerNormLSTM: Dropout + Layer Normalization.
  • LayerNormGalLSTM: Gal Dropout + Layer Normalization.
  • LayerNormMoonLSTM: Moon Dropout + Layer Normalization.
  • LayerNormSemeniutaLSTM: Semeniuta Dropout + Layer Normalization.

Container Modules:

  • MultiLayerLSTM: helper class to build multiple layers LSTMs.

Convention: If applicable, the activations are computed first, and then the nodes are droped. (dropout on the output, not the input, just like PyTorch)

Install

pip install -e .

Usage

You can find a good example of how to use the layers in test/test_speed.py.

All Dropout models share the same signature:

LSTM(self, input_size, hidden_size, bias=True, dropout=0.0, dropout_method='pytorch')

All Normalization + Dropout models share the same signature:

LayerNormLSTM(self, input_size, hidden_size, bias=True, dropout=0.0, 
             dropout_method='pytorch', ln_preact=True, learnable=True):

And all models use the same out, hidden = model.forward(x, hidden)signature as the official PyTorch LSTM layers. They also all provide a model.sample_mask() method, which needs to be called in order to sample a new Dropout mask. (e.g, when processing a new sequence)

Note: LayerNorm is not an LSTM layer, and thus uses out = model.forward(x).

Containers

This package provides a helper class, MultiLayerLSTM, which can be use to stack multiple LSTMs together.

lstm = MultiLayerLSTM(input_size=256, layer_type=LayerNormSemeniutaLSTM,
                      layer_sizes=(64, 64, 16), dropout=0.7, ln_preact=False)
hiddens = lstm.create_hiddens(bsz=batch_size)
x = Variable(th.rand(1, 1, 256))
for _ in range(10):
    out, hiddens = lstm(x, hiddens)

Note that hiddens doesn't match the PyTorch specification. It is the list of (h_i, c_i) for each LSTM layer. Instead, the LSTM layers in PyTorch return a single tuple of (h_n, c_n), where h_n and c_n have sizes (num_layers * num_directions, batch, hidden_size).

Capacity Benchmarks

Warning: This is an artificial memory benchmark, not necessarily representative of each method's capacity.

Note: nn.LSTM and SlowLSTM do not have dropout in these experiments.

Info: dropout = 0.9 , SEQ_LEN = 10 , dataset size = 100 layer size = 256

model error
nn.LSTM 3.515
SlowLSTM 4.171
LSTM 4.160
GalLSTM 4.456
MoonLSTM 4.442
SemeniutaLSTM 3.762
GalLSTM 4.456
MoonLSTM 4.442
SemeniutaLSTM 3.762

Speed Benchmarks

Available by running make speed.

Warning: Inference timings only, and on a single sequence of length 1000 with dropout = 0.5.

SlowLSTM Benchmark

size nn.LSTM SlowLSTM Speedup
128 0.628 0.666 0.943
256 0.676 0.759 0.890
512 0.709 1.026 0.690
1024 2.364 2.867 0.824
2048 6.161 8.261 0.746

LSTM Benchmark

size nn.LSTM LSTM Speedup
128 0.568 0.387 1.466
256 0.668 0.419 1.594
512 0.803 0.769 1.045
1024 2.966 2.002 1.482
2048 6.291 6.393 0.984

GalLSTM Benchmark

size nn.LSTM GalLSTM Speedup
128 0.557 0.488 1.142
256 0.683 0.446 1.530
512 0.966 0.548 1.763
1024 2.524 2.587 0.975
2048 6.618 6.099 1.085

MoonLSTM Benchmark

size nn.LSTM MoonLSTM Speedup
128 0.667 0.445 1.499
256 0.818 0.535 1.530
512 0.908 0.695 1.306
1024 2.517 2.553 0.986
2048 6.475 6.779 0.955

SemeniutaLSTM Benchmark

size nn.LSTM SemeniutaLSTM Speedup
128 0.692 0.513 1.348
256 0.685 0.697 0.983
512 0.717 0.701 1.022
1024 2.639 2.751 0.959
2048 7.294 6.122 1.191

LayerNormLSTM Benchmark

size nn.LSTM LayerNormLSTM Speedup
128 0.646 1.656 0.390
256 0.583 1.800 0.324
512 0.770 1.989 0.387
1024 2.623 3.844 0.682
2048 6.573 9.592 0.685

LayerNormGalLSTM Benchmark

size nn.LSTM LayerNormGalLSTM Speedup
128 0.566 0.486 1.163
256 0.592 0.350 1.693
512 0.920 0.606 1.517
1024 2.508 2.427 1.034
2048 7.356 10.268 0.716

LayerNormMoonLSTM Benchmark

size nn.LSTM LayerNormMoonLSTM Speedup
128 0.507 0.389 1.305
256 0.685 0.511 1.342
512 0.762 0.685 1.111
1024 2.661 2.261 1.177
2048 8.904 9.710 0.917

LayerNormSemeniutaLSTM Benchmark

size nn.LSTM LayerNormSemeniutaLSTM Speedup
128 0.492 0.388 1.267
256 0.583 0.360 1.616
512 0.760 0.578 1.316
1024 2.586 2.328 1.111
2048 6.970 10.725 0.650
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].