All Projects → fluency03 → sequence-rnn-py

fluency03 / sequence-rnn-py

Licence: other
Sequence analyzing using Recurrent Neural Networks (RNN) based on Keras

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to sequence-rnn-py

Rnn ctc
Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.
Stars: ✭ 220 (+685.71%)
Mutual labels:  theano, recurrent-neural-networks, lstm, rnn
Pytorch Pos Tagging
A tutorial on how to implement models for part-of-speech tagging using PyTorch and TorchText.
Stars: ✭ 96 (+242.86%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
Bitcoin Price Prediction Using Lstm
Bitcoin price Prediction ( Time Series ) using LSTM Recurrent neural network
Stars: ✭ 67 (+139.29%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
Pytorch Kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+7389.29%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
Lstm Human Activity Recognition
Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier
Stars: ✭ 2,943 (+10410.71%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
Rnnsharp
RNNSharp is a toolkit of deep recurrent neural network which is widely used for many different kinds of tasks, such as sequence labeling, sequence-to-sequence and so on. It's written by C# language and based on .NET framework 4.6 or above versions. RNNSharp supports many different types of networks, such as forward and bi-directional network, sequence-to-sequence network, and different types of layers, such as LSTM, Softmax, sampled Softmax and others.
Stars: ✭ 277 (+889.29%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
Linear Attention Recurrent Neural Network
A recurrent attention module consisting of an LSTM cell which can query its own past cell states by the means of windowed multi-head attention. The formulas are derived from the BN-LSTM and the Transformer Network. The LARNN cell with attention can be easily used inside a loop on the cell state, just like any other RNN. (LARNN)
Stars: ✭ 119 (+325%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
Pytorch Learners Tutorial
PyTorch tutorial for learners
Stars: ✭ 97 (+246.43%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
theano-recurrence
Recurrent Neural Networks (RNN, GRU, LSTM) and their Bidirectional versions (BiRNN, BiGRU, BiLSTM) for word & character level language modelling in Theano
Stars: ✭ 40 (+42.86%)
Mutual labels:  theano, lstm, rnn
Theano Kaldi Rnn
THEANO-KALDI-RNNs is a project implementing various Recurrent Neural Networks (RNNs) for RNN-HMM speech recognition. The Theano Code is coupled with the Kaldi decoder.
Stars: ✭ 31 (+10.71%)
Mutual labels:  theano, recurrent-neural-networks, rnn
Rnn Theano
使用Theano实现的一些RNN代码,包括最基本的RNN,LSTM,以及部分Attention模型,如论文MLSTM等
Stars: ✭ 31 (+10.71%)
Mutual labels:  theano, lstm, rnn
sgrnn
Tensorflow implementation of Synthetic Gradient for RNN (LSTM)
Stars: ✭ 40 (+42.86%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
tiny-rnn
Lightweight C++11 library for building deep recurrent neural networks
Stars: ✭ 41 (+46.43%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
Deepseqslam
The Official Deep Learning Framework for Route-based Place Recognition
Stars: ✭ 49 (+75%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
automatic-personality-prediction
[AAAI 2020] Modeling Personality with Attentive Networks and Contextual Embeddings
Stars: ✭ 43 (+53.57%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+11360.71%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
SpeakerDiarization RNN CNN LSTM
Speaker Diarization is the problem of separating speakers in an audio. There could be any number of speakers and final result should state when speaker starts and ends. In this project, we analyze given audio file with 2 channels and 2 speakers (on separate channels).
Stars: ✭ 56 (+100%)
Mutual labels:  recurrent-neural-networks, lstm, rnn
Deepjazz
Deep learning driven jazz generation using Keras & Theano!
Stars: ✭ 2,766 (+9778.57%)
Mutual labels:  theano, lstm, rnn
Human-Activity-Recognition
Human activity recognition using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six categories (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING).
Stars: ✭ 16 (-42.86%)
Mutual labels:  recurrent-neural-networks, rnn
cudnn rnn theano benchmarks
No description or website provided.
Stars: ✭ 22 (-21.43%)
Mutual labels:  theano, rnn

sequence-rnn-py

Build Status

This program analyze the sequence using (Uni-directional and Bi-directional) Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM) based on the python library Keras (Documents and Github). It is based on this lstm_text_generation.py and this imdb_bidirectional_lstm.py examples of Keras.

This is part of my master thesis project and still in development.

Requirements

  • Python 2.7

  • NumPy: The fundamental package needed for scientific computing with Python.

  • SciPy: Python-based ecosystem of open-source software for mathematics, science, and engineering.

  • Theano: A Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.

  • Tensorflow: An open source software library for numerical computation using data flow graphs.

  • Keras>=1.0: A minimalist, highly modular neural networks library, written in Python and capable of running on top of either TensorFlow or Theano. Update the Keras:

    pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps .

  • GPU Support (optional but highly recommended). Instructions of enabling GPU are here: for Theano and for TensorFlow.

  • pydot and graphviz (optional, if you want to plot the model)

  • HDF5 and h5py (optional, if you use model saving/loading functions)

Materials

A serias of Recurrent Neural Networks Tutorial:

  1. Part 1 - Introduction to RNNs
  2. Part 2 - Implementing a RNN with Python, Numpy and Theano
  3. Part 3 - Backpropagation Through Time and Vanishing Gradients
  4. Part 4 - Implementing a GRU/LSTM RNN with Python and Theano

Two great materials about LSTM: Understanding LSTM Networks of Christopher Olah and Understanding LSTM and its diagrams of Shi Yan

The best post of Andrej Karpathy blog regarding sequence prediction using RNN: The Unreasonable Effectiveness of Recurrent Neural Networks

One deeper material about RNN: Chapter 10 - Sequence Modeling: Recurrentand Recursive Nets of this book MIT Deep Learning.

Model

  • Two layers of LSTMs Uni-directional RNN model:

 RNN LSTM

  • One layer of LSTM Bi-Directional RNN model:

 BRNN LSTM

  • Naive Bayes model:

naive_bayes.py is a simple Naive Bayes model used for comparison.

Data

  • Training Set

  • Validation Set

  • Test Set

Training

This hyperas may help. It is A very simple convenience wrapper around hyperopt for fast prototyping with keras models. It is used for hyper-parameter optimization. An example can be found here.

Two good materials:

Considerations:

  • Batch Size: how many streams of data are processed in parallel at one time.

  • Samples per epoch and Batches per epoch: how many samples or batches considered per epoch. Based on some of my experiments: (i) the more #samples there are, the higher the accuracy can reach at the stable stage and the less the loss can be at the stable stage; (ii) the more #batches (integer ratio of #sample/batch_size) there are, the higher the accuracy can reach at table stage and the less the loss can be at stable stage and the less iterations it will take to reach the same loss/accuracy value.

  • Sentence Length: according to char-rnn:

The length of each data stream, which is also the limit at which the gradients can propagate backwards in time. For example, if seq_length is 20, then the gradient signal will never backpropagate more than 20 time steps, and the model might not find dependencies longer than this length in number of characters.

This is actually the limitation of the model's long term memory.

Thus, if you have a very difficult dataset where there are a lot of long-term dependencies, you will want to increase this setting.

  • Offset during sampling: offset is the start index when sampling the X_train and y_train from original sequence. The offset can be fixed value or random value ranging between 0 ~ step-1.

  • Data size vs. #parameters in total:

  • #layers: the number of layers, here suggests that always use num_layers of either 2 or 3.

  • layer size: the number of units per layer.

Acoording to char-rnn, the two important quantities to keep track of here are:

  • The total number of parameters in your model.
  • The size of your dataset. These two should be about the same order of magnitude.

How to calculate the number of parameters in RNN? For example, consider one layer of LSTM:

  • if it has the layer size of H=512;

  • if we have the vocabulary size as C=3000 (the number of unique classes);

  • the LSTM layer will have three parameter matrix - U with dimension (H, C)=(512, 3000), V with dimension (C, H)=(3000, 512), W with dimension (H, H)=(512, 512);

  • the total number of parameter for one layer will be: 2HC + H^2, which is 3,334,144 in this case.

  • That is 3 million parameters for only one layer!

  • Learning Rate: This ratio (percentage) influences the speed (step of the gradient descent) and quality of learning. The greater the ratio, the faster the neuron trains; the lower the ratio, the more accurate the training is. According to LSTM: A Search Space Odyssey [1]:

The learning rate is by far the most important hyperparameter. And based on their suggestion, while searching for a good learning rate for the LSTM, it is sufficient to do a coarse search by starting with a high value (e.g. 1.0) and dividing it by ten until performance stops increasing.

  • Dropout: an float between 0 and 1, indicating how much percentage of the hidden layer data are ignored when feeding to next layer. It is a powerful regularization method and mainly used for avoiding overfitting. If your model is overfitting, it better to increase the value of dropout.

  • Reinforcement learning function: The temperature parameter is dividing the predicted log probabilities before the Softmax, so lower temperature will cause the model to make more likely, but also more boring and conservative predictions. Higher temperatures cause the model to take more chances and increase diversity of results, but at a cost of more mistakes.

  • Loss function: categorical_crossentropy

  • Optimizer: RMSprop, you can try other options like simple SGD, Adagrad and Adam.

Reference

[1] Greff, Klaus, Rupesh Kumar Srivastava, Jan Koutník, Bas R. Steunebrink, and Jürgen Schmidhuber. "LSTM: A search space odyssey." arXiv preprint arXiv:1503.04069 (2015).
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].