All Projects → Sunnydreamrain → IndRNN_pytorch

Sunnydreamrain / IndRNN_pytorch

Licence: other
Independently Recurrent Neural Networks (IndRNN) implemented in pytorch.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to IndRNN pytorch

theano-recurrence
Recurrent Neural Networks (RNN, GRU, LSTM) and their Bidirectional versions (BiRNN, BiGRU, BiLSTM) for word & character level language modelling in Theano
Stars: ✭ 40 (-64.29%)
Mutual labels:  language-modeling, rnn
Rep-Counter
AI Exercise Rep Counter based on Google's Human Pose Estimation Library (Posenet)
Stars: ✭ 47 (-58.04%)
Mutual labels:  rnn
cudnn rnn theano benchmarks
No description or website provided.
Stars: ✭ 22 (-80.36%)
Mutual labels:  rnn
Recurrent-Neural-Network-for-BitCoin-price-prediction
Recurrent Neural Network (LSTM) by using TensorFlow and Keras in Python for BitCoin price prediction
Stars: ✭ 53 (-52.68%)
Mutual labels:  rnn
Human-Activity-Recognition
Human activity recognition using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six categories (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING).
Stars: ✭ 16 (-85.71%)
Mutual labels:  rnn
char-rnnlm-tensorflow
Char RNN Language Model based on Tensorflow
Stars: ✭ 14 (-87.5%)
Mutual labels:  rnn
LSTM-TensorSpark
Implementation of a LSTM with TensorFlow and distributed on Apache Spark
Stars: ✭ 40 (-64.29%)
Mutual labels:  rnn
dart-package-publisher
Action to Publish Dart / Flutter Package To https://pub.dev When you need to publish a package, just bump the version in pubspec.yaml
Stars: ✭ 45 (-59.82%)
Mutual labels:  action
upload-to-discord
A GitHub Action that uploads a file to Discord channel.
Stars: ✭ 44 (-60.71%)
Mutual labels:  action
chrome-extension-upload
upload & publish extensions to the Chrome Web Store.
Stars: ✭ 35 (-68.75%)
Mutual labels:  action
actions
Set of actions for implementing CI/CD with werf and GitHub Actions
Stars: ✭ 67 (-40.18%)
Mutual labels:  action
apollo11
elementary app skeleton (hello houston)
Stars: ✭ 27 (-75.89%)
Mutual labels:  skeleton
react-skeleton-loader
A react helper for skeleton loaders
Stars: ✭ 61 (-45.54%)
Mutual labels:  skeleton
etc-skel
/etc/skel with super cool confs for tmux, psql, inputrc, git, bash, dircolors, and more.
Stars: ✭ 22 (-80.36%)
Mutual labels:  skeleton
MachineLearning Exercises Python TensorFlow
Python&機械学習ライブラリ TensorFlow の使い方の練習コード集。特にニューラルネットワークを重点的に取り扱い。
Stars: ✭ 36 (-67.86%)
Mutual labels:  rnn
setup-bats
GitHub Action to setup BATS testing framework
Stars: ✭ 25 (-77.68%)
Mutual labels:  action
dltf
Hands-on in-person workshop for Deep Learning with TensorFlow
Stars: ✭ 14 (-87.5%)
Mutual labels:  rnn
Vapecord-ACNL-Plugin
Animal Crossing NL Vapecord Public Plugin WIP
Stars: ✭ 72 (-35.71%)
Mutual labels:  action
tf-ran-cell
Recurrent Additive Networks for Tensorflow
Stars: ✭ 16 (-85.71%)
Mutual labels:  rnn
DrowsyDriverDetection
This is a project implementing Computer Vision and Deep Learning concepts to detect drowsiness of a driver and sound an alarm if drowsy.
Stars: ✭ 82 (-26.79%)
Mutual labels:  rnn

Independently Recurrent Neural Networks

This code is to implement the IndRNN and the Deep IndRNN. It is based on Pytorch.

cuda_IndRNN_onlyrecurrent is the CUDA version. It is much faster than the simple pytorch implementation. For the sequential MNIST example (length 784), it runs over 31 times faster.

Please cite the following paper if you find it useful.
Shuai Li, Wanqing Li, Chris Cook, Ce Zhu, and Yanbo Gao. "Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN." CVPR 2018.
Shuai Li, Wanqing Li, Chris Cook, and Yanbo Gao. "Deep Independently Recurrent Neural Network (IndRNN)." arXiv preprint arXiv:1910.06251, 2019.
@inproceedings{li2018independently, title={Independently recurrent neural network (indrnn): Building a longer and deeper rnn}, author={Li, Shuai and Li, Wanqing and Cook, Chris and Zhu, Ce and Gao, Yanbo}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, pages={5457--5466}, year={2018} }
@article{li2019deep, title={Deep Independently Recurrent Neural Network (IndRNN)}, author={Li, Shuai and Li, Wanqing and Cook, Chris and Gao, Yanbo and Zhu, Ce}, journal={arXiv preprint arXiv:1910.06251}, year={2019} }

Summary of advantages

  • Able to process longer sequences (over 5000 steps): gradient vanishing and exploding problem is solved.
  • Able to construct deeper networks (over 20layer, much deeper if GPU memory supports)
  • Able to be robustly trained with ReLU
  • Able to interpret the behaviour of IndRNN neurons independently without the effect from the others
  • Reduced complexity (over 10x faster than cuDNN LSTM when the sequence is long)

Usage

IndRNN_onlyrecurrent.py provides only the recurrent+activation of the IndRNN function. Therefore, processing of the input with dense connection or convolution operation is needed. This is usedful for adding batch normalization (BN) between the processing of input and activation function. Just consider it as an Relu function with recurrent connections. I believe this is more flexible since you can add all different processings to the inputs.
cuda_IndRNN_onlyrecurrent is the CUDA version. It is much faster than the simple pytorch implementation. For the sequential MNIST example (length 784), it runs over 31 times faster.

Requirements

  • Pytorch

For the CUDA version

  • CuPy
  • pynvrtc

Running

Please refer to the tasks.

Considerations in implementation

1, Initialization of the recurrent weights

For relu, Uniform(0,1) is used to make different neurons keep different kinds of memory. But for problems that only use the output of the last time step such as the adding problem, MNIST classification problem, and action recognition problem, the recurrent weights for the last IndRNN layer (caution: only the last one not all) can be initialized to be all 1 or a proper range (1-epsilon, 1+epsilon) where epsilon is a small number, since only long-term memory is needed for the output of this layer. Examples are shown in Indrnn_action_network.py.

2, Constraint of the recurrent weights

For relu, generally it can be set to [-U_bound, U_bound] where U_bound=pow(args.MAG, 1.0 / seq_len) and MAG can be 2 or 10 or others. If the sequence is very long, it can be [-1, 1] since it is very close to 1 and the precision of GPU is limited. If the sequence is short such as 20, no constraint is needed. Example of the constraint is shown at Indrnn_action_train.py. By the way, this constraint can also be implemented as a weight decay of ||max(0,|U|-U_bound)||.
For simplicity, the constraint can always set to [-1, 1] as it can keep long-term memory already and the difference in performance is small.

3, Usage of batch normalization (BN)

Generally, over 3 layers, BN can help accelerate the training. BN can be used before the activation function or after it. In our experiments, we find it converges faster by putting BN after the activation function. However, for tasks such as PTB_c where the output of one batch is further used as the initialization of the next batch, it is better to put BN before activation as mentioned at the above example.

4, Learning rate

In our experiments, ADAM with a learning rate of 2e-4 works well.

5, Weight decay

If weight decay is used, no need to add decay on the recurrent weights.

6, Usage of dropout

Dropout (if used) is applied with the same mask over time.

Note

The above considerations are just suggestions. I did not explore lots of training techniques such as training methods, initialization techniques. So better results may be achieved with other options.

Other implementations

Theano and Lasagne:
https://github.com/Sunnydreamrain/IndRNN_Theano_Lasagne
Tensorflow:
https://github.com/batzner/indrnn
Keras:
https://github.com/titu1994/Keras-IndRNN
Pytorch:
https://github.com/StefOe/indrnn-pytorch
https://github.com/theSage21/IndRNN
https://github.com/zhangxu0307/Ind-RNN
Chainer:
https://github.com/0shimax/chainer-IndRNN

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].