All Projects → MaximumEntropy → Seq2seq Pytorch

MaximumEntropy / Seq2seq Pytorch

Licence: wtfpl
Sequence to Sequence Models with PyTorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Seq2seq Pytorch

Kaggle Web Traffic
1st place solution
Stars: ✭ 1,641 (+142.04%)
Mutual labels:  rnn, seq2seq
Tensorflow Tutorials
텐서플로우를 기초부터 응용까지 단계별로 연습할 수 있는 소스 코드를 제공합니다
Stars: ✭ 2,096 (+209.14%)
Mutual labels:  rnn, seq2seq
Chinese Chatbot
中文聊天机器人,基于10万组对白训练而成,采用注意力机制,对一般问题都会生成一个有意义的答复。已上传模型,可直接运行,跑不起来直播吃键盘。
Stars: ✭ 124 (-81.71%)
Mutual labels:  rnn, seq2seq
Mxnet Seq2seq
Sequence to sequence learning with MXNET
Stars: ✭ 51 (-92.48%)
Mutual labels:  rnn, seq2seq
DeepLearning-Lab
Code lab for deep learning. Including rnn,seq2seq,word2vec,cross entropy,bidirectional rnn,convolution operation,pooling operation,InceptionV3,transfer learning.
Stars: ✭ 83 (-87.76%)
Mutual labels:  rnn, seq2seq
Awesome Speech Recognition Speech Synthesis Papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Stars: ✭ 2,085 (+207.52%)
Mutual labels:  rnn, seq2seq
Poetry Seq2seq
Chinese Poetry Generation
Stars: ✭ 159 (-76.55%)
Mutual labels:  rnn, seq2seq
Natural Language Processing With Tensorflow
Natural Language Processing with TensorFlow, published by Packt
Stars: ✭ 222 (-67.26%)
Mutual labels:  rnn, seq2seq
tensorflow-ml-nlp-tf2
텐서플로2와 머신러닝으로 시작하는 자연어처리 (로지스틱회귀부터 BERT와 GPT3까지) 실습자료
Stars: ✭ 245 (-63.86%)
Mutual labels:  rnn, seq2seq
Pytorch Seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Stars: ✭ 3,418 (+404.13%)
Mutual labels:  rnn, seq2seq
GAN-RNN Timeseries-imputation
Recurrent GAN for imputation of time series data. Implemented in TensorFlow 2 on Wikipedia Web Traffic Forecast dataset from Kaggle.
Stars: ✭ 107 (-84.22%)
Mutual labels:  rnn, seq2seq
Base-On-Relation-Method-Extract-News-DA-RNN-Model-For-Stock-Prediction--Pytorch
基於關聯式新聞提取方法之雙階段注意力機制模型用於股票預測
Stars: ✭ 33 (-95.13%)
Mutual labels:  rnn, seq2seq
Text summurization abstractive methods
Multiple implementations for abstractive text summurization , using google colab
Stars: ✭ 359 (-47.05%)
Mutual labels:  rnn, seq2seq
Mozi
此项目致力于构建一套最基础,最精简,可维护的react-native项目,支持ios,android 🌹
Stars: ✭ 501 (-26.11%)
Mutual labels:  rnn
Seq2seq
Minimal Seq2Seq model with Attention for Neural Machine Translation in PyTorch
Stars: ✭ 552 (-18.58%)
Mutual labels:  seq2seq
Rgan
Recurrent (conditional) generative adversarial networks for generating real-valued time series data.
Stars: ✭ 480 (-29.2%)
Mutual labels:  rnn
Seq2seqchatbots
A wrapper around tensor2tensor to flexibly train, interact, and generate data for neural chatbots.
Stars: ✭ 466 (-31.27%)
Mutual labels:  seq2seq
Telemanom
A framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.
Stars: ✭ 589 (-13.13%)
Mutual labels:  rnn
How To Learn Deep Learning
A top-down, practical guide to learn AI, Deep learning and Machine Learning.
Stars: ✭ 544 (-19.76%)
Mutual labels:  rnn
Srnn
sliced-rnn
Stars: ✭ 462 (-31.86%)
Mutual labels:  rnn

Sequence to Sequence models with PyTorch

This repository contains implementations of Sequence to Sequence (Seq2Seq) models in PyTorch

At present it has implementations for :

* Vanilla Sequence to Sequence models

* Attention based Sequence to Sequence models from https://arxiv.org/abs/1409.0473 and https://arxiv.org/abs/1508.04025

* Faster attention mechanisms using dot products between the **final** encoder and decoder hidden states

* Sequence to Sequence autoencoders (experimental)

Sequence to Sequence models

A vanilla sequence to sequence model presented in https://arxiv.org/abs/1409.3215, https://arxiv.org/abs/1406.1078 consits of using a recurrent neural network such as an LSTM (http://dl.acm.org/citation.cfm?id=1246450) or GRU (https://arxiv.org/abs/1412.3555) to encode a sequence of words or characters in a source language into a fixed length vector representation and then deocoding from that representation using another RNN in the target language.

Sequence to Sequence

An extension of sequence to sequence models that incorporate an attention mechanism was presented in https://arxiv.org/abs/1409.0473 that uses information from the RNN hidden states in the source language at each time step in the deocder RNN. This attention mechanism significantly improves performance on tasks like machine translation. A few variants of the attention model for the task of machine translation have been presented in https://arxiv.org/abs/1508.04025.

Sequence to Sequence with attention

The repository also contains a simpler and faster variant of the attention mechanism that doesn't attend over the hidden states of the encoder at each time step in the deocder. Instead, it computes the a single batched dot product between all the hidden states of the decoder and encoder once after the decoder has processed all inputs in the target. This however comes at a minor cost in model performance. One advantage of this model is that it is possible to use the cuDNN LSTM in the attention based decoder as well since the attention is computed after running through all the inputs in the decoder.

Results on English - French WMT14

The following presents the model architecture and results obtained when training on the WMT14 English - French dataset. The training data is the english-french bitext from Europral-v7. The validation dataset is newstest2011

The model was trained with following configuration

* Source and target word embedding dimensions - 512

* Source and target LSTM hidden dimensions - 1024

* Encoder - 2 Layer Bidirectional LSTM

* Decoder - 1 Layer LSTM

* Optimization - ADAM with a learning rate of 0.0001 and batch size of 80

* Decoding - Greedy decoding (argmax)
Model BLEU Train Time Per Epoch
Seq2Seq 11.82 2h 50min
Seq2Seq FastAttention 18.89 3h 45min
Seq2Seq Attention 22.60 4h 47min

Times reported are using a Pre 2016 Nvidia GeForce Titan X

Running

To run, edit the config file and execute python nmt.py --config <your_config_file>

NOTE: This only runs on a GPU for now.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].