All Projects → eladhoffer → Seq2seq.pytorch

eladhoffer / Seq2seq.pytorch

Licence: mit
Sequence-to-Sequence learning using PyTorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Seq2seq.pytorch

Nmtpytorch
Sequence-to-Sequence Framework in PyTorch
Stars: ✭ 392 (-23.74%)
Mutual labels:  seq2seq, neural-machine-translation
Tensorflow Shakespeare
Neural machine translation between the writings of Shakespeare and modern English using TensorFlow
Stars: ✭ 244 (-52.53%)
Mutual labels:  seq2seq, neural-machine-translation
Xmunmt
An implementation of RNNsearch using TensorFlow
Stars: ✭ 69 (-86.58%)
Mutual labels:  seq2seq, neural-machine-translation
Openseq2seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Stars: ✭ 1,378 (+168.09%)
Mutual labels:  seq2seq, neural-machine-translation
minimal-nmt
A minimal nmt example to serve as an seq2seq+attention reference.
Stars: ✭ 36 (-93%)
Mutual labels:  seq2seq, neural-machine-translation
Joeynmt
Minimalist NMT for educational purposes
Stars: ✭ 420 (-18.29%)
Mutual labels:  seq2seq, neural-machine-translation
Nspm
🤖 Neural SPARQL Machines for Knowledge Graph Question Answering.
Stars: ✭ 156 (-69.65%)
Mutual labels:  seq2seq, neural-machine-translation
Sockeye
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet
Stars: ✭ 990 (+92.61%)
Mutual labels:  seq2seq, neural-machine-translation
dynmt-py
Neural machine translation implementation using dynet's python bindings
Stars: ✭ 17 (-96.69%)
Mutual labels:  seq2seq, neural-machine-translation
Word-Level-Eng-Mar-NMT
Translating English sentences to Marathi using Neural Machine Translation
Stars: ✭ 37 (-92.8%)
Mutual labels:  seq2seq, neural-machine-translation
Pytorch Seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Stars: ✭ 3,418 (+564.98%)
Mutual labels:  seq2seq, neural-machine-translation
transformer
Neutron: A pytorch based implementation of Transformer and its variants.
Stars: ✭ 60 (-88.33%)
Mutual labels:  seq2seq, neural-machine-translation
RNNSearch
An implementation of attention-based neural machine translation using Pytorch
Stars: ✭ 43 (-91.63%)
Mutual labels:  seq2seq, neural-machine-translation
Tf Seq2seq
Sequence to sequence learning using TensorFlow.
Stars: ✭ 387 (-24.71%)
Mutual labels:  seq2seq, neural-machine-translation
Time Series Prediction
A collection of time series prediction methods: rnn, seq2seq, cnn, wavenet, transformer, unet, n-beats, gan, kalman-filter
Stars: ✭ 351 (-31.71%)
Mutual labels:  seq2seq
Nlp Tutorials
Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com
Stars: ✭ 394 (-23.35%)
Mutual labels:  seq2seq
Im2latex
Image to LaTeX (Seq2seq + Attention with Beam Search) - Tensorflow
Stars: ✭ 342 (-33.46%)
Mutual labels:  seq2seq
Pytorch Chatbot
Pytorch seq2seq chatbot
Stars: ✭ 336 (-34.63%)
Mutual labels:  seq2seq
Neural Combinatorial Rl Pytorch
PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning https://arxiv.org/abs/1611.09940
Stars: ✭ 329 (-35.99%)
Mutual labels:  seq2seq
Nl2bash
Generating bash command from natural language https://arxiv.org/abs/1802.08979
Stars: ✭ 325 (-36.77%)
Mutual labels:  seq2seq

Seq2Seq in PyTorch

This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train and infer using them.

Using this code you can train:

  • Neural-machine-translation (NMT) models
  • Language models
  • Image to caption generation
  • Skip-thought sentence representations
  • And more...

Installation

git clone --recursive https://github.com/eladhoffer/seq2seq.pytorch
cd seq2seq.pytorch; python setup.py develop

Models

Models currently available:

Datasets

Datasets currently available:

All datasets can be tokenized using 3 available segmentation methods:

  • Character based segmentation
  • Word based segmentation
  • Byte-pair-encoding (BPE) as suggested by bpe with selectable number of tokens.

After choosing a tokenization method, a vocabulary will be generated and saved for future inference.

Training methods

The models can be trained using several methods:

  • Basic Seq2Seq - given encoded sequence, generate (decode) output sequence. Training is done with teacher-forcing.
  • Multi Seq2Seq - where several tasks (such as multiple languages) are trained simultaneously by using the data sequences as both input to the encoder and output for decoder.
  • Image2Seq - used to train image to caption generators.

Usage

Example training scripts are available in scripts folder. Inference examples are available in examples folder.

  • example for training a transformer on WMT16 according to original paper regime:
DATASET=${1:-"WMT16_de_en"}
DATASET_DIR=${2:-"./data/wmt16_de_en"}
OUTPUT_DIR=${3:-"./results"}

WARMUP="4000"
LR0="512**(-0.5)"

python main.py \
  --save transformer \
  --dataset ${DATASET} \
  --dataset-dir ${DATASET_DIR} \
  --results-dir ${OUTPUT_DIR} \
  --model Transformer \
  --model-config "{'num_layers': 6, 'hidden_size': 512, 'num_heads': 8, 'inner_linear': 2048}" \
  --data-config "{'moses_pretok': True, 'tokenization':'bpe', 'num_symbols':32000, 'shared_vocab':True}" \
  --b 128 \
  --max-length 100 \
  --device-ids 0 \
  --label-smoothing 0.1 \
  --trainer Seq2SeqTrainer \
  --optimization-config "[{'step_lambda':
                          \"lambda t: { \
                              'optimizer': 'Adam', \
                              'lr': ${LR0} * min(t ** -0.5, t * ${WARMUP} ** -1.5), \
                              'betas': (0.9, 0.98), 'eps':1e-9}\"
                          }]"
  • example for training attentional LSTM based model with 3 layers in both encoder and decoder:
python main.py \
  --save de_en_wmt17 \
  --dataset ${DATASET} \
  --dataset-dir ${DATASET_DIR} \
  --results-dir ${OUTPUT_DIR} \
  --model RecurrentAttentionSeq2Seq \
  --model-config "{'hidden_size': 512, 'dropout': 0.2, \
                   'tie_embedding': True, 'transfer_hidden': False, \
                   'encoder': {'num_layers': 3, 'bidirectional': True, 'num_bidirectional': 1, 'context_transform': 512}, \
                   'decoder': {'num_layers': 3, 'concat_attention': True,\
                               'attention': {'mode': 'dot_prod', 'dropout': 0, 'output_transform': True, 'output_nonlinearity': 'relu'}}}" \
  --data-config "{'moses_pretok': True, 'tokenization':'bpe', 'num_symbols':32000, 'shared_vocab':True}" \
  --b 128 \
  --max-length 80 \
  --device-ids 0 \
  --trainer Seq2SeqTrainer \
  --optimization-config "[{'epoch': 0, 'optimizer': 'Adam', 'lr': 1e-3},
                          {'epoch': 6, 'lr': 5e-4},
                          {'epoch': 8, 'lr':1e-4},
                          {'epoch': 10, 'lr': 5e-5},
                          {'epoch': 12, 'lr': 1e-5}]" \
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].