All Projects → xwgeng → RNNSearch

xwgeng / RNNSearch

Licence: other
An implementation of attention-based neural machine translation using Pytorch

Programming Languages

python
139335 projects - #7 most used programming language
perl
6916 projects
shell
77523 projects

Projects that are alternatives of or similar to RNNSearch

Pytorch Seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Stars: ✭ 3,418 (+7848.84%)
Mutual labels:  seq2seq, attention, neural-machine-translation, sequence-to-sequence
Tf Seq2seq
Sequence to sequence learning using TensorFlow.
Stars: ✭ 387 (+800%)
Mutual labels:  seq2seq, neural-machine-translation, sequence-to-sequence, nmt
Xmunmt
An implementation of RNNsearch using TensorFlow
Stars: ✭ 69 (+60.47%)
Mutual labels:  seq2seq, neural-machine-translation, sequence-to-sequence, nmt
Nmt Keras
Neural Machine Translation with Keras
Stars: ✭ 501 (+1065.12%)
Mutual labels:  neural-machine-translation, sequence-to-sequence, nmt
Nematus
Open-Source Neural Machine Translation in Tensorflow
Stars: ✭ 730 (+1597.67%)
Mutual labels:  neural-machine-translation, sequence-to-sequence, nmt
Neuralmonkey
An open-source tool for sequence learning in NLP built on TensorFlow.
Stars: ✭ 400 (+830.23%)
Mutual labels:  neural-machine-translation, sequence-to-sequence, nmt
Njunmt Tf
An open-source neural machine translation system developed by Natural Language Processing Group, Nanjing University.
Stars: ✭ 97 (+125.58%)
Mutual labels:  attention, neural-machine-translation, nmt
Word-Level-Eng-Mar-NMT
Translating English sentences to Marathi using Neural Machine Translation
Stars: ✭ 37 (-13.95%)
Mutual labels:  seq2seq, neural-machine-translation, sequence-to-sequence
Neural sp
End-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (+848.84%)
Mutual labels:  seq2seq, attention, sequence-to-sequence
dynmt-py
Neural machine translation implementation using dynet's python bindings
Stars: ✭ 17 (-60.47%)
Mutual labels:  seq2seq, neural-machine-translation, sequence-to-sequence
Joeynmt
Minimalist NMT for educational purposes
Stars: ✭ 420 (+876.74%)
Mutual labels:  seq2seq, neural-machine-translation, nmt
Nmt List
A list of Neural MT implementations
Stars: ✭ 359 (+734.88%)
Mutual labels:  neural-machine-translation, sequence-to-sequence, nmt
Nmtpytorch
Sequence-to-Sequence Framework in PyTorch
Stars: ✭ 392 (+811.63%)
Mutual labels:  seq2seq, neural-machine-translation, nmt
Sockeye
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet
Stars: ✭ 990 (+2202.33%)
Mutual labels:  seq2seq, neural-machine-translation, sequence-to-sequence
Openseq2seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Stars: ✭ 1,378 (+3104.65%)
Mutual labels:  seq2seq, neural-machine-translation, sequence-to-sequence
Speech recognition with tensorflow
Implementation of a seq2seq model for Speech Recognition using the latest version of TensorFlow. Architecture similar to Listen, Attend and Spell.
Stars: ✭ 253 (+488.37%)
Mutual labels:  seq2seq, sequence-to-sequence
vat nmt
Implementation of "Effective Adversarial Regularization for Neural Machine Translation", ACL 2019
Stars: ✭ 22 (-48.84%)
Mutual labels:  neural-machine-translation, nmt
chinese ancient poetry
seq2seq attention tensorflow textrank context
Stars: ✭ 30 (-30.23%)
Mutual labels:  seq2seq, attention
Tensorflow Shakespeare
Neural machine translation between the writings of Shakespeare and modern English using TensorFlow
Stars: ✭ 244 (+467.44%)
Mutual labels:  seq2seq, neural-machine-translation
seq2seq-pytorch
Sequence to Sequence Models in PyTorch
Stars: ✭ 41 (-4.65%)
Mutual labels:  attention, sequence-to-sequence

Attention-based Neural Machine Translation

Installation

The following packages are needed:

Preparation

To obtain vocabulary for training, run:

python scripts/buildvocab.py --corpus /path/to/train.cn --output /path/to/cn.voc3.pkl \
--limit 30000 --groundhog
python scripts/buildvocab.py --corpus /path/to/train.en --output /path/to/en.voc3.pkl \
--limit 30000 --groundhog

Training

Training the RNNSearch on Chinese-English translation datasets as follows:

python train.py \
--src_vocab /path/to/cn.voc3.pkl --trg_vocab /path/to/en.voc3.pkl \
--train_src corpus/train.cn-en.cn --train_trg corpus/train.cn-en.en \
--valid_src corpus/nist02/nist02.cn \
--valid_trg corpus/nist02/nist02.en0 corpus/nist02/nist02.en1 corpus/nist02/nist02.en2 corpus/nist02/nist02.en3 \
--eval_script scripts/validate.sh \
--model RNNSearch \
--optim RMSprop \
--batch_size 80 \
--half_epoch \
--cuda \
--info RMSprop-half_epoch 

Evaluation

python translate.py \
--src_vocab /path/to/cn.voc3.pkl --trg_vocab /path/to/en.voc3.pkl \
--test_src corpus/nist03/nist03.cn \
--test_trg corpus/nist03/nist02.en0 corpus/nist03/nist03.en1 corpus/nist03/nist03.en2 corpus/nist03/nist03.en3 \
--eval_script scripts/validate.sh \
--model RNNSearch \
--name RNNSearch.best.pt \
--cuda 

The evaluation metric for Chinese-English we use is case-insensitive BLEU. We use the muti-bleu.perl script from Moses to compute the BLEU:

perl scripts/multi-bleu.perl -lc corpus/nist03/nist03.en < nist03.translated

Results on Chinese-English translation

The trainining dataset consists of 1.25M billingual sentence pairs extracted from LDC corpora. Use NIST 2002(MT02) as tuning set for hyper-parameter optimization and model selection, and NIST 2003(MT03), 2004 (MT04), 2005(MT05), 2006(MT06) and 2008(MT08) as test sets. The beam size is set to 10.

MT02 MT03 MT04 MT05 MT06 MT08 Ave.
40.16 37.26 40.50 36.67 37.10 28.54 36.01

Acknowledgements

My implementation utilizes code from the following:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].