All Projects → hfxunlp → transformer

hfxunlp / transformer

Licence: GPL-3.0 license
Neutron: A pytorch based implementation of Transformer and its variants.

Programming Languages

python
139335 projects - #7 most used programming language
C++
36643 projects - #6 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to transformer

Sockeye
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet
Stars: ✭ 990 (+1550%)
Mutual labels:  transformer, seq2seq, neural-machine-translation, attention-is-all-you-need
Tf Seq2seq
Sequence to sequence learning using TensorFlow.
Stars: ✭ 387 (+545%)
Mutual labels:  seq2seq, beam-search, neural-machine-translation
kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
Stars: ✭ 456 (+660%)
Mutual labels:  transformer, seq2seq, attention-is-all-you-need
Joeynmt
Minimalist NMT for educational purposes
Stars: ✭ 420 (+600%)
Mutual labels:  transformer, seq2seq, neural-machine-translation
Nmt Keras
Neural Machine Translation with Keras
Stars: ✭ 501 (+735%)
Mutual labels:  transformer, neural-machine-translation, attention-is-all-you-need
Kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.
Stars: ✭ 190 (+216.67%)
Mutual labels:  transformer, seq2seq, attention-is-all-you-need
zero
Zero -- A neural machine translation system
Stars: ✭ 121 (+101.67%)
Mutual labels:  transformer, neural-machine-translation, average-attention-network
minimal-nmt
A minimal nmt example to serve as an seq2seq+attention reference.
Stars: ✭ 36 (-40%)
Mutual labels:  seq2seq, beam-search, neural-machine-translation
transformer
A PyTorch Implementation of "Attention Is All You Need"
Stars: ✭ 28 (-53.33%)
Mutual labels:  transformer, seq2seq, attention-is-all-you-need
Machine Translation
Stars: ✭ 51 (-15%)
Mutual labels:  transformer, seq2seq, attention-is-all-you-need
Pytorch Seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Stars: ✭ 3,418 (+5596.67%)
Mutual labels:  transformer, seq2seq, neural-machine-translation
TS3000 TheChatBOT
Its a social networking chat-bot trained on Reddit dataset . It supports open bounded queries developed on the concept of Neural Machine Translation. Beware of its being sarcastic just like its creator 😝 BDW it uses Pytorch framework and Python3.
Stars: ✭ 20 (-66.67%)
Mutual labels:  beam-search, neural-machine-translation
transformer
A simple TensorFlow implementation of the Transformer
Stars: ✭ 25 (-58.33%)
Mutual labels:  transformer, attention-is-all-you-need
Transformer Temporal Tagger
Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging
Stars: ✭ 55 (-8.33%)
Mutual labels:  transformer, seq2seq
Word-Level-Eng-Mar-NMT
Translating English sentences to Marathi using Neural Machine Translation
Stars: ✭ 37 (-38.33%)
Mutual labels:  seq2seq, neural-machine-translation
Neural-Machine-Translation
Several basic neural machine translation models implemented by PyTorch & TensorFlow
Stars: ✭ 29 (-51.67%)
Mutual labels:  transformer, neural-machine-translation
tensorflow-ml-nlp-tf2
텐서플로2와 머신러닝으로 시작하는 자연어처리 (로지스틱회귀부터 BERT와 GPT3까지) 실습자료
Stars: ✭ 245 (+308.33%)
Mutual labels:  transformer, seq2seq
dynmt-py
Neural machine translation implementation using dynet's python bindings
Stars: ✭ 17 (-71.67%)
Mutual labels:  seq2seq, neural-machine-translation
beam search
Beam search for neural network sequence to sequence (encoder-decoder) models.
Stars: ✭ 31 (-48.33%)
Mutual labels:  seq2seq, beam-search
NLP-paper
🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-61.67%)
Mutual labels:  transformer, seq2seq

Neutron

Neutron: A pytorch based implementation of the Transformer and its variants.

This project is developed with python 3.10.

Setup dependencies

Try pip install -r requirements.txt after you clone the repository.

If you want to use BPE, to enable convertion to C libraries, to try the simple MT server and to support Chinese word segmentation supported by pynlpir in this implementation, you should also install those dependencies in requirements.opt.txt with pip install -r requirements.opt.txt.

Data preprocessing

BPE

We provide scripts to apply Byte-Pair Encoding (BPE) under scripts/bpe/.

convert plain text to tensors for training

Generate training data for train.py with bash scripts/mktrain.sh, configure variables in scripts/mktrain.sh for your usage (the other variables shall comply with those in scripts/bpe/mk.sh).

Configuration for training and testing

Most configurations are managed in cnfg/base.py. Configure advanced details with cnfg/hyp.py.

Training

Just execute the following command to launch the training:

python train.py

Generation

bash scripts/mktest.sh, configure variables in scripts/mktest.sh for your usage (while keep the other settings consistent with those in scripts/mkbpe.sh and scripts/mktrain.sh).

Exporting python files to C libraries

You can convert python classes into C libraries with python mkcy.py build_ext --inplace, and codes will be checked before compiling, which can serve as a simple to way to find typo and bugs as well. This function is supported by Cython. These files can be removed by commands tools/clean/cython.py . and rm -fr build/. Loading modules from compiled C libraries may also accelerate, but not significantly.

Ranking

You can rank your corpus with pre-trained model, per token perplexity will be given for each sequence pair. Use it with:

python rank.py rsf h5f models

where rsf is the result file, h5f is HDF5 formatted input of file of your corpus (genrated like training set with tools/mkiodata.py like in scripts/mktrain.sh), models is a (list of) model file(s) to make perplexity evaluation.

The other files' discription

modules/

Foundamental models needed for the construction of transformer.

loss/

Implementation of label smoothing loss function required by the training of transformer.

lrsch.py

Learning rate schedule model needed according to the paper.

utils/

Functions for basic features, for example, freeze / unfreeze parameters of models, padding list of tensors to same size on assigned dimension.

translator.py

Provide an encapsulation for the whole translation procedure with which you can use the trained model in your application easier.

server.py

An example depends on Flask to provide simple Web service and REST API about how to use the translator, configure those variables before you use it.

transformer/

Implementations of seq2seq models.

parallel/

Multi-GPU parallelization implementation.

datautils/

Supportive functions for data segmentation.

tools/

Scripts to support data processing (e.g. text to tensor), analyzing, model file handling, etc.

Performance

Settings: WMT 2014, English -> German, 32k joint BPE with 8 as vocabulary threshold for BPE. 2 nVidia GTX 1080 Ti GPU(s) for training, 1 for decoding.

Tokenized case-sensitive BLEU measured with multi-bleu.perl, Training speed and decoding speed are measured by the number of target tokens (<eos> counted and <pad> discounted) per second and the number of sentences per second:

BLEU Training Speed Decoding Speed
Attention is all you need 27.3
Neutron 28.07 23213.65 150.15

Acknowledgments

Hongfei Xu enjoys a doctoral grant from China Scholarship Council ([2018]3101, 201807040056) while maintaining this project.

Details of this project can be found here, and please cite it if you enjoy the implementation :)

@article{xu2019neutron,
  author = {Xu, Hongfei and Liu, Qiuhui},
  title = "{Neutron: An Implementation of the Transformer Translation Model and its Variants}",
  journal = {arXiv preprint arXiv:1903.07402},
  archivePrefix = "arXiv",
  eprinttype = {arxiv},
  eprint = {1903.07402},
  primaryClass = "cs.CL",
  keywords = {Computer Science - Computation and Language},
  year = 2019,
  month = "March",
  url = {https://arxiv.org/abs/1903.07402},
  pdf = {https://arxiv.org/pdf/1903.07402}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].