All Projects → phohenecker → pytorch-transformer

phohenecker / pytorch-transformer

Licence: other
A PyTorch implementation of the Transformer model from "Attention Is All You Need".

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to pytorch-transformer

Transformer
A Pytorch Implementation of "Attention is All You Need" and "Weighted Transformer Network for Machine Translation"
Stars: ✭ 271 (+453.06%)
Mutual labels:  attention-is-all-you-need
Awesome Fast Attention
list of efficient attention modules
Stars: ✭ 627 (+1179.59%)
Mutual labels:  attention-is-all-you-need
Transformers without tears
Transformers without Tears: Improving the Normalization of Self-Attention
Stars: ✭ 80 (+63.27%)
Mutual labels:  attention-is-all-you-need
Transformer
A TensorFlow Implementation of the Transformer: Attention Is All You Need
Stars: ✭ 3,646 (+7340.82%)
Mutual labels:  attention-is-all-you-need
Speech Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Stars: ✭ 565 (+1053.06%)
Mutual labels:  attention-is-all-you-need
Bert language understanding
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
Stars: ✭ 933 (+1804.08%)
Mutual labels:  attention-is-all-you-need
attention-is-all-you-need-paper
Implementation of Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017.
Stars: ✭ 97 (+97.96%)
Mutual labels:  attention-is-all-you-need
Kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.
Stars: ✭ 190 (+287.76%)
Mutual labels:  attention-is-all-you-need
Attention Is All You Need Pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
Stars: ✭ 6,070 (+12287.76%)
Mutual labels:  attention-is-all-you-need
Machine Translation
Stars: ✭ 51 (+4.08%)
Mutual labels:  attention-is-all-you-need
Pytorch Original Transformer
My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
Stars: ✭ 411 (+738.78%)
Mutual labels:  attention-is-all-you-need
Textclassificationbenchmark
A Benchmark of Text Classification in PyTorch
Stars: ✭ 534 (+989.8%)
Mutual labels:  attention-is-all-you-need
Witwicky
Witwicky: An implementation of Transformer in PyTorch.
Stars: ✭ 21 (-57.14%)
Mutual labels:  attention-is-all-you-need
Dab
Data Augmentation by Backtranslation (DAB) ヽ( •_-)ᕗ
Stars: ✭ 294 (+500%)
Mutual labels:  attention-is-all-you-need
Njunmt Pytorch
Stars: ✭ 79 (+61.22%)
Mutual labels:  attention-is-all-you-need
BangalASR
Transformer based Bangla Speech Recognition
Stars: ✭ 20 (-59.18%)
Mutual labels:  attention-is-all-you-need
Attention Is All You Need Keras
A Keras+TensorFlow Implementation of the Transformer: Attention Is All You Need
Stars: ✭ 628 (+1181.63%)
Mutual labels:  attention-is-all-you-need
Pytorch Transformer
pytorch implementation of Attention is all you need
Stars: ✭ 199 (+306.12%)
Mutual labels:  attention-is-all-you-need
Linear Attention Recurrent Neural Network
A recurrent attention module consisting of an LSTM cell which can query its own past cell states by the means of windowed multi-head attention. The formulas are derived from the BN-LSTM and the Transformer Network. The LARNN cell with attention can be easily used inside a loop on the cell state, just like any other RNN. (LARNN)
Stars: ✭ 119 (+142.86%)
Mutual labels:  attention-is-all-you-need
Sockeye
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet
Stars: ✭ 990 (+1920.41%)
Mutual labels:  attention-is-all-you-need

pytorch-transformer

This repository provides a PyTorch implementation of the Transformer model that has been introduced in the paper Attention Is All You Need (Vaswani et al. 2017).

Installation

The easiest way to install this package is via pip:

pip install git+https://github.com/phohenecker/pytorch-transformer

Usage

import transformer
model = transformer.Transformer(...)
1. Computing Predictions given a Target Sequence

This is the default behaviour of a Transformer, and is implemented in its forward method:

predictions = model(input_seq, target_seq)
2. Evaluating the Probability of a Target Sequence

The probability of an output sequence given an input sequence under an already trained model can be evaluated by means of the function eval_probability:

probabilities = transformer.eval_probability(model, input_seq, target_seq, pad_index=...)
3. Sampling an Output Sequence

Sampling a random output given an input sequence under the distribution computed by a model is realized by the function sample_output:

output_seq = transformer.sample_output(model, input_seq, eos_index, pad_index, max_len)

Pretraining Encoders with BERT

For pretraining the encoder part of the transformer (i.e.,transformer.Encoder) with BERT (Devlin et al., 2018), the class MLMLoss provides an implementation of the masked language-model loss function. A full example of how to implement pretraining with BERT can be found in examples/bert_pretraining.py.

References

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Polosukhin, I. (2017). Attention Is All You Need.
Preprint at http://arxiv.org/abs/1706.03762.

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018).
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.
Preprint at http://arxiv.org/abs/1810.04805.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].