All Projects → bangoc123 → transformer

bangoc123 / transformer

Licence: other
Build English-Vietnamese machine translation with ProtonX Transformer. :D

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to transformer

sb-nmt
Code for Synchronous Bidirectional Neural Machine Translation (SB-NMT)
Stars: ✭ 66 (+60.98%)
Mutual labels:  machine-translation, transformer
Joeynmt
Minimalist NMT for educational purposes
Stars: ✭ 420 (+924.39%)
Mutual labels:  machine-translation, transformer
transformer-tensorflow2.0
transformer in tensorflow 2.0
Stars: ✭ 53 (+29.27%)
Mutual labels:  transformer, tensorflow2
DolboNet
Русскоязычный чат-бот для Discord на архитектуре Transformer
Stars: ✭ 53 (+29.27%)
Mutual labels:  transformer, tensorflow2
Sockeye
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet
Stars: ✭ 990 (+2314.63%)
Mutual labels:  machine-translation, transformer
Nmt Keras
Neural Machine Translation with Keras
Stars: ✭ 501 (+1121.95%)
Mutual labels:  machine-translation, transformer
NiuTrans.NMT
A Fast Neural Machine Translation System. It is developed in C++ and resorts to NiuTensor for fast tensor APIs.
Stars: ✭ 112 (+173.17%)
Mutual labels:  machine-translation, transformer
Transformers without tears
Transformers without Tears: Improving the Normalization of Self-Attention
Stars: ✭ 80 (+95.12%)
Mutual labels:  machine-translation, transformer
Witwicky
Witwicky: An implementation of Transformer in PyTorch.
Stars: ✭ 21 (-48.78%)
Mutual labels:  machine-translation, transformer
Turbotransformers
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
Stars: ✭ 826 (+1914.63%)
Mutual labels:  machine-translation, transformer
Machine Translation
Stars: ✭ 51 (+24.39%)
Mutual labels:  machine-translation, transformer
Hardware Aware Transformers
[ACL 2020] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Stars: ✭ 206 (+402.44%)
Mutual labels:  machine-translation, transformer
Attention Mechanisms
Implementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras.
Stars: ✭ 203 (+395.12%)
Mutual labels:  machine-translation
TFLite-ModelMaker-EfficientDet-Colab-Hands-On
TensorFlow Lite Model Makerで物体検出を行うハンズオン用資料です(Hands-on for object detection with TensorFlow Lite Model Maker)
Stars: ✭ 15 (-63.41%)
Mutual labels:  tensorflow2
Bleualign
Machine-Translation-based sentence alignment tool for parallel text
Stars: ✭ 199 (+385.37%)
Mutual labels:  machine-translation
Lingvo
Lingvo
Stars: ✭ 2,361 (+5658.54%)
Mutual labels:  machine-translation
SSE-PT
Codes and Datasets for paper RecSys'20 "SSE-PT: Sequential Recommendation Via Personalized Transformer" and NurIPS'19 "Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers"
Stars: ✭ 103 (+151.22%)
Mutual labels:  transformer
Grokking-Machine-Learning
This repo aims to contain different machine learning use cases along with the descriptions to the model architectures
Stars: ✭ 54 (+31.71%)
Mutual labels:  tensorflow2
Texar
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Stars: ✭ 2,236 (+5353.66%)
Mutual labels:  machine-translation
Npmt
Towards Neural Phrase-based Machine Translation
Stars: ✭ 175 (+326.83%)
Mutual labels:  machine-translation

ProtonX Transformer

Design Machine Translation Engine for Vietnamese using Transformer Architecture from paper Attention Is All You Need. Give us a star if you like this repo.

Model Explanation:

  • Slide:
    • Transformer Encoder: Check out here
    • Transformer Decoder (Updating)

Author:

This library belongs to our project: Papers-Videos-Code where we will implement AI SOTA papers and publish all source code. Additionally, videos to explain these models will be uploaded to ProtonX Youtube channels.

Architecture:

image

[Note] You can use your data to train this model.

I. Set up environment

  1. Make sure you have installed Miniconda. If not yet, see the setup document here.

  2. cd into transformer and use command line conda env create -f environment.yml to set up the environment

  3. Run conda environment using the command conda activate transformer

II. Set up your dataset.

Design train dataset with 2 files:

  • train.en
  • train.vi

For example:

train.en train.vi
I love you Tôi yêu bạn
... ....

You can see mocking data in ./data/mock folder.

III. Train your model by running this command line

Training script:

python train.py --epochs ${epochs} --input-lang en --target-lang vi --input-path ${path_to_en_text_file} --target-path ${path_to_vi_text_file}

Example: You want to build English-Vietnamese machine translation in 10 epochs

python train.py --epochs 10 --input-lang en --target-lang vi --input-path ./data/mock/train.en --target-path ./data/mock/train.vi

There are some important arguments for the script you should consider when running it:

  • input-lang: The name of the input language (E.g. en)
  • target-lang: The name of the target language (E.g. vi)
  • input-path: The path of the input text file (E.g. ./data/mock/train.en)
  • target-path: The path of the output text file (E.g. ./data/mock/train.vi)
  • model-folder: Saved model path
  • vocab-folder: Saved tokenizer + vocab path
  • batch-size: The batch size of the dataset
  • max-length: The maximum length of a sentence you want to keep when preprocessing
  • num-examples: The number of lines you want to train. It was set small if you want to experiment with this library quickly.
  • d-model: The dimension of linear projection for all sentence. It was mentioned in Section 3.2.2 on the page 5
  • n: The number of Encoder/Decoder Layers. Transformer-Base sets it to 6.
  • h: The number of Multi-Head Attention. Transformer-Base sets it to 6.
  • d-ff: The hidden size of Position-wise Feed-Forward Networks. It was mentioned in Section 3.3
  • activation: The activation of Position-wise Feed-Forward Networks. If we want to experiment GELU instead of RELU, which activation was wisely used recently.
  • dropout-rate. Dropout rate of any Layer. Transformer-Base sets it to 0.1
  • eps. Layer Norm parameter. Default value: 0.1

After training successfully, your model will be saved to model-folder defined before

IV. TODO

  • Bugs Fix:

    In this project, you can see that we try to compile all the pipeline into tf.keras.Model class in model.py file and using fit function to train the model. Unfortunately, there are few critical bugs we need to fix for a new release.

    • Fix exporting model using save_weights API. (Currently, the system is unable to reload checkpoint for some unknown reasons.)
  • New Features:

    • Reading files Pipeline (Release Time: 06/07/2021)
    • Adapting BPE, Subwords Tokenizer (Release Time: Updating...)
    • Use Beam Search for better-generating words (Release Time: Updating...)
    • Set up Typing weights mode (Release Time: Updating...)

V. Running Test

When you want to modify the model, you need to run the test to make sure your change does not affect the whole system.

In the ./transformer folder please run:

pytest

VI. Feedback

If you have any issues when using this library, please let us know via the issues submission tab.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].