All Projects → Helsinki-NLP → OPUS-MT-train

Helsinki-NLP / OPUS-MT-train

Licence: MIT license
Training open neural machine translation models

Programming Languages

Makefile
30231 projects
perl
6916 projects
shell
77523 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to OPUS-MT-train

Openkiwi
Open-Source Machine Translation Quality Estimation in PyTorch
Stars: ✭ 157 (-5.42%)
Mutual labels:  machine-translation
Hardware Aware Transformers
[ACL 2020] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Stars: ✭ 206 (+24.1%)
Mutual labels:  machine-translation
sb-nmt
Code for Synchronous Bidirectional Neural Machine Translation (SB-NMT)
Stars: ✭ 66 (-60.24%)
Mutual labels:  machine-translation
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+1416.87%)
Mutual labels:  machine-translation
Bleualign
Machine-Translation-based sentence alignment tool for parallel text
Stars: ✭ 199 (+19.88%)
Mutual labels:  machine-translation
Modernmt
Neural Adaptive Machine Translation that adapts to context and learns from corrections.
Stars: ✭ 231 (+39.16%)
Mutual labels:  machine-translation
Nspm
🤖 Neural SPARQL Machines for Knowledge Graph Question Answering.
Stars: ✭ 156 (-6.02%)
Mutual labels:  machine-translation
tvsub
TVsub: DCU-Tencent Chinese-English Dialogue Corpus
Stars: ✭ 40 (-75.9%)
Mutual labels:  machine-translation
Attention Mechanisms
Implementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras.
Stars: ✭ 203 (+22.29%)
Mutual labels:  machine-translation
apertium-apy
📦 Apertium HTTP Server in Python
Stars: ✭ 29 (-82.53%)
Mutual labels:  machine-translation
Npmt
Towards Neural Phrase-based Machine Translation
Stars: ✭ 175 (+5.42%)
Mutual labels:  machine-translation
Lingvo
Lingvo
Stars: ✭ 2,361 (+1322.29%)
Mutual labels:  machine-translation
ibleu
A visual and interactive scoring environment for machine translation systems.
Stars: ✭ 27 (-83.73%)
Mutual labels:  machine-translation
Mt Reading List
A machine translation reading list maintained by Tsinghua Natural Language Processing Group
Stars: ✭ 2,166 (+1204.82%)
Mutual labels:  machine-translation
osdg-tool
OSDG is an open-source tool that maps and connects activities to the UN Sustainable Development Goals (SDGs) by identifying SDG-relevant content in any text. The tool is available online at www.osdg.ai. API access available for research purposes.
Stars: ✭ 22 (-86.75%)
Mutual labels:  machine-translation
Mtbook
《机器翻译:基础与模型》肖桐 朱靖波 著 - Machine Translation: Foundations and Models
Stars: ✭ 2,307 (+1289.76%)
Mutual labels:  machine-translation
Opennmt
Open Source Neural Machine Translation in Torch (deprecated)
Stars: ✭ 2,339 (+1309.04%)
Mutual labels:  machine-translation
lt1
Course on Language Technologies and NLP
Stars: ✭ 15 (-90.96%)
Mutual labels:  language-technology
bergamot-translator
Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.
Stars: ✭ 181 (+9.04%)
Mutual labels:  machine-translation
transformer
Build English-Vietnamese machine translation with ProtonX Transformer. :D
Stars: ✭ 41 (-75.3%)
Mutual labels:  machine-translation

Train Opus-MT models

This package includes scripts for training NMT models using MarianNMT and OPUS data for OPUS-MT. More details are given in the Makefile but documentation needs to be improved. Also, the targets require a specific environment and right now only work well on the CSC HPC cluster in Finland.

Pre-trained models

The subdirectory models contains information about pre-trained models that can be downloaded from this project. They are distribted with a CC-BY 4.0 license license. More pre-trained models trained with the OPUS-MT training pipeline are available from the Tatoeba translation challenge also under a CC-BY 4.0 license license.

Quickstart

Setting up:

git clone https://github.com/Helsinki-NLP/OPUS-MT-train.git
git submodule update --init --recursive --remote
make install

Look into lib/env.mk and adust any settings that you need in your environment. For CSC-users: adjust lib/env/puhti.mk and lib/env/mahti.mk to match yoursetup (especially the locations where Marian-NMT and other tools are installed and the CSC project that you are using).

Training a multilingual NMT model (Finnish and Estonian to Danish, Swedish and English):

make SRCLANGS="fi et" TRGLANGS="da sv en" train
make SRCLANGS="fi et" TRGLANGS="da sv en" eval
make SRCLANGS="fi et" TRGLANGS="da sv en" release

More information is available in the documentation linked below.

Documentation

Tutorials

References

Please, cite the following paper if you use OPUS-MT software and models:

@InProceedings{TiedemannThottingal:EAMT2020,
  author = {J{\"o}rg Tiedemann and Santhosh Thottingal},
  title = {{OPUS-MT} — {B}uilding open translation services for the {W}orld},
  booktitle = {Proceedings of the 22nd Annual Conferenec of the European Association for Machine Translation (EAMT)},
  year = {2020},
  address = {Lisbon, Portugal}
 }

Acknowledgements

None of this would be possible without all the great open source software including

... and many other tools like terashuf, pigz, jq, Moses SMT, fast_align, sacrebleu ...

We would also like to acknowledge the support by the University of Helsinki, the IT Center of Science CSC, the funding through projects in the EU Horizon 2020 framework (FoTran, MeMAD, ELG) and the contributors to the open collection of parallel corpora OPUS.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].