Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → naver-ai → MetricMT

naver-ai / MetricMT

Licence: MIT license

The official code repository for MetricMT - a reward optimization method for NMT with learned metrics

Labels

machine-learning optimization machine-translation

Projects that are alternatives of or similar to MetricMT

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Stars: ✭ 2,236 (+9621.74%)

Mutual labels: machine-translation

Build English-Vietnamese machine translation with ProtonX Transformer. :D

Stars: ✭ 41 (+78.26%)

Mutual labels: machine-translation

Training open neural machine translation models

Stars: ✭ 166 (+621.74%)

Mutual labels: machine-translation

Machine-Translation-based sentence alignment tool for parallel text

Stars: ✭ 199 (+765.22%)

Mutual labels: machine-translation

Neural Adaptive Machine Translation that adapts to context and learns from corrections.

Stars: ✭ 231 (+904.35%)

Mutual labels: machine-translation

Code for Synchronous Bidirectional Neural Machine Translation (SB-NMT)

Stars: ✭ 66 (+186.96%)

Mutual labels: machine-translation

State of the Art Natural Language Processing

Stars: ✭ 2,518 (+10847.83%)

Mutual labels: machine-translation

tai5-uan5 gian5-gi2 kang1-ku7

臺灣言語工具

Stars: ✭ 79 (+243.48%)

Mutual labels: machine-translation

A visual and interactive scoring environment for machine translation systems.

Stars: ✭ 27 (+17.39%)

Mutual labels: machine-translation

TVsub: DCU-Tencent Chinese-English Dialogue Corpus

Stars: ✭ 40 (+73.91%)

Mutual labels: machine-translation

Attention Mechanisms

Implementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras.

Stars: ✭ 203 (+782.61%)

Mutual labels: machine-translation

Open Source Neural Machine Translation in Torch (deprecated)

Stars: ✭ 2,339 (+10069.57%)

Mutual labels: machine-translation

OSDG is an open-source tool that maps and connects activities to the UN Sustainable Development Goals (SDGs) by identifying SDG-relevant content in any text. The tool is available online at www.osdg.ai. API access available for research purposes.

Stars: ✭ 22 (-4.35%)

Mutual labels: machine-translation

Lingvo

Stars: ✭ 2,361 (+10165.22%)

Mutual labels: machine-translation

extreme-adaptation-for-personalized-translation

Code for the paper "Extreme Adaptation for Personalized Neural Machine Translation"

Stars: ✭ 42 (+82.61%)

Mutual labels: machine-translation

Towards Neural Phrase-based Machine Translation

Stars: ✭ 175 (+660.87%)

Mutual labels: machine-translation

📦 Apertium HTTP Server in Python

Stars: ✭ 29 (+26.09%)

Mutual labels: machine-translation

Sanskrit compound segmentation using seq2seq model

Stars: ✭ 21 (-8.7%)

Mutual labels: machine-translation

Distill-BERT-Textgen

Research code for ACL 2020 paper: "Distilling Knowledge Learned in BERT for Text Generation".

Stars: ✭ 121 (+426.09%)

Mutual labels: machine-translation

bergamot-translator

Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.

Stars: ✭ 181 (+686.96%)

Mutual labels: machine-translation

View All Similar Projects ➔

MetricMT - Reward Optimization for Neural Machine Translation with Learned Metrics

This is our official code repository. To read the paper, please see (arxiv).

Authors: Raphael Shu, Kang Min Yoo and Jung-Woo Ha (NAVER AI Lab)

What is it about

In short, we optimize NMT models with the state-of-the-art metric, BLEURT, and found the translations to have higher adequacy and coverage compared to both the baseline and models trained with BLEU.

In machine translation, BLEU has been a dominating evaluation metric for years. However, the criticism on BLEU dates back as early as 2006 (Callison-Burch et al., 2006). The best overall paper of ACL 2020 (Mathur, 2020) again shows that BLEU's correlation with human drops to zero or negative territory when comparing only a few top tier systems. The author calls for stopping the use of BLEU in the paper.

Recently, several model-based metrics are proposed (ESIM, Yisi-1, BERTScore, BLEURT). They are all using or building with BERT. These metrics typically achieve much higher human correlation by tuning themselves with human judgment data.

In our paper, we attempt to directly optimize NMT models with the state-of-the-art learned metric, BLEURT. The benefit is obvious, as BLEURT is tuned with human scores, it can potentially reflect human preference on translation quality. We want to know whether the training just changes the NMT parameter to hack the metric, or it yields meaningful improvement.

For reward optimization, we found a stable ranking-based sequence-level loss performs well and is suitable to use with large NMT and metric models.

How it works

We propose to use the following contrastive-margin loss, which is a pairwise ranking loss that differentiates two candidates with the best and worst rewrad in a candidate space. The loss has the following form:

Here, is the reward function. After we obtain a set of candidates using beam search, denotes the candidate with the best reward. is the candidate with the worst reward.

This reward optimizing loss has a lower memory footprint comparing with risk minimization loss, and is more stable than REINFORCE and max-margin loss. In the paper, we show this loss can effectively optimize both smoothed BLEU and BLEURT as rewards.

Results

We perform automatic and human evaluations to compare optimized models with the baselines. The experiments are conducted on German-English, Romanian-English, Russian-English and Japanese-English datasets. They are all to-English datasets as the pretrained BLEURT is for English language.

The results are interesting. In three over four language pairs, we found BLEURT is significantly increased after optimizing it, however, this optimization hurts BLEU. Here are the automatic scores:

Then we performed pairwise human evaluation on three criteria: adequacy, fluency and coverage. These are the results

We can see that the BLEURT optimized model tends to have better adequacy and coverage, and it performs better than models trained with smoothed BLEU. For fluency, annotators didn't find much difference overall, which may indicate the NLL loss is already good at improving fluency. Please check our paper for more details.

Getting Started

Our method can be applied to any MT metrics (including non-differentiable ones) for improving human perceived quality. We invite others to try our method with various metrics!

We will release the source code to reproduce our method very soon. Stay tuned!

Citing our Work

@article{shu2021reward,
    title={Reward Optimization for Neural Machine Translation with Learned Metrics},
    author={Shu, Raphael and Yoo, Kang Min and Ha, Jung-Woo},
    year={2021},
    journal={arXiv preprint arXiv:2104.07541},
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 23

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗