All Projects → TharinduDR → TransQuest

TharinduDR / TransQuest

Licence: Apache-2.0 license
Transformer based translation quality estimation

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to TransQuest

Transmogrifai
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Stars: ✭ 2,084 (+2351.76%)
Mutual labels:  transformers
COCO-LM
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Stars: ✭ 109 (+28.24%)
Mutual labels:  transformers
thermostat
Collection of NLP model explanations and accompanying analysis tools
Stars: ✭ 126 (+48.24%)
Mutual labels:  transformers
Simpletransformers
Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
Stars: ✭ 2,881 (+3289.41%)
Mutual labels:  transformers
Nlp Architect
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
Stars: ✭ 2,768 (+3156.47%)
Mutual labels:  transformers
Fengshenbang-LM
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
Stars: ✭ 1,813 (+2032.94%)
Mutual labels:  transformers
Fast Bert
Super easy library for BERT based NLP models
Stars: ✭ 1,678 (+1874.12%)
Mutual labels:  transformers
naru
Neural Relation Understanding: neural cardinality estimators for tabular data
Stars: ✭ 76 (-10.59%)
Mutual labels:  transformers
nlp-papers
Must-read papers on Natural Language Processing (NLP)
Stars: ✭ 87 (+2.35%)
Mutual labels:  transformers
SnowflakeNet
(TPAMI 2022) Snowflake Point Deconvolution for Point Cloud Completion and Generation with Skip-Transformer
Stars: ✭ 74 (-12.94%)
Mutual labels:  transformers
Dalle Pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
Stars: ✭ 3,661 (+4207.06%)
Mutual labels:  transformers
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+3675.29%)
Mutual labels:  transformers
Transformer-Implementations
Library - Vanilla, ViT, DeiT, BERT, GPT
Stars: ✭ 34 (-60%)
Mutual labels:  transformers
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+2862.35%)
Mutual labels:  transformers
LIT
[AAAI 2022] This is the official PyTorch implementation of "Less is More: Pay Less Attention in Vision Transformers"
Stars: ✭ 79 (-7.06%)
Mutual labels:  transformers
Clue
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+2752.94%)
Mutual labels:  transformers
KoBERT-Transformers
KoBERT on 🤗 Huggingface Transformers 🤗 (with Bug Fixed)
Stars: ✭ 162 (+90.59%)
Mutual labels:  transformers
STAM-pytorch
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification
Stars: ✭ 109 (+28.24%)
Mutual labels:  transformers
KB-ALBERT
KB국민은행에서 제공하는 경제/금융 도메인에 특화된 한국어 ALBERT 모델
Stars: ✭ 215 (+152.94%)
Mutual labels:  transformers
gpl
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Stars: ✭ 216 (+154.12%)
Mutual labels:  transformers

License Downloads

TransQuest: Translation Quality Estimation with Cross-lingual Transformers

The goal of quality estimation (QE) is to evaluate the quality of a translation without having access to a reference translation. High-accuracy QE that can be easily deployed for a number of language pairs is the missing piece in many commercial translation workflows as they have numerous potential uses. They can be employed to select the best translation when several translation engines are available or can inform the end user about the reliability of automatically translated content. In addition, QE systems can be used to decide whether a translation can be published as it is in a given context, or whether it requires human post-editing before publishing or translation from scratch by a human. The quality estimation can be done at different levels: document level, sentence level and word level.

With TransQuest, we have opensourced our research in translation quality estimation which also won the sentence-level direct assessment quality estimation shared task in WMT 2020. TransQuest outperforms current open-source quality estimation frameworks such as OpenKiwi and DeepQuest.

Features

  • Sentence-level translation quality estimation on both aspects: predicting post editing efforts and direct assessment.
  • Word-level translation quality estimation capable of predicting quality of source words, target words and target gaps.
  • Perform significantly better than current state-of-the-art quality estimation methods like DeepQuest and OpenKiwi in all the languages experimented.
  • Pre-trained quality estimation models for fifteen language pairs are available in HuggingFace.

Table of Contents

  1. Installation - Install TransQuest locally using pip.
  2. Architectures - Checkout the architectures implemented in TransQuest
    1. Sentence-level Architectures - We have released two architectures; MonoTransQuest and SiameseTransQuest to perform sentence level quality estimation.
    2. Word-level Architecture - We have released MicroTransQuest to perform word level quality estimation.
  3. Examples - We have provided several examples on how to use TransQuest in recent WMT quality estimation shared tasks.
    1. Sentence-level Examples
    2. Word-level Examples
  4. Pre-trained Models - We have provided pretrained quality estimation models for fifteen language pairs covering both sentence-level and word-level
    1. Sentence-level Models
    2. Word-level Models
  5. Contact - Contact us for any issues with TransQuest

Resources

Citations

If you are using the word-level architecture, please consider citing this paper which is accepted to ACL 2021.

@InProceedings{ranasinghe2021,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {An Exploratory Analysis of Multilingual Word Level Quality Estimation with Cross-Lingual Transformers},
booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics},
year = {2021}
}

If you are using the sentence-level architectures, please consider citing these papers which were presented in COLING 2020 and in WMT 2020 at EMNLP 2020.

@InProceedings{transquest:2020a,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest: Translation Quality Estimation with Cross-lingual Transformers},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
year = {2020}
}
@InProceedings{transquest:2020b,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest at WMT2020: Sentence-Level Direct Assessment},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
year = {2020}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].