ilmultiTooling to play around with multilingual machine translation for Indian Languages.
Stars: ✭ 19 (-93.52%)
ThotThot toolkit for statistical machine translation
Stars: ✭ 53 (-81.91%)
TokenizerFast and customizable text tokenization library with BPE and SentencePiece support
Stars: ✭ 132 (-54.95%)
masakhane-webMasakhane Web is a translation web application for solely African Languages.
Stars: ✭ 27 (-90.78%)
dynmt-pyNeural machine translation implementation using dynet's python bindings
Stars: ✭ 17 (-94.2%)
NLP ToolkitLibrary of state-of-the-art models (PyTorch) for NLP tasks
Stars: ✭ 92 (-68.6%)
cang-jieChinese tokenizer for tantivy, based on jieba-rs
Stars: ✭ 48 (-83.62%)
Machine-Translation-Hindi-to-english-Machine translation is the task of converting one language to other. Unlike the traditional phrase-based translation system which consists of many small sub-components that are tuned separately, neural machine translation attempts to build and train a single, large neural network that reads a sentence and outputs a correct translation.
Stars: ✭ 19 (-93.52%)
vscode-blockmanVSCode extension to highlight nested code blocks
Stars: ✭ 233 (-20.48%)
tokenizerA simple tokenizer in Ruby for NLP tasks.
Stars: ✭ 44 (-84.98%)
SequenceToSequenceA seq2seq with attention dialogue/MT model implemented by TensorFlow.
Stars: ✭ 11 (-96.25%)
simplemmaSimple multilingual lemmatizer for Python, especially useful for speed and efficiency
Stars: ✭ 32 (-89.08%)
farasapyA Python implementation of Farasa toolkit
Stars: ✭ 69 (-76.45%)
Hebrew-TokenizerA very simple python tokenizer for Hebrew text.
Stars: ✭ 16 (-94.54%)
lexLex is an implementation of lex tool in Ruby.
Stars: ✭ 49 (-83.28%)
parallel-corpora-toolsTools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
Stars: ✭ 35 (-88.05%)
BSDThe Business Scene Dialogue corpus
Stars: ✭ 51 (-82.59%)
ArabicProcessingCogA Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Stars: ✭ 19 (-93.52%)
linderaA morphological analysis library.
Stars: ✭ 226 (-22.87%)
omegat-tencent-pluginThis is a plugin to allow OmegaT to source machine translations from Tencent Cloud.
Stars: ✭ 31 (-89.42%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (-35.84%)
MetricMTThe official code repository for MetricMT - a reward optimization method for NMT with learned metrics
Stars: ✭ 23 (-92.15%)
python-mecabA repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
Stars: ✭ 27 (-90.78%)
jargonTokenizers and lemmatizers for Go
Stars: ✭ 98 (-66.55%)
bredonA modern CSS value compiler in JavaScript
Stars: ✭ 39 (-86.69%)
neural tokenizerTokenize English sentences using neural networks.
Stars: ✭ 64 (-78.16%)
mystem-scalaMorphological analyzer `mystem` (Russian language) wrapper for JVM languages
Stars: ✭ 21 (-92.83%)
rustfstRust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
Stars: ✭ 104 (-64.51%)
SpeechTransProgressTracking the progress in end-to-end speech translation
Stars: ✭ 139 (-52.56%)
psr2r-snifferA PSR-2-R code sniffer and code-style auto-correction-tool - including many useful additions
Stars: ✭ 32 (-89.08%)
tokenizerTokenize CSS according to the CSS Syntax
Stars: ✭ 52 (-82.25%)
transformer-pytorchA PyTorch implementation of Transformer in "Attention is All You Need"
Stars: ✭ 77 (-73.72%)
wink-tokenizerMultilingual tokenizer that automatically tags each token with its type
Stars: ✭ 51 (-82.59%)
snapdragon-lexerConverts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.
Stars: ✭ 19 (-93.52%)
hunspellHigh-Performance Stemmer, Tokenizer, and Spell Checker for R
Stars: ✭ 101 (-65.53%)
urbansA tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.
Stars: ✭ 19 (-93.52%)
JumanppJuman++ (a Morphological Analyzer Toolkit)
Stars: ✭ 254 (-13.31%)
inmtInteractive Neural Machine Translation tool
Stars: ✭ 44 (-84.98%)
mtdataA tool that locates, downloads, and extracts machine translation corpora
Stars: ✭ 95 (-67.58%)
SwiLexA universal lexer library in Swift.
Stars: ✭ 29 (-90.1%)
PaddleTokenizer使用 PaddlePaddle 实现基于深度神经网络的中文分词引擎 | A DNN Chinese Tokenizer by Using PaddlePaddle
Stars: ✭ 14 (-95.22%)
rtgReader Translator Generator - NMT toolkit based on pytorch
Stars: ✭ 26 (-91.13%)
liblexC library for Lexical Analysis
Stars: ✭ 25 (-91.47%)
gd-tokenizerA small godot project with a tokenizer written in GDScript.
Stars: ✭ 34 (-88.4%)
banglanmtThis repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.
Stars: ✭ 91 (-68.94%)
sktSanskrit compound segmentation using seq2seq model
Stars: ✭ 21 (-92.83%)
NiuTrans.NMTA Fast Neural Machine Translation System. It is developed in C++ and resorts to NiuTensor for fast tensor APIs.
Stars: ✭ 112 (-61.77%)
xontrib-output-searchGet identifiers, paths, URLs and words from the previous command output and use them for the next command in xonsh shell.
Stars: ✭ 26 (-91.13%)
Natural-Language-ProcessingContains various architectures and novel paper implementations for Natural Language Processing tasks like Sequence Modelling and Neural Machine Translation.
Stars: ✭ 48 (-83.62%)
Distill-BERT-TextgenResearch code for ACL 2020 paper: "Distilling Knowledge Learned in BERT for Text Generation".
Stars: ✭ 121 (-58.7%)
berserkerBerserker - BERt chineSE woRd toKenizER
Stars: ✭ 17 (-94.2%)
TransformerA Pytorch Implementation of "Attention is All You Need" and "Weighted Transformer Network for Machine Translation"
Stars: ✭ 271 (-7.51%)
pascal-interpreterA simple interpreter for a large subset of Pascal language written for educational purposes
Stars: ✭ 21 (-92.83%)
nepali-translatorNeural Machine Translation on the Nepali-English language pair
Stars: ✭ 29 (-90.1%)