The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.

Stars: ✭ 45 (+136.84%)

Mutual labels: tokenizer

lexertk

C++ Lexer Toolkit Library (LexerTk) https://www.partow.net/programming/lexertk/index.html

Stars: ✭ 26 (+36.84%)

Mutual labels: tokenizer

jargon

Tokenizers and lemmatizers for Go

Stars: ✭ 98 (+415.79%)

Mutual labels: tokenizer

hunspell

High-Performance Stemmer, Tokenizer, and Spell Checker for R

Stars: ✭ 101 (+431.58%)

Mutual labels: tokenizer

grasp

Essential NLP & ML, short & fast pure Python code

Stars: ✭ 58 (+205.26%)

Mutual labels: tokenizer

lindera

A morphological analysis library.

Stars: ✭ 226 (+1089.47%)

Mutual labels: tokenizer

elasticsearch-plugins

Some native scoring script plugins for elasticsearch

Stars: ✭ 30 (+57.89%)

Mutual labels: tokenizer

SequenceToSequence

A seq2seq with attention dialogue/MT model implemented by TensorFlow.

Stars: ✭ 11 (-42.11%)

Mutual labels: machine-translation

Deep-NLP-Resources

Curated list of all NLP Resources

Stars: ✭ 65 (+242.11%)

Mutual labels: machine-translation

MetricMT

The official code repository for MetricMT - a reward optimization method for NMT with learned metrics

Stars: ✭ 23 (+21.05%)

Mutual labels: machine-translation

neural tokenizer

Tokenize English sentences using neural networks.

Stars: ✭ 64 (+236.84%)

Mutual labels: tokenizer

python-mecab

A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)

Stars: ✭ 27 (+42.11%)

Mutual labels: tokenizer

mtdata

A tool that locates, downloads, and extracts machine translation corpora

Stars: ✭ 95 (+400%)

Mutual labels: machine-translation

instamojo-java

Java wrapper for Instamojo API

Stars: ✭ 15 (-21.05%)

Mutual labels: wrappers

rustfst

Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.

Stars: ✭ 104 (+447.37%)

Mutual labels: tokenizer

xontrib-output-search

Get identifiers, paths, URLs and words from the previous command output and use them for the next command in xonsh shell.

Stars: ✭ 26 (+36.84%)

Mutual labels: tokenizer

wink-tokenizer

Multilingual tokenizer that automatically tags each token with its type

Stars: ✭ 51 (+168.42%)

Mutual labels: tokenizer

Distill-BERT-Textgen

Research code for ACL 2020 paper: "Distilling Knowledge Learned in BERT for Text Generation".

Stars: ✭ 121 (+536.84%)

Mutual labels: machine-translation

masakhane-web

Masakhane Web is a translation web application for solely African Languages.

Stars: ✭ 27 (+42.11%)

Mutual labels: machine-translation

OPUS-MT-train

Training open neural machine translation models

Stars: ✭ 166 (+773.68%)

Mutual labels: machine-translation

vscode-blockman

VSCode extension to highlight nested code blocks

Stars: ✭ 233 (+1126.32%)

Mutual labels: tokenizer

tvsub

TVsub: DCU-Tencent Chinese-English Dialogue Corpus

Stars: ✭ 40 (+110.53%)

Mutual labels: machine-translation

lex

Lex is an implementation of lex tool in Ruby.

Stars: ✭ 49 (+157.89%)

Mutual labels: tokenizer

bergamot-translator

Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.

Stars: ✭ 181 (+852.63%)

Mutual labels: machine-translation

deepl-rb

A simple ruby gem for the DeepL API

Stars: ✭ 38 (+100%)

Mutual labels: machine-translation

Tokenizer

A tokenizer for Icelandic text

Stars: ✭ 27 (+42.11%)

Mutual labels: tokenizer

Machine-Translation-Hindi-to-english-

Machine translation is the task of converting one language to other. Unlike the traditional phrase-based translation system which consists of many small sub-components that are tuned separately, neural machine translation attempts to build and train a single, large neural network that reads a sentence and outputs a correct translation.

Stars: ✭ 19 (+0%)

Mutual labels: machine-translation

osdg-tool

OSDG is an open-source tool that maps and connects activities to the UN Sustainable Development Goals (SDGs) by identifying SDG-relevant content in any text. The tool is available online at www.osdg.ai. API access available for research purposes.

Stars: ✭ 22 (+15.79%)

Mutual labels: machine-translation

ReductionWrappers

R wrappers to connect Python dimensional reduction tools and single cell data objects (Seurat, SingleCellExperiment, etc...)

Stars: ✭ 31 (+63.16%)

Mutual labels: wrappers

BSD

The Business Scene Dialogue corpus

Stars: ✭ 51 (+168.42%)

Mutual labels: machine-translation

OpenISS

OpenISS -- a unified multimodal motion data delivery framework.