libmorphlibmorph rus/ukr - fast & accurate morphological analyzer/analyses for Russian and Ukrainian
Stars: ✭ 16 (-76.47%)
HebPipeAn NLP pipeline for Hebrew
Stars: ✭ 15 (-77.94%)
udarUDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.
Stars: ✭ 15 (-77.94%)
simplemmaSimple multilingual lemmatizer for Python, especially useful for speed and efficiency
Stars: ✭ 32 (-52.94%)
lemmaA Morphological Parser (Analyser) / Lemmatizer written in Elixir.
Stars: ✭ 45 (-33.82%)
SudachipyPython version of Sudachi, a Japanese tokenizer.
Stars: ✭ 207 (+204.41%)
SudachiA Japanese Tokenizer for Business
Stars: ✭ 496 (+629.41%)
NMeCabJapanese morphological analyzer on .NET
Stars: ✭ 65 (-4.41%)
NLP-toolsUseful python NLP tools (evaluation, GUI interface, tokenization)
Stars: ✭ 39 (-42.65%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+3602.94%)
zeyrekPython morphological analyzer for Turkish language. Partial port of ZemberekNLP.
Stars: ✭ 36 (-47.06%)
mystem-scalaMorphological analyzer `mystem` (Russian language) wrapper for JVM languages
Stars: ✭ 21 (-69.12%)
KagomeSelf-contained Japanese Morphological Analyzer written in pure Go
Stars: ✭ 554 (+714.71%)
KuromojiKuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
Stars: ✭ 745 (+995.59%)
Awesome Persian Nlp IrCurated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (+576.47%)
FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+3489.71%)
NcrfppNCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Stars: ✭ 1,767 (+2498.53%)
alixA Lucene Indexer for XML, with lexical analysis (lemmatization for French)
Stars: ✭ 15 (-77.94%)
Lingopackage lingo provides the data structures and algorithms required for natural language processing
Stars: ✭ 113 (+66.18%)
QutufQutuf (قُطُوْف): An Arabic Morphological analyzer and Part-Of-Speech tagger as an Expert System.
Stars: ✭ 84 (+23.53%)
JumanppJuman++ (a Morphological Analyzer Toolkit)
Stars: ✭ 254 (+273.53%)
Camel toolsA suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Stars: ✭ 124 (+82.35%)
yapYet Another (natural language) Parser
Stars: ✭ 40 (-41.18%)
graspEssential NLP & ML, short & fast pure Python code
Stars: ✭ 58 (-14.71%)
lemmy🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪
Stars: ✭ 68 (+0%)
golemA lemmatizer implemented in Go
Stars: ✭ 54 (-20.59%)
md-svg-vueMaterial design icons by Google for Vue.js & Nuxt.js (server side support & inline svg with path)
Stars: ✭ 14 (-79.41%)
TweebankNLP[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset
Stars: ✭ 84 (+23.53%)
empythyAutomated NLP sentiment predictions- batteries included, or use your own data
Stars: ✭ 17 (-75%)
limaThe Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.
Stars: ✭ 75 (+10.29%)
schrutepyThe Entire Transcript from the Office in Tidy Format
Stars: ✭ 22 (-67.65%)
PyKOMORAN(Beta) PyKOMORAN is wrapped KOMORAN in Python using Py4J.
Stars: ✭ 38 (-44.12%)
mlconjug3A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.
Stars: ✭ 47 (-30.88%)
RussianNounsJSСклонение существительных по падежам. Обычно требуются только форма в именительном падеже, одушевлённость и род.
Stars: ✭ 29 (-57.35%)
py-lingualyticsA text analytics library with support for codemixed data
Stars: ✭ 36 (-47.06%)
esa-httpclientAn asynchronous event-driven HTTP client based on netty.
Stars: ✭ 82 (+20.59%)
wefeWEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
Stars: ✭ 164 (+141.18%)
sinlingA collection of NLP tools for Sinhalese (සිංහල).
Stars: ✭ 38 (-44.12%)
lara-hungarian-nlpNLP class for rapid ChatBot development in Hungarian language
Stars: ✭ 27 (-60.29%)
aotRussian morphology for Java
Stars: ✭ 41 (-39.71%)
spaczzFuzzy matching and more functionality for spaCy.
Stars: ✭ 215 (+216.18%)
deduplicationFast multi-threaded content-dependent chunking deduplication for Buffers in C++ with a reference implementation in Javascript. Ships with extensive tests, a fuzz test and a benchmark.
Stars: ✭ 59 (-13.24%)
sticker2Further developed as SyntaxDot: https://github.com/tensordot/syntaxdot
Stars: ✭ 14 (-79.41%)
jmemBreak up huge JSON arrays into manageable sizes.
Stars: ✭ 14 (-79.41%)
Morse.jlPaper: Morphological Analysis Using a Sequence Decoder
Stars: ✭ 14 (-79.41%)
bllip-parserBLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.
Stars: ✭ 217 (+219.12%)
suikaSuika 🍉 is a Japanese morphological analyzer written in pure Ruby
Stars: ✭ 31 (-54.41%)
jargonTokenizers and lemmatizers for Go
Stars: ✭ 98 (+44.12%)
nlp-cheat-sheet-pythonNLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition
Stars: ✭ 69 (+1.47%)
Emotion-recognition-from-tweetsA comprehensive approach on recognizing emotion (sentiment) from a certain tweet. Supervised machine learning.
Stars: ✭ 17 (-75%)
Quantitative-Big-Imaging-2018(Latest semester at https://github.com/kmader/Quantitative-Big-Imaging-2019) The material for the Quantitative Big Imaging course at ETHZ for the Spring Semester 2018
Stars: ✭ 50 (-26.47%)
CorenlpStanford CoreNLP: A Java suite of core NLP tools.
Stars: ✭ 8,248 (+12029.41%)
rsmorphyMorphological analyzer / inflection engine for Russian and Ukrainian languages rewritten in Rust
Stars: ✭ 27 (-60.29%)