libmorphlibmorph rus/ukr - fast & accurate morphological analyzer/analyses for Russian and Ukrainian
Stars: ✭ 16 (-50%)
GrammarEngineГрамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (+112.5%)
udarUDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.
Stars: ✭ 15 (-53.12%)
alixA Lucene Indexer for XML, with lexical analysis (lemmatization for French)
Stars: ✭ 15 (-53.12%)
wink-tokenizerMultilingual tokenizer that automatically tags each token with its type
Stars: ✭ 51 (+59.38%)
mystem-scalaMorphological analyzer `mystem` (Russian language) wrapper for JVM languages
Stars: ✭ 21 (-34.37%)
lingNatural Language Processing Toolkit in Golang
Stars: ✭ 57 (+78.13%)
nlp-cheat-sheet-pythonNLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition
Stars: ✭ 69 (+115.63%)
xontrib-output-searchGet identifiers, paths, URLs and words from the previous command output and use them for the next command in xonsh shell.
Stars: ✭ 26 (-18.75%)
HebPipeAn NLP pipeline for Hebrew
Stars: ✭ 15 (-53.12%)
lemmaA Morphological Parser (Analyser) / Lemmatizer written in Elixir.
Stars: ✭ 45 (+40.63%)
JumanppJuman++ (a Morphological Analyzer Toolkit)
Stars: ✭ 254 (+693.75%)
jargonTokenizers and lemmatizers for Go
Stars: ✭ 98 (+206.25%)
TweebankNLP[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset
Stars: ✭ 84 (+162.5%)
zeyrekPython morphological analyzer for Turkish language. Partial port of ZemberekNLP.
Stars: ✭ 36 (+12.5%)
KagomeSelf-contained Japanese Morphological Analyzer written in pure Go
Stars: ✭ 554 (+1631.25%)
suikaSuika 🍉 is a Japanese morphological analyzer written in pure Ruby
Stars: ✭ 31 (-3.12%)
ComPPCompany Passwords Profiler (aka ComPP) helps making a bruteforce wordlist for a targeted company.
Stars: ✭ 44 (+37.5%)
psr2r-snifferA PSR-2-R code sniffer and code-style auto-correction-tool - including many useful additions
Stars: ✭ 32 (+0%)
tokenizerTokenize CSS according to the CSS Syntax
Stars: ✭ 52 (+62.5%)
lexLex is an implementation of lex tool in Ruby.
Stars: ✭ 49 (+53.13%)
parallel-corpora-toolsTools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
Stars: ✭ 35 (+9.38%)
rustfstRust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
Stars: ✭ 104 (+225%)
liblexC library for Lexical Analysis
Stars: ✭ 25 (-21.87%)
hunspellHigh-Performance Stemmer, Tokenizer, and Spell Checker for R
Stars: ✭ 101 (+215.63%)
berserkerBerserker - BERt chineSE woRd toKenizER
Stars: ✭ 17 (-46.87%)
tokenizerA simple tokenizer in Ruby for NLP tasks.
Stars: ✭ 44 (+37.5%)
RockYou2021.txtRockYou2021.txt is a MASSIVE WORDLIST compiled of various other wordlists. RockYou2021.txt DOES NOT CONTAIN USER:PASS logins!
Stars: ✭ 288 (+800%)
lara-hungarian-nlpNLP class for rapid ChatBot development in Hungarian language
Stars: ✭ 27 (-15.62%)
Text tone analyzerСистема, анализирующая тональность текстов и высказываний.
Stars: ✭ 15 (-53.12%)
WordFrequencyPythonPython code to find out most frequent words from different word lists
Stars: ✭ 31 (-3.12%)
SwiLexA universal lexer library in Swift.
Stars: ✭ 29 (-9.37%)
crackena fast password wordlist generator, Smartlist creation and password hybrid-mask analysis tool written in pure safe Rust
Stars: ✭ 192 (+500%)
ronin-supportA support library for Ronin. Like activesupport, but for hacking!
Stars: ✭ 23 (-28.12%)
voikko-rsRust bindings for the Voikko library
Stars: ✭ 16 (-50%)
ilmultiTooling to play around with multilingual machine translation for Indian Languages.
Stars: ✭ 19 (-40.62%)
linderaA morphological analysis library.
Stars: ✭ 226 (+606.25%)
gd-tokenizerA small godot project with a tokenizer written in GDScript.
Stars: ✭ 34 (+6.25%)
python-mecabA repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
Stars: ✭ 27 (-15.62%)
vscode-blockmanVSCode extension to highlight nested code blocks
Stars: ✭ 233 (+628.13%)
FATFactom Asset Tokens - Open tokenization standards on Factom
Stars: ✭ 17 (-46.87%)
kontextAn advanced, extensible web front-end for the Manatee-open corpus search engine
Stars: ✭ 50 (+56.25%)
Emotion-recognition-from-tweetsA comprehensive approach on recognizing emotion (sentiment) from a certain tweet. Supervised machine learning.
Stars: ✭ 17 (-46.87%)
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+184.38%)
brutasWordlists and passwords handcrafted with ♥
Stars: ✭ 32 (+0%)
snapdragon-lexerConverts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.
Stars: ✭ 19 (-40.62%)
spacy-server🦜 Containerized HTTP API for industrial-strength NLP via spaCy and sense2vec
Stars: ✭ 58 (+81.25%)
longtongueCustomized Password/Passphrase List inputting Target Info
Stars: ✭ 61 (+90.63%)
WiCrackFiPython Script to help/automate the WiFi hacking exercises.
Stars: ✭ 61 (+90.63%)
golemA lemmatizer implemented in Go
Stars: ✭ 54 (+68.75%)
Brutal-wordlist-GeneratorBrutal Wordlist Generator is a java based Application software used to generate the wordlist with best of UX interface
Stars: ✭ 24 (-25%)
neural tokenizerTokenize English sentences using neural networks.
Stars: ✭ 64 (+100%)
tmpleakLeak other players' temporary workspaces for ctf and wargames.
Stars: ✭ 76 (+137.5%)