All Projects → simplemma → Similar Projects or Alternatives

207 Open source projects that are alternatives of or similar to simplemma

libmorph
libmorph rus/ukr - fast & accurate morphological analyzer/analyses for Russian and Ukrainian
Stars: ✭ 16 (-50%)
GrammarEngine
Грамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (+112.5%)
udar
UDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.
Stars: ✭ 15 (-53.12%)
alix
A Lucene Indexer for XML, with lexical analysis (lemmatization for French)
Stars: ✭ 15 (-53.12%)
Mutual labels:  lemmatizer, lemmatization
wink-tokenizer
Multilingual tokenizer that automatically tags each token with its type
Stars: ✭ 51 (+59.38%)
Mutual labels:  tokenizer, tokenization
mystem-scala
Morphological analyzer `mystem` (Russian language) wrapper for JVM languages
Stars: ✭ 21 (-34.37%)
Mutual labels:  tokenizer, lemmatizer
wink-lemmatizer
English lemmatizer
Stars: ✭ 53 (+65.63%)
Mutual labels:  lemmatizer, lemmatization
ling
Natural Language Processing Toolkit in Golang
Stars: ✭ 57 (+78.13%)
Mutual labels:  tokenization, lemmatization
nlp-cheat-sheet-python
NLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition
Stars: ✭ 69 (+115.63%)
Mutual labels:  tokenization, lemmatization
Turkish-Lemmatizer
Lemmatization for Turkish Language
Stars: ✭ 72 (+125%)
Mutual labels:  lemmatizer, lemmatization
xontrib-output-search
Get identifiers, paths, URLs and words from the previous command output and use them for the next command in xonsh shell.
Stars: ✭ 26 (-18.75%)
Mutual labels:  tokenizer, tokenization
HebPipe
An NLP pipeline for Hebrew
Stars: ✭ 15 (-53.12%)
lemma
A Morphological Parser (Analyser) / Lemmatizer written in Elixir.
Stars: ✭ 45 (+40.63%)
Mutual labels:  lemmatizer, lemmatization
Jumanpp
Juman++ (a Morphological Analyzer Toolkit)
Stars: ✭ 254 (+693.75%)
jargon
Tokenizers and lemmatizers for Go
Stars: ✭ 98 (+206.25%)
Mutual labels:  tokenizer, lemmatizer
TweebankNLP
[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset
Stars: ✭ 84 (+162.5%)
Mutual labels:  tokenization, lemmatization
zeyrek
Python morphological analyzer for Turkish language. Partial port of ZemberekNLP.
Stars: ✭ 36 (+12.5%)
Kagome
Self-contained Japanese Morphological Analyzer written in pure Go
Stars: ✭ 554 (+1631.25%)
suika
Suika 🍉 is a Japanese morphological analyzer written in pure Ruby
Stars: ✭ 31 (-3.12%)
ComPP
Company Passwords Profiler (aka ComPP) helps making a bruteforce wordlist for a targeted company.
Stars: ✭ 44 (+37.5%)
Mutual labels:  wordlist
spacy russian tokenizer
Custom Russian tokenizer for spaCy
Stars: ✭ 35 (+9.38%)
Mutual labels:  tokenization
treestoolbox
TREES toolbox
Stars: ✭ 20 (-37.5%)
Mutual labels:  morphological-analysis
psr2r-sniffer
A PSR-2-R code sniffer and code-style auto-correction-tool - including many useful additions
Stars: ✭ 32 (+0%)
Mutual labels:  tokenizer
tokenizer
Tokenize CSS according to the CSS Syntax
Stars: ✭ 52 (+62.5%)
Mutual labels:  tokenizer
lex
Lex is an implementation of lex tool in Ruby.
Stars: ✭ 49 (+53.13%)
Mutual labels:  tokenizer
parallel-corpora-tools
Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
Stars: ✭ 35 (+9.38%)
Mutual labels:  corpus-tools
rustfst
Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
Stars: ✭ 104 (+225%)
Mutual labels:  tokenizer
liblex
C library for Lexical Analysis
Stars: ✭ 25 (-21.87%)
Mutual labels:  tokenizer
hunspell
High-Performance Stemmer, Tokenizer, and Spell Checker for R
Stars: ✭ 101 (+215.63%)
Mutual labels:  tokenizer
berserker
Berserker - BERt chineSE woRd toKenizER
Stars: ✭ 17 (-46.87%)
Mutual labels:  tokenizer
tokenizer
A simple tokenizer in Ruby for NLP tasks.
Stars: ✭ 44 (+37.5%)
Mutual labels:  tokenizer
python-wordlist-generator
Create awesome wordlist with python, demo: https://asciinema.org/a/101677
Stars: ✭ 87 (+171.88%)
Mutual labels:  wordlist
RockYou2021.txt
RockYou2021.txt is a MASSIVE WORDLIST compiled of various other wordlists. RockYou2021.txt DOES NOT CONTAIN USER:PASS logins!
Stars: ✭ 288 (+800%)
Mutual labels:  wordlist
lara-hungarian-nlp
NLP class for rapid ChatBot development in Hungarian language
Stars: ✭ 27 (-15.62%)
Mutual labels:  lemmatizer
Text tone analyzer
Система, анализирующая тональность текстов и высказываний.
Stars: ✭ 15 (-53.12%)
Mutual labels:  lemmatization
WordFrequencyPython
Python code to find out most frequent words from different word lists
Stars: ✭ 31 (-3.12%)
Mutual labels:  wordlist
SwiLex
A universal lexer library in Swift.
Stars: ✭ 29 (-9.37%)
Mutual labels:  tokenizer
cracken
a fast password wordlist generator, Smartlist creation and password hybrid-mask analysis tool written in pure safe Rust
Stars: ✭ 192 (+500%)
Mutual labels:  wordlist
ronin-support
A support library for Ronin. Like activesupport, but for hacking!
Stars: ✭ 23 (-28.12%)
Mutual labels:  wordlist
voikko-rs
Rust bindings for the Voikko library
Stars: ✭ 16 (-50%)
Mutual labels:  morphological-analysis
ilmulti
Tooling to play around with multilingual machine translation for Indian Languages.
Stars: ✭ 19 (-40.62%)
Mutual labels:  tokenizer
lindera
A morphological analysis library.
Stars: ✭ 226 (+606.25%)
Mutual labels:  tokenizer
gd-tokenizer
A small godot project with a tokenizer written in GDScript.
Stars: ✭ 34 (+6.25%)
Mutual labels:  tokenizer
python-mecab
A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
Stars: ✭ 27 (-15.62%)
Mutual labels:  tokenizer
vscode-blockman
VSCode extension to highlight nested code blocks
Stars: ✭ 233 (+628.13%)
Mutual labels:  tokenizer
FAT
Factom Asset Tokens - Open tokenization standards on Factom
Stars: ✭ 17 (-46.87%)
Mutual labels:  tokenization
kontext
An advanced, extensible web front-end for the Manatee-open corpus search engine
Stars: ✭ 50 (+56.25%)
Mutual labels:  corpus-tools
Emotion-recognition-from-tweets
A comprehensive approach on recognizing emotion (sentiment) from a certain tweet. Supervised machine learning.
Stars: ✭ 17 (-46.87%)
Mutual labels:  lemmatization
elasticsearch-plugins
Some native scoring script plugins for elasticsearch
Stars: ✭ 30 (-6.25%)
Mutual labels:  tokenizer
teanaps
자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+184.38%)
Mutual labels:  morphological-analysis
brutas
Wordlists and passwords handcrafted with ♥
Stars: ✭ 32 (+0%)
Mutual labels:  wordlist
snapdragon-lexer
Converts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.
Stars: ✭ 19 (-40.62%)
Mutual labels:  tokenizer
spacy-server
🦜 Containerized HTTP API for industrial-strength NLP via spaCy and sense2vec
Stars: ✭ 58 (+81.25%)
Mutual labels:  tokenization
longtongue
Customized Password/Passphrase List inputting Target Info
Stars: ✭ 61 (+90.63%)
Mutual labels:  wordlist
WiCrackFi
Python Script to help/automate the WiFi hacking exercises.
Stars: ✭ 61 (+90.63%)
Mutual labels:  wordlist
golem
A lemmatizer implemented in Go
Stars: ✭ 54 (+68.75%)
Mutual labels:  lemmatizer
Brutal-wordlist-Generator
Brutal Wordlist Generator is a java based Application software used to generate the wordlist with best of UX interface
Stars: ✭ 24 (-25%)
Mutual labels:  wordlist
neural tokenizer
Tokenize English sentences using neural networks.
Stars: ✭ 64 (+100%)
Mutual labels:  tokenizer
tmpleak
Leak other players' temporary workspaces for ctf and wargames.
Stars: ✭ 76 (+137.5%)
Mutual labels:  wordlist
chinese-tokenizer
Tokenizes Chinese texts into words.
Stars: ✭ 72 (+125%)
Mutual labels:  tokenizer
1-60 of 207 similar projects