Spacy💫 Industrial-strength Natural Language Processing (NLP) in Python
uax29A tokenizer based on Unicode text segmentation (UAX 29), for Go
auto-data-tokenizeIdentify and tokenize sensitive data automatically using Cloud DLP and Dataflow
simplemmaSimple multilingual lemmatizer for Python, especially useful for speed and efficiency
polycashThe ultimate open source betting protocol. PolyCash is a P2P blockchain platform for wallets, asset issuance, bonds & gaming.
wink-tokenizerMultilingual tokenizer that automatically tags each token with its type
spacy-server🦜 Containerized HTTP API for industrial-strength NLP via spaCy and sense2vec
lingNatural Language Processing Toolkit in Golang
nlp-cheat-sheet-pythonNLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition
FATFactom Asset Tokens - Open tokenization standards on Factom
xontrib-output-searchGet identifiers, paths, URLs and words from the previous command output and use them for the next command in xonsh shell.
TweebankNLP[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset
lunasecLunaSec - Dependency Security Scanner that automatically notifies you about vulnerabilities like Log4Shell or node-ipc in your Pull Requests and Builds. Protect yourself in 30 seconds with the LunaTrace GitHub App: https://github.com/marketplace/lunatrace-by-lunasec/
limaThe Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.
Vaaku2VecLanguage Modeling and Text Classification in Malayalam Language using ULMFiT
tkseemArabic Tokenization Library. It provides many tokenization algorithms.