Regex AutomataA low level regular expression library that uses deterministic finite automata.
Stars: ✭ 203 (+576.67%)
BsedSimple SQL-like syntax on top of Perl text processing.
Stars: ✭ 414 (+1280%)
Dan Jurafsky Chris Manning NlpMy solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (+313.33%)
TextpipeTextpipe: clean and extract metadata from text
Stars: ✭ 284 (+846.67%)
neo4j-rake tasksRake tasks for managing Neo4j. Tasks allow for starting, stopping, and configuring
Stars: ✭ 13 (-56.67%)
daachorse🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure.
Stars: ✭ 75 (+150%)
Textcluster短文本聚类预处理模块 Short text cluster
Stars: ✭ 115 (+283.33%)
NLP-toolsUseful python NLP tools (evaluation, GUI interface, tokenization)
Stars: ✭ 39 (+30%)
typ3r.js🍟 [Library] dA aNn0Y1Ng t3Xt g3NeRa7or
Stars: ✭ 22 (-26.67%)
rsfbclientRust Firebird Client
Stars: ✭ 64 (+113.33%)
hckA sharp cut(1) clone.
Stars: ✭ 542 (+1706.67%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-10%)
FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+8036.67%)
pwsh-preludePowerShell “standard” library for supercharging your productivity. Provides a powerful cross-platform scripting environment enabling efficient analysis and sustainable science in myriad contexts.
Stars: ✭ 26 (-13.33%)
NostrilNostril: Nonsense String Evaluator
Stars: ✭ 86 (+186.67%)
Compare-UserJSPowerShell script for comparing user.js (or prefs.js) files.
Stars: ✭ 79 (+163.33%)
bakeAn alternative to rake, with all the great stuff and a sprinkling of magic dust.
Stars: ✭ 72 (+140%)
lingua-go👄 The most accurate natural language detection library for Go, suitable for long and short text alike
Stars: ✭ 684 (+2180%)
Node RakeA NodeJS implementation of the Rapid Automatic Keyword Extraction algorithm.
Stars: ✭ 85 (+183.33%)
textstatRuby gem to calculate statistics from text to determine readability, complexity and grade level of a particular corpus.
Stars: ✭ 25 (-16.67%)
TextvecText vectorization tool to outperform TFIDF for classification tasks
Stars: ✭ 167 (+456.67%)
hama-py🦛 파이썬 한글 처리 라이브러리. Python Korean Morphological Analyzer
Stars: ✭ 16 (-46.67%)
nlcliNatural language interface for the command line.
Stars: ✭ 21 (-30%)
cactusref🌵 Cycle-Aware Reference Counting in Rust
Stars: ✭ 129 (+330%)
finglishA Finglish to Persian converter.
Stars: ✭ 60 (+100%)
TerText Expression Runner – Readable and easy to use text expressions
Stars: ✭ 67 (+123.33%)
sova-tts-tpsNLP-preprocessor for the SOVA-TTS project
Stars: ✭ 44 (+46.67%)
JaconvPure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku and Zenkaku
Stars: ✭ 157 (+423.33%)
s3-utilsUtilities and tools based around Amazon S3 to provide convenience APIs in a CLI
Stars: ✭ 45 (+50%)
mug💎 mug Jekyll theme
Stars: ✭ 45 (+50%)
python-mecabA repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
Stars: ✭ 27 (-10%)
PipeitPipeIt is a text transformation, conversion, cleansing and extraction tool.
Stars: ✭ 57 (+90%)
Emotion-recognition-from-tweetsA comprehensive approach on recognizing emotion (sentiment) from a certain tweet. Supervised machine learning.
Stars: ✭ 17 (-43.33%)
XiocExtract indicators of compromise from text, including "escaped" ones.
Stars: ✭ 148 (+393.33%)
text2videoText to Video Generation Problem
Stars: ✭ 28 (-6.67%)
PyparsingPython library for creating PEG parsers
Stars: ✭ 1,052 (+3406.67%)
estrattoparsing fixed width files content made easy
Stars: ✭ 12 (-60%)
broomA disk cleaning utility for developers.
Stars: ✭ 38 (+26.67%)
FxtA large scale feature extraction tool for text-based machine learning
Stars: ✭ 25 (-16.67%)
synsyn - the thesaurus
Stars: ✭ 45 (+50%)
Stanza OldStanford NLP group's shared Python tools.
Stars: ✭ 142 (+373.33%)
Text-Classification-LSTMs-PyTorchThe aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (+50%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (+100%)
cakeCake is a powerful and flexible Make-like utility tool. Make Tasks Great Again!
Stars: ✭ 64 (+113.33%)
GohnHatena Notation (はてな記法) Parser written in Go
Stars: ✭ 17 (-43.33%)
termscp🖥 A feature rich terminal UI file transfer and explorer with support for SCP/SFTP/FTP/S3
Stars: ✭ 707 (+2256.67%)
punylinuxBuild automation (powered by Ruby & Rake) for a very minimal Linux system.
Stars: ✭ 31 (+3.33%)
StringiTHE String Processing Package for R (with ICU)
Stars: ✭ 204 (+580%)
Konoha🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Stars: ✭ 130 (+333.33%)
Diff Match PatchDiff Match Patch is a high-performance library in multiple languages that manipulates plain text.
Stars: ✭ 4,910 (+16266.67%)