cang-jieChinese tokenizer for tantivy, based on jieba-rs
Stars: ✭ 48 (-84.66%)
berserkerBerserker - BERt chineSE woRd toKenizER
Stars: ✭ 17 (-94.57%)
NLPIR-ICTCLASThe Java Package of NLPIR-ICTCLAS.
Stars: ✭ 16 (-94.89%)
tokenizerTokenize CSS according to the CSS Syntax
Stars: ✭ 52 (-83.39%)
SwiLexA universal lexer library in Swift.
Stars: ✭ 29 (-90.73%)
farasapyA Python implementation of Farasa toolkit
Stars: ✭ 69 (-77.96%)
bukefull text search manpages
Stars: ✭ 27 (-91.37%)
fts🔍 Postgres full-text search (fts)
Stars: ✭ 28 (-91.05%)
Hebrew-TokenizerA very simple python tokenizer for Hebrew text.
Stars: ✭ 16 (-94.89%)
xontrib-output-searchGet identifiers, paths, URLs and words from the previous command output and use them for the next command in xonsh shell.
Stars: ✭ 26 (-91.69%)
understand-full-text-search📖 Support examples for learning full-text search with use of PostgreSQL. Ready to run.
Stars: ✭ 98 (-68.69%)
simplemmaSimple multilingual lemmatizer for Python, especially useful for speed and efficiency
Stars: ✭ 32 (-89.78%)
rustfstRust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
Stars: ✭ 104 (-66.77%)
ex elasticlunrElasticlunr is a small, full-text search library for use in the Elixir environment. It indexes JSON documents and provides a friendly search interface to retrieve documents.
Stars: ✭ 125 (-60.06%)
CodeIndexA Code Index Searching Tools Based On Lucene.Net
Stars: ✭ 28 (-91.05%)
poyongaPython Groonga Client
Stars: ✭ 19 (-93.93%)
tokenizerA simple tokenizer in Ruby for NLP tasks.
Stars: ✭ 44 (-85.94%)
pascal-interpreterA simple interpreter for a large subset of Pascal language written for educational purposes
Stars: ✭ 21 (-93.29%)
gd-tokenizerA small godot project with a tokenizer written in GDScript.
Stars: ✭ 34 (-89.14%)
vscode-blockmanVSCode extension to highlight nested code blocks
Stars: ✭ 233 (-25.56%)
djangoqueriesThe code of "Making queries" in docs.djangoproject.com that I used in my article "Full-Text Search in Django with PostgreSQL".
Stars: ✭ 39 (-87.54%)
PaddleTokenizer使用 PaddlePaddle 实现基于深度神经网络的中文分词引擎 | A DNN Chinese Tokenizer by Using PaddlePaddle
Stars: ✭ 14 (-95.53%)
Text-Classification-LSTMs-PyTorchThe aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (-85.62%)
wink-tokenizerMultilingual tokenizer that automatically tags each token with its type
Stars: ✭ 51 (-83.71%)
bulksearchLightweight and read-write optimized full text search library.
Stars: ✭ 108 (-65.5%)
Cross-Domain-CWSCode for IJCAI 2018 paper "Neural Networks Incorporating Unlabeled and Partially-labeled Data for Cross-domain Chinese Word Segmentation"
Stars: ✭ 14 (-95.53%)
bredonA modern CSS value compiler in JavaScript
Stars: ✭ 39 (-87.54%)
neural tokenizerTokenize English sentences using neural networks.
Stars: ✭ 64 (-79.55%)
rgpipelesspipe for ripgrep for common new filetypes using few dependencies
Stars: ✭ 21 (-93.29%)
mystem-scalaMorphological analyzer `mystem` (Russian language) wrapper for JVM languages
Stars: ✭ 21 (-93.29%)
mxusearch🔍 基于讯搜封装的 Laravel 全文检索服务。
Stars: ✭ 40 (-87.22%)
JumanppJuman++ (a Morphological Analyzer Toolkit)
Stars: ✭ 254 (-18.85%)
psr2r-snifferA PSR-2-R code sniffer and code-style auto-correction-tool - including many useful additions
Stars: ✭ 32 (-89.78%)
gatsby-plugin-lunrGatsby plugin for full text search implementation based on lunr client-side index. Supports multilanguage search.
Stars: ✭ 69 (-77.96%)
lexLex is an implementation of lex tool in Ruby.
Stars: ✭ 49 (-84.35%)
search-for-kirbyKirby 3 plugin for adding a search index (sqlite or Algolia).
Stars: ✭ 42 (-86.58%)
hunspellHigh-Performance Stemmer, Tokenizer, and Spell Checker for R
Stars: ✭ 101 (-67.73%)
paperless-ngA supercharged version of paperless: scan, index and archive all your physical documents
Stars: ✭ 4,840 (+1446.33%)
linderaA morphological analysis library.
Stars: ✭ 226 (-27.8%)
SacremosesPython port of Moses tokenizer, truecaser and normalizer
Stars: ✭ 293 (-6.39%)
lunr-moduleFull-text search with pre-build indexes for Nuxt.js using lunr.js
Stars: ✭ 45 (-85.62%)
ilmultiTooling to play around with multilingual machine translation for Indian Languages.
Stars: ✭ 19 (-93.93%)
python-mecabA repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
Stars: ✭ 27 (-91.37%)
snapdragon-lexerConverts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.
Stars: ✭ 19 (-93.93%)
nlpir-analysis-cn-ictclasLucene/Solr Analyzer Plugin. Support MacOS,Linux x86/64,Windows x86/64. It's a maven project, which allows you change the lucene/solr version. //Maven工程,修改Lucene/Solr版本,以兼容相应版本。
Stars: ✭ 71 (-77.32%)
ArabicProcessingCogA Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Stars: ✭ 19 (-93.93%)
suikaSuika 🍉 is a Japanese morphological analyzer written in pure Ruby
Stars: ✭ 31 (-90.1%)
liblexC library for Lexical Analysis
Stars: ✭ 25 (-92.01%)
TokenizerA tokenizer for Icelandic text
Stars: ✭ 27 (-91.37%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (-39.94%)
lexertkC++ Lexer Toolkit Library (LexerTk) https://www.partow.net/programming/lexertk/index.html
Stars: ✭ 26 (-91.69%)
SentencesA multilingual command line sentence tokenizer in Golang
Stars: ✭ 293 (-6.39%)
MemexBrowser Extension to full-text search your browsing history & bookmarks.
Stars: ✭ 3,344 (+968.37%)
larasearchA driver based solution to searching your Eloquent models supports Laravel 5.2 and Elasticsearch engine.
Stars: ✭ 13 (-95.85%)
lucillaFast, efficient, in-memory Full Text Search for Kotlin
Stars: ✭ 102 (-67.41%)
jargonTokenizers and lemmatizers for Go
Stars: ✭ 98 (-68.69%)