libmorphlibmorph rus/ukr - fast & accurate morphological analyzer/analyses for Russian and Ukrainian
Stars: ✭ 16 (-23.81%)
jargonTokenizers and lemmatizers for Go
Stars: ✭ 98 (+366.67%)
mystemCGo bindings to Yandex.Mystem
Stars: ✭ 28 (+33.33%)
ArabicProcessingCogA Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Stars: ✭ 19 (-9.52%)
GrammarEngineГрамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (+223.81%)
simplemmaSimple multilingual lemmatizer for Python, especially useful for speed and efficiency
Stars: ✭ 32 (+52.38%)
RussianNounsJSСклонение существительных по падежам. Обычно требуются только форма в именительном падеже, одушевлённость и род.
Stars: ✭ 29 (+38.1%)
translateA module grouping multiple translation APIs
Stars: ✭ 321 (+1428.57%)
hunspellHigh-Performance Stemmer, Tokenizer, and Spell Checker for R
Stars: ✭ 101 (+380.95%)
frogFrog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Stars: ✭ 70 (+233.33%)
kaldi helpers🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-38.1%)
uctoUnicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules …
Stars: ✭ 58 (+176.19%)
lemmaA Morphological Parser (Analyser) / Lemmatizer written in Elixir.
Stars: ✭ 45 (+114.29%)
YaSeekerYandex OSINT tool
Stars: ✭ 104 (+395.24%)
tokenizerTokenize CSS according to the CSS Syntax
Stars: ✭ 52 (+147.62%)
robots-txt-parserPHP class for parse all directives from robots.txt files according to specifications
Stars: ✭ 38 (+80.95%)
citation-functionMeasuring the Evolution of a Scientific Field through Citation Frames
Stars: ✭ 40 (+90.48%)
lara-hungarian-nlpNLP class for rapid ChatBot development in Hungarian language
Stars: ✭ 27 (+28.57%)
lxa5Linguistica 5: Unsupervised Learning of Linguistic Structure
Stars: ✭ 27 (+28.57%)
nytwitNew York Times Word Innovation Types dataset
Stars: ✭ 21 (+0%)
gd-tokenizerA small godot project with a tokenizer written in GDScript.
Stars: ✭ 34 (+61.9%)
python-mecabA repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
Stars: ✭ 27 (+28.57%)
farasapyA Python implementation of Farasa toolkit
Stars: ✭ 69 (+228.57%)
datastories-semeval2017-task6Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".
Stars: ✭ 20 (-4.76%)
rustfstRust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
Stars: ✭ 104 (+395.24%)
yandex-direct-apiPHP library for Yandex.Direct API v5 (abandoned)
Stars: ✭ 12 (-42.86%)
xontrib-output-searchGet identifiers, paths, URLs and words from the previous command output and use them for the next command in xonsh shell.
Stars: ✭ 26 (+23.81%)
appmetrica-logsapi-loaderA tool for automatic data loading from AppMetrica LogsAPI into (local) ClickHouse
Stars: ✭ 18 (-14.29%)
psr2r-snifferA PSR-2-R code sniffer and code-style auto-correction-tool - including many useful additions
Stars: ✭ 32 (+52.38%)
artefactory-connectors-kitACK is an E(T)L tool specialized in API data ingestion. It is accessible through a Command-Line Interface. The application allows you to easily extract, stream and load data (with minimum transformations), from the API source to the destination of your choice.
Stars: ✭ 34 (+61.9%)
xy2xyA list of technologies similar to inner Yandex technologies
Stars: ✭ 112 (+433.33%)
wink-tokenizerMultilingual tokenizer that automatically tags each token with its type
Stars: ✭ 51 (+142.86%)
lexLex is an implementation of lex tool in Ruby.
Stars: ✭ 49 (+133.33%)
vscode-blockmanVSCode extension to highlight nested code blocks
Stars: ✭ 233 (+1009.52%)
tokenizerA simple tokenizer in Ruby for NLP tasks.
Stars: ✭ 44 (+109.52%)
berserkerBerserker - BERt chineSE woRd toKenizER
Stars: ✭ 17 (-19.05%)
linderaA morphological analysis library.
Stars: ✭ 226 (+976.19%)
SwiLexA universal lexer library in Swift.
Stars: ✭ 29 (+38.1%)
FAРепозиторий практик факультета ИТиАБД направления Прикладной Информатики в Финансовом Университете при Правительстве РФ
Stars: ✭ 26 (+23.81%)
drupal 8 unset html head link🤖 Module for unset any wrong HTML links (like rel="delete-form", rel="edit-form", etc.) from head on Drupal 8.x websites. This is trust way to grow up position in SERP Google, Yandex, etc.
Stars: ✭ 19 (-9.52%)
datalinguistStanford CoreNLP in idiomatic Clojure.
Stars: ✭ 93 (+342.86%)
word2vec-tsneGoogle News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Stars: ✭ 59 (+180.95%)
snapdragon-lexerConverts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.
Stars: ✭ 19 (-9.52%)
yandex-disk-apiThis library is built to use Yandex Disk API with PHP
Stars: ✭ 19 (-9.52%)
liblexC library for Lexical Analysis
Stars: ✭ 25 (+19.05%)
golemA lemmatizer implemented in Go
Stars: ✭ 54 (+157.14%)
yametrikapyPython library for Yandex Metrika API
Stars: ✭ 20 (-4.76%)
alice-rendererNode.js библиотека для формирования ответов в навыках Яндекс Алисы.
Stars: ✭ 27 (+28.57%)
swfk“Snake wrangling for kids”: the Russian translation. Русский перевод книги «Snake Wrangling for Kids»
Stars: ✭ 24 (+14.29%)