udarUDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.
Stars: ✭ 15 (-62.5%)
HebPipeAn NLP pipeline for Hebrew
Stars: ✭ 15 (-62.5%)
frogFrog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Stars: ✭ 70 (+75%)
GrammarEngineГрамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (+70%)
datalinguistStanford CoreNLP in idiomatic Clojure.
Stars: ✭ 93 (+132.5%)
ipymarkupNER, syntax markup visualizations
Stars: ✭ 108 (+170%)
word2vec-tsneGoogle News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Stars: ✭ 59 (+47.5%)
TweebankNLP[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset
Stars: ✭ 84 (+110%)
CorenlpStanford CoreNLP: A Java suite of core NLP tools.
Stars: ✭ 8,248 (+20520%)
foliapyAn extensive Python library for dealing with FoLiA (Format for Linguistic Annotation) documents, a rich XML-based format for linguistic annotation finding application in Natural Language Processing (NLP). This library was formerly part of PyNLPl.
Stars: ✭ 13 (-67.5%)
CYK-ParserA CYK parser written in Python 3.
Stars: ✭ 24 (-40%)
Hanlp中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Stars: ✭ 24,626 (+61465%)
mystem-scalaMorphological analyzer `mystem` (Russian language) wrapper for JVM languages
Stars: ✭ 21 (-47.5%)
FAParserA Fast(er) and Accurate Syntactic Parsing by Exacter Searching.
Stars: ✭ 17 (-57.5%)
ArabicProcessingCogA Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Stars: ✭ 19 (-52.5%)
arcs-pyArc-Eager and Arc-Hybrid Greedy Dependency Parser with Dynamic Oracle in Python (with no Dependencies!)
Stars: ✭ 17 (-57.5%)
kaldi helpers🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-67.5%)
billboard🎤 Lyrics/associated NLP data for Billboard's Top 100, 1950-2015.
Stars: ✭ 53 (+32.5%)
biblio-gluttonA high performance bibliographic information service
Stars: ✭ 54 (+35%)
CISTEMStemmer for German
Stars: ✭ 33 (-17.5%)
uctoUnicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules …
Stars: ✭ 58 (+45%)
KosherCocoaMy Objective-C port of KosherJava. KosherCocoa enables you to perform sunrise-based and sunset-based calculations for Jewish prayer and calendar.
Stars: ✭ 49 (+22.5%)
sembei🍘 単語分割を経由しない単語埋め込み 🍘
Stars: ✭ 14 (-65%)
nytwitNew York Times Word Innovation Types dataset
Stars: ✭ 21 (-47.5%)
DartBible-Fluttercross-platform mobile bible app [Android & iOS / iPhone / iPad]; written in Dart programming language
Stars: ✭ 26 (-35%)
Pyhanlp中文分词 词性标注 命名实体识别 依存句法分析 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁 自然语言处理
Stars: ✭ 2,564 (+6310%)
dependency parsing tfTensorflow implementation of "A Fast and Accurate Dependency Parser using Neural Networks"
Stars: ✭ 77 (+92.5%)
dparNeural network transition-based dependency parser (in Rust)
Stars: ✭ 41 (+2.5%)
esappAn unsupervised Chinese word segmentation tool.
Stars: ✭ 13 (-67.5%)
syntaxnetSyntaxnet Parsey McParseface wrapper for POS tagging and dependency parsing
Stars: ✭ 77 (+92.5%)
datastories-semeval2017-task6Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".
Stars: ✭ 20 (-50%)
FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+6002.5%)
python-arpa🐍 Python library for n-gram models in ARPA format
Stars: ✭ 35 (-12.5%)
NLP-toolsUseful python NLP tools (evaluation, GUI interface, tokenization)
Stars: ✭ 39 (-2.5%)
citation-functionMeasuring the Evolution of a Scientific Field through Citation Frames
Stars: ✭ 40 (+0%)
pylangacqLanguage Acquisition Research Tools
Stars: ✭ 33 (-17.5%)
VirtualBLUA Virtual Assistant for Windows PC with wicked Qt Graphics.
Stars: ✭ 41 (+2.5%)
Pymystem3A Python wrapper of the Yandex Mystem 3.1 morphological analyzer (http://api.yandex.ru/mystem). The original tool is shipped as a binary and this library makes it easy to integrate it in Python projects. Let us know in the issues if you would like to be involved into the developments or maintenance of this project. If you have any fix or suggestion, please make a pull request. We are very open to accepting any contributions.
Stars: ✭ 224 (+460%)
Hebrew-TokenizerA very simple python tokenizer for Hebrew text.
Stars: ✭ 16 (-60%)
lxa5Linguistica 5: Unsupervised Learning of Linguistic Structure
Stars: ✭ 27 (-32.5%)
UniqueBibleA cross-platform bible application, integrated with high-quality resources and amazing features, running offline in Windows, macOS and Linux
Stars: ✭ 61 (+52.5%)
OpenHebrewBibleOpen Hebrew Bible Project; aligning BHS with WLC; bridging ETCBC, OpenScriptures & Berean data on Hebrew Bible
Stars: ✭ 43 (+7.5%)
StanzaOfficial Stanford NLP Python Library for Many Human Languages
Stars: ✭ 5,887 (+14617.5%)
wikipronMassively multilingual pronunciation mining
Stars: ✭ 167 (+317.5%)
bllip-parserBLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.
Stars: ✭ 217 (+442.5%)
MomepyUrban Morphology Measuring Toolkit
Stars: ✭ 210 (+425%)
NEMONeural Modeling for Named Entities and Morphology (Hebrew NER)
Stars: ✭ 25 (-37.5%)
foliaFoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (+40%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (+50%)