mystem-scalaMorphological analyzer `mystem` (Russian language) wrapper for JVM languages
Stars: ✭ 21 (-25%)
ngramrR package to query the Google Ngram Viewer
Stars: ✭ 46 (+64.29%)
StrataРаскладка клавиатуры для тех, кто любит Markdown и пишет по-русски
Stars: ✭ 70 (+150%)
TossiChooses correct Korean particle morphs for arbitrary words.
Stars: ✭ 160 (+471.43%)
russiannamesRussian names parsers, gender identification and processing tools
Stars: ✭ 102 (+264.29%)
expletivesExpletives vomiting library...
Stars: ✭ 12 (-57.14%)
OpencorporaA web-based engine for creating and annotating textual corpora
Stars: ✭ 204 (+628.57%)
lametaThe Metadata Editor for Transparent Archiving of language document materials
Stars: ✭ 18 (-35.71%)
RussianNounsJSСклонение существительных по падежам. Обычно требуются только форма в именительном падеже, одушевлённость и род.
Stars: ✭ 29 (+3.57%)
CorpuscrawlerCrawler for linguistic corpora
Stars: ✭ 127 (+353.57%)
Elpis🙊 WIP software for creating speech recognition models.
Stars: ✭ 101 (+260.71%)
proiel-treebankOfficial releases of the PROIEL treebank of ancient Indo-European languages
Stars: ✭ 30 (+7.14%)
mlconjug3A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.
Stars: ✭ 47 (+67.86%)
pylangacqLanguage Acquisition Research Tools
Stars: ✭ 33 (+17.86%)
WonderfulPolishLanguageThis is a repository created for the list of resources for learning and exploring Wonderful Polish language.
Stars: ✭ 31 (+10.71%)
languaA suite of language tools
Stars: ✭ 29 (+3.57%)
Rime CantoneseRime Cantonese input schema | 粵語拼音輸入方案
Stars: ✭ 173 (+517.86%)
ru punktRussian language support for NLTK's PunktSentenceTokenizer
Stars: ✭ 49 (+75%)
PycantoneseCantonese Linguistics and NLP in Python
Stars: ✭ 147 (+425%)
Colibri CoreColibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Stars: ✭ 112 (+300%)
FAРепозиторий практик факультета ИТиАБД направления Прикладной Информатики в Финансовом Университете при Правительстве РФ
Stars: ✭ 26 (-7.14%)
lambda-notebookLambda Notebook: Formal Semantics in Jupyter
Stars: ✭ 16 (-42.86%)
FlatFoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.
Stars: ✭ 93 (+232.14%)
BetaAn open source reimplementation of Benny Brodda's BETA in Python
Stars: ✭ 65 (+132.14%)
vim-plugin-ruscmdVim plugin: support command mode in Russian keyboard layout
Stars: ✭ 60 (+114.29%)
LangPadA word processor/dictionary/generally useful tool for linguistics.
Stars: ✭ 20 (-28.57%)
nyt-first-saidTweets when words are published for the first time in the NYT
Stars: ✭ 222 (+692.86%)
swfk“Snake wrangling for kids”: the Russian translation. Русский перевод книги «Snake Wrangling for Kids»
Stars: ✭ 24 (-14.29%)
libpalasoPalaso Library: A set of .Net libraries useful for developers of Language Software.
Stars: ✭ 36 (+28.57%)
poesyPoetic processing, for Python.
Stars: ✭ 28 (+0%)
pfootprintPolitical Discourse Analysis Using Pre-Trained Word Vectors.
Stars: ✭ 20 (-28.57%)
verbeccComplete Conjugation of any Verb using Machine Learning for French, Spanish, Portuguese, Italian and Romanian
Stars: ✭ 45 (+60.71%)
Awesome LinguisticsA curated list of anything remotely related to linguistics
Stars: ✭ 207 (+639.29%)
linguisticsdownEasy Linguistics Document Writing with R Markdown
Stars: ✭ 24 (-14.29%)
HangulizeKorean Alphabet Transcription
Stars: ✭ 184 (+557.14%)
KoParadigmKoParadigm: Korean Inflectional Paradigm Generator
Stars: ✭ 48 (+71.43%)
ProsodicProsodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.
Stars: ✭ 162 (+478.57%)
foliaFoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (+100%)
HangulizeHangulize transcribes non-Korean words into Hangul
Stars: ✭ 152 (+442.86%)
devPHOIBLE data and development.
Stars: ✭ 90 (+221.43%)
Ipa DictMonolingual wordlists with pronunciation information in IPA
Stars: ✭ 139 (+396.43%)
lingvo--Ner-ruNamed entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
Stars: ✭ 38 (+35.71%)
IchiranLinguistic tools for texts in Japanese language
Stars: ✭ 120 (+328.57%)
PyconllA minimal, pure Python library to interface with CoNLL-U format files.
Stars: ✭ 104 (+271.43%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-3.57%)
WikipronMassively multilingual pronunciation mining
Stars: ✭ 99 (+253.57%)
OnsetA language evolution simulator, using realistic phonetic changes.
Stars: ✭ 30 (+7.14%)
TextannotationgraphsA modular annotation system that supports complex, interactive annotation graphs embedded on top of sequences of text.
Stars: ✭ 73 (+160.71%)
NatLangNatLang is an English parser with an extensible grammar
Stars: ✭ 20 (-28.57%)
dureeDurée: the longest book ever written.
Stars: ✭ 67 (+139.29%)
ego-demoEnvoy filters in Go
Stars: ✭ 34 (+21.43%)
event-embedding-multitask*SEM 2018: Learning Distributed Event Representations with a Multi-Task Approach
Stars: ✭ 22 (-21.43%)
Yesterday I LearnedBrainfarts are caused by the rupturing of the cerebral sphincter.
Stars: ✭ 50 (+78.57%)
eliza-rsA rust implementation of ELIZA - a natural language processing program developed by Joseph Weizenbaum in 1966.
Stars: ✭ 48 (+71.43%)
lingtypologyR package for linguistic cartography and typological databases search
Stars: ✭ 47 (+67.86%)