Rime CantoneseRime Cantonese input schema | 粵語拼音輸入方案
Stars: ✭ 173 (+179.03%)
alexa-rubyRuby toolkit for Amazon Alexa service
Stars: ✭ 17 (-72.58%)
event-embedding-multitask*SEM 2018: Learning Distributed Event Representations with a Multi-Task Approach
Stars: ✭ 22 (-64.52%)
Colibri CoreColibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Stars: ✭ 112 (+80.65%)
eliza-rsA rust implementation of ELIZA - a natural language processing program developed by Joseph Weizenbaum in 1966.
Stars: ✭ 48 (-22.58%)
WonderfulPolishLanguageThis is a repository created for the list of resources for learning and exploring Wonderful Polish language.
Stars: ✭ 31 (-50%)
PycantoneseCantonese Linguistics and NLP in Python
Stars: ✭ 147 (+137.1%)
languaA suite of language tools
Stars: ✭ 29 (-53.23%)
OnsetA language evolution simulator, using realistic phonetic changes.
Stars: ✭ 30 (-51.61%)
FlatFoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.
Stars: ✭ 93 (+50%)
lingvo--Ner-ruNamed entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
Stars: ✭ 38 (-38.71%)
proiel-treebankOfficial releases of the PROIEL treebank of ancient Indo-European languages
Stars: ✭ 30 (-51.61%)
dureeDurée: the longest book ever written.
Stars: ✭ 67 (+8.06%)
poesyPoetic processing, for Python.
Stars: ✭ 28 (-54.84%)
mlconjug3A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.
Stars: ✭ 47 (-24.19%)
OpencorporaA web-based engine for creating and annotating textual corpora
Stars: ✭ 204 (+229.03%)
TextGridToolsRead, write, and manipulate Praat TextGrid files with Python
Stars: ✭ 84 (+35.48%)
TossiChooses correct Korean particle morphs for arbitrary words.
Stars: ✭ 160 (+158.06%)
expletivesExpletives vomiting library...
Stars: ✭ 12 (-80.65%)
CorpuscrawlerCrawler for linguistic corpora
Stars: ✭ 127 (+104.84%)
Elpis🙊 WIP software for creating speech recognition models.
Stars: ✭ 101 (+62.9%)
ngramrR package to query the Google Ngram Viewer
Stars: ✭ 46 (-25.81%)
TextannotationgraphsA modular annotation system that supports complex, interactive annotation graphs embedded on top of sequences of text.
Stars: ✭ 73 (+17.74%)
lametaThe Metadata Editor for Transparent Archiving of language document materials
Stars: ✭ 18 (-70.97%)
lambda-notebookLambda Notebook: Formal Semantics in Jupyter
Stars: ✭ 16 (-74.19%)
mystemCGo bindings to Yandex.Mystem
Stars: ✭ 28 (-54.84%)
lingtypologyR package for linguistic cartography and typological databases search
Stars: ✭ 47 (-24.19%)
NatLangNatLang is an English parser with an extensible grammar
Stars: ✭ 20 (-67.74%)
nyt-first-saidTweets when words are published for the first time in the NYT
Stars: ✭ 222 (+258.06%)
concepticon-dataThe curation repository for the data behind Concepticon.
Stars: ✭ 25 (-59.68%)
pylangacqLanguage Acquisition Research Tools
Stars: ✭ 33 (-46.77%)
LangPadA word processor/dictionary/generally useful tool for linguistics.
Stars: ✭ 20 (-67.74%)
pfootprintPolitical Discourse Analysis Using Pre-Trained Word Vectors.
Stars: ✭ 20 (-67.74%)
foliaFoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (-9.68%)
Awesome LinguisticsA curated list of anything remotely related to linguistics
Stars: ✭ 207 (+233.87%)
tokenizerA simple tokenizer in Ruby for NLP tasks.
Stars: ✭ 44 (-29.03%)
HangulizeKorean Alphabet Transcription
Stars: ✭ 184 (+196.77%)
spanish-corporaUnannotated Spanish 3 Billion Words Corpora
Stars: ✭ 61 (-1.61%)
ProsodicProsodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.
Stars: ✭ 162 (+161.29%)
libpalasoPalaso Library: A set of .Net libraries useful for developers of Language Software.
Stars: ✭ 36 (-41.94%)
HangulizeHangulize transcribes non-Korean words into Hangul
Stars: ✭ 152 (+145.16%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-56.45%)
Ipa DictMonolingual wordlists with pronunciation information in IPA
Stars: ✭ 139 (+124.19%)
verbeccComplete Conjugation of any Verb using Machine Learning for French, Spanish, Portuguese, Italian and Romanian
Stars: ✭ 45 (-27.42%)
IchiranLinguistic tools for texts in Japanese language
Stars: ✭ 120 (+93.55%)
wikipronMassively multilingual pronunciation mining
Stars: ✭ 167 (+169.35%)
PyconllA minimal, pure Python library to interface with CoNLL-U format files.
Stars: ✭ 104 (+67.74%)
KoParadigmKoParadigm: Korean Inflectional Paradigm Generator
Stars: ✭ 48 (-22.58%)
WikipronMassively multilingual pronunciation mining
Stars: ✭ 99 (+59.68%)
devPHOIBLE data and development.
Stars: ✭ 90 (+45.16%)
nlp-pureNatural language processing algorithms implemented in pure Ruby with minimal dependencies
Stars: ✭ 19 (-69.35%)
treebenderA HDPSG-inspired symbolic natural language parser written in Rust
Stars: ✭ 24 (-61.29%)
OpenGNTOpen Greek New Testament Project; NA28 / NA27 Equivalent Text & Resources
Stars: ✭ 55 (-11.29%)
linguisticsdownEasy Linguistics Document Writing with R Markdown
Stars: ✭ 24 (-61.29%)