OpencorporaA web-based engine for creating and annotating textual corpora
Stars: ✭ 204 (+750%)
CorpuscrawlerCrawler for linguistic corpora
Stars: ✭ 127 (+429.17%)
ngramrR package to query the Google Ngram Viewer
Stars: ✭ 46 (+91.67%)
poesyPoetic processing, for Python.
Stars: ✭ 28 (+16.67%)
BetaAn open source reimplementation of Benny Brodda's BETA in Python
Stars: ✭ 65 (+170.83%)
expletivesExpletives vomiting library...
Stars: ✭ 12 (-50%)
TossiChooses correct Korean particle morphs for arbitrary words.
Stars: ✭ 160 (+566.67%)
linguisticsdownEasy Linguistics Document Writing with R Markdown
Stars: ✭ 24 (+0%)
Elpis🙊 WIP software for creating speech recognition models.
Stars: ✭ 101 (+320.83%)
OnsetA language evolution simulator, using realistic phonetic changes.
Stars: ✭ 30 (+25%)
PhonemesJason Riggle's chart of phonological features in JSON format + extras
Stars: ✭ 33 (+37.5%)
mlconjug3A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.
Stars: ✭ 47 (+95.83%)
WonderfulPolishLanguageThis is a repository created for the list of resources for learning and exploring Wonderful Polish language.
Stars: ✭ 31 (+29.17%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (+12.5%)
Rime CantoneseRime Cantonese input schema | 粵語拼音輸入方案
Stars: ✭ 173 (+620.83%)
languaA suite of language tools
Stars: ✭ 29 (+20.83%)
PycantoneseCantonese Linguistics and NLP in Python
Stars: ✭ 147 (+512.5%)
mystemCGo bindings to Yandex.Mystem
Stars: ✭ 28 (+16.67%)
Colibri CoreColibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Stars: ✭ 112 (+366.67%)
FlatFoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.
Stars: ✭ 93 (+287.5%)
lingvo--Ner-ruNamed entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
Stars: ✭ 38 (+58.33%)
event-embedding-multitask*SEM 2018: Learning Distributed Event Representations with a Multi-Task Approach
Stars: ✭ 22 (-8.33%)
nyt-first-saidTweets when words are published for the first time in the NYT
Stars: ✭ 222 (+825%)
LangPadA word processor/dictionary/generally useful tool for linguistics.
Stars: ✭ 20 (-16.67%)
pylangacqLanguage Acquisition Research Tools
Stars: ✭ 33 (+37.5%)
pfootprintPolitical Discourse Analysis Using Pre-Trained Word Vectors.
Stars: ✭ 20 (-16.67%)
libpalasoPalaso Library: A set of .Net libraries useful for developers of Language Software.
Stars: ✭ 36 (+50%)
Awesome LinguisticsA curated list of anything remotely related to linguistics
Stars: ✭ 207 (+762.5%)
OpenGNTOpen Greek New Testament Project; NA28 / NA27 Equivalent Text & Resources
Stars: ✭ 55 (+129.17%)
HangulizeKorean Alphabet Transcription
Stars: ✭ 184 (+666.67%)
verbeccComplete Conjugation of any Verb using Machine Learning for French, Spanish, Portuguese, Italian and Romanian
Stars: ✭ 45 (+87.5%)
ProsodicProsodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.
Stars: ✭ 162 (+575%)
HangulizeHangulize transcribes non-Korean words into Hangul
Stars: ✭ 152 (+533.33%)
KoParadigmKoParadigm: Korean Inflectional Paradigm Generator
Stars: ✭ 48 (+100%)
Ipa DictMonolingual wordlists with pronunciation information in IPA
Stars: ✭ 139 (+479.17%)
TextGridToolsRead, write, and manipulate Praat TextGrid files with Python
Stars: ✭ 84 (+250%)
IchiranLinguistic tools for texts in Japanese language
Stars: ✭ 120 (+400%)
devPHOIBLE data and development.
Stars: ✭ 90 (+275%)
PyconllA minimal, pure Python library to interface with CoNLL-U format files.
Stars: ✭ 104 (+333.33%)
lametaThe Metadata Editor for Transparent Archiving of language document materials
Stars: ✭ 18 (-25%)
WikipronMassively multilingual pronunciation mining
Stars: ✭ 99 (+312.5%)
TextannotationgraphsA modular annotation system that supports complex, interactive annotation graphs embedded on top of sequences of text.
Stars: ✭ 73 (+204.17%)
dureeDurée: the longest book ever written.
Stars: ✭ 67 (+179.17%)
Yesterday I LearnedBrainfarts are caused by the rupturing of the cerebral sphincter.
Stars: ✭ 50 (+108.33%)
lambda-notebookLambda Notebook: Formal Semantics in Jupyter
Stars: ✭ 16 (-33.33%)
PsychopyFor running psychology and neuroscience experiments
Stars: ✭ 1,020 (+4150%)
NatLangNatLang is an English parser with an extensible grammar
Stars: ✭ 20 (-16.67%)
lingtypologyR package for linguistic cartography and typological databases search
Stars: ✭ 47 (+95.83%)
concepticon-dataThe curation repository for the data behind Concepticon.
Stars: ✭ 25 (+4.17%)
wikipronMassively multilingual pronunciation mining
Stars: ✭ 167 (+595.83%)
foliaFoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (+133.33%)
eliza-rsA rust implementation of ELIZA - a natural language processing program developed by Joseph Weizenbaum in 1966.
Stars: ✭ 48 (+100%)
proiel-treebankOfficial releases of the PROIEL treebank of ancient Indo-European languages
Stars: ✭ 30 (+25%)