pylangacqLanguage Acquisition Research Tools
Stars: ✭ 33 (+43.48%)
foliaFoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (+143.48%)
wikipronMassively multilingual pronunciation mining
Stars: ✭ 167 (+626.09%)
rnn darts fastaiImplement Differentiable Architecture Search (DARTS) for RNN with fastai
Stars: ✭ 21 (-8.7%)
referit3dCode accompanying our ECCV-2020 paper on 3D Neural Listeners.
Stars: ✭ 59 (+156.52%)
Awesome LinguisticsA curated list of anything remotely related to linguistics
Stars: ✭ 207 (+800%)
CommonCoreOntologiesThe Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
Stars: ✭ 109 (+373.91%)
auto-gfqgAutomatic Gap-Fill Question Generation
Stars: ✭ 17 (-26.09%)
ProsodicProsodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.
Stars: ✭ 162 (+604.35%)
mozolmMozoLM: A language model (LM) serving library
Stars: ✭ 32 (+39.13%)
codeprepA toolkit for pre-processing large source code corpora
Stars: ✭ 39 (+69.57%)
bllip-parserBLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.
Stars: ✭ 217 (+843.48%)
IndRNN pytorchIndependently Recurrent Neural Networks (IndRNN) implemented in pytorch.
Stars: ✭ 112 (+386.96%)
HangulizeKorean Alphabet Transcription
Stars: ✭ 184 (+700%)
OnsetA language evolution simulator, using realistic phonetic changes.
Stars: ✭ 30 (+30.43%)
WonderfulPolishLanguageThis is a repository created for the list of resources for learning and exploring Wonderful Polish language.
Stars: ✭ 31 (+34.78%)
HangulizeHangulize transcribes non-Korean words into Hangul
Stars: ✭ 152 (+560.87%)
Ipa DictMonolingual wordlists with pronunciation information in IPA
Stars: ✭ 139 (+504.35%)
IchiranLinguistic tools for texts in Japanese language
Stars: ✭ 120 (+421.74%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (+160.87%)
COCO-LM[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Stars: ✭ 109 (+373.91%)
PyconllA minimal, pure Python library to interface with CoNLL-U format files.
Stars: ✭ 104 (+352.17%)
LNEx📍 🏢 🏦 🏣 🏪 🏬 LNEx: Location Name Extractor
Stars: ✭ 21 (-8.7%)
nyt-first-saidTweets when words are published for the first time in the NYT
Stars: ✭ 222 (+865.22%)
theano-recurrenceRecurrent Neural Networks (RNN, GRU, LSTM) and their Bidirectional versions (BiRNN, BiGRU, BiLSTM) for word & character level language modelling in Theano
Stars: ✭ 40 (+73.91%)
event-embedding-multitask*SEM 2018: Learning Distributed Event Representations with a Multi-Task Approach
Stars: ✭ 22 (-4.35%)
lingua-go👄 The most accurate natural language detection library for Go, suitable for long and short text alike
Stars: ✭ 684 (+2873.91%)
tributech-catalog-apiTributech Catalog - Create and manage your DTDL models using our graphical interface and store them using our APIs.
Stars: ✭ 15 (-34.78%)
deepblastNeural Networks for Protein Sequence Alignment
Stars: ✭ 29 (+26.09%)
pytorch-translmAn implementation of transformer-based language model for sentence rewriting tasks such as summarization, simplification, and grammatical error correction.
Stars: ✭ 22 (-4.35%)
WikipronMassively multilingual pronunciation mining
Stars: ✭ 99 (+330.43%)
group-transformerOfficial code for Group-Transformer (Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model, COLING-2020).
Stars: ✭ 21 (-8.7%)
lingtypologyR package for linguistic cartography and typological databases search
Stars: ✭ 47 (+104.35%)
OpencorporaA web-based engine for creating and annotating textual corpora
Stars: ✭ 204 (+786.96%)
poesyPoetic processing, for Python.
Stars: ✭ 28 (+21.74%)
Rime CantoneseRime Cantonese input schema | 粵語拼音輸入方案
Stars: ✭ 173 (+652.17%)
esappAn unsupervised Chinese word segmentation tool.
Stars: ✭ 13 (-43.48%)
TextannotationgraphsA modular annotation system that supports complex, interactive annotation graphs embedded on top of sequences of text.
Stars: ✭ 73 (+217.39%)
TossiChooses correct Korean particle morphs for arbitrary words.
Stars: ✭ 160 (+595.65%)
pfootprintPolitical Discourse Analysis Using Pre-Trained Word Vectors.
Stars: ✭ 20 (-13.04%)
PycantoneseCantonese Linguistics and NLP in Python
Stars: ✭ 147 (+539.13%)
proiel-treebankOfficial releases of the PROIEL treebank of ancient Indo-European languages
Stars: ✭ 30 (+30.43%)
CorpuscrawlerCrawler for linguistic corpora
Stars: ✭ 127 (+452.17%)
FUTUREA private, free, open-source search engine built on a P2P network
Stars: ✭ 19 (-17.39%)
Colibri CoreColibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Stars: ✭ 112 (+386.96%)
LuciLogical Unity for Communicational Interactivity
Stars: ✭ 25 (+8.7%)
Elpis🙊 WIP software for creating speech recognition models.
Stars: ✭ 101 (+339.13%)
FlatFoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.
Stars: ✭ 93 (+304.35%)
Fill-the-GAP[ACL-WS] 4th place solution to gendered pronoun resolution challenge on Kaggle
Stars: ✭ 13 (-43.48%)
BetaAn open source reimplementation of Benny Brodda's BETA in Python
Stars: ✭ 65 (+182.61%)
DartsDifferentiable architecture search for convolutional and recurrent networks
Stars: ✭ 3,463 (+14956.52%)
tape-neurips2019Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. (DEPRECATED)
Stars: ✭ 117 (+408.7%)
viky-aiNatural Language Processing platform. Allows to extract information from unstructured text.
Stars: ✭ 38 (+65.22%)
lambda-notebookLambda Notebook: Formal Semantics in Jupyter
Stars: ✭ 16 (-30.43%)