All Projects → Opencorpora → Similar Projects or Alternatives

66 Open source projects that are alternatives of or similar to Opencorpora

linguisticsdown

Easy Linguistics Document Writing with R Markdown

Stars: ✭ 24 (-88.24%)

Mutual labels: linguistics

expletives

Expletives vomiting library...

Stars: ✭ 12 (-94.12%)

Mutual labels: linguistics

Weixin public corpus

微信公众号语料库

Stars: ✭ 465 (+127.94%)

Mutual labels: linguistics

folia

FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…

Stars: ✭ 56 (-72.55%)

Mutual labels: linguistics

Onset

A language evolution simulator, using realistic phonetic changes.

Stars: ✭ 30 (-85.29%)

Mutual labels: linguistics

Psychopy

For running psychology and neuroscience experiments

Stars: ✭ 1,020 (+400%)

Mutual labels: linguistics

eliza-rs

A rust implementation of ELIZA - a natural language processing program developed by Joseph Weizenbaum in 1966.

Stars: ✭ 48 (-76.47%)

Mutual labels: linguistics

Pyconll

A minimal, pure Python library to interface with CoNLL-U format files.

Stars: ✭ 104 (-49.02%)

Mutual labels: linguistics

ngramr

R package to query the Google Ngram Viewer

Stars: ✭ 46 (-77.45%)

Mutual labels: linguistics

spanish-corpora

Unannotated Spanish 3 Billion Words Corpora

Stars: ✭ 61 (-70.1%)

Mutual labels: linguistics

mystem

CGo bindings to Yandex.Mystem

Stars: ✭ 28 (-86.27%)

Mutual labels: linguistics

proiel-treebank

Official releases of the PROIEL treebank of ancient Indo-European languages

Stars: ✭ 30 (-85.29%)

Mutual labels: linguistics

Yesterday I Learned

Brainfarts are caused by the rupturing of the cerebral sphincter.

Stars: ✭ 50 (-75.49%)

Mutual labels: linguistics

TextDatasetCleaner

🔬 Очистка датасетов от мусора (нормализация, препроцессинг)

Stars: ✭ 27 (-86.76%)

Mutual labels: linguistics

Ichiran

Linguistic tools for texts in Japanese language

Stars: ✭ 120 (-41.18%)

Mutual labels: linguistics

lingvo--Ner-ru

Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке

Stars: ✭ 38 (-81.37%)

Mutual labels: linguistics

Awesome Sentiment Analysis

😀😄😂😭 A curated list of Sentiment Analysis methods, implementations and misc. 😥😟😱😤

Stars: ✭ 816 (+300%)

Mutual labels: linguistics

mlconjug3

A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.

Stars: ✭ 47 (-76.96%)

Mutual labels: linguistics

Hangulize

Hangulize transcribes non-Korean words into Hangul

Stars: ✭ 152 (-25.49%)

Mutual labels: linguistics

langua

A suite of language tools

Stars: ✭ 29 (-85.78%)

Mutual labels: linguistics

Pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

Stars: ✭ 426 (+108.82%)

Mutual labels: linguistics

linguistics problems

Natural language processing in examples and games

Stars: ✭ 23 (-88.73%)

Mutual labels: linguistics

Wikipron

Massively multilingual pronunciation mining

Stars: ✭ 99 (-51.47%)

Mutual labels: linguistics

event-embedding-multitask

*SEM 2018: Learning Distributed Event Representations with a Multi-Task Approach

Stars: ✭ 22 (-89.22%)

Mutual labels: linguistics

concepticon-data

The curation repository for the data behind Concepticon.

Stars: ✭ 25 (-87.75%)

Mutual labels: linguistics

OpenGNT

Open Greek New Testament Project; NA28 / NA27 Equivalent Text & Resources

Stars: ✭ 55 (-73.04%)

Mutual labels: linguistics

nyt-first-said

Tweets when words are published for the first time in the NYT

Stars: ✭ 222 (+8.82%)

Mutual labels: linguistics

Beta

An open source reimplementation of Benny Brodda's BETA in Python

Stars: ✭ 65 (-68.14%)

Mutual labels: linguistics

duree

Durée: the longest book ever written.

Stars: ✭ 67 (-67.16%)

Mutual labels: linguistics

Corpuscrawler

Crawler for linguistic corpora

Stars: ✭ 127 (-37.75%)

Mutual labels: linguistics

clinical nlp elastic

Clinical NLP Analysis with Elasticsearch and Kibana

Stars: ✭ 32 (-84.31%)

Mutual labels: linguistics

Python Datamuse

Python 3 wrapper for the Datamuse API

Stars: ✭ 47 (-76.96%)

Mutual labels: linguistics

neural-net-linguistics

Papers about NN and linguistics

Stars: ✭ 14 (-93.14%)

Mutual labels: linguistics

Tossi

Chooses correct Korean particle morphs for arbitrary words.

Stars: ✭ 160 (-21.57%)

Mutual labels: linguistics

lameta

The Metadata Editor for Transparent Archiving of language document materials

Stars: ✭ 18 (-91.18%)

Mutual labels: linguistics

Phonemes

Jason Riggle's chart of phonological features in JSON format + extras

Stars: ✭ 33 (-83.82%)

Mutual labels: linguistics

NatLang

NatLang is an English parser with an extensible grammar

Stars: ✭ 20 (-90.2%)

Mutual labels: linguistics

Colibri Core

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

Stars: ✭ 112 (-45.1%)

Mutual labels: linguistics

LangPad

A word processor/dictionary/generally useful tool for linguistics.

Stars: ✭ 20 (-90.2%)

Mutual labels: linguistics

Nltk data

NLTK Data

Stars: ✭ 675 (+230.88%)

Mutual labels: linguistics

libpalaso

Palaso Library: A set of .Net libraries useful for developers of Language Software.

Stars: ✭ 36 (-82.35%)

Mutual labels: linguistics

Rime Cantonese

Rime Cantonese input schema | 粵語拼音輸入方案

Stars: ✭ 173 (-15.2%)

Mutual labels: linguistics

verbecc

Complete Conjugation of any Verb using Machine Learning for French, Spanish, Portuguese, Italian and Romanian

Stars: ✭ 45 (-77.94%)

Mutual labels: linguistics

Lexpredict Lexnlp

LexNLP by LexPredict

Stars: ✭ 439 (+115.2%)

Mutual labels: linguistics

KoParadigm

KoParadigm: Korean Inflectional Paradigm Generator

Stars: ✭ 48 (-76.47%)

Mutual labels: linguistics

Elpis

🙊 WIP software for creating speech recognition models.

Stars: ✭ 101 (-50.49%)

Mutual labels: linguistics

dev

PHOIBLE data and development.

Stars: ✭ 90 (-55.88%)

Mutual labels: linguistics

rsyntaxtree

Syntax tree generator made with Ruby and RMagic

Stars: ✭ 62 (-69.61%)

Mutual labels: linguistics

corpusexplorer2.0

Korpuslinguistik war noch nie so einfach...

Stars: ✭ 16 (-92.16%)

Mutual labels: linguistics

Pycantonese

Cantonese Linguistics and NLP in Python

Stars: ✭ 147 (-27.94%)

Mutual labels: linguistics

lambda-notebook

Lambda Notebook: Formal Semantics in Jupyter

Stars: ✭ 16 (-92.16%)

Mutual labels: linguistics

treebender

A HDPSG-inspired symbolic natural language parser written in Rust

Stars: ✭ 24 (-88.24%)

Mutual labels: linguistics

lingtypology

R package for linguistic cartography and typological databases search

Stars: ✭ 47 (-76.96%)

Mutual labels: linguistics

Flat

FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.

Stars: ✭ 93 (-54.41%)

Mutual labels: linguistics

TextGridTools

Read, write, and manipulate Praat TextGrid files with Python

Stars: ✭ 84 (-58.82%)

Mutual labels: linguistics

Hangulize

Korean Alphabet Transcription

Stars: ✭ 184 (-9.8%)

Mutual labels: linguistics

Prosodic

Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.