Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"

Stars: ✭ 17 (-96.32%)

Mutual labels: corpus

berserker

Berserker - BERt chineSE woRd toKenizER

Stars: ✭ 17 (-96.32%)

Mutual labels: chinese-nlp

Zhparser

zhparser is a PostgreSQL extension for full-text search of Chinese language

Stars: ✭ 418 (-9.52%)

Mutual labels: chinese-nlp

SpiCE-Corpus

An open-access corpus of conversational bilingual speech in Cantonese and English

Stars: ✭ 33 (-92.86%)

Mutual labels: corpus

LanguageCodes

We present a list of languages with their codes, families, regions and etc. We also present a list of multi-lingual corpora (with urls).

Stars: ✭ 70 (-84.85%)

Mutual labels: corpus

kanji-frequency

Kanji usage frequency data collected from various sources

Stars: ✭ 92 (-80.09%)

Mutual labels: corpus

thaigov-corpus

โครงการเก็บรวบรวมข่าวสารจากเว็บไซต์รัฐบาลไทย

Stars: ✭ 19 (-95.89%)

Mutual labels: corpus

Fakenewscorpus

A dataset of millions of news articles scraped from a curated list of data sources.

Stars: ✭ 255 (-44.81%)

Mutual labels: corpus

chinese-nlp-ner

一套针对中文实体识别的BLSTM-CRF解决方案

Stars: ✭ 14 (-96.97%)

Mutual labels: chinese-nlp

BSD

The Business Scene Dialogue corpus

Stars: ✭ 51 (-88.96%)

Mutual labels: corpus

named-entity-recognition-template

Build a deep learning model for predicting the named entities from text.

Stars: ✭ 51 (-88.96%)

Mutual labels: corpus

EdgarAllanPoetry

Computer-generated poetry

Stars: ✭ 22 (-95.24%)

Mutual labels: corpus

ChineseBert

This is a chinese Bert model specific for question answering

Stars: ✭ 24 (-94.81%)

Mutual labels: chinese-nlp

Ltp

Language Technology Platform

Stars: ✭ 3,648 (+689.61%)

Mutual labels: chinese-nlp

KWDLC

Kyoto University Web Document Leads Corpus

Stars: ✭ 64 (-86.15%)

Mutual labels: corpus

fastmorph

Fast corpus search engine originally made for the Corpus of Written Tatar language

Stars: ✭ 14 (-96.97%)

Mutual labels: corpus

CLUEmotionAnalysis2020

CLUE Emotion Analysis Dataset 细粒度情感分析数据集

Stars: ✭ 3 (-99.35%)

Mutual labels: corpus

Corpora

A collection of small corpuses of interesting data for the creation of bots and similar stuff.

Stars: ✭ 4,293 (+829.22%)

Mutual labels: corpus

pdf-corpus

Python script to quickly create hand-crafted PDF files

Stars: ✭ 17 (-96.32%)

Mutual labels: corpus

Species-Names-Corpus

物种名称语料库。植物名,动物名。

Stars: ✭ 23 (-95.02%)

Mutual labels: corpus

egret-wenda-corpus

A Public Corpus for Machine Learning

Stars: ✭ 41 (-91.13%)

Mutual labels: corpus

Thulac Java

An Efficient Lexical Analyzer for Chinese

Stars: ✭ 285 (-38.31%)

Mutual labels: chinese-nlp

THUCKE

THU Chinese Keyphrase Extraction Toolkit

Stars: ✭ 116 (-74.89%)

Mutual labels: chinese-nlp

dialogue-datasets

collect the open dialog corpus and some useful data processing utils.

Stars: ✭ 24 (-94.81%)

Mutual labels: corpus

ltp4j

ltp4j: Language Technology Platform For Java

Stars: ✭ 165 (-64.29%)

Mutual labels: chinese-nlp

Bookcorpus

Crawl BookCorpus

Stars: ✭ 443 (-4.11%)

Mutual labels: corpus

TV4Dialog

No description or website provided.

Stars: ✭ 33 (-92.86%)

Mutual labels: corpus

fuzzing-corpus

My fuzzing corpus

Stars: ✭ 120 (-74.03%)

Mutual labels: corpus

text-classification-cn

中文文本分类实践，基于搜狗新闻语料库，采用传统机器学习方法以及预训练模型等方法

Stars: ✭ 81 (-82.47%)

Mutual labels: corpus

Korpora

Korean corpus repository

Stars: ✭ 270 (-41.56%)

Mutual labels: corpus

Electra with tensorflow

This is an implementation of electra according to the paper {ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators}

Stars: ✭ 13 (-97.19%)

Mutual labels: chinese-nlp

OpenDialog

An Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统，一键部署微信闲聊机器人)

Stars: ✭ 94 (-79.65%)

Mutual labels: corpus

mev-corpus

MEV Data Corpus

Stars: ✭ 77 (-83.33%)

Mutual labels: corpus

Wordless

An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation

Stars: ✭ 378 (-18.18%)

Mutual labels: corpus

OneStopEnglishCorpus

No description or website provided.

Stars: ✭ 38 (-91.77%)

Mutual labels: corpus

When-in-Rome

A meta-corpus of functional harmonic analysis.

Stars: ✭ 35 (-92.42%)

Mutual labels: corpus

textbox

Text collections made available by the CLiGS group.

Stars: ✭ 19 (-95.89%)

Mutual labels: corpus

malay-dataset

Text corpus for Bahasa Malaysia, https://malaya.readthedocs.io/en/latest/Dataset.html

Stars: ✭ 189 (-59.09%)

Mutual labels: corpus

Medical-Names-Corpus

医疗语料库。医疗机构名语料库。药品本位码。

Stars: ✭ 26 (-94.37%)

Mutual labels: corpus

PubMed-PICO-Detection

PubMed PICO Element Detection Dataset

Stars: ✭ 37 (-91.99%)

Mutual labels: corpus

open2ch-dialogue-corpus

おーぷん2ちゃんねるをクロールして作成した対話コーパス

Stars: ✭ 65 (-85.93%)

Mutual labels: corpus

Awesome Persian Nlp Ir

Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources

Stars: ✭ 460 (-0.43%)

Mutual labels: corpus

folia

FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…

Stars: ✭ 56 (-87.88%)

Mutual labels: corpus

gum

Repository for the Georgetown University Multilayer Corpus (GUM)

Stars: ✭ 71 (-84.63%)

Mutual labels: corpus

1-60 of 145 similar projects

›