Typing Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.

Stars: ✭ 32 (-66.67%)

Mutual labels: corpus

OpenDialog

An Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统，一键部署微信闲聊机器人)

Stars: ✭ 94 (-2.08%)

Mutual labels: corpus

Bookcorpus

Crawl BookCorpus

Stars: ✭ 443 (+361.46%)

Mutual labels: corpus

thai-language

computer tools for thai language

Stars: ✭ 20 (-79.17%)

Mutual labels: corpus

Blacklab

A corpus retrieval engine based on Apache Lucene

Stars: ✭ 69 (-28.12%)

Mutual labels: corpus

bible-corpus

A multilingual parallel corpus created from translations of the Bible.

Stars: ✭ 115 (+19.79%)

Mutual labels: corpus

Fuzzdata

Fuzzing resources for feeding various fuzzers with input. 🔧

Stars: ✭ 376 (+291.67%)

Mutual labels: corpus

CBLUE

中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark

Stars: ✭ 379 (+294.79%)

Mutual labels: corpus

Insuranceqa Corpus Zh

🚁 保险行业语料库，聊天机器人

Stars: ✭ 821 (+755.21%)

Mutual labels: corpus

LanguageCodes

We present a list of languages with their codes, families, regions and etc. We also present a list of multi-lingual corpora (with urls).

Stars: ✭ 70 (-27.08%)

Mutual labels: corpus

Medical-Names-Corpus

医疗语料库。医疗机构名语料库。药品本位码。

Stars: ✭ 26 (-72.92%)

Mutual labels: corpus

wordfish-python

extract relationships from standardized terms from corpus of interest with deep learning 🐟

Stars: ✭ 19 (-80.21%)

Mutual labels: corpus

thaigov-corpus

โครงการเก็บรวบรวมข่าวสารจากเว็บไซต์รัฐบาลไทย

Stars: ✭ 19 (-80.21%)

Mutual labels: corpus

Quanteda

An R package for the Quantitative Analysis of Textual Data

Stars: ✭ 647 (+573.96%)

Mutual labels: corpus

open-discourse

Open Discourse is the first fully comprehensive corpus of the plenary proceedings of the federal German Parliament (Bundestag).

Stars: ✭ 47 (-51.04%)

Mutual labels: corpus

Chatterbot Corpus

A multilingual dialog corpus

Stars: ✭ 964 (+904.17%)

Mutual labels: corpus

DeepSentiPers

Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"

Stars: ✭ 17 (-82.29%)

Mutual labels: corpus

Weixin public corpus

微信公众号语料库

Stars: ✭ 465 (+384.38%)

Mutual labels: corpus

Filipino-Text-Benchmarks

Open-source benchmark datasets and pretrained transformer models in the Filipino language.

Stars: ✭ 22 (-77.08%)

Mutual labels: corpus

Russian news corpus

Russian mass media stemmed texts corpus / Корпус лемматизированных (морфологически нормализованных) текстов российских СМИ

Stars: ✭ 76 (-20.83%)

Mutual labels: corpus

SpiCE-Corpus

An open-access corpus of conversational bilingual speech in Cantonese and English

Stars: ✭ 33 (-65.62%)

Mutual labels: corpus

Awesome Persian Nlp Ir

Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources

Stars: ✭ 460 (+379.17%)

Mutual labels: corpus

OneStopEnglishCorpus

No description or website provided.

Stars: ✭ 38 (-60.42%)

Mutual labels: corpus

Lyrics Corpora

An unofficial Python API that allows users to create a corpus of lyrical text from their favorite artists and billboard charts

Stars: ✭ 13 (-86.46%)

Mutual labels: corpus

folia

FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…

Stars: ✭ 56 (-41.67%)

Mutual labels: corpus

Chinese Nlp Corpus

Collections of Chinese NLP corpus

Stars: ✭ 438 (+356.25%)

Mutual labels: corpus

named-entity-recognition-template

Build a deep learning model for predicting the named entities from text.

Stars: ✭ 51 (-46.87%)

Mutual labels: corpus

Dataset List

lists of text corpus and more (mainly Japanese)

Stars: ✭ 84 (-12.5%)

Mutual labels: corpus

KWDLC

Kyoto University Web Document Leads Corpus

Stars: ✭ 64 (-33.33%)

Mutual labels: corpus

Wordless

An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation

Stars: ✭ 378 (+293.75%)

Mutual labels: corpus

CLUEmotionAnalysis2020

CLUE Emotion Analysis Dataset 细粒度情感分析数据集

Stars: ✭ 3 (-96.87%)

Mutual labels: corpus

Naive Bayes Classifier

Naive Bayes classifier is classification algorithm. It uses Naive based Bernoulli and Multinomial equation to classify documents(Text) as ham or spam.

Stars: ✭ 6 (-93.75%)

Mutual labels: corpus

pdf-corpus

Python script to quickly create hand-crafted PDF files

Stars: ✭ 17 (-82.29%)

Mutual labels: corpus

Cluecorpus2020

Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料

Stars: ✭ 278 (+189.58%)

Mutual labels: corpus

egret-wenda-corpus

A Public Corpus for Machine Learning