All Projects → SpiCE-Corpus → Similar Projects or Alternatives

110 Open source projects that are alternatives of or similar to SpiCE-Corpus

Blade-and-Soul-2-Localization

Localization For Blade and Soul 2

Stars: ✭ 17 (-48.48%)

Mutual labels: english-language

Chinese Names Corpus

中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。

Stars: ✭ 3,053 (+9151.52%)

Mutual labels: corpus

BotSmartScheduler

Enhance your planning capabilities with this smart bot!

Stars: ✭ 44 (+33.33%)

Mutual labels: english-language

Chatbot-Training-Corpus

总结了一些可以用作聊天机器人训练实作的文字语聊，包含中英文不同语言

Stars: ✭ 117 (+254.55%)

Mutual labels: corpus

中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard

Stars: ✭ 2,425 (+7248.48%)

Mutual labels: corpus

text-classification-cn

中文文本分类实践，基于搜狗新闻语料库，采用传统机器学习方法以及预训练模型等方法

Stars: ✭ 81 (+145.45%)

Mutual labels: corpus

Rich Context leaderboard competition, including the corpus and current SOTA for required tasks.

Stars: ✭ 20 (-39.39%)

Mutual labels: corpus

Поэтический корпус русского языка

Stars: ✭ 40 (+21.21%)

Mutual labels: corpus

Efaqa Corpus Zh

❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库

Stars: ✭ 170 (+415.15%)

Mutual labels: corpus

Text corpus for Bahasa Malaysia, https://malaya.readthedocs.io/en/latest/Dataset.html

Stars: ✭ 189 (+472.73%)

Mutual labels: corpus

opensource-voice-tools

A repo listing known open source voice tools, ordered by where they sit in the voice stack

Stars: ✭ 21 (-36.36%)

Mutual labels: corpus

Awesome Chatbot

Awesome Chatbot Projects,Corpus,Papers,Tutorials.Chinese Chatbot =>:

Stars: ✭ 1,785 (+5309.09%)

Mutual labels: corpus

No description or website provided.

Stars: ✭ 33 (+0%)

Mutual labels: corpus

Speech-Corpus-Collection

A Collection of Speech Corpus for ASR and TTS

Stars: ✭ 113 (+242.42%)

Mutual labels: corpus

A multilingual parallel corpus created from translations of the Bible.

Stars: ✭ 115 (+248.48%)

Mutual labels: corpus

A free, open-source, offline Cantonese Dictionary for Windows, Mac, and Linux. Qt, SQLite. C++ and Python.

Stars: ✭ 67 (+103.03%)

Mutual labels: cantonese-language

โครงการเก็บรวบรวมข่าวสารจากเว็บไซต์รัฐบาลไทย

Stars: ✭ 19 (-42.42%)

Mutual labels: corpus

A merged version of multiple open-source German speech datasets.

Stars: ✭ 21 (-36.36%)

Mutual labels: corpus

computer tools for thai language

Stars: ✭ 20 (-39.39%)

Mutual labels: corpus

Weibo terminater

Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator

Stars: ✭ 2,295 (+6854.55%)

Mutual labels: corpus

A meta-corpus of functional harmonic analysis.

Stars: ✭ 35 (+6.06%)

Mutual labels: corpus

Indonesian Nlp Resources

data resource untuk NLP bahasa indonesia

Stars: ✭ 143 (+333.33%)

Mutual labels: corpus

中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark

Stars: ✭ 379 (+1048.48%)

Mutual labels: corpus

Gossiping Chinese Corpus

PTT 八卦版問答中文語料

Stars: ✭ 137 (+315.15%)

Mutual labels: corpus

Repository for the Georgetown University Multilayer Corpus (GUM)

Stars: ✭ 71 (+115.15%)

Mutual labels: corpus

Text conversion tool (from e.g. Word, HTML, txt) to corpus formats TEI or FoLiA)

Stars: ✭ 20 (-39.39%)

Mutual labels: corpus

KH Coder: for Quantitative Content Analysis or Text Mining

Stars: ✭ 126 (+281.82%)

Mutual labels: corpus

mw-thesaurus.el

Merriam-Webster Thesaurus in Emacs

Stars: ✭ 84 (+154.55%)

Mutual labels: english-language

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

Stars: ✭ 711 (+2054.55%)

Mutual labels: corpus

Kyoto University Web Document Leads Corpus

Stars: ✭ 64 (+93.94%)

Mutual labels: corpus

TVsub: DCU-Tencent Chinese-English Dialogue Corpus

Stars: ✭ 40 (+21.21%)

Mutual labels: corpus

We present a list of languages with their codes, families, regions and etc. We also present a list of multi-lingual corpora (with urls).

Stars: ✭ 70 (+112.12%)

Mutual labels: corpus

proiel-treebank

Official releases of the PROIEL treebank of ancient Indo-European languages

Stars: ✭ 30 (-9.09%)

Mutual labels: corpus

FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…

Stars: ✭ 56 (+69.7%)

Mutual labels: corpus

DANeS is an open-source E-newspaper dataset by collaboration between DATASET JSC (dataset.vn) and AIV Group (aivgroup.vn)

Stars: ✭ 64 (+93.94%)

Mutual labels: corpus

kanji-frequency

Kanji usage frequency data collected from various sources

Stars: ✭ 92 (+178.79%)

Mutual labels: corpus

Probabilistic-RNN-DA-Classifier

Probabilistic Dialogue Act Classification for the Switchboard Corpus using an LSTM model

Stars: ✭ 22 (-33.33%)

Mutual labels: corpus

CLUEmotionAnalysis2020

CLUE Emotion Analysis Dataset 细粒度情感分析数据集

Stars: ✭ 3 (-90.91%)

Mutual labels: corpus

A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the data and parse compound words.

Stars: ✭ 101 (+206.06%)

Mutual labels: corpus

MEV Data Corpus

Stars: ✭ 77 (+133.33%)

Mutual labels: corpus

Dialogue-Corpus

No description or website provided.

Stars: ✭ 27 (-18.18%)

Mutual labels: corpus

OneStopEnglishCorpus

No description or website provided.

Stars: ✭ 38 (+15.15%)

Mutual labels: corpus

Awesome Deeplearning Resources

Deep Learning and deep reinforcement learning research papers and some codes

Stars: ✭ 2,483 (+7424.24%)

Mutual labels: corpus

The Business Scene Dialogue corpus

Stars: ✭ 51 (+54.55%)

Mutual labels: corpus

Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.

Stars: ✭ 192 (+481.82%)

Mutual labels: corpus

Python script to quickly create hand-crafted PDF files

Stars: ✭ 17 (-48.48%)

Mutual labels: corpus

Nlp bahasa resources

A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia

Stars: ✭ 158 (+378.79%)

Mutual labels: corpus

Text collections made available by the CLiGS group.

Stars: ✭ 19 (-42.42%)

Mutual labels: corpus

WP2TXT extracts plain text data from Wikipedia dump file (encoded in XML/compressed with Bzip2) stripping all the MediaWiki markups and other metadata.

Stars: ✭ 145 (+339.39%)

Mutual labels: corpus

named-entity-recognition-template

Build a deep learning model for predicting the named entities from text.

Stars: ✭ 51 (+54.55%)

Mutual labels: corpus

Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text

Stars: ✭ 139 (+321.21%)

Mutual labels: corpus

open2ch-dialogue-corpus

おーぷん2ちゃんねるをクロールして作成した対話コーパス

Stars: ✭ 65 (+96.97%)

Mutual labels: corpus

Code Docstring Corpus

Preprocessed Python functions and docstrings for automated code documentation (code2doc) and automated code generation (doc2code) tasks.

Stars: ✭ 137 (+315.15%)

Mutual labels: corpus

egret-wenda-corpus

A Public Corpus for Machine Learning

Stars: ✭ 41 (+24.24%)

Mutual labels: corpus

New York Times Word Innovation Types dataset

Stars: ✭ 21 (-36.36%)

Mutual labels: corpus

An Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统，一键部署微信闲聊机器人)

Stars: ✭ 94 (+184.85%)

Mutual labels: corpus

PubMed-PICO-Detection

PubMed PICO Element Detection Dataset

Stars: ✭ 37 (+12.12%)

Mutual labels: corpus

A greppable archive of ClojureScript code

Stars: ✭ 37 (+12.12%)

Mutual labels: corpus

Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)

Stars: ✭ 66 (+100%)

Mutual labels: corpus

Convert a PDF via OCR to a TXT file in UTF-8 encoding

Stars: ✭ 90 (+172.73%)

Mutual labels: corpus

1-60 of 110 similar projects