All Projects → khiajohnson → SpiCE-Corpus

khiajohnson / SpiCE-Corpus

Licence: other
An open-access corpus of conversational bilingual speech in Cantonese and English

Programming Languages

javascript
184084 projects - #8 most used programming language
HTML
75241 projects
CSS
56736 projects

Projects that are alternatives of or similar to SpiCE-Corpus

text-classification-cn
中文文本分类实践,基于搜狗新闻语料库,采用传统机器学习方法以及预训练模型等方法
Stars: ✭ 81 (+145.45%)
Mutual labels:  corpus
PoetryCorpus
Поэтический корпус русского языка
Stars: ✭ 40 (+21.21%)
Mutual labels:  corpus
thai-language
computer tools for thai language
Stars: ✭ 20 (-39.39%)
Mutual labels:  corpus
TV4Dialog
No description or website provided.
Stars: ✭ 33 (+0%)
Mutual labels:  corpus
CBLUE
中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Stars: ✭ 379 (+1048.48%)
Mutual labels:  corpus
bible-corpus
A multilingual parallel corpus created from translations of the Bible.
Stars: ✭ 115 (+248.48%)
Mutual labels:  corpus
thaigov-corpus
โครงการเก็บรวบรวมข่าวสารจากเว็บไซต์รัฐบาลไทย
Stars: ✭ 19 (-42.42%)
Mutual labels:  corpus
OneStopEnglishCorpus
No description or website provided.
Stars: ✭ 38 (+15.15%)
Mutual labels:  corpus
pdf-corpus
Python script to quickly create hand-crafted PDF files
Stars: ✭ 17 (-48.48%)
Mutual labels:  corpus
named-entity-recognition-template
Build a deep learning model for predicting the named entities from text.
Stars: ✭ 51 (+54.55%)
Mutual labels:  corpus
mw-thesaurus.el
Merriam-Webster Thesaurus in Emacs
Stars: ✭ 84 (+154.55%)
Mutual labels:  english-language
egret-wenda-corpus
A Public Corpus for Machine Learning
Stars: ✭ 41 (+24.24%)
Mutual labels:  corpus
KWDLC
Kyoto University Web Document Leads Corpus
Stars: ✭ 64 (+93.94%)
Mutual labels:  corpus
LanguageCodes
We present a list of languages with their codes, families, regions and etc. We also present a list of multi-lingual corpora (with urls).
Stars: ✭ 70 (+112.12%)
Mutual labels:  corpus
folia
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (+69.7%)
Mutual labels:  corpus
kanji-frequency
Kanji usage frequency data collected from various sources
Stars: ✭ 92 (+178.79%)
Mutual labels:  corpus
CLUEmotionAnalysis2020
CLUE Emotion Analysis Dataset 细粒度情感分析数据集
Stars: ✭ 3 (-90.91%)
Mutual labels:  corpus
OpenDialog
An Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)
Stars: ✭ 94 (+184.85%)
Mutual labels:  corpus
PubMed-PICO-Detection
PubMed PICO Element Detection Dataset
Stars: ✭ 37 (+12.12%)
Mutual labels:  corpus
cljs-corpus
A greppable archive of ClojureScript code
Stars: ✭ 37 (+12.12%)
Mutual labels:  corpus

SpiCE Corpus

SpiCE is an open-access corpus of conversational bilingual Speech in Cantonese and English. It was initially released in May 2021. See the documentation https://spice-corpus.readthedocs.io/ for details regarding access, design, and research with the corpus.

Contact [email protected] with any questions.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].