ikegami-yukino / Dataset List
Licence: wtfpl
lists of text corpus and more (mainly Japanese)
Stars: ✭ 84
Projects that are alternatives of or similar to Dataset List
Clue
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+2786.9%)
Mutual labels: dataset, corpus
Indonesian Nlp Resources
data resource untuk NLP bahasa indonesia
Stars: ✭ 143 (+70.24%)
Mutual labels: dataset, corpus
Dialog corpus
用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Stars: ✭ 1,662 (+1878.57%)
Mutual labels: dataset, corpus
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+7823.81%)
Mutual labels: dataset, corpus
Chinese Names Corpus
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Stars: ✭ 3,053 (+3534.52%)
Mutual labels: dataset, corpus
Nlp bahasa resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+88.1%)
Mutual labels: dataset, corpus
Cluepretrainedmodels
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Stars: ✭ 493 (+486.9%)
Mutual labels: dataset, corpus
Fakenewscorpus
A dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (+203.57%)
Mutual labels: dataset, corpus
Awesome Hungarian Nlp
A curated list of NLP resources for Hungarian
Stars: ✭ 121 (+44.05%)
Mutual labels: dataset, corpus
Company Names Corpus
公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
Stars: ✭ 868 (+933.33%)
Mutual labels: dataset, corpus
Ua Gec
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Stars: ✭ 108 (+28.57%)
Mutual labels: dataset, corpus
Prosody
Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Stars: ✭ 139 (+65.48%)
Mutual labels: dataset, corpus
Google Covid19 Mobility Reports
Data extraction of Google's COVID-19 Mobility Reports
Stars: ✭ 82 (-2.38%)
Mutual labels: dataset
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-5.95%)
Mutual labels: dataset
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].