Company Names Corpus公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
Stars: ✭ 868 (+3673.91%)
Mutual labels: corpus, dataset, dict
Chinese Names Corpus中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Stars: ✭ 3,053 (+13173.91%)
Mutual labels: corpus, dataset, dict
Cluepretrainedmodels高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Stars: ✭ 493 (+2043.48%)
Mutual labels: corpus, dataset
Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+28839.13%)
Mutual labels: corpus, dataset
Ua GecUA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Stars: ✭ 108 (+369.57%)
Mutual labels: corpus, dataset
CoarijCorpus of Annual Reports in Japan
Stars: ✭ 55 (+139.13%)
Mutual labels: corpus, dataset
Awesome Hungarian NlpA curated list of NLP resources for Hungarian
Stars: ✭ 121 (+426.09%)
Mutual labels: corpus, dataset
ProsodyHelsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Stars: ✭ 139 (+504.35%)
Mutual labels: corpus, dataset
Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+10443.48%)
Mutual labels: corpus, dataset
FakenewscorpusA dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (+1008.7%)
Mutual labels: corpus, dataset
Dataset Listlists of text corpus and more (mainly Japanese)
Stars: ✭ 84 (+265.22%)
Mutual labels: corpus, dataset
Dialog corpus用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Stars: ✭ 1,662 (+7126.09%)
Mutual labels: corpus, dataset
Nlp bahasa resourcesA Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+586.96%)
Mutual labels: corpus, dataset
PoetryCorpusПоэтический корпус русского языка
Stars: ✭ 40 (+73.91%)
Mutual labels: corpus