All Projects → CLUEbenchmark → CLUEmotionAnalysis2020

CLUEbenchmark / CLUEmotionAnalysis2020

Licence: other
CLUE Emotion Analysis Dataset 细粒度情感分析数据集

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects
shell
77523 projects

Projects that are alternatives of or similar to CLUEmotionAnalysis2020

Cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
Stars: ✭ 2,112 (+70300%)
Mutual labels:  sentiment-analysis, corpus, chinese
XED
XED multilingual emotion datasets
Stars: ✭ 34 (+1033.33%)
Mutual labels:  sentiment-analysis, emotion-recognition
Chinese financial sentiment dictionary
A Chinese financial sentiment word dictionary
Stars: ✭ 67 (+2133.33%)
Mutual labels:  sentiment-analysis, chinese
Indonesian Nlp Resources
data resource untuk NLP bahasa indonesia
Stars: ✭ 143 (+4666.67%)
Mutual labels:  sentiment-analysis, corpus
Clue
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+80733.33%)
Mutual labels:  corpus, chinese
Weibo terminater
Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator
Stars: ✭ 2,295 (+76400%)
Mutual labels:  corpus, chinese
DeepSentiPers
Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"
Stars: ✭ 17 (+466.67%)
Mutual labels:  sentiment-analysis, corpus
Cluecorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Stars: ✭ 278 (+9166.67%)
Mutual labels:  corpus, chinese
Nlp4han
中文自然语言处理工具集【断句/分词/词性标注/组块/句法分析/语义分析/NER/N元语法/HMM/代词消解/情感分析/拼写检查】
Stars: ✭ 206 (+6766.67%)
Mutual labels:  sentiment-analysis, chinese
TV4Dialog
No description or website provided.
Stars: ✭ 33 (+1000%)
Mutual labels:  corpus, chinese
Emotion and Polarity SO
An emotion classifier of text containing technical content from the SE domain
Stars: ✭ 74 (+2366.67%)
Mutual labels:  sentiment-analysis, emotion-recognition
Datasets
Poetry-related datasets developed by THUAIPoet (Jiuge) group.
Stars: ✭ 111 (+3600%)
Mutual labels:  corpus, chinese
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+221766.67%)
Mutual labels:  corpus, chinese
ntua-slp-semeval2018
Deep-learning models of NTUA-SLP team submitted in SemEval 2018 tasks 1, 2 and 3.
Stars: ✭ 79 (+2533.33%)
Mutual labels:  sentiment-analysis, emotion-recognition
Cluepretrainedmodels
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Stars: ✭ 493 (+16333.33%)
Mutual labels:  corpus, chinese
CBLUE
中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Stars: ✭ 379 (+12533.33%)
Mutual labels:  corpus, chinese
OpenDialog
An Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)
Stars: ✭ 94 (+3033.33%)
Mutual labels:  corpus, chinese
Nlp bahasa resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+5166.67%)
Mutual labels:  sentiment-analysis, corpus
converse
Conversational text Analysis using various NLP techniques
Stars: ✭ 147 (+4800%)
Mutual labels:  sentiment-analysis, emotion-recognition
hfusion
Multimodal sentiment analysis using hierarchical fusion with context modeling
Stars: ✭ 42 (+1300%)
Mutual labels:  sentiment-analysis, emotion-recognition

CLUEEmotion2020

CLUE Emotion Analysis Dataset 情感分析数据集

Data Description

This dataset in data directory is emotion analysis corpus, with each sample annotated with one emotion label. The label set is like, happiness, sadness, anger, disgust, fear and surprise.

This dataset is from the following paper:

Minglei Li, Yunfei Long, Qin Lu, and Wenjie Li. “Emotion Corpus Construction Based on Selection from Hashtags.” In Proceedings of International Conference on Language Resources and Evaluation (LREC). Portorož, Slovenia, 2016

The corpus statistics and lable distribution are as follows:

label_distribution

The train, valid and test set is split by the ratio of 8:1:1 and encoded in UTF-8.

Baseline results

Test results of different classification models on this dataset.

Models Accuracy Parameters
BERT-base 60.7% Epoch 3, batch 32, max_seq_len 128

Reproduce the results

The code is based on the original CLUE source code, which is based on the original Google BERT code, and the pre-trained language model is BERT Base Chinese version.

Env

tensorflow 1.12

Run command

cd models/bert
./run_classifier_emotion.sh
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].