All Projects → Peter-Chou → ai_challenger_2018_sentiment_analysis

Peter-Chou / ai_challenger_2018_sentiment_analysis

Licence: other
Fine-grained Sentiment Analysis of User Reviews --- AI CHALLENGER 2018

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to ai challenger 2018 sentiment analysis

german-sentiment
A data set and model for german sentiment classification.
Stars: ✭ 37 (+131.25%)
Mutual labels:  sentiment-analysis, transformer
NLP-paper
🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (+43.75%)
Mutual labels:  transformer, textcnn
h-transformer-1d
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning
Stars: ✭ 121 (+656.25%)
Mutual labels:  transformer, attention
Sentimentanalysis
Sentiment analysis neural network trained by fine-tuning BERT, ALBERT, or DistilBERT on the Stanford Sentiment Treebank.
Stars: ✭ 186 (+1062.5%)
Mutual labels:  sentiment-analysis, transformer
visualization
a collection of visualization function
Stars: ✭ 189 (+1081.25%)
Mutual labels:  transformer, attention
seq2seq-pytorch
Sequence to Sequence Models in PyTorch
Stars: ✭ 41 (+156.25%)
Mutual labels:  transformer, attention
CrabNet
Predict materials properties using only the composition information!
Stars: ✭ 57 (+256.25%)
Mutual labels:  transformer, attention
Onnxt5
Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Stars: ✭ 143 (+793.75%)
Mutual labels:  sentiment-analysis, transformer
Relation-Extraction-Transformer
NLP: Relation extraction with position-aware self-attention transformer
Stars: ✭ 63 (+293.75%)
Mutual labels:  transformer, attention
transformer
A PyTorch Implementation of "Attention Is All You Need"
Stars: ✭ 28 (+75%)
Mutual labels:  transformer, attention
Datastories Semeval2017 Task4
Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".
Stars: ✭ 184 (+1050%)
Mutual labels:  sentiment-analysis, attention
NTUA-slp-nlp
💻Speech and Natural Language Processing (SLP & NLP) Lab Assignments for ECE NTUA
Stars: ✭ 19 (+18.75%)
Mutual labels:  sentiment-analysis, attention
Multimodal Sentiment Analysis
Attention-based multimodal fusion for sentiment analysis
Stars: ✭ 172 (+975%)
Mutual labels:  sentiment-analysis, attention
TRAR-VQA
[ICCV 2021] TRAR: Routing the Attention Spans in Transformers for Visual Question Answering -- Official Implementation
Stars: ✭ 49 (+206.25%)
Mutual labels:  transformer, attention
Hey Jetson
Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.
Stars: ✭ 161 (+906.25%)
Mutual labels:  sentiment-analysis, attention
learningspoons
nlp lecture-notes and source code
Stars: ✭ 29 (+81.25%)
Mutual labels:  transformer, attention
Absa Pytorch
Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
Stars: ✭ 1,181 (+7281.25%)
Mutual labels:  sentiment-analysis, attention
Absa keras
Keras Implementation of Aspect based Sentiment Analysis
Stars: ✭ 126 (+687.5%)
Mutual labels:  sentiment-analysis, attention
COVID-19-Tweet-Classification-using-Roberta-and-Bert-Simple-Transformers
Rank 1 / 216
Stars: ✭ 24 (+50%)
Mutual labels:  sentiment-analysis, transformer
ntua-slp-semeval2018
Deep-learning models of NTUA-SLP team submitted in SemEval 2018 tasks 1, 2 and 3.
Stars: ✭ 79 (+393.75%)
Mutual labels:  sentiment-analysis, attention

细粒度用户评论情感分析 (全球AI挑战赛 2018)

项目简介

根据客户的评论,对20个方面进行情感分析(-2:未提及、-1:负面、0:中性、1:正面)
解决问题思路:将这20个多分类任务问题看作一个 多任务学习 问题来搭建模型
解决方案模型: SemEval-2018中论文 (Attention-based Convolutional Neural Networks for Multi-label Emotion Classification) 的实现及应用。

预处理

繁体转简体

使用opencc 将文件中的繁体转换成简体

opencc -i data/train/sentiment_analysis_trainingset.csv -o data/train/train_sc.csv -c t2s.json
opencc -i data/val/sentiment_analysis_validationset.csv -o data/val/val_sc.csv -c t2s.json
opencc -i data/test/a/sentiment_analysis_testa.csv -o data/test/a/a_sc.csv -c t2s.json
opencc -i data/test/b/sentiment_analysis_testb.csv -o data/test/b/b_sc.csv -c t2s.json

中文词向量

简体中文的词向量chinese word vectors 里的Word2vec / Skip-Gram with Negative Sampling,内容选择微博 (Word + Character + Ngram)
中文停用词使用此微博中文停用词库 (其中去除0-9)

分词

分词使用的是jieba包, 主要先按词组拆分,如果词组不在词库(已去除停用词)中出现,再将该词组按字拆分,
因为考虑到项目为辨析情绪非翻译,考虑弱化语言结构,所以这里对未在词库中出现的新词不进行保留。

python preprocess_data.py --data_dir data/train
python preprocess_data.py --data_dir data/val
python preprocess_data.py -t --data_dir data/test/a
python preprocess_data.py -t --data_dir data/test/b

模型

模型结构

模型由参数共享的语句理解层和参数独立的情感辨别层:

  • 特征共享层:由1词向量层 + 1位置向量层(提供位置信息) + 3个Transformer Encoder 自注意力模块组成
  • 情感辨别层:由1卷积层 + 1最大池化层 + 1全连接层组成

attn_conv picture

该模型的思路是模仿人处理该问题的行为:第一步理解语句(自注意力模块),第二步辨别情感(卷积+最大池化)

Transformer Encoder: 自注意力模块

Transformer是由谷歌团队在Attention Is All You Need首次提出,这里使用的是Encoder中的自注意力Transformer
自注意力Transformer Encoder对输入进行线性变换得到每个位置的query和(key, value)键值对,
通过对query和key求点积来寻找与query最相关的key并对其结果使用softmax得到该键值对的权重。
这个query的回答就是:sum(value * 对应权重)
最后对这个query的回答进行维度缩放(使用position-wise feed forword,即一维卷积,stride=1, 激活函数为relu)
这样若有N个位置,得到N个query及其对应的回答

transformer_encoder picture

CNN情感辨别模块

这里借鉴的是Yoon Kim在Convolutional Neural Networks for Sentence Classification提出的架构。其中:
卷积层kernel的宽度为Transformer提取的Attention的维度大小,kernel的高度取10(即对临近的10个Attention进行卷积操作)。kernel的数量取64
最大池化的作用范围为整个feature map,即每个Kernel得到的feature map在经过最大池化后被提炼为一个值

textcnn pic

训练 / 推断

训练

python main.py --model_dir output

推断

python main.py -t --test_dir path/to/test/folder --model_dir output

效果

Average F1: 0.61

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].