All Projects → ROGERDJQ → RoBERTaABSA

ROGERDJQ / RoBERTaABSA

Licence: other
Implementation of paper Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to RoBERTaABSA

openroberta-lab
The programming environment »Open Roberta Lab« by Fraunhofer IAIS enables children and adolescents to program robots. A variety of different programming blocks are provided to program motors and sensors of the robot. Open Roberta Lab uses an approach of graphical programming so that beginners can seamlessly start coding. As a cloud-based applica…
Stars: ✭ 98 (-12.5%)
Mutual labels:  roberta
erc
Emotion recognition in conversation
Stars: ✭ 34 (-69.64%)
Mutual labels:  roberta
Bertviz
Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
Stars: ✭ 3,443 (+2974.11%)
Mutual labels:  roberta
Text-Summarization
Abstractive and Extractive Text summarization using Transformers.
Stars: ✭ 38 (-66.07%)
Mutual labels:  roberta
koclip
KoCLIP: Korean port of OpenAI CLIP, in Flax
Stars: ✭ 80 (-28.57%)
Mutual labels:  roberta
Albert zh
A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
Stars: ✭ 3,500 (+3025%)
Mutual labels:  roberta
japanese-pretrained-models
Code for producing Japanese pretrained models provided by rinna Co., Ltd.
Stars: ✭ 484 (+332.14%)
Mutual labels:  roberta
ABSADatasets
Public & Community-shared datasets for Aspect-based sentiment analysis and Text Classification
Stars: ✭ 49 (-56.25%)
Mutual labels:  absa
KLUE
📖 Korean NLU Benchmark
Stars: ✭ 420 (+275%)
Mutual labels:  roberta
Clue
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+2065.18%)
Mutual labels:  roberta
PoLitBert
Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good model.
Stars: ✭ 25 (-77.68%)
Mutual labels:  roberta
Tianchi2020ChineseMedicineQuestionGeneration
2020 阿里云天池大数据竞赛-中医药文献问题生成挑战赛
Stars: ✭ 20 (-82.14%)
Mutual labels:  roberta
Chinese Bert Wwm
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
Stars: ✭ 6,357 (+5575.89%)
Mutual labels:  roberta
roberta-wwm-base-distill
this is roberta wwm base distilled model which was distilled from roberta wwm by roberta wwm large
Stars: ✭ 61 (-45.54%)
Mutual labels:  roberta
Transferable-E2E-ABSA
Transferable End-to-End Aspect-based Sentiment Analysis with Selective Adversarial Learning (EMNLP'19)
Stars: ✭ 62 (-44.64%)
Mutual labels:  absa
RECCON
This repository contains the dataset and the PyTorch implementations of the models from the paper Recognizing Emotion Cause in Conversations.
Stars: ✭ 126 (+12.5%)
Mutual labels:  roberta
CLUE pytorch
CLUE baseline pytorch CLUE的pytorch版本基线
Stars: ✭ 72 (-35.71%)
Mutual labels:  roberta
AspectBasedSentimentAnalysis
Aspect Based Sentiment Analysis is a special type of sentiment analysis. In an explicit aspect, opinion is expressed on a target(opinion target), this aspect-polarity extraction is known as ABSA.
Stars: ✭ 61 (-45.54%)
Mutual labels:  absa
MemNet ABSA
No description or website provided.
Stars: ✭ 20 (-82.14%)
Mutual labels:  absa
Roberta zh
RoBERTa中文预训练模型: RoBERTa for Chinese
Stars: ✭ 1,953 (+1643.75%)
Mutual labels:  roberta

RoBERTaABSA

Implementation for paper Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa [NAACL 2021], a work focusing on Aspect-level Sentiment Classification (ALSC). It conducts a detailed study on the performance gain of dependency tree in ALSC, and provides a strong baseline using RoBERTa.

You can find more information here:

For any questions about code or paper, feel free to create issues or email me via [email protected].

If you are interested in on the whole ABSA task, please have a look at our ACL 2021 paper A Unified Generative Framework for Aspect-Based Sentiment Analysis.

Dependencies

We recommend to create a virtual environment.

conda create -n absa
conda activate absa

packages:

All codes are tested on linux only.

Data

English Datasets are released in Dataset folder for reproduction. If you want to process with your own data, please refer to scripts in Dataset folder.

Usage

To get ALSC result:

To get ALSC results (see Paperwithcode), run finetune.py in Train folder. Before code running, remember to check Notes below, and make sure --data_dir and --dataset are filled with correct dataset filepath and dataset name.

We also provide detailed arguments and logs here:

datasets args logs
Restaurant14 args logs
Laptop14 args logs
Twitter args logs

It is worth noting that above results are only tested by one run rather than averaged runs reported in the paper. Also remember to check Notes below.

To reproduce the whole experiment in paper:

It includes four steps to reproduce the whole experiment in our paper:

  1. Fine-tuning models on ALSC datasets using codes in Train folder, which will save fine-tuned models after fine-tuning.

    python finetune.py --data_dir {/your/dataset_filepath/} --dataset {dataset_name}
  2. Generate induced trees using code in Perturbed-Masking folder, which will output datasets serving as input for different models.

    python generate_matrix.py --model_path bert --data_dir /user/project/dataset/ --dataset Restaurant
  • model_path can be either bert/roberta/xlmroberta/xlmbert, or the model path where the above fine-tuned model is put.
  1. Generate data with different input format corrsponding to specific model.
  • ASGCN Input Data:
python generate_asgcn.py --layers 11
  • PWCN Input Data:
python generate_pwcn.py --layers 11
  • RGAT Input Data:
python generate_rgat.py --layers 11
  1. Run code in ASGCN, PWCN and RGAT.

Disclaimer

  • We made necessary changes to the original code of ASGCN, PWCN , RGAT and Perturbed-Masking. All the changes we have made are opensourced. We believe all the changes are under MIT License permission.
  • Errors may be raised if run above codes following their original steps. We recommand to run them (ASGCN, PWCN , RGAT and Perturbed-Masking) following the readme description in corresponding folders.

Notes

  • The learning rate in the paper was written incorrectly and should be corrected to 2e-5 for RoBERTa.
  • Remember to split validation set on your own data. The "dev" argument should be filled with corresponding validation filepath in the trainer of finetune.py. We did not provide a validation set partition here, which was an issue that we previously overlooked. Yet in the implementation of our experiment, we use validation set to evaluate performance of different induced trees.

Reference

If you find this work useful, feel free to cite:

@inproceedings{DBLP:conf/naacl/DaiYSLQ21,
  author    = {Junqi Dai and
               Hang Yan and
               Tianxiang Sun and
               Pengfei Liu and
               Xipeng Qiu},
  editor    = {Kristina Toutanova and
               Anna Rumshisky and
               Luke Zettlemoyer and
               Dilek Hakkani{-}T{\"{u}}r and
               Iz Beltagy and
               Steven Bethard and
               Ryan Cotterell and
               Tanmoy Chakraborty and
               Yichao Zhou},
  title     = {Does syntax matter? {A} strong baseline for Aspect-based Sentiment
               Analysis with RoBERTa},
  booktitle = {Proceedings of the 2021 Conference of the North American Chapter of
               the Association for Computational Linguistics: Human Language Technologies,
               {NAACL-HLT} 2021, Online, June 6-11, 2021},
  pages     = {1816--1829},
  publisher = {Association for Computational Linguistics},
  year      = {2021},
  url       = {https://doi.org/10.18653/v1/2021.naacl-main.146},
  doi       = {10.18653/v1/2021.naacl-main.146},
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].