All Projects → James-Yip → QuantiDCE

James-Yip / QuantiDCE

Licence: other
Towards Quantifiable Dialogue Coherence Evaluation (ACL 2021)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to QuantiDCE

GRADE
GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems
Stars: ✭ 50 (+31.58%)
Mutual labels:  dialogue-metric
TEXTOIR
TEXTOIR is a flexible toolkit for open intent detection and discovery. (ACL 2021)
Stars: ✭ 31 (-18.42%)
Mutual labels:  acl2021
OSKut
Handling Cross- and Out-of-Domain Samples in Thai Word Segmentation (ACL 2021 Findings).
Stars: ✭ 18 (-52.63%)
Mutual labels:  acl2021
CogIE
CogIE: An Information Extraction Toolkit for Bridging Text and CogNet. ACL 2021
Stars: ✭ 47 (+23.68%)
Mutual labels:  acl2021

QuantiDCE

This repository contains the source code for the following paper:

Towards Quantifiable Dialogue Coherence Evaluation
Zheng Ye, Liucun Lu, Lishan Huang, Liang Lin and Xiaodan Liang; ACL 2021

Framework Overview

QuantiDCE

Prerequisites

Create virtural environment (recommended):

conda create -n QuantiDCE python=3.6
source activate QuantiDCE

Install the required packages:

pip install -r requirements.txt

Install Texar locally:

cd texar-pytorch
pip install .

Note: Make sure that your environment has installed cuda 10.1.

Data Preparation

QuantiDCE contains two training stages: MLR pre-training and KD fine-tuning.

To prepare the data for MLR pre-training, please run:

cd ./script
bash prepare_pretrain_data.sh

To prepare the data for KD fine-tuning, please run:

cd ./script
bash prepare_finetune_data.sh

Note: To reproduce the results in our paper, please directly use the data ./data/human_judgement_for_fine_tuning/dailydialog_EVAL for KD fine-tuning since we forgot to set a seed when generating the fine-tuning data.

Training

To train metrics by QuantiDCE, please first run the script for MLR pre-training:

cd ./script
bash pretrain.sh

And then, run the script for KD fine-tuning:

cd ./script
bash finetune.sh

Note: The checkpoint of our final QuantiDCE is provided. You can download it and unzip into the root directory of our project.

Evaluation

To see the performance of QuantiDCE during training, you can refer to the evaluation result file ./output/$SEED/$CHECKPOINT_NAME/human-correlation_results_train.txt.

To evaluate a certain checkpoint after training, please modify the arguments in ./script/eval.sh and run:

cd ./script
bash eval.sh

Inference

The get_score method in ./model/bert_metric.py outputs the coherence score of a response w.r.t. a context.

In detail, infer.py provide an example about how to use QuantiDCE, just run the following command to see the result:

cd ./script
bash infer.sh
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].