All Projects → awslabs → pptod

awslabs / pptod

Licence: Apache-2.0 license
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System (ACL 2022)

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to pptod

motor-defect-detector-python
Predict performance issues with manufacturing equipment motors. Perform local or cloud analytics of the issues found, and then display the data on a user interface to determine when failures might arise.
Stars: ✭ 24 (-68.83%)
Mutual labels:  pretrained-models
WARP
Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming. Outperforming `GPT-3` on SuperGLUE Few-Shot text classification. https://aclanthology.org/2021.acl-long.381/
Stars: ✭ 66 (-14.29%)
Mutual labels:  pretrained-models
MinTL
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
Stars: ✭ 61 (-20.78%)
Mutual labels:  task-oriented-dialogue
gan-vae-pretrained-pytorch
Pretrained GANs + VAEs + classifiers for MNIST/CIFAR in pytorch.
Stars: ✭ 134 (+74.03%)
Mutual labels:  pretrained-models
super-gradients
Easily train or fine-tune SOTA computer vision models with one open source training library
Stars: ✭ 429 (+457.14%)
Mutual labels:  pretrained-models
Tacotron2-PyTorch
Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.
Stars: ✭ 118 (+53.25%)
Mutual labels:  pretrained-models
mixed-language-training
Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems (AAAI-2020)
Stars: ✭ 29 (-62.34%)
Mutual labels:  task-oriented-dialogue
two-stream-fusion-for-action-recognition-in-videos
No description or website provided.
Stars: ✭ 80 (+3.9%)
Mutual labels:  pretrained-models
Awesome-Dialogue-State-Tracking
Dialogue State Tracking (DST) Papers, Datasets, Resources 🤩
Stars: ✭ 94 (+22.08%)
Mutual labels:  task-oriented-dialogue
pigallery
PiGallery: AI-powered Self-hosted Secure Multi-user Image Gallery and Detailed Image analysis using Machine Learning, EXIF Parsing and Geo Tagging
Stars: ✭ 35 (-54.55%)
Mutual labels:  pretrained-models
open clip
An open source implementation of CLIP.
Stars: ✭ 1,534 (+1892.21%)
Mutual labels:  pretrained-models
chatbot
kbqa task-oriented qa seq2seq ir neo4j jena seq2seq tf chatbot chat
Stars: ✭ 32 (-58.44%)
Mutual labels:  task-oriented-dialogue
Pytorch-NLU
Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (+96.1%)
Mutual labels:  pretrained-models
BIRADS classifier
High-resolution breast cancer screening with multi-view deep convolutional neural networks
Stars: ✭ 122 (+58.44%)
Mutual labels:  pretrained-models
transformer-models
Deep Learning Transformer models in MATLAB
Stars: ✭ 90 (+16.88%)
Mutual labels:  pretrained-models
roberta-wwm-base-distill
this is roberta wwm base distilled model which was distilled from roberta wwm by roberta wwm large
Stars: ✭ 61 (-20.78%)
Mutual labels:  pretrained-models
concurrent-video-analytic-pipeline-optimization-sample-l
Create a concurrent video analysis pipeline featuring multistream face and human pose detection, vehicle attribute detection, and the ability to encode multiple videos to local storage in a single stream.
Stars: ✭ 39 (-49.35%)
Mutual labels:  pretrained-models
CPPE-Dataset
Code for our paper CPPE - 5 (Medical Personal Protective Equipment), a new challenging object detection dataset
Stars: ✭ 42 (-45.45%)
Mutual labels:  pretrained-models
Entity
EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation
Stars: ✭ 313 (+306.49%)
Mutual labels:  pretrained-models
text classifier
Tensorflow2.3的文本分类项目,支持各种分类模型,支持相关tricks。
Stars: ✭ 135 (+75.32%)
Mutual labels:  pretrained-models

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Authors: Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, and Yi Zhang

Code of our PPTOD paper: Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

News:

  • [2022/02/24] PPTOD is accepted to the main conference of ACL 2022!
  • [2021/09/29] PPTOD is publicly released!

Introduction:

Pre-trained language models have been recently shown to benefit task-oriented dialogue (TOD) systems. Despite their success, existing methods often formulate this task as a cascaded generation problem which can lead to error accumulation across different sub-tasks and greater data annotation overhead. In this study, we present PPTOD, a unified model that seamlessly supports both task-oriented dialogue understanding and response generation in a plug-and-play fashion. In addition, we introduce a new dialogue multi-task pre-training strategy that allows the model to learn the primary TOD task completion skills from heterogeneous dialog corpora. We extensively test our model on three benchmark TOD tasks, including end-to-end dialogue modelling, dialogue state tracking, and intent classification. Results show that PPTOD creates new state-of-the-art on all evaluated tasks in both full training and low-resource scenarios. Furthermore, comparisons against previous SOTA methods show that the responses generated by PPTOD are more factually correct and semantically coherent as judged by human annotators.

Alt text

Main Results:

The following table shows our models performances on end-to-end dialogue modelling (Inform, Success, BLEU, and Combined Score) on MultiWOZ 2.0. It also shows the dialogue state tracking (DST) results on MultiWOZ 2.0 and intent classification accuracy on Banking77.

Inform Success BLEU Combined Score DST Joint Accuracy Intent Classification Accuracy
PPTOD-small 87.80 75.30 19.89 101.44 51.50 93.27
PPTOD-base 89.20 79.40 18.62 102.92 53.37 93.86
PPTOD-large 82.60 74.10 19.21 97.56 53.89 94.08

Citation:

If you find our paper and resources useful, please kindly cite our paper:

@article{su2021multitask,
   author = {Yixuan Su and
             Lei Shu and
             Elman Mansimov and
             Arshit Gupta and
             Deng Cai and
             Yi{-}An Lai and
             Yi Zhang},
   title     = {Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System},
   booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL)",
   publisher = "Association for Computational Linguistics",
   year      = {2022},
   url       = {https://arxiv.org/abs/2109.14739}
}

Example Usage:

In the following, we provide an example of how to use PPTOD to address different TOD tasks without fine-tuning on any downstream task! We assume you have downloaded the pptod-small checkpoint and have it in the "./checkpoints/small/" directory (you can find instructions below).

# load the pre-trained PPTOD-small
import torch
from transformers import T5Tokenizer
model_path = r'./checkpoints/small/'
tokenizer = T5Tokenizer.from_pretrained(model_path)
from E2E_TOD.modelling.T5Model import T5Gen_Model
from E2E_TOD.ontology import sos_eos_tokens
special_tokens = sos_eos_tokens
model = T5Gen_Model(model_path, tokenizer, special_tokens, dropout=0.0, 
        add_special_decoder_token=True, is_training=False)
model.eval()
# prepare some pre-defined tokens and task-specific prompts
sos_context_token_id = tokenizer.convert_tokens_to_ids(['<sos_context>'])[0]
eos_context_token_id = tokenizer.convert_tokens_to_ids(['<eos_context>'])[0]
pad_token_id, sos_b_token_id, eos_b_token_id, sos_a_token_id, eos_a_token_id, \
sos_r_token_id, eos_r_token_id, sos_ic_token_id, eos_ic_token_id = \
tokenizer.convert_tokens_to_ids(['<_PAD_>', '<sos_b>', 
'<eos_b>', '<sos_a>', '<eos_a>', '<sos_r>','<eos_r>', '<sos_d>', '<eos_d>'])
bs_prefix_text = 'translate dialogue to belief state:'
bs_prefix_id = tokenizer.convert_tokens_to_ids(tokenizer.tokenize(bs_prefix_text))
da_prefix_text = 'translate dialogue to dialogue action:'
da_prefix_id = tokenizer.convert_tokens_to_ids(tokenizer.tokenize(da_prefix_text))
nlg_prefix_text = 'translate dialogue to system response:'
nlg_prefix_id = tokenizer.convert_tokens_to_ids(tokenizer.tokenize(nlg_prefix_text))
ic_prefix_text = 'translate dialogue to user intent:'
ic_prefix_id = tokenizer.convert_tokens_to_ids(tokenizer.tokenize(ic_prefix_text))
# an example dialogue context
dialogue_context = "<sos_u> can i reserve a five star place for thursday night at 3:30 for 2 people <eos_u> <sos_r> i'm happy to assist you! what city are you dining in? <eos_r> <sos_u> seattle please. <eos_u>"
context_id = tokenizer.convert_tokens_to_ids(tokenizer.tokenize(dialogue_context))
# predict belief state 
input_id = bs_prefix_id + [sos_context_token_id] + context_id + [eos_context_token_id]
input_id = torch.LongTensor(input_id).view(1, -1)
x = model.model.generate(input_ids = input_id, decoder_start_token_id = sos_b_token_id,
            pad_token_id = pad_token_id, eos_token_id = eos_b_token_id, max_length = 128)
print (model.tokenized_decode(x[0]))
# the predicted result is
# <sos_b> [restaurant] rating five star date thursday night start time 3:30 number of people 2 city seattle <eos_b>
# predict dialogue act
input_id = da_prefix_id + [sos_context_token_id] + context_id + [eos_context_token_id]
input_id = torch.LongTensor(input_id).view(1, -1)
x = model.model.generate(input_ids = input_id, decoder_start_token_id = sos_a_token_id,
            pad_token_id = pad_token_id, eos_token_id = eos_a_token_id, max_length = 128)
print (model.tokenized_decode(x[0]))
# the predicted result is
# <sos_a> [restaurant] [inform] restaurant name rating [multiple_choice] restaurant name <eos_a>
# predict system response
input_id = nlg_prefix_id + [sos_context_token_id] + context_id + [eos_context_token_id]
input_id = torch.LongTensor(input_id).view(1, -1)
x = model.model.generate(input_ids = input_id, decoder_start_token_id = sos_r_token_id,
            pad_token_id = pad_token_id, eos_token_id = eos_r_token_id, max_length = 128)
print (model.tokenized_decode(x[0]))
# the predicted result is 
# <sos_r> ok, let me find some options for you. <eos_r>
# predict user intent
input_id = ic_prefix_id + [sos_context_token_id] + context_id + [eos_context_token_id]
input_id = torch.LongTensor(input_id).view(1, -1)
x = model.model.generate(input_ids = input_id, decoder_start_token_id = sos_ic_token_id,
            pad_token_id = pad_token_id, eos_token_id = eos_ic_token_id, max_length = 128)
print (model.tokenized_decode(x[0]))
# the predicted result is 
# <sos_d> [book_restaurant] <eos_d>

1. Environment Setup:

pip3 install -r requirements.txt
python -m spacy download en_core_web_sm

2. PPTOD Checkpoints:

You can download checkpoints of PPTOD with different configurations here.

PPTOD-small PPTOD-base PPTOD-large
here here here

To use PPTOD, you should download the checkpoint you want and unzip it in the ./checkpoints directory.

Alternatively, you can run the following commands to download the PPTOD checkpoints.

(1) Downloading Pre-trained PPTOD-small Checkpoint:

cd checkpoints
chmod +x ./download_pptod_small.sh
./download_pptod_small.sh

(2) Downloading Pre-trained PPTOD-base Checkpoint:

cd checkpoints
chmod +x ./download_pptod_base.sh
./download_pptod_base.sh

(3) Downloading Pre-trained PPTOD-large Checkpoint:

cd checkpoints
chmod +x ./download_pptod_large.sh
./download_pptod_large.sh

3. Data Preparation:

The detailed instruction for preparing the pre-training corpora and the data of downstream TOD tasks are provided in the ./data folder.

4. Dialogue Multi-Task Pre-training:

To pre-train a PPTOD model from scratch, please refer to details provided in ./Pretraining directory.

5. Benchmark TOD Tasks:

(1) End-to-End Dialogue Modelling:

To perform End-to-End Dialogue Modelling using PPTOD, please refer to details provided in ./E2E_TOD directory.

(2) Dialogue State Tracking:

To perform Dialogue State Tracking using PPTOD, please refer to details provided in ./DST directory.

(3) Intent Classification:

To perform Intent Classification using PPTOD, please refer to details provided in ./IC directory.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].