All Projects → zlinao → Variational-Transformer

zlinao / Variational-Transformer

Licence: MIT license
Variational Transformers for Diverse Response Generation

Programming Languages

python
139335 projects - #7 most used programming language
perl
6916 projects

Projects that are alternatives of or similar to Variational-Transformer

few shot dialogue generation
Dialogue Knowledge Transfer Networks (DiKTNet)
Stars: ✭ 24 (-69.62%)
Mutual labels:  dialog, dialogue-systems
Xpersona
XPersona: Evaluating Multilingual Personalized Chatbot
Stars: ✭ 54 (-31.65%)
Mutual labels:  dialog, transformer
Moel
MoEL: Mixture of Empathetic Listeners
Stars: ✭ 38 (-51.9%)
Mutual labels:  transformer, dialogue-systems
permuted-bAbI-dialog-tasks
Dataset for 'Learning End-to-End Goal-Oriented Dialog with Multiple Answers' EMNLP 2018
Stars: ✭ 17 (-78.48%)
Mutual labels:  dialog, dialogue-systems
Rakugo Archive
Framework (inspired by Ren'Py) for story driven games in Godot.
Stars: ✭ 291 (+268.35%)
Mutual labels:  dialog, dialogue-systems
Unit Uskit
unit-uskit
Stars: ✭ 197 (+149.37%)
Mutual labels:  dialog, dialogue-systems
Talkit
Non-Linear Game Dialogue Editor
Stars: ✭ 83 (+5.06%)
Mutual labels:  dialog, dialogue-systems
dialogue-datasets
collect the open dialog corpus and some useful data processing utils.
Stars: ✭ 24 (-69.62%)
Mutual labels:  dialog, dialogue-systems
Gonorth
GoNorth is a story and content planning tool for RPGs and other open world games.
Stars: ✭ 289 (+265.82%)
Mutual labels:  dialog, dialogue-systems
Seq2seqchatbots
A wrapper around tensor2tensor to flexibly train, interact, and generate data for neural chatbots.
Stars: ✭ 466 (+489.87%)
Mutual labels:  dialog, transformer
Dialogue
Node based dialogue system
Stars: ✭ 207 (+162.03%)
Mutual labels:  dialog, dialogue-systems
vue-modal
Reusable Modal component, supports own custom HTML, text and classes.
Stars: ✭ 29 (-63.29%)
Mutual labels:  dialog
fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Stars: ✭ 421 (+432.91%)
Mutual labels:  transformer
ViTs-vs-CNNs
[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)
Stars: ✭ 145 (+83.54%)
Mutual labels:  transformer
R-MeN
Transformer-based Memory Networks for Knowledge Graph Embeddings (ACL 2020) (Pytorch and Tensorflow)
Stars: ✭ 74 (-6.33%)
Mutual labels:  transformer
sister
SImple SenTence EmbeddeR
Stars: ✭ 66 (-16.46%)
Mutual labels:  transformer
libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
Stars: ✭ 284 (+259.49%)
Mutual labels:  transformer
Representation-Learning-for-Information-Extraction
Pytorch implementation of Paper by Google Research - Representation Learning for Information Extraction from Form-like Documents.
Stars: ✭ 82 (+3.8%)
Mutual labels:  transformer
keras-vision-transformer
The Tensorflow, Keras implementation of Swin-Transformer and Swin-UNET
Stars: ✭ 91 (+15.19%)
Mutual labels:  transformer
Awesome-low-level-vision-resources
A curated list of resources for Low-level Vision Tasks
Stars: ✭ 35 (-55.7%)
Mutual labels:  transformer

Variational-Transformer

License: MIT

🔆 This is the PyTorch implementation of the paper:

Variational Transformers for Diverse Response Generation. Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung [PDF]

This code has been written using PyTorch >= 0.4.1. If you use any source codes or datasets included in this toolkit in your work, please cite the following paper. The bibtex is listed below:

@article{lin2020variational,
  title={Variational Transformers for Diverse Response Generation},
  author={Lin, Zhaojiang and Winata, Genta Indra and Xu, Peng and Liu, Zihan and Fung, Pascale},
  journal={arXiv preprint arXiv:2003.12738},
  year={2020}
}

Global Variational Transformer (GVT):

The GVT is the extension of CVAE in Zhao et al. (2017), which modeling the discourse-level diversity with a global latent variable.

Sequential Variational Transformer (SVT):

SVT, inspired by variational autoregressive models (Goyal et al., 2017; Du et al., 2018), incorporates a sequence of latent variables into decoding process by using a novel variational decoder layer. Unlike previous approaches (Zhao et al., 2017; Goyal et al., 2017; Du et al., 2018), SVT uses Non-causal Multi-head Attention, which attend to future tokens for computing posterior latent variables instead of using an additional encoder.

Dependency

Check the packages needed or simply run the command

❱❱❱ pip install -r requirements.txt

Pre-trained glove embedding: glove.6B.300d.txt inside folder /vectors/.

Experiment

Dataset

Three datasets (Mojitalk, PersonaChat, EmpatheticDialogue) are used in this work. Mojitalk is single-turn dialogue dataset, PersonaChat and EmpatheticDialogue are multiturn dialogue datasets. EmpatheticDialogue is preprocessed and stored in npy format: sys_dialog_texts.train.npy, sys_target_texts.train.npy, sys_emotion_texts.train.npy which consist of parallel list of context (source), response (target) and emotion label (additional label).

Single turn dialogue

Transformer (train&test)

❱❱❱ python3 main.py --model trs --emb_dim 300 --hidden_dim 300 --hop 4 --heads 4 --cuda --batch_size 128 --lr 0.001 --pretrain_emb --kl_ceiling 0.48 --aux_ceiling 1 --full_kl_step 20000 --save_path save/trs_new_bow_batch/ > save/trs_new_bow_batch/out.txt

Use the trained Transformer to initialize GVT: replace model_8999_82.7771_0.0000_0.0000_0.0000_0.0000 with your checkpoint.

GVT (train&test)

❱❱❱ python3 main.py --model cvaetrs --emb_dim 300 --hidden_dim 300 --hop 4 --heads 4 --cuda --batch_size 128 --lr 0.001 --pretrain_emb --kl_ceiling 0.08 --aux_ceiling 1 --full_kl_step 15000 --save_path_pretrained save/trs_new_bow_batch/model_8999_82.7771_0.0000_0.0000_0.0000_0.0000 --save_path save/cvae_new_bow_batch0.08/ > save/cvae_new_bow_batch0.08/out.txt

Same here we pre-trained SVT with MLE

❱❱❱ python3 main.py --model trs --v2 --emb_dim 300 --hidden_dim 300 --hop 4 --heads 4 --cuda --batch_size 128 --lr 0.001 --pretrain_emb --kl_ceiling 0.08 --aux_ceiling 1 --full_kl_step 20000 --num_var_layers 1 --save_path save/trs_v2/ > save/trs_v2/out.txt

Use the trained Transformer to initialize SVT: replace model_8999_4.4207_83.1528_0.0000_0.6200_0.0000 with your checkpoint.

SVT (train&test)

❱❱❱ python3 main.py --model cvaetrs --v2 --emb_dim 300 --hidden_dim 300 --hop 4 --heads 4 --cuda --batch_size 16 --lr 0.0002 --pretrain_emb --kl_ceiling 0.3 --aux_ceiling 1 --full_kl_step 30000 --num_var_layers 1 --save_path_pretrained save/trs_v2/model_8999_4.4207_83.1528_0.0000_0.6200_0.0000 --save_path save/cvae_trs_v2_0.3/ > save/cvae_trs_v2_0.3/out.txt

Multiturn dialogue

Transformer (train&test)

❱❱❱ python3 main.py --model trs --emb_dim 300 --hidden_dim 300 --hop 4 --heads 4 --cuda --batch_size 32 --persona --lr 0.0002 --pretrain_emb --kl_ceiling 0.48 --aux_ceiling 1 --full_kl_step 20000 --dataset empathetic --save_path save/trs_ed_persona/ > save/trs_ed_persona/out.txt

Interact with Transformer

❱❱❱ python3 interact.py --model trs --cuda --persona --dataset empathetic --save_path_pretrained save/trs_ed_persona/model_8999_4.0222_55.8249_0.0000_0.0000_0.0000

Use the trained Transformer to initialize GVT: replace model_5999_4.0928_59.9090_0.0000_1.8200_0.0000 with your checkpoint.

GVT (train&test)

❱❱❱ python3 main.py --model cvaetrs --emb_dim 300 --hidden_dim 300 --hop 4 --heads 4 --cuda --batch_size 32 --persona --lr 0.0002 --pretrain_emb --kl_ceiling 0.05 --aux_ceiling 1 --full_kl_step 12000 --dataset empathetic --save_path_pretrained save/trs_ed_persona/model_5999_4.0928_59.9090_0.0000_1.8200_0.0000 --save_path save/cvae_trs_ed_persona_0.05/ > save/cvae_trs_ed_persona_0.05/out.txt

Interact with GVT

❱❱❱ python3 interact.py --model cvaetrs --cuda --persona --dataset empathetic --save_path_pretrained save/cvae_trs_ed_persona_0.05/model_12999_22.3743_22.9358_0.0000_0.0000_19.2416

Same here we pre-trained SVT with MLE

❱❱❱ python3 main.py --model trs --v2 --emb_dim 300 --hidden_dim 300 --hop 4 --heads 4 --cuda --batch_size 32 --persona --lr 0.0002 --pretrain_emb --num_var_layers 1 --kl_ceiling 0.05 --aux_ceiling 1 --full_kl_step 12000 --dataset empathetic --save_path save/trs_ed_persona_v2/ > save/trs_ed_persona_v2/out.txt

Use the trained Transformer to initialize SVT: replace model_7999_4.0249_55.9739_0.0000_2.0900_0.0000 with your checkpoint.

SVT (train&test)

❱❱❱ python3 main.py --model cvaetrs --v2 --emb_dim 300 --hidden_dim 300 --hop 4 --heads 4 --cuda --batch_size 2 --persona --gradient_accumulation_steps 16 --lr 0.0002 --pretrain_emb --num_var_layers 1 --kl_ceiling 0.6 --aux_ceiling 1 --full_kl_step 12000 --dataset empathetic --save_path_pretrained save/trs_ed_persona_v2/model_7999_4.0249_55.9739_0.0000_2.0900_0.0000 --save_path save/v2_cvae_trs_ed_persona_0.6/ > save/v2_cvae_trs_ed_persona_0.6/out.txt

Interact with SVT

❱❱❱ python3 interact.py --model cvaetrs --v2 --cuda --persona --dataset empathetic --save_path_pretrained save/v2_cvae_trs_ed_persona_0.6/model_15999_4.5419_18.7720_0.0000_0.0000_1.6095 --num_var_layers 1
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].