All Projects → zlinao → MinTL

zlinao / MinTL

Licence: MIT license
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to MinTL

Nlp Paper
NLP Paper
Stars: ✭ 484 (+693.44%)
Mutual labels:  transformer, transfer-learning, language-model
Awesome Bert Nlp
A curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.
Stars: ✭ 567 (+829.51%)
Mutual labels:  transformer, transfer-learning, language-model
Gpt2
PyTorch Implementation of OpenAI GPT-2
Stars: ✭ 64 (+4.92%)
Mutual labels:  transformer, language-model
Indonesian Language Models
Indonesian Language Models and its Usage
Stars: ✭ 64 (+4.92%)
Mutual labels:  transformer, language-model
Tupe
Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
Stars: ✭ 143 (+134.43%)
Mutual labels:  transformer, language-model
Bert Keras
Keras implementation of BERT with pre-trained weights
Stars: ✭ 820 (+1244.26%)
Mutual labels:  transformer, transfer-learning
Gpt2 French
GPT-2 French demo | Démo française de GPT-2
Stars: ✭ 47 (-22.95%)
Mutual labels:  transformer, language-model
Transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+91280.33%)
Mutual labels:  transformer, language-model
Bert Pytorch
Google AI 2018 BERT pytorch implementation
Stars: ✭ 4,642 (+7509.84%)
Mutual labels:  transformer, language-model
Highway-Transformer
[ACL‘20] Highway Transformer: A Gated Transformer.
Stars: ✭ 26 (-57.38%)
Mutual labels:  transformer, language-model
Relational Rnn Pytorch
An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.
Stars: ✭ 236 (+286.89%)
Mutual labels:  transformer, language-model
backprop
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (+275.41%)
Mutual labels:  transfer-learning, language-model
Getting Things Done With Pytorch
Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch. Topics: Face detection with Detectron 2, Time Series anomaly detection with LSTM Autoencoders, Object Detection with YOLO v5, Build your first Neural Network, Time Series forecasting for Coronavirus daily cases, Sentiment Analysis with BERT.
Stars: ✭ 738 (+1109.84%)
Mutual labels:  transformer, transfer-learning
Vietnamese Electra
Electra pre-trained model using Vietnamese corpus
Stars: ✭ 55 (-9.84%)
Mutual labels:  transformer, language-model
Context-Transformer
Context-Transformer: Tackling Object Confusion for Few-Shot Detection, AAAI 2020
Stars: ✭ 89 (+45.9%)
Mutual labels:  transformer, transfer-learning
Pytorch Openai Transformer Lm
🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI
Stars: ✭ 1,268 (+1978.69%)
Mutual labels:  transformer, language-model
Flow Forecast
Deep learning PyTorch library for time series forecasting, classification, and anomaly detection (originally for flood forecasting).
Stars: ✭ 368 (+503.28%)
Mutual labels:  transformer, transfer-learning
Neural sp
End-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (+568.85%)
Mutual labels:  transformer, language-model
Gpt Scrolls
A collaborative collection of open-source safe GPT-3 prompts that work well
Stars: ✭ 195 (+219.67%)
Mutual labels:  transformer, language-model
wechsel
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Stars: ✭ 39 (-36.07%)
Mutual labels:  transfer-learning, language-model

MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems

License: MIT

This is the implementation of the EMNLP 2020 paper:

MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems. Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung [PDF]

Citation:

If you use any source codes or datasets included in this toolkit in your work, please cite the following paper. The bibtex is listed below:

@article{lin2020mintl,
    title={MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems},
    author={Zhaojiang Lin and Andrea Madotto and Genta Indra Winata and Pascale Fung},
    journal={arXiv preprint arXiv:2009.12005},
    year={2020}
}

Abstract:

In this paper, we propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems and alleviate the over-dependency on annotated data. MinTL is a simple yet effective transfer learning framework, which allows us to plug-and-play pre-trained seq2seq models, and jointly learn dialogue state tracking and dialogue response generation. Unlike previous approaches, which use a copy mechanism to "carryover" the old dialogue states to the new one, we introduce Levenshtein belief spans (Lev), that allows efficient dialogue state tracking with a minimal generation length. We instantiate our learning framework with two pretrained backbones: T5 (Raffel et al., 2019) and BART (Lewis et al., 2019), and evaluate them on MultiWOZ. Extensive experiments demonstrate that: 1) our systems establish new state-of-the-art results on end-to-end response generation, 2) MinTL-based systems are more robust than baseline methods in the low resource setting, and they achieve competitive results with only 20% training data, and 3) Lev greatly improves the inference efficiency.

Dependency

Check the packages needed or simply run the command

❱❱❱ pip install -r requirements.txt

Experiments Setup

We used the preprocess script from DAMD. Please check setup.sh for data preprocessing.

Experiments

T5 End2End

❱❱❱ python train.py --mode train --context_window 2 --pretrained_checkpoint t5-small --cfg seed=557 batch_size=32

T5 DST

❱❱❱ python DST.py --mode train --context_window 3 --cfg seed=557 batch_size=32

BART End2End

❱❱❱ python train.py --mode train --context_window 2 --pretrained_checkpoint bart-large-cnn --gradient_accumulation_steps 8 --lr 3e-5 --back_bone bart --cfg seed=557 batch_size=8

BART DST

❱❱❱ python DST.py --mode train --context_window 3 --gradient_accumulation_steps 10 --pretrained_checkpoint bart-large-cnn --back_bone bart --lr 1e-5 --cfg seed=557 batch_size=4

check run.py for more information.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].