Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.

Stars: ✭ 143 (+134.43%)

Mutual labels: transformer, language-model

Bert Keras

Keras implementation of BERT with pre-trained weights

Stars: ✭ 820 (+1244.26%)

Mutual labels: transformer, transfer-learning

Gpt2 French

GPT-2 French demo | Démo française de GPT-2

Stars: ✭ 47 (-22.95%)

Mutual labels: transformer, language-model

Transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Stars: ✭ 55,742 (+91280.33%)

Mutual labels: transformer, language-model

Bert Pytorch

Google AI 2018 BERT pytorch implementation

Stars: ✭ 4,642 (+7509.84%)

Mutual labels: transformer, language-model

Highway-Transformer

[ACL‘20] Highway Transformer: A Gated Transformer.

Stars: ✭ 26 (-57.38%)

Mutual labels: transformer, language-model

Relational Rnn Pytorch

An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.

Stars: ✭ 236 (+286.89%)

Mutual labels: transformer, language-model

backprop

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

Stars: ✭ 229 (+275.41%)

Mutual labels: transfer-learning, language-model

Getting Things Done With Pytorch

Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch. Topics: Face detection with Detectron 2, Time Series anomaly detection with LSTM Autoencoders, Object Detection with YOLO v5, Build your first Neural Network, Time Series forecasting for Coronavirus daily cases, Sentiment Analysis with BERT.

Stars: ✭ 738 (+1109.84%)

Mutual labels: transformer, transfer-learning

Vietnamese Electra

Electra pre-trained model using Vietnamese corpus

Stars: ✭ 55 (-9.84%)

Mutual labels: transformer, language-model

Context-Transformer

Context-Transformer: Tackling Object Confusion for Few-Shot Detection, AAAI 2020

Stars: ✭ 89 (+45.9%)

Mutual labels: transformer, transfer-learning

Pytorch Openai Transformer Lm

🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI

Stars: ✭ 1,268 (+1978.69%)

Mutual labels: transformer, language-model

Flow Forecast

Deep learning PyTorch library for time series forecasting, classification, and anomaly detection (originally for flood forecasting).

Stars: ✭ 368 (+503.28%)

Mutual labels: transformer, transfer-learning

Neural sp

End-to-end ASR/LM implementation with PyTorch

Stars: ✭ 408 (+568.85%)

Mutual labels: transformer, language-model

Gpt Scrolls

A collaborative collection of open-source safe GPT-3 prompts that work well

Stars: ✭ 195 (+219.67%)

Mutual labels: transformer, language-model

wechsel

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

Stars: ✭ 39 (-36.07%)

Mutual labels: transfer-learning, language-model

View All Similar Projects ➔

MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems

This is the implementation of the EMNLP 2020 paper:

MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems. Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung [PDF]

Citation:

If you use any source codes or datasets included in this toolkit in your work, please cite the following paper. The bibtex is listed below:

@article{lin2020mintl,
    title={MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems},
    author={Zhaojiang Lin and Andrea Madotto and Genta Indra Winata and Pascale Fung},
    journal={arXiv preprint arXiv:2009.12005},
    year={2020}
}

Abstract:

In this paper, we propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems and alleviate the over-dependency on annotated data. MinTL is a simple yet effective transfer learning framework, which allows us to plug-and-play pre-trained seq2seq models, and jointly learn dialogue state tracking and dialogue response generation. Unlike previous approaches, which use a copy mechanism to "carryover" the old dialogue states to the new one, we introduce Levenshtein belief spans (Lev), that allows efficient dialogue state tracking with a minimal generation length. We instantiate our learning framework with two pretrained backbones: T5 (Raffel et al., 2019) and BART (Lewis et al., 2019), and evaluate them on MultiWOZ. Extensive experiments demonstrate that: 1) our systems establish new state-of-the-art results on end-to-end response generation, 2) MinTL-based systems are more robust than baseline methods in the low resource setting, and they achieve competitive results with only 20% training data, and 3) Lev greatly improves the inference efficiency.

Dependency

Check the packages needed or simply run the command

❱❱❱ pip install -r requirements.txt

Experiments Setup

We used the preprocess script from DAMD. Please check setup.sh for data preprocessing.

Experiments

T5 End2End

❱❱❱ python train.py --mode train --context_window 2 --pretrained_checkpoint t5-small --cfg seed=557 batch_size=32

T5 DST

❱❱❱ python DST.py --mode train --context_window 3 --cfg seed=557 batch_size=32

BART End2End

❱❱❱ python train.py --mode train --context_window 2 --pretrained_checkpoint bart-large-cnn --gradient_accumulation_steps 8 --lr 3e-5 --back_bone bart --cfg seed=557 batch_size=8

BART DST

❱❱❱ python DST.py --mode train --context_window 3 --gradient_accumulation_steps 10 --pretrained_checkpoint bart-large-cnn --back_bone bart --lr 1e-5 --cfg seed=557 batch_size=4

check run.py for more information.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

zlinao / MinTL

Programming Languages

Labels

Projects that are alternatives of or similar to MinTL

MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems

Citation:

Abstract:

Dependency

Experiments Setup

Experiments