All Projects → tuvuumass → task-transferability

tuvuumass / task-transferability

Licence: Apache-2.0 license
Data and code for our paper "Exploring and Predicting Transferability across NLP Tasks", to appear at EMNLP 2020.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to task-transferability

SIGIR2021 Conure
One Person, One Model, One World: Learning Continual User Representation without Forgetting
Stars: ✭ 23 (-34.29%)
Mutual labels:  transfer-learning, bert
ParsBigBird
Persian Bert For Long-Range Sequences
Stars: ✭ 58 (+65.71%)
Mutual labels:  transfer-learning, bert
Filipino-Text-Benchmarks
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
Stars: ✭ 22 (-37.14%)
Mutual labels:  transfer-learning, bert
Haystack
🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+9640%)
Mutual labels:  transfer-learning, bert
Kashgari
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Stars: ✭ 2,235 (+6285.71%)
Mutual labels:  transfer-learning, bert
backprop
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (+554.29%)
Mutual labels:  transfer-learning, bert
oreilly-bert-nlp
This repository contains code for the O'Reilly Live Online Training for BERT
Stars: ✭ 19 (-45.71%)
Mutual labels:  transfer-learning, bert
wechsel
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Stars: ✭ 39 (+11.43%)
Mutual labels:  transfer-learning, bert
label-studio-transformers
Label data using HuggingFace's transformers and automatically get a prediction service
Stars: ✭ 117 (+234.29%)
Mutual labels:  bert
aml-keras-image-recognition
A sample Azure Machine Learning project for Transfer Learning-based custom image recognition by utilizing Keras.
Stars: ✭ 14 (-60%)
Mutual labels:  transfer-learning
WSDM2022-PTUPCDR
This is the official implementation of our paper Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR), which has been accepted by WSDM2022.
Stars: ✭ 65 (+85.71%)
Mutual labels:  transfer-learning
deep-learning
Projects include the application of transfer learning to build a convolutional neural network (CNN) that identifies the artist of a painting, the building of predictive models for Bitcoin price data using Long Short-Term Memory recurrent neural networks (LSTMs) and a tutorial explaining how to build two types of neural network using as input the…
Stars: ✭ 43 (+22.86%)
Mutual labels:  transfer-learning
MoeFlow
Repository for anime characters recognition website, powered by TensorFlow
Stars: ✭ 113 (+222.86%)
Mutual labels:  transfer-learning
AB distillation
Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons (AAAI 2019)
Stars: ✭ 105 (+200%)
Mutual labels:  transfer-learning
paper annotations
A place to keep track of all the annotated papers.
Stars: ✭ 96 (+174.29%)
Mutual labels:  transfer-learning
speech-recognition-transfer-learning
Speech command recognition DenseNet transfer learning from UrbanSound8k in keras tensorflow
Stars: ✭ 18 (-48.57%)
Mutual labels:  transfer-learning
bert nli
A Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)
Stars: ✭ 97 (+177.14%)
Mutual labels:  bert
rasa milktea chatbot
Chatbot with bert chinese model, base on rasa framework(中文聊天机器人,结合bert意图分析,基于rasa框架)
Stars: ✭ 97 (+177.14%)
Mutual labels:  bert
bert-tensorflow-pytorch-spacy-conversion
Instructions for how to convert a BERT Tensorflow model to work with HuggingFace's pytorch-transformers, and spaCy. This walk-through uses DeepPavlov's RuBERT as example.
Stars: ✭ 26 (-25.71%)
Mutual labels:  bert
Keras-MultiClass-Image-Classification
Multiclass image classification using Convolutional Neural Network
Stars: ✭ 48 (+37.14%)
Mutual labels:  transfer-learning

Task transferability

Note: This repository is a work-in-progress.

Data and code for our paper Exploring and Predicting Transferability across NLP Tasks, to appear at EMNLP 2020.

Task embedding pipeline.

Figure 1: A demonstration of our task embedding pipeline. Given a target task, we first compute its task embedding and then identify the most similar source task embedding (in this example, WikiHop) from a precomputed library via cosine similarity. Finally, we perform intermediate-task transfer from the selected source task to the target task.

Table of Contents

Installation

This repository is developed on Python 3.7.5, PyTorch 1.4.0 and the Hugging Face's Transformers 2.1.1.

You should install all necessary Python packages in a virtual environment. If you are unfamiliar with Python virtual environments, please check out the user guide here.

Below, we create a virtual environment and activate it:

conda create -n task-transferability python=3.7
conda activate task-transferability

Now, install all necessary packages:

cd task-transferability
pip install -r requirements.txt
pip install [--editable] .

Data

You can download the classification/regression (CR) and question answering (QA) datasets we study at this Google Drive link. Unfortunately, most (if not all) of the sequence labeling (SL) datasets used in our paper are not publicly available; more details can be found here.

Fine-tuning BERT on downstream NLP tasks

We have separate code for fine-tuning BERT on each class of problems, including text classification/regression, question answering, and sequence labeling. If you are interested in writing a common API for fine-tuning BERT on different classes of problems, check out a homework I designed for our Advanced NLP class at this Google Colab link (see the function do_target_task_finetuning())!

Fine-tuning BERT on text classification/regression (CR) tasks

The following example code fine-tunes BERT on the MRPC task:

export SEED_ID=42
export DATA_DIR=/path/to/MRPC/data/dir
export MODEL_TYPE=bert
export MODEL_NAME_OR_PATH=bert-base-uncased
export TASK_NAME=MRPC
export CACHE_DIR=/path/to/cache/dir
export FINETUNE_FEATURE_EXTRACTOR=True # set to False to freeze BERT and train the output layer only
export SAVE=all # model checkpoints to save; can be a specific checkpoint, e.g., 0, 1,.., num_train_epochs
export OUTPUT_DIR=path/to/output/dir

python ./run_finetuning_CR.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --task_name ${TASK_NAME} \
    --do_train \
    --do_eval \
    --do_lower_case \
    --finetune_feature_extractor ${FINETUNE_FEATURE_EXTRACTOR}\
    --data_dir ${DATA_DIR} \
    --max_seq_length 128 \
    --per_gpu_eval_batch_size=8  \
    --per_gpu_train_batch_size=32  \
    --learning_rate 2e-5 \
    --num_train_epochs 3 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --save ${SAVE} \
    --seed ${SEED_ID}

Fine-tuning BERT on question answering (QA) tasks

Here is an example that fine-tunes BERT on the SQuAD-2 task:

export SEED_ID=42
export DATA_DIR=/path/to/SQuAD-2/data/dir
export MODEL_TYPE=bert
export MODEL_NAME_OR_PATH=bert-base-uncased
export VERSION_2_WITH_NEGATIVE=True
export CACHE_DIR=/path/to/cache/dir
export FINETUNE_FEATURE_EXTRACTOR=True # set to False to freeze BERT and train the output layer only
export SAVE=all # model checkpoints to save; can be a specific checkpoint, e.g., 0, 1,.., num_train_epochs
export OUTPUT_DIR=/path/to/output/dir

python ./run_finetuning_QA.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --do_train \
    --do_eval \
    --do_lower_case \
    --version_2_with_negative ${VERSION_2_WITH_NEGATIVE} \
    --finetune_feature_extractor ${FINETUNE_FEATURE_EXTRACTOR} \
    --train_file ${DATA_DIR}/train.json \
    --predict_file ${DATA_DIR}/dev.json \
    --max_seq_length 384 \
    --per_gpu_eval_batch_size=8  \
    --per_gpu_train_batch_size=12  \
    --learning_rate 3e-5 \
    --num_train_epochs 3 \
    --doc_stride 128 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --save ${SAVE} \
    --seed ${SEED_ID}

Fine-tuning BERT on sequence labeling (SL) tasks

This example code fine-tunes BERT on the NER task:

export SEED_ID=42
export DATA_DIR=/path/to/NER/data/dir
export MODEL_TYPE=bert
export MODEL_NAME_OR_PATH=bert-base-uncased
export LABEL_FILE=${DATA_DIR}/labels.txt
export CACHE_DIR=/path/to/cache/dir
export FINETUNE_FEATURE_EXTRACTOR=True # set to False to freeze BERT and train the output layer only
export SAVE=all # model checkpoints to save; can be a specific checkpoint, e.g., 0, 1,.., num_train_epochs
export OUTPUT_DIR=/path/to/output/dir

python ./run_finetuning_SL.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --do_train \
    --do_eval \
    --do_lower_case \
    --labels ${LABEL_FILE} \
    --finetune_feature_extractor ${FINETUNE_FEATURE_EXTRACTOR} \
    --data_dir ${DATA_DIR} \
    --max_seq_length 128 \
    --per_gpu_eval_batch_size=8  \
    --per_gpu_train_batch_size=32  \
    --learning_rate 5e-5 \
    --num_train_epochs 6 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --save ${SAVE} \
    --seed ${SEED_ID}

Note: You might want to use a Cased model if case information is important for your task (e.g., named entity recognition or part-of-speech tagging).

Intermediate-task transfer with BERT

Since the target task is typically different than the source task, we will discard the output layer from the model fine-tuned on the source task and incorporate BERT with another output layer to form a task-specific model for the target task. The output layer will be learned from scratch. You can do this by running our fine-tuning code with the argument model_load_mode = "base_model_only".

Transfer results.

Figure 2: Our experiments show that even low-data source tasks that are on the surface very different than the target task can result in transfer gains. Out-of-class transfer (e.g., from a question answering task to a classification task) succeeds in many cases, some of which are unintuitive. Here in this violin plot, each point in a violin represents the performance on the associated target task when we perform intermediate-task transfer with a particular source task and the black line corresponds to what happens when we directly fine-tune BERT on the target task without any intermediate-task transfer.

Intermediate-task transfer to text classification/regression (CR) tasks

This example code performs intermediate-task transfer from the HotpotQA task (question answering) to the MRPC task (text classification/regression):

export SEED_ID=42
export TARGET_DATA_DIR=/path/to/MRPC/data/dir
export MODEL_TYPE=bert
export TARGET_TASK=MRPC
export CACHE_DIR=/path/to/cache/dir
export FINETUNE_FEATURE_EXTRACTOR=True
export MODEL_LOAD_MODE=base_model_only
export SAVE=all
export SOURCE_CHECKPOINT=3
export MODEL_NAME_OR_PATH=/path/to/BERT/HotpotQA/model/checkpoint-${SOURCE_CHECKPOINT}
export OUTPUT_DIR=/path/to/output/dir

python ./run_finetuning_CR.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --task_name ${TARGET_TASK} \
    --do_train \
    --do_eval \
    --do_lower_case \
    --finetune_feature_extractor ${FINETUNE_FEATURE_EXTRACTOR} \
    --model_load_mode ${MODEL_LOAD_MODE} \
    --data_dir ${TARGET_DATA_DIR} \
    --max_seq_length 128 \
    --per_gpu_eval_batch_size=8  \
    --per_gpu_train_batch_size=32  \
    --learning_rate 2e-5 \
    --num_train_epochs 3 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --save ${SAVE} \
    --seed ${SEED_ID}

Intermediate-task transfer to question answering (QA) tasks

The following script does intermediate-task transfer from the GParent task (sequence labeling) to the CQ task (question answering):

export SEED_ID=42
export TARGET_DATA_DIR=/path/to/CQ/data/dir
export MODEL_TYPE=bert
export VERSION_2_WITH_NEGATIVE=False
export CACHE_DIR=/path/to/cache/dir
export FINETUNE_FEATURE_EXTRACTOR=True
export MODEL_LOAD_MODE=base_model_only
export SAVE=all
export SOURCE_CHECKPOINT=6
export MODEL_NAME_OR_PATH=/path/to/BERT/GParent/model/checkpoint-${SOURCE_CHECKPOINT}
export OUTPUT_DIR=/path/to/output/dir

python ./run_finetuning_QA.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --do_train \
    --do_eval \
    --do_lower_case \
    --version_2_with_negative ${VERSION_2_WITH_NEGATIVE} \
    --finetune_feature_extractor ${FINETUNE_FEATURE_EXTRACTOR} \
    --model_load_mode ${MODEL_LOAD_MODE} \
    --train_file ${TARGET_DATA_DIR}/train.json \
    --predict_file ${TARGET_DATA_DIR}/dev.json \
    --max_seq_length 384 \
    --per_gpu_eval_batch_size=8  \
    --per_gpu_train_batch_size=12  \
    --learning_rate 3e-5 \
    --num_train_epochs 3 \
    --doc_stride 128 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --save ${SAVE}

Intermediate-task transfer to sequence labeling (SL) tasks

Below, we perform intermediate-task transfer from the MNLI task (text classification/regression) to the Conj task (sequence labeling):

export SEED_ID=42
export TARGET_DATA_DIR=/path/to/Conj/data/dir
export MODEL_TYPE=bert
export LABEL_FILE=${TARGET_DATA_DIR}/labels.txt
export CACHE_DIR=/path/to/cache/dir
export FINETUNE_FEATURE_EXTRACTOR=True
export MODEL_LOAD_MODE=base_model_only
export SAVE=all
export SOURCE_CHECKPOINT=3
export MODEL_NAME_OR_PATH=/path/to/BERT/MNLI/model/checkpoint-${SOURCE_CHECKPOINT}
export OUTPUT_DIR=/path/to/output/dir

python ./run_finetuning_SL.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --do_train \
    --do_eval \
    --do_lower_case \
    --labels ${LABEL_FILE}\
    --finetune_feature_extractor ${FINETUNE_FEATURE_EXTRACTOR} \
    --model_load_mode ${MODEL_LOAD_MODE} \
    --data_dir ${TARGET_DATA_DIR} \
    --max_seq_length 128 \
    --per_gpu_eval_batch_size=8  \
    --per_gpu_train_batch_size=32  \
    --learning_rate 5e-5 \
    --num_train_epochs 6 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --save ${SAVE}

TextEmb

TextEmb is computed by pooling BERT’s final-layer token-level representations across an entire dataset, and as such captures properties of the text and domain. This method does not depend on the training labels and thus can be used in zero-shot transfer.

TextEmb space.

Figure 3a: A visualization of the task space of TextEmb. It captures domain similarity (e.g., the Penn Treebank sequence labeling tasks are highly interconnected).

Computing TextEmb for text classification/regression (CR) tasks

You can run something like the following script to compute TextEmb for the MRPC task:

export SEED_ID=42
export DATA_DIR=/path/to/MRPC/data/dir
export MODEL_TYPE=bert
export MODEL_NAME_OR_PATH=bert-base-uncased
export TASK_NAME=MRPC
export CACHE_DIR=/path/to/cache/dir
export OUTPUT_DIR=/path/to/output/dir

python run_textemb_CR.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --task_name ${TASK_NAME} \
    --do_lower_case \
    --data_dir ${DATA_DIR} \
    --max_seq_length 128 \
    --per_gpu_train_batch_size=32  \
    --num_train_epochs 1 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --seed ${SEED_ID}

Let's look at the TextEmb task embedding:

import numpy as np

task_emb = np.load("/path/to/output/dir/avg_sequence_output.npy").reshape(-1)
print(task_emb.shape) # (768, )
print(task_emb)

Computing TextEmb for question answering (QA) tasks

This example code computes TextEmb for the SQuAD-2 task:

export SEED_ID=42
export DATA_DIR=/path/to/SQuAD-2/data/dir
export MODEL_TYPE=bert
export MODEL_NAME_OR_PATH=bert-base-uncased
export VERSION_2_WITH_NEGATIVE=True
export CACHE_DIR=/path/to/cache/dir
export OUTPUT_DIR=/path/to/output/dir

python ./run_textemb_QA.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --do_lower_case \
    --version_2_with_negative ${VERSION_2_WITH_NEGATIVE} \
    --train_file ${DATA_DIR}/train.json \
    --max_seq_length 384 \
    --per_gpu_train_batch_size=12  \
    --num_train_epochs 1 \
    --doc_stride 128 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --seed ${SEED_ID}

Computing TextEmb for sequence labeling (SL) tasks

Here is an example that computes TextEmb for the NER task:

export SEED_ID=42
export DATA_DIR=/path/to/NER/data/dir
export MODEL_TYPE=bert
export MODEL_NAME_OR_PATH=bert-base-uncased
export LABEL_FILE=${DATA_DIR}/labels.txt
export CACHE_DIR=/path/to/cache/dir
export OUTPUT_DIR=/path/to/output/dir

python ./run_textemb_SL.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --do_lower_case \
    --labels ${LABEL_FILE}\
    --data_dir ${DATA_DIR} \
    --max_seq_length 128 \
    --per_gpu_train_batch_size=32  \
    --num_train_epochs 1 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --seed ${SEED_ID}

TaskEmb

TaskEmb relies on the correlation between the fine-tuning loss function and the parameters of BERT, and encodes more information about the type of knowledge and reasoning required to solve the task. It is computationally more expensive than TextEmb since you need to fine-tune BERT. If you have enough training data for your task, you can try freezing BERT during fine-tuning and train the output layer only. We find empirically that TaskEmb computed from a fine-tuned task-specific BERT result in better correlations to task transferability in data-constrained scenarios.

TextEmb space.

Figure 3b: A visualization of the task space of TaskEmb. It focuses more on task similarity (e.g., the two part-of-speech tagging tasks are interconnected despite their domain dissimilarity).

Computing TaskEmb for text classification/regression (CR) tasks

In this example, we will compute TaskEmb for the MRPC task. We assume that you have fine-tuned BERT on MRPC already. Use the following script to compute TaskEmb:

export SEED_ID=42
export DATA_DIR=/path/to/MRPC/data/dir
export MODEL_TYPE=bert
export TASK_NAME=MRPC
export USE_LABELS=True # set to False to sample from the model's predictive distribution
export CACHE_DIR=/path/to/cache/dir
export CHECKPOINT=3
export MODEL_NAME_OR_PATH=/path/to/BERT/MRPC/model/checkpoint-${CHECKPOINT}
# we start from a fine-tuned task-specific BERT so no need for further fine-tuning
export FURTHER_FINETUNE_CLASSIFIER=False
export FURTHER_FINETUNE_FEATURE_EXTRACTOR=False
export OUTPUT_DIR=/path/to/output/dir

python ./run_taskemb_CR.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --task_name ${TASK_NAME} \
    --use_labels ${USE_LABELS} \
    --do_lower_case \
    --finetune_classifier ${FURTHER_FINETUNE_CLASSIFIER} \
    --finetune_feature_extractor ${FURTHER_FINETUNE_FEATURE_EXTRACTOR} \
    --data_dir ${DATA_DIR} \
    --max_seq_length 128 \
    --batch_size=32  \
    --learning_rate 2e-5 \
    --num_epochs 1 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --seed ${SEED_ID}

Let's look at the TaskEmb task embedding computed from a specific component of BERT, i.e., the CLS output vector:

import numpy as np

task_emb = np.load("/path/to/output/dir/cls_output.npy").reshape(-1)
print(task_emb.shape) # (768, )
print(task_emb)

Computing TaskEmb for question answering (QA) tasks

Let's compute TaskEmb for the SQuAD-2 task. We assume that you have fine-tuned BERT on SQuAD-2 already. Run the following script to compute TaskEmb:

export SEED_ID=42
export DATA_DIR=/path/to/SQuAD-2/data/dir
export MODEL_TYPE=bert
export VERSION_2_WITH_NEGATIVE=True
export USE_LABELS=True # set to False to sample from the model's predictive distribution
export CACHE_DIR=/path/to/cache/dir
export CHECKPOINT=3
export MODEL_NAME_OR_PATH=/path/to/BERT/SQuAD-2/model/checkpoint-${CHECKPOINT}
# we start from a fine-tuned task-specific BERT so no need for further fine-tuning
export FURTHER_FINETUNE_CLASSIFIER=False
export FURTHER_FINETUNE_FEATURE_EXTRACTOR=False
export OUTPUT_DIR=/path/to/output/dir

python ./run_taskemb_QA.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --use_labels ${USE_LABELS} \
    --do_lower_case \
    --finetune_classifier ${FURTHER_FINETUNE_CLASSIFIER} \
    --finetune_feature_extractor ${FURTHER_FINETUNE_FEATURE_EXTRACTOR} \
    --train_file ${DATA_DIR}/train.json \
    --version_2_with_negative ${VERSION_2_WITH_NEGATIVE} \
    --max_seq_length 384 \
    --batch_size=12  \
    --learning_rate 3e-5 \
    --num_epochs 1 \
    --doc_stride 128 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --seed ${SEED_ID}

Computing TaskEmb for sequence labeling (SL) tasks

This example code computes TaskEmb for the NER task, which assumes that you have fine-tuned BERT on NER already:

export SEED_ID=42
export DATA_DIR=/path/to/NER/data/dir
export MODEL_TYPE=bert
export LABEL_FILE=${DATA_DIR}/labels.txt
export USE_LABELS=True # set to False to sample from the model's predictive distribution
export CACHE_DIR=/path/to/cache/dir
export CHECKPOINT=6
export MODEL_NAME_OR_PATH=/path/to/BERT/NER/model/checkpoint-${CHECKPOINT}
# we start from a fine-tuned task-specific BERT so no need for further fine-tuning
export FURTHER_FINETUNE_CLASSIFIER=False
export FURTHER_FINETUNE_FEATURE_EXTRACTOR=False
export OUTPUT_DIR=/path/to/output/dir

python ./run_taskemb_CR.py \
    --model_type ${MODEL_TYPE} \
    --model_name_or_path ${MODEL_NAME_OR_PATH} \
    --labels ${LABEL_FILE} \
    --use_labels ${USE_LABELS} \
    --do_lower_case \
    --finetune_classifier ${FURTHER_FINETUNE_CLASSIFIER} \
    --finetune_feature_extractor ${FURTHER_FINETUNE_FEATURE_EXTRACTOR} \
    --data_dir ${DATA_DIR} \
    --max_seq_length 128 \
    --batch_size=32  \
    --learning_rate 5e-5 \
    --num_epochs 1 \
    --output_dir ${OUTPUT_DIR} \
    --cache_dir ${CACHE_DIR} \
    --seed ${SEED_ID}

Pretrained models and precomputed task embeddings

Using task embeddings for source task selection

This notebook demonstrates using the TextEmb task embedding for source task selection.

Citation

If you find this repository useful, please cite our paper:

@inproceedings{vu-etal-2020-task-transferability,
    title = "Exploring and Predicting Transferability across {NLP} Tasks",
    author = "Vu, Tu  and
      Wang, Tong  and
      Munkhdalai, Tsendsuren  and
      Sordoni, Alessandro  and
      Trischler, Adam  and
      Mattarella-Micke, Andrew  and
      Maji, Subhransu  and
      Iyyer, Mohit",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.emnlp-main.635",
    pages = "7882--7926",
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].