All Projects → joeljang → continual-knowledge-learning

joeljang / continual-knowledge-learning

Licence: other
[ICLR 2022] Towards Continual Knowledge Learning of Language Models

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to continual-knowledge-learning

class-norm
Class Normalization for Continual Zero-Shot Learning
Stars: ✭ 34 (-55.84%)
Mutual labels:  continual-learning
BLIP
Official Implementation of CVPR2021 paper: Continual Learning via Bit-Level Information Preserving
Stars: ✭ 33 (-57.14%)
Mutual labels:  continual-learning
MetaLifelongLanguage
Repository containing code for the paper "Meta-Learning with Sparse Experience Replay for Lifelong Language Learning".
Stars: ✭ 21 (-72.73%)
Mutual labels:  continual-learning
FUSION
PyTorch code for NeurIPSW 2020 paper (4th Workshop on Meta-Learning) "Few-Shot Unsupervised Continual Learning through Meta-Examples"
Stars: ✭ 18 (-76.62%)
Mutual labels:  continual-learning
reproducible-continual-learning
Continual learning baselines and strategies from popular papers, using Avalanche. We include EWC, SI, GEM, AGEM, LwF, iCarl, GDumb, and other strategies.
Stars: ✭ 118 (+53.25%)
Mutual labels:  continual-learning
GPM
Official Code Repository for "Gradient Projection Memory for Continual Learning"
Stars: ✭ 50 (-35.06%)
Mutual labels:  continual-learning
php-best-practices
What I consider the best practices for web and software development.
Stars: ✭ 60 (-22.08%)
Mutual labels:  continual-learning
SIGIR2021 Conure
One Person, One Model, One World: Learning Continual User Representation without Forgetting
Stars: ✭ 23 (-70.13%)
Mutual labels:  continual-learning
Remembering-for-the-Right-Reasons
Official Implementation of Remembering for the Right Reasons (ICLR 2021)
Stars: ✭ 27 (-64.94%)
Mutual labels:  continual-learning
cvpr clvision challenge
CVPR 2020 Continual Learning Challenge - Submit your CL algorithm today!
Stars: ✭ 57 (-25.97%)
Mutual labels:  continual-learning
Adam-NSCL
PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"
Stars: ✭ 34 (-55.84%)
Mutual labels:  continual-learning
OCDVAEContinualLearning
Open-source code for our paper: Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition
Stars: ✭ 56 (-27.27%)
Mutual labels:  continual-learning
CVPR21 PASS
PyTorch implementation of our CVPR2021 (oral) paper "Prototype Augmentation and Self-Supervision for Incremental Learning"
Stars: ✭ 55 (-28.57%)
Mutual labels:  continual-learning
Generative Continual Learning
No description or website provided.
Stars: ✭ 51 (-33.77%)
Mutual labels:  continual-learning
CPG
Steven C. Y. Hung, Cheng-Hao Tu, Cheng-En Wu, Chien-Hung Chen, Yi-Ming Chan, and Chu-Song Chen, "Compacting, Picking and Growing for Unforgetting Continual Learning," Thirty-third Conference on Neural Information Processing Systems, NeurIPS 2019
Stars: ✭ 91 (+18.18%)
Mutual labels:  continual-learning
FACIL
Framework for Analysis of Class-Incremental Learning with 12 state-of-the-art methods and 3 baselines.
Stars: ✭ 411 (+433.77%)
Mutual labels:  continual-learning
CVPR2021 PLOP
Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation
Stars: ✭ 102 (+32.47%)
Mutual labels:  continual-learning
class-incremental-learning
PyTorch implementation of a VAE-based generative classifier, as well as other class-incremental learning methods that do not store data (DGR, BI-R, EWC, SI, CWR, CWR+, AR1, the "labels trick", SLDA).
Stars: ✭ 30 (-61.04%)
Mutual labels:  continual-learning
Continual Learning Data Former
A pytorch compatible data loader to create sequence of tasks for Continual Learning
Stars: ✭ 32 (-58.44%)
Mutual labels:  continual-learning
course-content-dl
NMA deep learning course
Stars: ✭ 537 (+597.4%)
Mutual labels:  continual-learning

Towards Continual Knowledge Learning of Language Models

This is the official github repository for Towards Continual Knowledge Learning of Language Models, accepted at ICLR 2022.

Use the following to cite our paper:

@inproceedings{jang2022towards,
  title={Towards Continual Knowledge Learning of Language Models},
  author={Jang, Joel and Ye, Seonghyeon and Yang, Sohee and Shin, Joongbo and Han, Janghoon and Kim, Gyeonghun and Choi, Stanley Jungkyu and Seo, Minjoon},
  booktitle={ICLR},
  year={2022}
}

In order to reproduce our results, take the following steps:

1. Create conda environment and install requirements

conda create -n ckl python=3.8 && conda activate ckl
pip install -r requirements.txt

Also, make sure to install the correct version of pytorch corresponding to the CUDA version and environment: Refer to https://pytorch.org/

#For CUDA 10.x
pip3 install torch torchvision torchaudio
#For CUDA 11.x
pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html

2. Download the data used for the experiments.

To download only the CKL benchmark dataset:

wget https://continual.blob.core.windows.net/ckl/ckl_data.zip

To download ALL of the data used for the experiments (required to reproduce results):

wget https://continual.blob.core.windows.net/ckl/data.zip

To download the (continually pretrained) model checkpoints of the main experiment (required to reproduce results):

wget https://continual.blob.core.windows.net/ckl/modelcheckpoints_main.zip

For the other experimental settings such as multiple CKL phases, GPT-2, we do not separately provide the continually pretrained model checkpoints.

3. Reproducing Experimental Results

We provide all the configs in order to reproduce the zero-shot results of our paper. We only provide the model checkpoints for the main experimental setting (full_setting) which can be downloaded with the command above.

configs
├── full_setting
│   ├── evaluation
│   |   ├── invariantLAMA
│   |   |   ├── t5_baseline.json
│   |   |   ├── t5_kadapters.json
│   |   |   ├── ...
│   |   ├── newLAMA
│   |   ├── newLAMA_easy
│   |   ├── updatedLAMA
│   ├── training
│   |   ├── t5_baseline.json
│   |   ├── t5_kadapters.json
│   |   ├── ...
├── GPT2
│   ├── ...
├── kilt
│   ├── ...
├── small_setting
│   ├── ...
├── split
│   ├── ...                    

Components in each configurations file

  • input_length (int) : the input sequence length
  • output_length (int) : the output sequence length
  • num_train_epochs (int) : number of training epochs
  • output_dir (string) : the directory to save the model checkpoints
  • dataset (string) : the dataset to perform zero-shot evaluation or continual pretraining
  • dataset_version (string) : the version of the dataset ['full', 'small', 'debug']
  • train_batch_size (int) : batch size used for training
  • learning rate (float) : learning rate used for training
  • model (string) : model name in huggingface models (https://huggingface.co/models)
  • method (string) : method being used ['baseline', 'kadapter', 'lora', 'mixreview', 'modular_small', 'recadam']
  • freeze_level (int) : how much of the model to freeze during traininig (0 for none, 1 for freezing only encoder, 2 for freezing all of the parameters)
  • gradient_accumulation_steps (int) : gradient accumulation used to match the global training batch of each method
  • ngpu (int) : number of gpus used for the run
  • num_workers (int) : number of workers for the Dataloader
  • resume_from_checkpoint (string) : null by default. directory to model checkpoint if resuming from checkpoint
  • accelerator (string) : 'ddp' by default. the pytorch lightning accelerator to be used.
  • use_deepspeed (bool) : false by default. Currently not extensively tested.
  • CUDA_VISIBLE_DEVICES (string) : gpu devices that are made available for this run (e.g. "0,1,2,3", "0")
  • wandb_log (bool) : whether to log experiment through wandb
  • wandb_project (string) : project name of wandb
  • wandb_run_name (string) : the name of this training run
  • mode (string) : 'pretrain' for all configs
  • use_lr_scheduling (bool) : true if using learning rate scheduling
  • check_validation (bool) : true for evaluation (no training)
  • checkpoint_path (string) : path to the model checkpoint that is used for evaluation
  • output_log (string) : directory to log evaluation results to
  • split_num (int) : default is 1. more than 1 if there are multile CKL phases
  • split (int) : which CKL phase it is

This is an example of getting the invariantLAMA zero-shot evaluation of continually pretrained t5_kadapters

python run.py --config configs/full_setting/evaluation/invariantLAMA/t5_kadapters.json

This is an example of performing continual pretraining on CC-RecentNews (main experiment) with t5_kadapters

python run.py --config configs/full_setting/training/t5_kadapters.json

Reference

@article{jang2021towards,
  title={Towards Continual Knowledge Learning of Language Models},
  author={Jang, Joel and Ye, Seonghyeon and Yang, Sohee and Shin, Joongbo and Han, Janghoon and Kim, Gyeonghun and Choi, Stanley Jungkyu and Seo, Minjoon},
  journal={arXiv preprint arXiv:2110.03215},
  year={2021}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].