All Projects → Yujun-Shi → BLIP

Yujun-Shi / BLIP

Licence: MIT license
Official Implementation of CVPR2021 paper: Continual Learning via Bit-Level Information Preserving

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to BLIP

CVPR2021 PLOP
Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation
Stars: ✭ 102 (+209.09%)
Mutual labels:  continual-learning, cvpr2021
HESIC
Official Code of "Deep Homography for Efficient Stereo Image Compression"[cvpr21oral]
Stars: ✭ 42 (+27.27%)
Mutual labels:  cvpr2021
LabelRelaxation-CVPR21
Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021
Stars: ✭ 37 (+12.12%)
Mutual labels:  cvpr2021
RainNet
[CVPR 2021] Region-aware Adaptive Instance Normalization for Image Harmonization
Stars: ✭ 125 (+278.79%)
Mutual labels:  cvpr2021
ADER
(RecSys 2020) Adaptively Distilled Exemplar Replay towards Continual Learning for Session-based Recommendation [Best Short Paper]
Stars: ✭ 28 (-15.15%)
Mutual labels:  continual-learning
Im2Vec
[CVPR 2021 Oral] Im2Vec Synthesizing Vector Graphics without Vector Supervision
Stars: ✭ 229 (+593.94%)
Mutual labels:  cvpr2021
Generative Continual Learning
No description or website provided.
Stars: ✭ 51 (+54.55%)
Mutual labels:  continual-learning
DCNet
Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection, CVPR 2021
Stars: ✭ 113 (+242.42%)
Mutual labels:  cvpr2021
cvpr-buzz
🐝 Explore Trending Papers at CVPR
Stars: ✭ 37 (+12.12%)
Mutual labels:  cvpr2021
RSCD
[CVPR2021] Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes
Stars: ✭ 83 (+151.52%)
Mutual labels:  cvpr2021
Remembering-for-the-Right-Reasons
Official Implementation of Remembering for the Right Reasons (ICLR 2021)
Stars: ✭ 27 (-18.18%)
Mutual labels:  continual-learning
BCNet
Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [CVPR 2021]
Stars: ✭ 434 (+1215.15%)
Mutual labels:  cvpr2021
single-positive-multi-label
Multi-Label Learning from Single Positive Labels - CVPR 2021
Stars: ✭ 63 (+90.91%)
Mutual labels:  cvpr2021
Adam-NSCL
PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"
Stars: ✭ 34 (+3.03%)
Mutual labels:  continual-learning
SkeletonMerger
Code repository for paper `Skeleton Merger: an Unsupervised Aligned Keypoint Detector`.
Stars: ✭ 49 (+48.48%)
Mutual labels:  cvpr2021
FUSION
PyTorch code for NeurIPSW 2020 paper (4th Workshop on Meta-Learning) "Few-Shot Unsupervised Continual Learning through Meta-Examples"
Stars: ✭ 18 (-45.45%)
Mutual labels:  continual-learning
reproducible-continual-learning
Continual learning baselines and strategies from popular papers, using Avalanche. We include EWC, SI, GEM, AGEM, LwF, iCarl, GDumb, and other strategies.
Stars: ✭ 118 (+257.58%)
Mutual labels:  continual-learning
CondenseNetV2
[CVPR 2021] CondenseNet V2: Sparse Feature Reactivation for Deep Networks
Stars: ✭ 73 (+121.21%)
Mutual labels:  cvpr2021
MetaBIN
[CVPR2021] Meta Batch-Instance Normalization for Generalizable Person Re-Identification
Stars: ✭ 58 (+75.76%)
Mutual labels:  cvpr2021
CVPR2021-Papers-with-Code-Demo
收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!
Stars: ✭ 752 (+2178.79%)
Mutual labels:  cvpr2021

(CVPR2021) Continual Learning via Bit-Level Information Preserving ArXiv

This repo contains the official Implementation of the CVPR2021 paper: Continual Learning via Bit-Level Information Preserving.

Abstract

Continual learning tackles the setting of learning different tasks sequentially. Despite the lots of previous solutions, most of them still suffer significant forgetting or expensive memory cost. In this work, targeted at these problems, we first study the continual learning process through the lens of information theory and observe that forgetting of a model stems from the loss of information gain on its parameters from the previous tasks when learning a new task. From this viewpoint, we then propose a novel continual learning approach called Bit-Level Information Preserving (BLIP) that preserves the information gain on model parameters through updating the parameters at the bit level, which can be conveniently implemented with parameter quantization. More specifically, BLIP first trains a neural network with weight quantization on the new incoming task and then estimates information gain on each parameter provided by the task data to determine the bits to be frozen to prevent forgetting. We conduct extensive experiments ranging from classification tasks to reinforcement learning tasks, and the results show that our method produces better or on par results comparing to previous state-of-the-arts. Indeed, BLIP achieves close to zero forgetting while only requiring constant memory overheads throughout continual learning

Authors

Yujun Shi (LV Lab), Li Yuan (LV Lab), Yunpeng Chen (YITU Technology), Jiashi Feng (LV Lab)

Graphical Illustration

graphical_illustration

We consider a simple scenario with one single parameter quantized to 10 bits to illustrate our method. $\theta_{t}$ denotes the parameter after learning on task $1$ to $t$, and $\theta_{0}$ is a randomly initialized value before training on any task. $IG_{t}$ denotes information gain on $\theta$ after learning the task $t$. Bit representation of $\theta$ after learning each task is shown below. From the higher bit positions to lower ones is more significant bits to less significant ones. Frozen bits are filled with color and the rest bits are free bits. After learning each task, the information gain is calculated and then $\lceil IG_{t} \rceil$ bits are to be frozen in the bit representation. By repeating this process, the information on previous tasks can be preserved, enabling continual learning for neural networks.

Experiment Results

For numerical results and ablation studies, please check our paper.

Here, we render and compare agents trained by EWC and BLIP under different environments.

Below is visualization of sequentially learning the first 3 Atari games in our setups (i.e., kung fu master -- boxing -- james bond).

The i-th row, j-th column GIF illustrates how well does the agent perform in the j-th task after learning the first i tasks.

As can be seen, for EWC, the agent's performance on previous task degraded drastically after learning new tasks, while agent trained with BLIP can still perform quite well. (This phenomenon is most significant for task 1.)

EWC



















BLIP



















Citation

If you find our repo/paper helpful, please consider citing our work :)

@article{shi2021continual,
  title={Continual Learning via Bit-Level Information Preserving},
  author={Shi, Yujun and Yuan, Li and Chen, Yunpeng and Feng, Jiashi},
  journal={arXiv preprint arXiv:2105.04444},
  year={2021}
}

Prerequisites

  • pytorch >= 1.3.1
  • gym (required by RL, no need if you only run image classifications)
  • baselines (required by RL, no need if you only run image classifications)




Image Classifications (Besides mini-ImageNet)

Under the folder of ImageClassification/src:

To run BLIP with MNIST-5:

python run_blip.py --approach blip --experiment mnist5 --lr 0.01 --sbatch 64 --F-prior 1e-15 --nepochs 200

To run BLIP with PMNIST:

python run_blip.py --approach blip --experiment pmnist --lr 0.01 --sbatch 64 --F-prior 1e-15 --nepochs 200

To run BLIP with Alternating Cifar10/100:

python run_blip.py --experiment cifar --lr 0.05 --sbatch 32 --F-prior 5e-16 --mul 2

To run BLIP with Sequence of 5 datasets:

python run_blip.py --experiment mixture5 --lr 0.05 --sbatch 32 --F-prior 5e-17 --mul 0.8 --seed 0

All datasets will be automatically downloaded and processed under ImageClassification/data




Image Classification (mini-ImageNet)

Under the folder miniImageNetClassification/src:

The following two steps are needed to run the experiment:

  • Step 1: Prepare Data

First, download the zipped file and extract it under the folder miniImageNetClassification/src/data

Then, under miniImageNetClassification/src/data, execute the following to obtain data split:

python generate_train_test_split.sh

After executing the file, two files named "train.pkl" and "test.pkl" will be generated. These are the data files and will be loaded for training/testing.

  • Step 2: run shell command

Under the folder miniImageNetClassification/src:

To run BLIP with AlexNet, use:

python run_blip.py --F-prior 5e-16 --lr 0.01 --momentum 0.0 --mul 1 --sbatch 32 --seed 0 --ntasks 20 --arch alexnet

To run BLIP with ResNet-18, use:

python run_blip.py --F-prior 1e-16 --lr 0.01 --momentum 0.0 --mul 1.5 --sbatch 32 --seed 0 --ntasks 20 --arch resnet

To run Baseline methods with AlexNet, use:

python run_baselines.py --lr 0.01 --approach <baseline-method> --momentum 0.0 --mul 1 --sbatch 32 --seed 0 --ntasks 20 --arch alexnet

where <baseline-method> should be replaced by the name of baseline methods (e.g., sgd, sgd-frozen, lwf, imm-mode, ewc).




RL (sequence of 6 Atari games)

Under the folder RL/src

To run our method BLIP, use:

./run_blip.sh

To run online EWC, use:

./run_ewc.sh

To run plain fine-tuning, use:

./run_ft.sh




Contact

Yujun Shi ([email protected])

Acknowledgements

Our code is inspired by the following repo: HAT, ACL, UCL, pytorch-a2c-ppo-acktr-gail

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].