AdCo
AdCo is a contrastive-learning based self-supervised learning methods, which is published on CVPR2021.
Copyright (C) 2020 Qianjiang Hu*, Xiao Wang*, Wei Hu, Guo-Jun Qi
License: MIT for academic use.
Contact: Guo-Jun Qi ([email protected])
Introduction
Contrastive learning relies on constructing a collection of negative examples that are sufficiently hard to discriminate against positive queries when their representations are self-trained. Existing contrastive learning methods either maintain a queue of negative samples over minibatches while only a small portion of them are updated in an iteration, or only use the other examples from the current minibatch as negatives. They could not closely track the change of the learned representation over iterations by updating the entire queue as a whole, or discard the useful information from the past minibatches. Alternatively, we present to directly learn a set of negative adversaries playing against the self-trained representation. Two players, the representation network and negative adversaries, are alternately updated to obtain the most challenging negative examples against which the representation of positive queries will be trained to discriminate. We further show that the negative adversaries are updated towards a weighted combination of positive queries by maximizing the adversarial contrastive loss, thereby allowing them to closely track the change of representations over time. Experiment results demonstrate the proposed Adversarial Contrastive (AdCo) model not only achieves superior performances (a top-1 accuracy of 73.2% over 200 epochs and 75.7% over 800 epochs with linear evaluation on ImageNet), but also can be pre-trained more efficiently with much shorter GPU time and fewer epochs.
Installation
CUDA version should be 10.1 or higher.
Install git
1. 2. Clone the repository in your computer
git clone [email protected]:maple-research-lab/AdCo.git && cd AdCo
3. Build dependencies.
You have two options to install dependency on your computer:
3.1 Install with pip and python(Ver 3.6.9).
install pip
.
3.1.13.1.2 Install dependency in command line.
pip install -r requirements.txt --user
If you encounter any errors, you can install each library one by one:
pip install torch==1.7.1
pip install torchvision==0.8.2
pip install numpy==1.19.5
pip install Pillow==5.1.0
pip install tensorboard==1.14.0
pip install tensorboardX==1.7
3.2 Install with anaconda
install conda
.
3.2.1 3.2.2 Install dependency in command line
conda create -n AdCo python=3.6.9
conda activate AdCo
pip install -r requirements.txt
Each time when you want to run my code, simply activate the environment by
conda activate AdCo
conda deactivate(If you want to exit)
4 Prepare the ImageNet dataset
ImageNet2012 Dataset under "./datasets/imagenet2012".
4.1 Download the4.2 Go to path "./datasets/imagenet2012/val"
the following shell script
4.3 move validation images to labeled subfolders, usingUsage
Unsupervised Training
This implementation only supports multi-gpu, DistributedDataParallel training, which is faster and simpler; single-gpu or DataParallel training is not supported.
Single Crop
1 Without symmetrical loss:
python3 main_adco.py --sym=0 --lr=0.03 --memory_lr=3 --moco_t=0.12 --mem_t=0.02 --data=./datasets/imagenet2012 --dist_url=tcp://localhost:10001
2 With symmetrical loss:
python3 main_adco.py --sym=1 --lr=0.03 --memory_lr=3 --moco_t=0.12 --mem_t=0.02 --data=./datasets/imagenet2012 --dist_url=tcp://localhost:10001
3 setting different numbers of negative samples:
# e.g., training with 8192 negative samples and symmetrical loss
python3 main_adco.py --sym=1 --lr=0.04 --memory_lr=3 --moco_t=0.14 --mem_t=0.03 --cluster 8192 --data=./datasets/imagenet2012 --dist_url=tcp://localhost:10001
Multi Crop
python3 main_adco.py --multi_crop=1 --lr=0.03 --memory_lr=3 --moco_t=0.12 --mem_t=0.02 --data=./datasets/imagenet2012 --dist_url=tcp://localhost:10001
So far we have yet to support multi crop with symmetrical loss, the feature will be added in future.
Linear Classification
With a pre-trained model, we can easily evaluate its performance on ImageNet with:
python3 lincls.py --data=./datasets/imagenet2012 --dist-url=tcp://localhost:10001 --pretrained=input.pth.tar
Performance:
pre-train network |
pre-train epochs |
Crop | Symmetrical Loss |
AdCo top-1 acc. |
Model Link |
---|---|---|---|---|---|
ResNet-50 | 200 | Single | No | 68.6 | model |
ResNet-50 | 200 | Multi | No | 73.2 | model |
ResNet-50 | 800 | Single | No | 72.8 | None |
ResNet-50 | 800 | Multi | No | 75.7 | None |
ResNet-50 | 200 | Single | Yes | 70.6 | model |
Really sorry that we can't provide 800 epochs' model, which is because of the company regulations, since we trained them on company machines. For downstream tasks, we found multi-200epoch model also had similar performances. Thus, we suggested you to use this model for downstream purposes.
Performance with different negative samples:
pre-train network |
pre-train epochs |
negative samples |
Symmetrical Loss |
AdCo top-1 acc. |
Model Link |
---|---|---|---|---|---|
ResNet-50 | 200 | 65536 | No | 68.6 | model |
ResNet-50 | 200 | 65536 | Yes | 70.6 | model |
ResNet-50 | 200 | 16384 | No | 68.6 | model |
ResNet-50 | 200 | 16384 | Yes | 70.2 | model |
ResNet-50 | 200 | 8192 | No | 68.4 | model |
ResNet-50 | 200 | 8192 | Yes | 70.2 | model |
The performance is obtained on a single machine with 8*V100 GPUs.
Transfering to VOC07 Classification
Dataset under "./datasets/voc"
1 Download2 Linear Evaluation:
cd VOC_CLF
python3 main.py --data=../datasets/voc --pretrained=../input.pth.tar
Here VOC directory should be the directory includes "vockit" directory.
Transfer to Places205 Classification
Dataset under "./datasets/places205"
1 Download2 Linear Evaluation:
python3 lincls.py --dataset=Place205 --sgdr=1 --data=./datasets/places205 --lr=5 --dist-url=tcp://localhost:10001 --pretrained=input.pth.tar
Transfer to Object Detection
detectron2.
1. Install2. Convert a pre-trained AdCo model to detectron2's format:
# in detection folder
python3 convert-pretrain-to-detectron2.py input.pth.tar output.pkl
VOC Dataset and COCO Dataset under "./detection/datasets" directory,
3. downloadfollowing the directory structure requried by detectron2.
4. Run training:
4.1 Pascal detection
Number of GPU will influence the overall batch size, thus all the experiments should be done with 8 GPUs. If with less GPUs, please finetune the SOLVER.BASE_LR based on your condition.
cd detection
python train_net.py --config-file configs/pascal_voc_R_50_C4_24k_adco.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
4.2 COCO detection
Number of GPU will influence the overall batch size, thus all the experiments should be done with 8 GPUs. If with less GPUs, please finetune the SOLVER.BASE_LR based on your condition.
cd detection
python train_net.py --config-file configs/coco_R_50_C4_2x_adco.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
Citation:
@inproceedings{hu2021adco,
title={Adco: Adversarial contrast for efficient learning of unsupervised representations from self-trained negative adversaries},
author={Hu, Qianjiang and Wang, Xiao and Hu, Wei and Qi, Guo-Jun},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={1074--1083},
year={2021}
}