DefangChen / SemCKD

Licence: other

This is the official implementation for the AAAI-2021 paper (Cross-Layer Distillation with Semantic Calibration).

Programming Languages

Jupyter Notebook

11667 projects

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to SemCKD

LabelRelaxation-CVPR21

Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021

Stars: ✭ 37 (-11.9%)

Mutual labels: knowledge-distillation

EC-GAN

EC-GAN: Low-Sample Classification using Semi-Supervised Algorithms and GANs (AAAI 2021)

Stars: ✭ 29 (-30.95%)

Mutual labels: aaai2021

mmrazor

OpenMMLab Model Compression Toolbox and Benchmark.

Stars: ✭ 644 (+1433.33%)

Mutual labels: knowledge-distillation

MLIC-KD-WSD

Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection (ACM MM 2018)

Stars: ✭ 58 (+38.1%)

Mutual labels: knowledge-distillation

MutualGuide

Localize to Classify and Classify to Localize: Mutual Guidance in Object Detection

Stars: ✭ 97 (+130.95%)

Mutual labels: knowledge-distillation

MoTIS

Mobile(iOS) Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP). Accepted at NAACL 2022.

Stars: ✭ 60 (+42.86%)

Mutual labels: knowledge-distillation

proxy-synthesis

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Stars: ✭ 30 (-28.57%)

Mutual labels: aaai2021

SelfSupervisedLearning-DSM

code for AAAI21 paper "Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion“

Stars: ✭ 26 (-38.1%)

Mutual labels: aaai2021

SGRAF

The code of “Similarity Reasoning and Filtration for Image-Text Matching” [AAAI2021]

Stars: ✭ 136 (+223.81%)

Mutual labels: aaai2021

lffont

Official PyTorch implementation of LF-Font (Few-shot Font Generation with Localized Style Representations and Factorization) AAAI 2021

Stars: ✭ 110 (+161.9%)

Mutual labels: aaai2021

Localization Distillation for Dense Object Detection (CVPR 2022)

Stars: ✭ 271 (+545.24%)

Mutual labels: knowledge-distillation

neural-compressor

Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.

Stars: ✭ 666 (+1485.71%)

Mutual labels: knowledge-distillation

AttaNet

AttaNet for real-time semantic segmentation.

Stars: ✭ 37 (-11.9%)

Mutual labels: aaai2021

Distill-BERT-Textgen

Research code for ACL 2020 paper: "Distilling Knowledge Learned in BERT for Text Generation".

Stars: ✭ 121 (+188.1%)

Mutual labels: knowledge-distillation

bert-AAD

Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Stars: ✭ 27 (-35.71%)

Mutual labels: knowledge-distillation

SnapMix

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

Stars: ✭ 127 (+202.38%)

Mutual labels: aaai2021

AB distillation

Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons (AAAI 2019)

Stars: ✭ 105 (+150%)

Mutual labels: knowledge-distillation

cool-papers-in-pytorch

Reimplementing cool papers in PyTorch...

Stars: ✭ 21 (-50%)

Mutual labels: knowledge-distillation

FKD

A Fast Knowledge Distillation Framework for Visual Recognition

Stars: ✭ 49 (+16.67%)

Mutual labels: knowledge-distillation

Zero-shot Knowledge Distillation Pytorch

ZSKD with PyTorch

Stars: ✭ 26 (-38.1%)

Mutual labels: knowledge-distillation

View All Similar Projects ➔

SemCKD

Cross-Layer Distillation with Semantic Calibration (AAAI-2021) https://arxiv.org/abs/2012.03236v1

Journal version was published in IEEE TKDE https://ieeexplore.ieee.org/document/9767633

A more compact and clear implementation was provided in https://github.com/DefangChen/SimKD

Overview

The existing feature distillation works can be separated into two categories according to the position where the knowledge distillation is performed. As shown in the figure below, one is feature-map distillation and another one is feature-embedding distillation.

SemCKD belongs to feature-map distillation and is compatible with SOTA feature-embedding distillation (e.g., CRD) to further boost the performance of Student Networks.

This repo contains the implementation of SemCKD together with the compared approaches, such as classic KD, Feature-Map Distillation variants like FitNet, AT, SP, VID, HKD and feature-embedding distillation variants like PKT, RKD, IRG, CC, CRD.

CIFAR-100 Results

where ARI means Average Relative Improvement. This evaluation metric reflects the extent to which SemCKD further improves on the basis of existing approaches compared to improvements made by these approaches upon the baseline student model.

To get the pretrained teacher models for CIFAR-100:

sh scripts/fetch_pretrained_teachers.sh

For ImageNet, pretrained models from torchvision are used, e.g. ResNet34. Save the model to ./save/models/$MODEL_vanilla/ and use scripts/model_transform.py to make it readable by our code.

Running SemCKD:

# CIFAR-100
python train_student.py --path-t ./save/models/resnet32x4_vanilla/ckpt_epoch_240.pth --distill semckd --model_s resnet8x4 -r 1 -a 1 -b 400 --trial 0
# ImageNet
python train_student.py --path-t ./save/models/ResNet34_vanilla/resnet34_transformed.pth \
--batch_size 256 --epochs 90 --dataset imagenet --gpu_id 0,1,2,3,4,5,6,7 --dist-url tcp://127.0.0.1:23333 \
--print-freq 100 --num_workers 32 --distill semckd --model_s ResNet18 -r 1 -a 1 -b 50 --trial 0 \
--multiprocessing-distributed --learning_rate 0.1 --lr_decay_epochs 30,60 --weight_decay 1e-4 --dali gpu

Note:

The implementation of compared methods are based on the author-provided code and a open-source benchmark https://github.com/HobbitLong/RepDistiller. The main difference is that we set both weights for classification loss and logit-level distillation loss as 1 throughout the experiments, which is a more common practice for knowledge distillation. (-r 1 -a 1)
Note that the wide ResNet model in the "RepDistiller/models/wrn.py" is almost the same as those in resnet.py. For example, wrn_40_2 in wrn.py almost equals to resnet38x2 in resnet.py. The only difference is that resnet38x2 has additional three BN layers, which will lead to 2*(16+32+64)*k parameters [k=2 in this comparison].
Three FC layers of VGG-ImageNet are replaced with single one, thus the total layer number should be reduced by two on CIFAR-100. For example, the actual number of layers for VGG-8 is 6.
Computing Infrastructure:
- For CIFAR-100, we run experiments on a single machine that contains one NVIDIA GeForce TITAN X-Pascal GPU, 32 Inter (R) Xeon (R) CPU E5-2620 v4 @ 2.10GHz. The CUDA version is 10.2. The PyTorch version is 1.0.
- For ImageNet, we run experiments on a single machine that contains eight NVIDIA GeForce RTX 2080Ti GPUs, 64 Intel (R) Xeon (R) Silver 4216 CPU @ 2.10 GHz. The CUDA version is 10.2. The PyTorch version is 1.6.
The codes in this repository was merged from different sources, and we have not tested them thoroughly. Hence, if you have any questions, please contact us without hesitation.

Citation

If you find this repository useful, please consider citing the following paper:

@inproceedings{chen2021cross,
  author    = {Defang Chen and Jian{-}Ping Mei and Yuan Zhang and Can Wang and Zhe Wang and Yan Feng and Chun Chen},
  title     = {Cross-Layer Distillation with Semantic Calibration},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  pages     = {7028--7036},
  year      = {2021},
}

@article{chen2022cross,  
  author    = {Wang, Can and Chen, Defang and Mei, Jian-Ping and Zhang, Yuan and Feng, Yan and Chen, Chun},  
  title     = {SemCKD: Semantic Calibration for Cross-Layer Knowledge Distillation},   
  journal   = {IEEE Transactions on Knowledge and Data Engineering},   
  year      = {2022},  
  doi       = {10.1109/TKDE.2022.3171571}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

DefangChen / SemCKD

Programming Languages

Labels

Projects that are alternatives of or similar to SemCKD

SemCKD

Overview

Citation