Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → DushyantaDhyani → kdtf

DushyantaDhyani / kdtf

Licence: MIT license

Knowledge Distillation using Tensorflow

Programming Languages

139335 projects - #7 most used programming language

Labels

deep-learning tensorflow knowledge-distillation

Projects that are alternatives of or similar to kdtf

[ECCV 2020] Scale Adaptive Network: Learning to Learn Parameterized Classification Networks for Scalable Input Images

Stars: ✭ 41 (-70.5%)

Mutual labels: knowledge-distillation

A Fast Knowledge Distillation Framework for Visual Recognition

Stars: ✭ 49 (-64.75%)

Mutual labels: knowledge-distillation

BigGAN; Knowledge Distillation; Black-Box; Fast Training; 16x compression

Stars: ✭ 62 (-55.4%)

Mutual labels: knowledge-distillation

Localize to Classify and Classify to Localize: Mutual Guidance in Object Detection

Stars: ✭ 97 (-30.22%)

Mutual labels: knowledge-distillation

OpenMMLab Model Compression Toolbox and Benchmark.

Stars: ✭ 644 (+363.31%)

Mutual labels: knowledge-distillation

This is the official implementation for the AAAI-2021 paper (Cross-Layer Distillation with Semantic Calibration).

Stars: ✭ 42 (-69.78%)

Mutual labels: knowledge-distillation

Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection (ACM MM 2018)

Stars: ✭ 58 (-58.27%)

Mutual labels: knowledge-distillation

Pretrained Language Model

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Stars: ✭ 2,033 (+1362.59%)

Mutual labels: knowledge-distillation

Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Stars: ✭ 27 (-80.58%)

Mutual labels: knowledge-distillation

Focal and Global Knowledge Distillation for Detectors (CVPR 2022)

Stars: ✭ 124 (-10.79%)

Mutual labels: knowledge-distillation

AB distillation

Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons (AAAI 2019)

Stars: ✭ 105 (-24.46%)

Mutual labels: knowledge-distillation

Zero-shot Knowledge Distillation Pytorch

ZSKD with PyTorch

Stars: ✭ 26 (-81.29%)

Mutual labels: knowledge-distillation

noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

Stars: ✭ 45 (-67.63%)

Mutual labels: knowledge-distillation

neural-compressor

Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.

Stars: ✭ 666 (+379.14%)

Mutual labels: knowledge-distillation

model optimizer

Model optimizer used in Adlik.

Stars: ✭ 22 (-84.17%)

Mutual labels: knowledge-distillation

Localization Distillation for Dense Object Detection (CVPR 2022)

Stars: ✭ 271 (+94.96%)

Mutual labels: knowledge-distillation

cool-papers-in-pytorch

Reimplementing cool papers in PyTorch...

Stars: ✭ 21 (-84.89%)

Mutual labels: knowledge-distillation

Awesome Knowledge Distillation

Awesome Knowledge Distillation

Stars: ✭ 2,634 (+1794.96%)

Mutual labels: knowledge-distillation

Knowledge distillation via TF2.0

The codes for recent knowledge distillation algorithms and benchmark results via TF2.0 low-level API

Stars: ✭ 87 (-37.41%)

Mutual labels: knowledge-distillation

Efficient-Computing

Efficient-Computing

Stars: ✭ 474 (+241.01%)

Mutual labels: knowledge-distillation

View All Similar Projects ➔

Knowledge Distillation - Tensorflow

This is an implementation for the basic idea behind Hinton's Knowledge Distillation Paper. We do not reproduce the exact results but rather show that the idea works.

While a few other implementations are available, the code flow is not very intuitive. Here we generate the soft targets from the teacher in an on-line manner while training the student network.

The big and small models (with some modification - We currently have a simple softmax regression as in TF's tutorial) have been taken from here.

While this may not (or may) be a good way to implement the distillation architecture, it leads to a good improvement in the (small) student model. In case you find any bug or have any suggestions feel free to create an issue or even send in a pull request.

Requirements

Tensorflow 1.3 or above

Running the code

Train the Teacher Model

 python main.py --model_type teacher --checkpoint_dir teachercpt --num_steps 5000 --temperature 5

Train the Student Model (in a standalone manner for comparison)

 python main.py --model_type student --checkpoint_dir studentcpt --num_steps 5000

Train the Student Model (Using Soft Targets from the teacher model)

 python main.py --model_type student --checkpoint_dir studentcpt --load_teacher_from_checkpoint true --load_teacher_checkpoint_dir teachercpt --num_steps 5000 --temperature 5

Results (For different temperature values)

Model	Accuracy - 2	Accuracy - 5
Teacher Only	97.9	98.12
Distillation	89.14	90.77
Student Only	88.84	88.84

The small model when trained without the soft labels always use temperature=1.

References

Distilling the Knowledge in a Neural Network

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 139

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (5) 🔗