lhyfst / Knowledge Distillation Papers
knowledge distillation papers
Stars: ✭ 422
Projects that are alternatives of or similar to Knowledge Distillation Papers
Tg Reading List
A text generation reading list maintained by Tsinghua Natural Language Processing Group.
Stars: ✭ 352 (-16.59%)
Mutual labels: paper, reading-list
my-bookshelf
Collection of books/papers that I've read/I'm going to read/I would remember that they exist/It is unlikely that I'll read/I'll never read.
Stars: ✭ 49 (-88.39%)
Mutual labels: paper, reading-list
Musicgenreclassification
Classify music genre from a 10 second sound stream using a Neural Network.
Stars: ✭ 377 (-10.66%)
Mutual labels: paper
Healthcare ml
A curated list of ML|NLP resources for healthcare.
Stars: ✭ 351 (-16.82%)
Mutual labels: paper
Languagedetector
PHP Class to detect languages from any free text
Stars: ✭ 317 (-24.88%)
Mutual labels: paper
Inception V4
Inception-v4, Inception - Resnet-v1 and v2 Architectures in Keras
Stars: ✭ 350 (-17.06%)
Mutual labels: paper
Slimefun4
Slimefun 4 - A unique Spigot/Paper plugin that looks and feels like a modpack. We've been giving you backpacks, jetpacks, reactors and much more since 2013.
Stars: ✭ 369 (-12.56%)
Mutual labels: paper
Action Recognition Visual Attention
Action recognition using soft attention based deep recurrent neural networks
Stars: ✭ 350 (-17.06%)
Mutual labels: paper
Vsepp
PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"
Stars: ✭ 354 (-16.11%)
Mutual labels: paper
Filter Pruning Geometric Median
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)
Stars: ✭ 338 (-19.91%)
Mutual labels: model-compression
Akarin
Akarin is a powerful (not yet) server software from the 'new dimension'
Stars: ✭ 332 (-21.33%)
Mutual labels: paper
Learning Deep Learning
Paper reading notes on Deep Learning and Machine Learning
Stars: ✭ 388 (-8.06%)
Mutual labels: paper
Yatopia
The Most Powerful and Feature Rich Minecraft Server Software!
Stars: ✭ 408 (-3.32%)
Mutual labels: paper
knowledge distillation papers
Author: Li Heyuan (李贺元)
Email: [email protected]
Inspired by dkozlov/awesome-knowledge-distillation
All rights reserved
If you have any suggestion or want to recommend new papers, please feel free to let me know.
I have read all the papers here, and am very happy to discuss with you if you have any questions on these papers.
I will keep updating this project frequently.
Early Papers
- Model Compression, Rich Caruana, 2006
- Distilling the Knowledge in a Neural Network, Hinton, J.Dean, 2015
- Knowledge Acquisition from Examples Via Multiple Models, Perdo Domingos, 1997
- Combining labeled and unlabeled data with co-training, A. Blum, T. Mitchell, 1998
- Using A Neural Network to Approximate An Ensemble of Classifiers, Xinchuan Zeng and Tony R. Martinez, 2000
- Do Deep Nets Really Need to be Deep?, Lei Jimmy Ba, Rich Caruana, 2014
Recommended Papers
- FitNets: Hints for Thin Deep Nets, Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, Yoshua Bengio, 2015
- Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, Sergey Zagoruyko, Nikos Komodakis, 2016
- A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning, Junho Yim, Donggyu Joo, Jihoon Bae, Junmo Kim, 2017
- Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks, Zheng Xu, Yen-Chang Hsu, Jiawei Huang
- Born Again Neural Networks, Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, Anima Anandkumar, 2018
- Net2Net: Accelerating Learning Via Knowledge Transfer, Tianqi Chen, Ian Goodfellow, Jonathon Shlens, 2016
- Unifying distillation and privileged information, David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, Vladimir Vapnik, 2015
- Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks, Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, Ananthram Swami, 2016
- Large scale distributed neural network training through online distillation, Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E. Dahl, Geoffrey E. Hinton, 2018
- Deep Mutual Learning, Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu, 2017
- Learning Loss for Knowledge Distillation with Conditional Adversarial Networks, Zheng Xu, Yen-Chang Hsu, Jiawei Huang, 2017
- Quantization Mimic: Towards Very Tiny CNN for Object Detection, Yi Wei, Xinyu Pan, Hongwei Qin, Wanli Ouyang, Junjie Yan, 2018
- Knowledge Projection for Deep Neural Networks, Zhi Zhang, Guanghan Ning, Zhihai He, 2017
- Moonshine: Distilling with Cheap Convolutions, Elliot J. Crowley, Gavin Gray, Amos Storkey, 2017
- Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving, Jiaolong Xu, Peng Wang, Heng Yang and Antonio M. L ´opez, 2018
- Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net, Zihao Liu, Qi Liu, Tao Liu, Yanzhi Wang, Wujie Wen, 2017
- Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher, Seyed-Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Hassan Ghasemzadeh, 2019
- ResKD: Residual-Guided Knowledge Distillation, Xuewei Li, Songyuan Li, Bourahla Omar, and Xi Li, 2020
Recent Papers(since 2018)
- Learning Global Additive Explanations for Neural Nets Using Model Distillation, Sarah Tan, Rich Caruana, Giles Hooker, Paul Koch, Albert Gordo, 2018
- YASENN: Explaining Neural Networks via Partitioning Activation Sequences, Yaroslav Zharov, Denis Korzhenkov, Pavel Shvechikov, Alexander Tuzhilin, 2018
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, Antti Tarvainen, Harri Valpola, 2018
- Local Affine Approximators for Improving Knowledge Transfer, Suraj Srinivas & François Fleuret, 2018
- Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?Shilin Zhu, Xin Dong, Hao Su, 2018
- Learning Efficient Detector with Semi-supervised Adaptive Distillation, Shitao Tang, Litong Feng, Zhanghui Kuang, Wenqi Shao, Quanquan Li, Wei Zhang, Yimin Chen, 2019
- Dataset Distillation, Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba, Alexei A. Efros, 2019
- Relational Knowledge Distillation, Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho, 2019
- Knowledge Adaptation for Efficient Semantic Segmentation, Tong He, Chunhua Shen, Zhi Tian, Dong Gong, Changming Sun, Youliang Yan, 2019
- A Comprehensive Overhaul of Feature Distillation, Byeongho Heo, Jeesoo Kim, Sangdoo Yun, Hyojin Park, Nojun Kwak, Jin Young Choi, 2019
- Towards Understanding Knowledge Distillation, Mary Phuong, Christoph Lampert, ICML, 2019
Relevant Papers
- Learning Efficient Object Detection Models with Knowledge Distillation, Guobin Chen, Wongun Choi, Xiang Yu, Tony Han, Manmohan Chandraker, NIPS 2017
- Data Distillation: Towards Omni-Supervised Learning, Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, Kaiming He, CVPR 2017
- Cross Modal Distillation for Supervision Transfer, Saurabh Gupta, Judy Hoffman, Jitendra Malik, CVPR 2016
- Knowledge Projection for Deep Neural Networks, Zhi Zhang, Guanghan Ning, Zhihai He, 2017
- Like What You Like: Knowledge Distill via Neuron Selectivity Transfer, Zehao Huang, Naiyan Wang, 2017
- Deep Model Compression: Distilling Knowledge from Noisy Teachers, Bharat Bhusan Sau, Vineeth N. Balasubramanian, 2016
- Knowledge Distillation for Small-footprint Highway Networks, Liang Lu, Michelle Guo, Steve Renals, 2016
- Sequence-Level Knowledge Distillation, deeplearning-papernotes, Yoon Kim, Alexander M. Rush, 2016
- Recurrent Neural Network Training with Dark Knowledge Transfer, Zhiyuan Tang, Dong Wang, Zhiyong Zhang, 2016
- Data-Free Knowledge Distillation For Deep Neural Networks, Raphael Gontijo Lopes, Stefano Fenu, 2017
- DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang, 2017
- Face Model Compression by Distilling Knowledge from Neurons, Ping Luo, Zhenyao Zhu, Ziwei Liu, Xiaogang Wang, and Xiaoou Tang, 2016
- Adapting Models to Signal Degradation using Distillation, Jong-Chyi Su, Subhransu Maji, BMVC 2017
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].