All Projects → qiangsiwei → Bert_distill

qiangsiwei / Bert_distill

BERT distillation(基于BERT的蒸馏实验 )

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Bert distill

Sparse Evolutionary Artificial Neural Networks
Always sparse. Never dense. But never say never. A repository for the Adaptive Sparse Connectivity concept and its algorithmic instantiation, i.e. Sparse Evolutionary Training, to boost Deep Learning scalability on various aspects (e.g. memory and computational time efficiency, representation and generalization power).
Stars: ✭ 182 (-10.34%)
Mutual labels:  classification
Mem absa
Aspect Based Sentiment Analysis using End-to-End Memory Networks
Stars: ✭ 189 (-6.9%)
Mutual labels:  classification
Cnn 3d Images Tensorflow
3D image classification using CNN (Convolutional Neural Network)
Stars: ✭ 199 (-1.97%)
Mutual labels:  classification
3d Pointcloud
Papers and Datasets about Point Cloud.
Stars: ✭ 179 (-11.82%)
Mutual labels:  classification
Classifai
Enhance your WordPress content with Artificial Intelligence and Machine Learning services.
Stars: ✭ 188 (-7.39%)
Mutual labels:  classification
Sentinel2 Cloud Detector
Sentinel Hub Cloud Detector for Sentinel-2 images in Python
Stars: ✭ 194 (-4.43%)
Mutual labels:  classification
Deep Learning For Image Processing
deep learning for image processing including classification and object-detection etc.
Stars: ✭ 5,808 (+2761.08%)
Mutual labels:  classification
Lightautoml
LAMA - automatic model creation framework
Stars: ✭ 196 (-3.45%)
Mutual labels:  classification
Paddle2onnx
PaddlePaddle to ONNX model converter
Stars: ✭ 185 (-8.87%)
Mutual labels:  classification
Asl
Official Pytorch Implementation of: "Asymmetric Loss For Multi-Label Classification"(2020) paper
Stars: ✭ 195 (-3.94%)
Mutual labels:  classification
Geostats
A tiny and standalone javascript library for classification and basic statistics :
Stars: ✭ 183 (-9.85%)
Mutual labels:  classification
Hms Ml Demo
HMS ML Demo provides an example of integrating Huawei ML Kit service into applications. This example demonstrates how to integrate services provided by ML Kit, such as face detection, text recognition, image segmentation, asr, and tts.
Stars: ✭ 187 (-7.88%)
Mutual labels:  classification
Fake news detection
Fake News Detection in Python
Stars: ✭ 194 (-4.43%)
Mutual labels:  classification
Recurrent Convolutional Neural Network Text Classifier
My (slightly modified) Keras implementation of the Recurrent Convolutional Neural Network (RCNN) described here: http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9745.
Stars: ✭ 182 (-10.34%)
Mutual labels:  classification
Nlp classification
Implementing nlp papers relevant to classification with PyTorch, gluonnlp
Stars: ✭ 202 (-0.49%)
Mutual labels:  classification
Imgclsmob
Sandbox for training deep learning networks
Stars: ✭ 2,405 (+1084.73%)
Mutual labels:  classification
Uci Ml Api
Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)
Stars: ✭ 190 (-6.4%)
Mutual labels:  classification
Bonnetal
Bonnet and then some! Deep Learning Framework for various Image Recognition Tasks. Photogrammetry and Robotics Lab, University of Bonn
Stars: ✭ 202 (-0.49%)
Mutual labels:  classification
Laravel Categories
Rinvex Categorizable is a polymorphic Laravel package, for category management. You can categorize any eloquent model with ease, and utilize the power of Nested Sets, and the awesomeness of Sluggable, and Translatable models out of the box.
Stars: ✭ 199 (-1.97%)
Mutual labels:  classification
Dynaml
Scala Library/REPL for Machine Learning Research
Stars: ✭ 195 (-3.94%)
Mutual labels:  classification

基于BERT的蒸馏实验

参考论文《Distilling Task-Specific Knowledge from BERT into Simple Neural Networks》

分别采用keras和pytorch基于textcnn和bilstm(gru)进行了实验

实验数据分割成 1(有标签训练):8(无标签训练):1(测试)

在情感2分类clothing的数据集上初步结果如下:

  • 小模型(textcnn & bilstm)准确率在 0.80 ~ 0.81

  • BERT模型 准确率在 0.90 ~ 0.91

  • 蒸馏模型 准确率在 0.87 ~ 0.88

实验结果与论文结论基本一致,与预期相符

后续将尝试其他更有效的蒸馏方案

使用方法

首先finetune BERT

python ptbert.py

然后把BERT的知识蒸馏到小模型里

需要先解压data/cache/word2vec.gz

然后

python distill.py

调整文件中的use_aug及以下的参数可以使用论文中提到的其中两种数据增强方式(masking, n-gram sampling)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].