All Projects → lancopku → Mesimp

lancopku / Mesimp

Codes for "Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method"

Projects that are alternatives of or similar to Mesimp

Keras Attention
Visualizing RNNs using the attention mechanism
Stars: ✭ 697 (+4256.25%)
Mutual labels:  natural-language-processing
Coursera
Quiz & Assignment of Coursera
Stars: ✭ 774 (+4737.5%)
Mutual labels:  natural-language-processing
Nlg Eval
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
Stars: ✭ 822 (+5037.5%)
Mutual labels:  natural-language-processing
Machine Learning
머신러닝 입문자 혹은 스터디를 준비하시는 분들에게 도움이 되고자 만든 repository입니다. (This repository is intented for helping whom are interested in machine learning study)
Stars: ✭ 705 (+4306.25%)
Mutual labels:  natural-language-processing
Youtokentome
Unsupervised text tokenizer focused on computational efficiency
Stars: ✭ 728 (+4450%)
Mutual labels:  natural-language-processing
Torchmoji
😇A pyTorch implementation of the DeepMoji model: state-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc
Stars: ✭ 795 (+4868.75%)
Mutual labels:  natural-language-processing
Madewithml
Learn how to responsibly deliver value with ML.
Stars: ✭ 29,253 (+182731.25%)
Mutual labels:  natural-language-processing
Lightning Bolts
Toolbox of models, callbacks, and datasets for AI/ML researchers.
Stars: ✭ 829 (+5081.25%)
Mutual labels:  natural-language-processing
Jcseg
Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for the latest lucene,solr,elasticsearch
Stars: ✭ 754 (+4612.5%)
Mutual labels:  natural-language-processing
Insuranceqa Corpus Zh
🚁 保险行业语料库,聊天机器人
Stars: ✭ 821 (+5031.25%)
Mutual labels:  natural-language-processing
Machine learning examples
A collection of machine learning examples and tutorials.
Stars: ✭ 6,466 (+40312.5%)
Mutual labels:  natural-language-processing
Ecco
Visualize and explore NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2).
Stars: ✭ 723 (+4418.75%)
Mutual labels:  natural-language-processing
Spacy Models
💫 Models for the spaCy Natural Language Processing (NLP) library
Stars: ✭ 796 (+4875%)
Mutual labels:  natural-language-processing
Ai Series
📚 [.md & .ipynb] Series of Artificial Intelligence & Deep Learning, including Mathematics Fundamentals, Python Practices, NLP Application, etc. 💫 人工智能与深度学习实战,数理统计篇 | 机器学习篇 | 深度学习篇 | 自然语言处理篇 | 工具实践 Scikit & Tensoflow & PyTorch 篇 | 行业应用 & 课程笔记
Stars: ✭ 702 (+4287.5%)
Mutual labels:  natural-language-processing
Underthesea
Underthesea - Vietnamese NLP Toolkit
Stars: ✭ 823 (+5043.75%)
Mutual labels:  natural-language-processing
Bert
TensorFlow code and pre-trained models for BERT
Stars: ✭ 29,971 (+187218.75%)
Mutual labels:  natural-language-processing
Nlp In Practice
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Stars: ✭ 790 (+4837.5%)
Mutual labels:  natural-language-processing
Awesome Ai Ml Dl
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.
Stars: ✭ 831 (+5093.75%)
Mutual labels:  natural-language-processing
Ciphey
⚡ Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes ⚡
Stars: ✭ 9,116 (+56875%)
Mutual labels:  natural-language-processing
Pororo
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Stars: ✭ 812 (+4975%)
Mutual labels:  natural-language-processing

meSimp

The codes were used for experiments on MNIST with Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method [pdf] by Xu Sun, Xuancheng Ren, Shuming Ma, Bingzhen Wei, Wei Li, Houfeng Wang. Codes are writen in C#.

Introduction

We propose a simple yet effective technique to simplify the training and the resulting model of neural networks. The technique is based on the top-k selection of the gradients in back propagation.

Based on the sparsified gradients from meProp, we further simplify the model by eliminating the rows or columns that are seldom updated, which will reduce the computational cost both in the training and decoding, and potentially accelerate decoding in real-world applications. We name this method meSimp (minimal effort simplification).

The model simplification results show that we could adaptively simplify the model which could often be reduced by around 9x, without any loss on accuracy or even with improved accuracy.

The following figure is an illustration of the idea of meSimp.

An illustration of the idea of meSimp.

TL;DR: Training with meSimp can substantially reduce the size of the neural networks, without loss on accuracy or even with improved accuracy. The method works with different neural models (MLP and LSTM). The trained reduced networks work better than normally-trained dimensional networks of the same size.

Results on test set (please refer to the paper for detailed results and experimental settings):

Method (Adam, CPU) Dimension (Avg.) Test (%)
Parsing (MLP 500d) 500 89.80
Parsing (meProp top-20) 51 (10.2%) 90.11 (+0.31)
POS-Tag (LSTM 500d) 500 97.22
POS-Tag (meProp top-20) 60 (12.0%) 97.25 (+0.03)
MNIST (MLP 500d) 500 98.20
MNIST (meProp top-160) 154 (30.8%) 98.31 (+0.11)

See [pdf] for more details, experimental results, and analysis.

Usage

Requirements

  • Targeting Microsoft .NET Framework 4.6.1+
  • Compatible versions of Mono should work fine (tested Mono 5.0.1)
  • Developed with Microsoft Visual Studio 2017

Dataset

MNIST: Download from link. Extract the files, and place them at the same location with the executable.

Run

Compile the code first, or use the executable provided in releases.

Then

nnmnist.exe <config.json>

or

mono nnmnist.exe <config.json>

where <config.json> is a configuration file. There is an example configuration file in the source codes. The example configuration file runs meSimp. The output will be written to a file at the same location with the executable.

Citation

If you use this code for your research, please cite the paper this code is based on: Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method:

@article{sun17mesimp,
  title     = {Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method},
  author    = {Xu Sun and Xuancheng Ren and Shuming Ma and Bingzhen Wei and Wei Li and Houfeng Wang},
  journal   = {CoRR},
  volume    = {abs/1711.06528},
  year      = {2017}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].