All Projects → hvy → chainer-param-monitor

hvy / chainer-param-monitor

Licence: other
Monitor parameter and gradient statistics during neural network training with Chainer

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to chainer-param-monitor

Nnpulearning
Non-negative Positive-Unlabeled (nnPU) and unbiased Positive-Unlabeled (uPU) learning reproductive code on MNIST and CIFAR10
Stars: ✭ 181 (+1292.31%)
Mutual labels:  chainer
chainer-grad-cam
Chainer implementation of Grad-CAM
Stars: ✭ 20 (+53.85%)
Mutual labels:  chainer
BMI219-2017-ProteinFolding
UCSF BMI219 Deep Learning (2017), Coding example (Prediction of protein folding with RNN and CNN)
Stars: ✭ 14 (+7.69%)
Mutual labels:  chainer
Alpha Zero General
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
Stars: ✭ 2,617 (+20030.77%)
Mutual labels:  chainer
kawaii creator
Photo to illustration converter
Stars: ✭ 79 (+507.69%)
Mutual labels:  chainer
ChainerPruner
ChainerPruner: Channel Pruning framework for Chainer
Stars: ✭ 21 (+61.54%)
Mutual labels:  chainer
Pai
Resource scheduling and cluster management for AI
Stars: ✭ 2,223 (+17000%)
Mutual labels:  chainer
convolutional seq2seq
fairseq: Convolutional Sequence to Sequence Learning (Gehring et al. 2017) by Chainer
Stars: ✭ 63 (+384.62%)
Mutual labels:  chainer
pyner
🌈 Implementation of Neural Network based Named Entity Recognizer (Lample+, 2016) using Chainer.
Stars: ✭ 45 (+246.15%)
Mutual labels:  chainer
chainer-pix2pix
Chainer implementation for Image-to-Image Translation Using Conditional Adversarial Networks
Stars: ✭ 40 (+207.69%)
Mutual labels:  chainer
Bert Chainer
Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
Stars: ✭ 205 (+1476.92%)
Mutual labels:  chainer
Einops
Deep learning operations reinvented (for pytorch, tensorflow, jax and others)
Stars: ✭ 4,022 (+30838.46%)
Mutual labels:  chainer
Visual-Attention-Model
Chainer implementation of Deepmind's Visual Attention Model paper
Stars: ✭ 27 (+107.69%)
Mutual labels:  chainer
Dynamic routing between capsules
Implementation of Dynamic Routing Between Capsules, Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, NIPS 2017
Stars: ✭ 202 (+1453.85%)
Mutual labels:  chainer
deep-learning-tutorial-with-chainer
Deep learning tutorial with Chainer
Stars: ✭ 25 (+92.31%)
Mutual labels:  chainer
Imgclsmob
Sandbox for training deep learning networks
Stars: ✭ 2,405 (+18400%)
Mutual labels:  chainer
3dgan-chainer
📦 A Chainer implementation of 3D Generative Adversarial Network.
Stars: ✭ 25 (+92.31%)
Mutual labels:  chainer
chainer-fcis
[This project has moved to ChainerCV] Chainer Implementation of Fully Convolutional Instance-aware Semantic Segmentation
Stars: ✭ 45 (+246.15%)
Mutual labels:  chainer
build-user-vars-plugin
Set of environment variables that describe the user who started the build
Stars: ✭ 40 (+207.69%)
Mutual labels:  parameter
chainer2pytorch
Converts Chainer modules to PyTorch, parameters included.
Stars: ✭ 36 (+176.92%)
Mutual labels:  chainer

Neural Network Monitoring for Chainer Models

This is a Chainer plugin for computing statistics over weights, biases and gradients during training.

You can collect the above mentioned data from any chainer.Chain and repeat it for each iteration or epoch, saving them to a log using e.g. chainer.report() to plot the statistical changes over the course of training later on.

Note: It is not yet optimized for speed. Computing percentiles is for instance slow.

Statistics

An example plot of weights, biases and gradients from different convolutional and fully connected layers.

Data

  • Mean
  • Standard deviation
  • Min
  • Max
  • Percentiles
  • Sparsity (actually just counting number of zeros)

Targets

  • Weights
  • Biases
  • Gradients

For a specific layer or the aggregated data over the entire model.

Dependencies

Chainer 1.18.0 (including NumPy 1.11.2)

Example

Usage

# This is simplified code, see the 'example' directory for a working example.
import monitor

# Prepare the model.
model = MLP()
optimizer.setup(model)

# Forward computation, back propagation and a parameter update.
# The gradients are still stored inside each parameter after those steps.
loss = model(x, t)
loss.backward()
optimizer.update()

# Use the plugin to collect data and nicely ask Chainer to include it in the log.
weight_report = monitor.weight_statistics(model)
chainer.report(weight_report) # Mean, std, min, max, percentiles

bias_report = monitor.bias_statistics(model)
chainer.report(bias_report)

fst_layer_grads = monitor.weight_gradient_statistics(model, layer_name='fc1')
chainer.report(fst_layer_grads)

zeros = monitor.sparsity(model, include_bias=False)
chainer.report(zeros)

Plotting the Statistics

Weights and biases when training a small convolutional neural network for classification for 100 epochs aggregated over all layers (including final fully connected linear layers). The different alphas show different percentiles.

Weights

Biases

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].