Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → majumderb → Rezero

majumderb / Rezero

Licence: mit

Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"

Programming Languages

python

139335 projects - #7 most used programming language

Labels

deep-neural-networks transformer resnet

Projects that are alternatives of or similar to Rezero

Real Time Gesrec

Real-time Hand Gesture Recognition with PyTorch on EgoGesture, NvGesture, Jester, Kinetics and UCF101

Stars: ✭ 339 (+6.94%)

Mutual labels: deep-neural-networks, resnet

Deep Ranking

Learning Fine-grained Image Similarity with Deep Ranking is a novel application of neural networks, where the authors use a new multi scale architecture combined with a triplet loss to create a neural network that is able to perform image search. This repository is a simplified implementation of the same

Stars: ✭ 64 (-79.81%)

Mutual labels: deep-neural-networks, resnet

Flow Forecast

Deep learning PyTorch library for time series forecasting, classification, and anomaly detection (originally for flood forecasting).

Stars: ✭ 368 (+16.09%)

Mutual labels: deep-neural-networks, transformer

Bmw Tensorflow Training Gui

This repository allows you to get started with a gui based training a State-of-the-art Deep Learning model with little to no configuration needed! NoCode training with TensorFlow has never been so easy.

Stars: ✭ 736 (+132.18%)

Mutual labels: deep-neural-networks, resnet

Paddlex

PaddlePaddle End-to-End Development Toolkit（『飞桨』深度学习全流程开发工具）

Stars: ✭ 3,399 (+972.24%)

Mutual labels: deep-neural-networks, resnet

Eeg Dl

A Deep Learning library for EEG Tasks (Signals) Classification, based on TensorFlow.

Stars: ✭ 165 (-47.95%)

Mutual labels: resnet, transformer

Sockeye

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Stars: ✭ 990 (+212.3%)

Mutual labels: deep-neural-networks, transformer

Deep Ctr Prediction

CTR prediction models based on deep learning(基于深度学习的广告推荐CTR预估模型)

Stars: ✭ 628 (+98.11%)

Mutual labels: resnet, transformer

Voice activity detection

Voice Activity Detection based on Deep Learning & TensorFlow

Stars: ✭ 132 (-58.36%)

Mutual labels: deep-neural-networks, resnet

Tensorflow2.0 Examples

🙄 Difficult algorithm, Simple code.

Stars: ✭ 1,397 (+340.69%)

Mutual labels: deep-neural-networks, resnet

Gluon2pytorch

Gluon to PyTorch deep neural network model converter

Stars: ✭ 70 (-77.92%)

Mutual labels: deep-neural-networks, resnet

Octconv.pytorch

PyTorch implementation of Octave Convolution with pre-trained Oct-ResNet and Oct-MobileNet models

Stars: ✭ 229 (-27.76%)

Mutual labels: deep-neural-networks, resnet

Iresnet

Improved Residual Networks (https://arxiv.org/pdf/2004.04989.pdf)

Stars: ✭ 163 (-48.58%)

Mutual labels: deep-neural-networks, resnet

Dab

Data Augmentation by Backtranslation (DAB) ヽ( •_-)ᕗ

Stars: ✭ 294 (-7.26%)

Mutual labels: deep-neural-networks, transformer

Deep Learning Uncertainty

Literature survey, paper reviews, experimental setups and a collection of implementations for baselines methods for predictive uncertainty estimation in deep learning models.

Stars: ✭ 296 (-6.62%)

Mutual labels: deep-neural-networks

Tensorflow Image Detection

A generic image detection program that uses Google's Machine Learning library, Tensorflow and a pre-trained Deep Learning Convolutional Neural Network model called Inception.

Stars: ✭ 306 (-3.47%)

Mutual labels: deep-neural-networks

Cascaded Fcn

Source code for the MICCAI 2016 Paper "Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional NeuralNetworks and 3D Conditional Random Fields"

Stars: ✭ 296 (-6.62%)

Mutual labels: deep-neural-networks

Model Compression Papers

Papers for deep neural network compression and acceleration

Stars: ✭ 296 (-6.62%)

Mutual labels: deep-neural-networks

Pytorch Vdsr

VDSR (CVPR2016) pytorch implementation

Stars: ✭ 313 (-1.26%)

Mutual labels: deep-neural-networks

Deepxi

Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.

Stars: ✭ 304 (-4.1%)

Mutual labels: resnet

View All Similar Projects ➔

ReZero for Deep Neural Networks

ReZero is All You Need: Fast Convergence at Large Depth; ArXiv, March 2020.

Thomas Bachlechner*, Bodhisattwa Prasad Majumder*, Huanru Henry Mao*, Garrison W. Cottrell, Julian McAuley (* denotes equal contributions)

This repository contains the ReZero-Transformer implementation from the paper. It matches Pytorch's Transformer and can be easily used as a drop-in replacement.

Quick Links:

Abstract

Deep networks have enabled significant performance gains across domains, but they often suffer from vanishing/exploding gradients. This is especially true for Transformer architectures where depth beyond 12 layers is difficult to train without large datasets and computational budgets. In general, we find that inefficient signal propagation impedes learning in deep networks. In Transformers, multi-head self-attention is the main cause of this poor signal propagation. To facilitate deep signal propagation, we propose ReZero, a simple change to the architecture that initializes an arbitrary layer as the identity map, using a single additional learned parameter per layer. We apply this technique to language modeling and find that we can easily train ReZero-Transformer networks over a hundred layers. When applied to 12 layer Transformers, ReZero converges 56% faster on enwiki8. ReZero applies beyond Transformers to other residual networks, enabling 1,500% faster convergence for deep fully connected networks and 32% faster convergence for a ResNet-56 trained on CIFAR 10.

Installation

Simply install from pip:

pip install rezero

Pytorch 1.4 or greater is required.

Usage

We provide custom ReZero Transformer layers (RZTX).

For example, this will create a Transformer encoder:

import torch
import torch.nn as nn
from rezero.transformer import RZTXEncoderLayer

encoder_layer = RZTXEncoderLayer(d_model=512, nhead=8)
transformer_encoder = nn.TransformerEncoder(encoder_layer, num_layers=6)
src = torch.rand(10, 32, 512)
out = transformer_encoder(src)

This will create a Transformer decoder:

import torch
import torch.nn as nn
from rezero.transformer import RZTXDecoderLayer

decoder_layer = RZTXDecoderLayer(d_model=512, nhead=8)
transformer_decoder = nn.TransformerDecoder(decoder_layer, num_layers=6)
memory = torch.rand(10, 32, 512)
tgt = torch.rand(20, 32, 512)
out = transformer_decoder(tgt, memory)

Make sure norm argument is left as None as to not use LayerNorm in the Transformer.

See https://pytorch.org/docs/master/nn.html#torch.nn.Transformer for details on how to integrate customer Transformer layers to Pytorch.

Tutorials

Watch for more tutorials in this space.

Citation

If you find rezero useful for your research, please cite our paper:

@inproceedings{BacMajMaoCotMcA20,
    title = "ReZero is All You Need: Fast Convergence at Large Depth",
    author = "Bachlechner, Thomas  and
      Majumder, Bodhisattwa Prasad
      Mao, Huanru Henry and
      Cottrell, Garrison W. and
      McAuley, Julian",
    booktitle = "arXiv",
    year = "2020",
    url = "https://arxiv.org/abs/2003.04887"
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 317

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗