All Projects → fabio-deep → ReZero-ResNet

fabio-deep / ReZero-ResNet

Licence: MIT license
Unofficial pytorch implementation of ReZero in ResNet

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to ReZero-ResNet

resnet-cifar10
ResNet for Cifar10
Stars: ✭ 21 (-8.7%)
Mutual labels:  resnet, residual-networks, cifar10
Keras-CIFAR10
practice on CIFAR10 with Keras
Stars: ✭ 25 (+8.7%)
Mutual labels:  resnet, cifar10
wideresnet-tensorlayer
Wide Residual Networks implemented in TensorLayer and TensorFlow.
Stars: ✭ 44 (+91.3%)
Mutual labels:  resnet, residual-networks
Resnet
Tensorflow ResNet implementation on cifar10
Stars: ✭ 10 (-56.52%)
Mutual labels:  resnet, cifar10
Pytorch Speech Commands
Speech commands recognition with PyTorch
Stars: ✭ 128 (+456.52%)
Mutual labels:  resnet, cifar10
Bsconv
Reference implementation for Blueprint Separable Convolutions (CVPR 2020)
Stars: ✭ 84 (+265.22%)
Mutual labels:  resnet, cifar10
Retinal-Disease-Diagnosis-With-Residual-Attention-Networks
Using Residual Attention Networks to diagnose retinal diseases in medical images
Stars: ✭ 14 (-39.13%)
Mutual labels:  resnet, residual-networks
caffe-wrn-generator
Caffe Wide-Residual-Network (WRN) Generator
Stars: ✭ 19 (-17.39%)
Mutual labels:  resnet, residual-networks
Resnet On Cifar10
Reimplementation ResNet on cifar10 with caffe
Stars: ✭ 123 (+434.78%)
Mutual labels:  resnet, cifar10
Pytorch Classification
Classification with PyTorch.
Stars: ✭ 1,268 (+5413.04%)
Mutual labels:  resnet, cifar10
Chainer Cifar10
Various CNN models for CIFAR10 with Chainer
Stars: ✭ 134 (+482.61%)
Mutual labels:  resnet, cifar10
Resnet Cifar10 Caffe
ResNet-20/32/44/56/110 on CIFAR-10 with Caffe
Stars: ✭ 161 (+600%)
Mutual labels:  resnet, cifar10
miopen-benchmark
benchmarking miopen
Stars: ✭ 17 (-26.09%)
Mutual labels:  resnet
vehicle recognition
一种运用resnet进行车型识别的方法,
Stars: ✭ 32 (+39.13%)
Mutual labels:  resnet
segmentation-enhanced-resunet
Urban building extraction in Daejeon region using Modified Residual U-Net (Modified ResUnet) and applying post-processing.
Stars: ✭ 34 (+47.83%)
Mutual labels:  residual-networks
SE-Net-CIFAR
SE-Net Incorporates with ResNet and WideResnet on CIFAR-10/100 Dataset.
Stars: ✭ 48 (+108.7%)
Mutual labels:  resnet
pb-gcn
Code for the BMVC paper (http://bmvc2018.org/contents/papers/1003.pdf)
Stars: ✭ 32 (+39.13%)
Mutual labels:  resnet
gluon2pytorch
Gluon to PyTorch deep neural network model converter
Stars: ✭ 72 (+213.04%)
Mutual labels:  resnet
DMPfold
De novo protein structure prediction using iteratively predicted structural constraints
Stars: ✭ 52 (+126.09%)
Mutual labels:  resnet
BottleneckTransformers
Bottleneck Transformers for Visual Recognition
Stars: ✭ 231 (+904.35%)
Mutual labels:  cifar10

ReZero ResNet Unofficial Pytorch Implementation.

Trained a couple of nets for (fun) comparisons, using identical hyperparams and early stopping on validation accuracy plateau schedule.
All experiments can be reproduced with the code from this repo using the default hyperparameters defined in src/main.py.

Check out the ReZero paper by the authors: https://arxiv.org/pdf/2003.04887.pdf
Neat idea which seems to improve ResNet convergence speed, especially at the beggining of training (see figures).

ReZero ResNet vs. ResNet on CIFAR-10:

Model # params runtime epochs Valid error (%) Test error (%)
ResNet-20 272,474 70m3s 398 7.63 7.98
ResNet-56 855,770 127m41s 281 6.04 6.44
ResNet-110 1,730,768 240m53s 313 6 6.39
ReZero ResNet-20 272,483 63m9s 327 7.44 7.94
ReZero ResNet-56 855,797 134m44s 303 6.31 6.55
ReZero ResNet-110 1,730,714 301m19s 410 5.84 5.88

Loss & Error curves:

ResNet-20:

ResNet-56:

ResNet-110:

This repo vs. original ResNet paper:

Model (paper) Test error (%) (this repo) Test error (%)
ResNet-20 8.75 7.98
ResNet-56 6.97 6.44
ResNet-110 6.43 6.39

Run

You can launch Distributed training from src/ using:

python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=2 --use_env main.py

This will train on a single machine (nnodes=1), assigning 1 process per GPU where nproc_per_node=2 refers to training on 2 GPUs. To train on N GPUs simply launch N processes by setting nproc_per_node=N.

The number of CPU threads to use per process is hard coded to torch.set_num_threads(1) for safety, and can be changed to your # cpu threads / nproc_per_node for better performance.

For more info on multi-node and multi-gpu distributed training refer to https://github.com/hgrover/pytorchdistr/blob/master/README.md

To train normally using nn.DataParallel or using the CPU:

python main.py --no_distributed
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].