All Projects → benjs → Nfnets_pytorch

benjs / Nfnets_pytorch

Licence: apache-2.0
Pre-trained NFNets with 99% of the accuracy of the official paper "High-Performance Large-Scale Image Recognition Without Normalization".

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Nfnets pytorch

Bert Ner
Pytorch-Named-Entity-Recognition-with-BERT
Stars: ✭ 829 (+875.29%)
Mutual labels:  pretrained-models
Ml In Tf
Get started with Machine Learning in TensorFlow with a selection of good reads and implemented examples!
Stars: ✭ 45 (-47.06%)
Mutual labels:  deepmind
Recurrent Environment Simulators
Deepmind Recurrent Environment Simulators paper implementation in tensorflow
Stars: ✭ 73 (-14.12%)
Mutual labels:  deepmind
Musical Onset Efficient
Supplementary information and code for the paper: An efficient deep learning model for musical onset detection
Stars: ✭ 26 (-69.41%)
Mutual labels:  pretrained-models
Pyannote Audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Stars: ✭ 978 (+1050.59%)
Mutual labels:  pretrained-models
Gpt2 Ml
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
Stars: ✭ 1,066 (+1154.12%)
Mutual labels:  pretrained-models
Srgan Tensorflow
Tensorflow implementation of the SRGAN algorithm for single image super-resolution
Stars: ✭ 754 (+787.06%)
Mutual labels:  pretrained-models
Sc2aibot
Implementing reinforcement-learning algorithms for pysc2 -environment
Stars: ✭ 83 (-2.35%)
Mutual labels:  deepmind
Cv Pretrained Model
A collection of computer vision pre-trained models.
Stars: ✭ 995 (+1070.59%)
Mutual labels:  pretrained-models
Farm
🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Stars: ✭ 1,140 (+1241.18%)
Mutual labels:  pretrained-models
Classification models
Classification models trained on ImageNet. Keras.
Stars: ✭ 938 (+1003.53%)
Mutual labels:  pretrained-models
Asteroid
The PyTorch-based audio source separation toolkit for researchers
Stars: ✭ 862 (+914.12%)
Mutual labels:  pretrained-models
Pytorch Image Models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
Stars: ✭ 15,232 (+17820%)
Mutual labels:  pretrained-models
Prosr
Repository containing an independent implementation of the paper: "A Fully Progressive Approach to Single-Image Super-Resolution"
Stars: ✭ 923 (+985.88%)
Mutual labels:  pretrained-models
Dmc2gym
OpenAI Gym wrapper for the DeepMind Control Suite
Stars: ✭ 75 (-11.76%)
Mutual labels:  deepmind
Bert Keras
Keras implementation of BERT with pre-trained weights
Stars: ✭ 820 (+864.71%)
Mutual labels:  pretrained-models
Mujocounity
Reproducing MuJoCo benchmarks in a modern, commercial game /physics engine (Unity + PhysX).
Stars: ✭ 47 (-44.71%)
Mutual labels:  deepmind
Gen Efficientnet Pytorch
Pretrained EfficientNet, EfficientNet-Lite, MixNet, MobileNetV3 / V2, MNASNet A1 and B1, FBNet, Single-Path NAS
Stars: ✭ 1,275 (+1400%)
Mutual labels:  pretrained-models
Dialogue Understanding
This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
Stars: ✭ 77 (-9.41%)
Mutual labels:  pretrained-models
People Counter Python
Create a smart video application using the Intel Distribution of OpenVINO toolkit. The toolkit uses models and inference to run single-class object detection.
Stars: ✭ 62 (-27.06%)
Mutual labels:  pretrained-models

NFNet Pytorch Implementation

Open In Colab

This repo contains pretrained NFNet models F0-F6 with high ImageNet accuracy from the paper High-Performance Large-Scale Image Recognition Without Normalization. The small models are as accurate as an EfficientNet-B7, but train 8.7 times faster. The large models set a new SOTA top-1 accuracy on ImageNet.

NFNet F0 F1 F2 F3 F4 F5 F6+SAM
Top-1 accuracy Brock et al. 83.6 84.7 85.1 85.7 85.9 86.0 86.5
Top-1 accuracy this implementation 82.82 84.63 84.90 85.46 85.66 85.62 TBD

All credits go to the authors of the original paper. This repo is heavily inspired by their nice JAX implementation in the official repository. Visit their repo for citing.

Get started

git clone https://github.com/benjs/nfnets_pytorch.git
pip3 install -r requirements.txt

or if you don't need eval and training script

pip install git+https://github.com/benjs/nfnets_pytorch

Download pretrained weights from the official repository and call

from nfnets import pretrained_nfnet
model_F0 = pretrained_nfnet('pretrained/F0_haiku.npz')
model_F1 = pretrained_nfnet('pretrained/F1_haiku.npz')
# ...

The model variant is automatically derived from the parameter count in the pretrained weights file.

Validate yourself

python3 eval.py --pretrained pretrained/F0_haiku.npz --dataset path/to/imagenet/valset/

You can download the ImageNet validation set from the ILSVRC2012 challenge site after asking for access with, for instance, your .edu mail address or from AcademicTorrents

Scaled weight standardization convolutions in your own model

Simply replace all your nn.Conv2d with WSConv2D and all your nn.ReLU with VPReLU or VPGELU (variance preserving ReLU/GELU).

import torch.nn as nn
from nfnets import WSConv2D, VPReLU, VPGELU

# Simply replace your nn.Conv2d layers
class MyNet(nn.Module):
    def __init__(self):
        super(MyNet, self).__init__()
 
        self.activation = VPReLU(inplace=True) # or VPGELU
        self.conv0 = WSConv2D(in_channels=128, out_channels=256, kernel_size=1, ...)
        # ...

    def forward(self, x):
      out = self.activation(self.conv0(x))
      # ...

SGD with adaptive gradient clipping in your own model

Simply replace your SGD optimizer with SGD_AGC.

from nfnets import SGD_AGC

optimizer = SGD_AGC(
        named_params=model.named_parameters(), # Pass named parameters
        lr=1e-3,
        momentum=0.9,
        clipping=0.1, # New clipping parameter
        weight_decay=2e-5, 
        nesterov=True)

It is important to exclude certain layers from clipping or momentum. The authors recommends to exclude the last fully convolutional from clipping and the bias/gain parameters from weight decay:

import re

for group in optimizer.param_groups:
    name = group['name'] 
    
    # Exclude from weight decay
    if len(re.findall('stem.*(bias|gain)|conv.*(bias|gain)|skip_gain', name)) > 0:
        group['weight_decay'] = 0

    # Exclude from clipping
    if name.startswith('linear'):
        group['clipping'] = None

Train your own NFNet

Adjust your desired parameters in default_config.yaml and start training.

python3 train.py --dataset /path/to/imagenet/

There is still some parts missing for complete training from scratch:

  • Multi-GPU training
  • Data augmentations
  • FP16 activations and gradients

Contribute

The implementation is still in an early stage in terms of usability / testing. If you have an idea to improve this repo open an issue, start a discussion or submit a pull request.

The current development status can be seen in this project board.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].