All Projects → gupta-abhay → pytorch-vit

gupta-abhay / pytorch-vit

Licence: MIT license
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to pytorch-vit

BottleneckTransformers
Bottleneck Transformers for Visual Recognition
Stars: ✭ 231 (-7.6%)
Mutual labels:  transformers, image-classification, image-recognition
GFNet
[NeurIPS 2021] Global Filter Networks for Image Classification
Stars: ✭ 199 (-20.4%)
Mutual labels:  image-classification, image-recognition, vision-transformer
HugsVision
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
Stars: ✭ 154 (-38.4%)
Mutual labels:  transformers, image-classification, vit
towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Stars: ✭ 821 (+228.4%)
Mutual labels:  vit, vision-transformer
mobilevit-pytorch
A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".
Stars: ✭ 349 (+39.6%)
Mutual labels:  vit, vision-transformer
Paper-Notes
Paper notes in deep learning/machine learning and computer vision
Stars: ✭ 37 (-85.2%)
Mutual labels:  image-classification, image-recognition
pytorch-cifar-model-zoo
Implementation of Conv-based and Vit-based networks designed for CIFAR.
Stars: ✭ 62 (-75.2%)
Mutual labels:  image-classification, vision-transformer
backprop
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (-8.4%)
Mutual labels:  transformers, image-classification
rps-cv
A Rock-Paper-Scissors game using computer vision and machine learning on Raspberry Pi
Stars: ✭ 102 (-59.2%)
Mutual labels:  image-classification, image-recognition
image-classification
A collection of SOTA Image Classification Models in PyTorch
Stars: ✭ 70 (-72%)
Mutual labels:  image-classification, vision-transformer
Evo-ViT
Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Stars: ✭ 50 (-80%)
Mutual labels:  image-classification, vision-transformer
jpetstore-kubernetes
Modernize and Extend: JPetStore on IBM Cloud Kubernetes Service
Stars: ✭ 21 (-91.6%)
Mutual labels:  image-classification, image-recognition
TensorFlow-Multiclass-Image-Classification-using-CNN-s
Balanced Multiclass Image Classification with TensorFlow on Python.
Stars: ✭ 57 (-77.2%)
Mutual labels:  image-classification, image-recognition
UnityProminentColor
Tool to gather main colors of an image using Unity.
Stars: ✭ 40 (-84%)
Mutual labels:  image-classification, image-recognition
Image-Classification
Pre-trained VGG-Net Model for image classification using tensorflow
Stars: ✭ 29 (-88.4%)
Mutual labels:  image-classification, image-recognition
zalo-landmark
Zalo AI Challenge - Landmark Identification
Stars: ✭ 39 (-84.4%)
Mutual labels:  image-classification, image-recognition
ICCV2021-Paper-Code-Interpretation
ICCV2021/2019/2017 论文/代码/解读/直播合集,极市团队整理
Stars: ✭ 2,022 (+708.8%)
Mutual labels:  image-classification, image-recognition
MNIST
Handwritten digit recognizer using a feed-forward neural network and the MNIST dataset of 70,000 human-labeled handwritten digits.
Stars: ✭ 28 (-88.8%)
Mutual labels:  image-classification, image-recognition
tensorflow-image-recognition-chrome-extension
Chrome browser extension for using TensorFlow image recognition on web pages
Stars: ✭ 88 (-64.8%)
Mutual labels:  image-classification, image-recognition
LIT
[AAAI 2022] This is the official PyTorch implementation of "Less is More: Pay Less Attention in Vision Transformers"
Stars: ✭ 79 (-68.4%)
Mutual labels:  transformers, image-recognition

Vision Transformers

Implementation of Vision Transformer in PyTorch, a new model to achieve SOTA in vision classification with using transformer style encoders. Associated blog article.

Credits to Phil Wang for the gif ViT

Features

  • ViT
  • ViT with convolutional patches
  • ViT with convolutional stems
    • Early Convolutional Stem
    • Scaled ReLU Stem
  • GAP Pooling

Citations

@article{dosovitskiy2020image,
  title={An image is worth 16x16 words: Transformers for image recognition at scale},
  author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and others},
  journal={arXiv preprint arXiv:2010.11929},
  year={2020}
}
@article{xiao2021early,
  title={Early convolutions help transformers see better},
  author={Xiao, Tete and Singh, Mannat and Mintun, Eric and Darrell, Trevor and Doll{\'a}r, Piotr and Girshick, Ross},
  journal={arXiv preprint arXiv:2106.14881},
  year={2021}
}
@article{wang2021scaled,
  title={Scaled ReLU Matters for Training Vision Transformers},
  author={Wang, Pichao and Wang, Xue and Luo, Hao and Zhou, Jingkai and Zhou, Zhipeng and Wang, Fan and Li, Hao and Jin, Rong},
  journal={arXiv preprint arXiv:2109.03810},
  year={2021}
}
@article{zhai2021scaling,
  title={Scaling vision transformers},
  author={Zhai, Xiaohua and Kolesnikov, Alexander and Houlsby, Neil and Beyer, Lucas},
  journal={arXiv preprint arXiv:2106.04560},
  year={2021}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].