All Projects → lucidrains → mlp-mixer-pytorch

lucidrains / mlp-mixer-pytorch

Licence: MIT license
An All-MLP solution for Vision, from Google AI

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to mlp-mixer-pytorch

sam-textvqa
Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.
Stars: ✭ 51 (-93.39%)
Mutual labels:  vision
CustomVisionMicrosoftToCoreMLDemoApp
This app recognises 3 hand signs - fist, high five and victory hand [ rock, paper, scissors basically :) ] with live feed camera. It uses a HandSigns.mlmodel which has been trained using Custom Vision from Microsoft.
Stars: ✭ 25 (-96.76%)
Mutual labels:  vision
calvin
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
Stars: ✭ 105 (-86.38%)
Mutual labels:  vision
EfficientMORL
EfficientMORL (ICML'21)
Stars: ✭ 22 (-97.15%)
Mutual labels:  vision
handbook
We're a small high-trust livelihood pod doing tech consulting within Enspiral.
Stars: ✭ 35 (-95.46%)
Mutual labels:  vision
photonvision
PhotonVision is the free, fast, and easy-to-use computer vision solution for the FIRST Robotics Competition.
Stars: ✭ 115 (-85.08%)
Mutual labels:  vision
autonomous-delivery-robot
Repository for Autonomous Delivery Robot project of IvLabs, VNIT
Stars: ✭ 65 (-91.57%)
Mutual labels:  vision
SentimentVisionDemo
🌅 iOS11 demo application for visual sentiment prediction.
Stars: ✭ 34 (-95.59%)
Mutual labels:  vision
CNN-GoogLeNet
👁 Vision : Model 4: GoogLeNet : Image Classification
Stars: ✭ 17 (-97.8%)
Mutual labels:  vision
Final-year-project-deep-learning-models
Deep learning for freehand sketch object recognition
Stars: ✭ 22 (-97.15%)
Mutual labels:  vision
TokenLabeling
Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"
Stars: ✭ 385 (-50.06%)
Mutual labels:  vision
Denoised-Smoothing-TF
Minimal implementation of Denoised Smoothing (https://arxiv.org/abs/2003.01908) in TensorFlow.
Stars: ✭ 19 (-97.54%)
Mutual labels:  vision
face age gender
Can we predict the age and gender of someone given a picture of their face ?
Stars: ✭ 40 (-94.81%)
Mutual labels:  vision
stereo.vision
planar fitting computation using stereo vision techniques
Stars: ✭ 19 (-97.54%)
Mutual labels:  vision
CarLens-iOS
CarLens - Recognize and Collect Cars
Stars: ✭ 124 (-83.92%)
Mutual labels:  vision
pybv
A lightweight I/O utility for the BrainVision data format, written in Python.
Stars: ✭ 18 (-97.67%)
Mutual labels:  vision
res-mlp-pytorch
Implementation of ResMLP, an all MLP solution to image classification, in Pytorch
Stars: ✭ 178 (-76.91%)
Mutual labels:  vision
edge-computer-vision
Edge Computer Vision Course
Stars: ✭ 41 (-94.68%)
Mutual labels:  vision
SAPC-APCA
APCA (Accessible Perceptual Contrast Algorithm) is a new method for predicting contrast for use in emerging web standards (WCAG 3) for determining readability contrast. APCA is derived form the SAPC (S-LUV Advanced Predictive Color) which is an accessibility-oriented color appearance model designed for self-illuminated displays.
Stars: ✭ 266 (-65.5%)
Mutual labels:  vision
dd-ml-segmentation-benchmark
DroneDeploy Machine Learning Segmentation Benchmark
Stars: ✭ 179 (-76.78%)
Mutual labels:  vision

MLP Mixer - Pytorch

An All-MLP solution for Vision, from Google AI, in Pytorch.

No convolutions nor attention needed!

Yannic Kilcher video

Install

$ pip install mlp-mixer-pytorch

Usage

import torch
from mlp_mixer_pytorch import MLPMixer

model = MLPMixer(
    image_size = 256,
    channels = 3,
    patch_size = 16,
    dim = 512,
    depth = 12,
    num_classes = 1000
)

img = torch.randn(1, 3, 256, 256)
pred = model(img) # (1, 1000)

Rectangular image

import torch
from mlp_mixer_pytorch import MLPMixer

model = MLPMixer(
    image_size = (256, 128),
    channels = 3,
    patch_size = 16,
    dim = 512,
    depth = 12,
    num_classes = 1000
)

img = torch.randn(1, 3, 256, 128)
pred = model(img) # (1, 1000)

Citations

@misc{tolstikhin2021mlpmixer,
    title   = {MLP-Mixer: An all-MLP Architecture for Vision},
    author  = {Ilya Tolstikhin and Neil Houlsby and Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Thomas Unterthiner and Jessica Yung and Daniel Keysers and Jakob Uszkoreit and Mario Lucic and Alexey Dosovitskiy},
    year    = {2021},
    eprint  = {2105.01601},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}
@misc{hou2021vision,
    title   = {Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition},
    author  = {Qibin Hou and Zihang Jiang and Li Yuan and Ming-Ming Cheng and Shuicheng Yan and Jiashi Feng},
    year    = {2021},
    eprint  = {2106.12368},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].