All Projects → Multi-Modal-Transformer → Similar Projects or Alternatives

41 Open source projects that are alternatives of or similar to Multi-Modal-Transformer

image-classification

A collection of SOTA Image Classification Models in PyTorch

Stars: ✭ 70 (+14.75%)

Mutual labels: vision-transformer, mlp-mixer

deep-text-recognition-benchmark

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Stars: ✭ 123 (+101.64%)

Mutual labels: vision-transformer

Splice

Official Pytorch Implementation for "Splicing ViT Features for Semantic Appearance Transfer" presenting "Splice" (CVPR 2022)

Stars: ✭ 126 (+106.56%)

Mutual labels: vision-transformer

GFNet

[NeurIPS 2021] Global Filter Networks for Image Classification

Stars: ✭ 199 (+226.23%)

Mutual labels: vision-transformer

Combining-EfficientNet-and-Vision-Transformers-for-Video-Deepfake-Detection

Code for Video Deepfake Detection model from "Combining EfficientNet and Vision Transformers for Video Deepfake Detection" available on Arxiv and was submitted to ICIAP 2021.

Stars: ✭ 39 (-36.07%)

Mutual labels: vision-transformer

MPViT

MPViT:Multi-Path Vision Transformer for Dense Prediction in CVPR 2022

Stars: ✭ 193 (+216.39%)

Mutual labels: vision-transformer

PASSL

PASSL包含 SimCLR，MoCo v1/v2，BYOL，CLIP，PixPro，BEiT，MAE等图像自监督算法以及 Vision Transformer，DEiT，Swin Transformer，CvT，T2T-ViT，MLP-Mixer，XCiT，ConvNeXt，PVTv2 等基础视觉算法

Stars: ✭ 134 (+119.67%)

Mutual labels: vision-transformer

koclip

KoCLIP: Korean port of OpenAI CLIP, in Flax

Stars: ✭ 80 (+31.15%)

Mutual labels: vision-transformer

LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Stars: ✭ 1,566 (+2467.21%)

Mutual labels: vision-transformer

visualization

a collection of visualization function

Stars: ✭ 189 (+209.84%)

Mutual labels: vision-transformer

iPerceive

Stars: ✭ 52 (-14.75%)

Mutual labels: multi-modal

OASIS

Official implementation of the paper "You Only Need Adversarial Supervision for Semantic Image Synthesis" (ICLR 2021)

Stars: ✭ 232 (+280.33%)

Mutual labels: multi-modal

InterpretDL

InterpretDL: Interpretation of Deep Learning Models，基于『飞桨』的模型可解释性算法库。

Stars: ✭ 121 (+98.36%)

Mutual labels: vision-transformer

pytorch-vit

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Stars: ✭ 250 (+309.84%)

Mutual labels: vision-transformer

nemar

[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation

Stars: ✭ 120 (+96.72%)

Mutual labels: multi-modal

towhee

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Stars: ✭ 821 (+1245.9%)

Mutual labels: vision-transformer

YOLOS

You Only Look at One Sequence (NeurIPS 2021)

Stars: ✭ 612 (+903.28%)

Mutual labels: vision-transformer

TransMorph Transformer for Medical Image Registration

TransMorph: Transformer for Unsupervised Medical Image Registration (PyTorch)

Stars: ✭ 130 (+113.11%)

Mutual labels: vision-transformer

ViT-V-Net for 3D Image Registration Pytorch

Vision Transformer for 3D medical image registration (Pytorch).

Stars: ✭ 169 (+177.05%)

Mutual labels: vision-transformer

VidSitu

[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)

Stars: ✭ 41 (-32.79%)

Mutual labels: video-language

EGSC-IT

Tensorflow implementation of ICLR2019 paper "Exemplar Guided Unsupervised Image-to-Image Translation with Semantic Consistency"

Stars: ✭ 29 (-52.46%)

Mutual labels: multi-modal

Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Stars: ✭ 2,828 (+4536.07%)

Mutual labels: vision-transformer

ICON

(TPAMI2022) Salient Object Detection via Integrity Learning.

Stars: ✭ 125 (+104.92%)

Mutual labels: mlp-mixer

Evo-ViT

Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer

Stars: ✭ 50 (-18.03%)

Mutual labels: vision-transformer

semantic-segmentation

SOTA Semantic Segmentation Models in PyTorch

Stars: ✭ 464 (+660.66%)

Mutual labels: vision-transformer

SReT

Official PyTorch implementation of our ECCV 2022 paper "Sliced Recursive Transformer"

Stars: ✭ 51 (-16.39%)

Mutual labels: vision-transformer

TRAR-VQA

[ICCV 2021] TRAR: Routing the Attention Spans in Transformers for Visual Question Answering -- Official Implementation

Stars: ✭ 49 (-19.67%)

Mutual labels: multi-modal

MMTOD

Multi-modal Thermal Object Detector

Stars: ✭ 38 (-37.7%)

Mutual labels: multi-modal

ImageNet21K

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper

Stars: ✭ 565 (+826.23%)

Mutual labels: vision-transformer

ReferFormer

[CVPR2022] Official Implementation of ReferFormer

Stars: ✭ 230 (+277.05%)

Mutual labels: video-language

mobilevit-pytorch

A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

Stars: ✭ 349 (+472.13%)

Mutual labels: vision-transformer

transformer-ls

Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).

Stars: ✭ 201 (+229.51%)

Mutual labels: vision-transformer

skill-sample-nodejs-berry-bash

Demonstrates the use of interactive render template directives through multi modal screen design.

Stars: ✭ 22 (-63.93%)

Mutual labels: multi-modal

libai

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training

Stars: ✭ 284 (+365.57%)

Mutual labels: vision-transformer

keras-vision-transformer

The Tensorflow, Keras implementation of Swin-Transformer and Swin-UNET

Stars: ✭ 91 (+49.18%)

Mutual labels: vision-transformer

pytorch-cifar-model-zoo

Implementation of Conv-based and Vit-based networks designed for CIFAR.

Stars: ✭ 62 (+1.64%)

Mutual labels: vision-transformer

VT-UNet

[MICCAI2022] This is an official PyTorch implementation for A Robust Volumetric Transformer for Accurate 3D Tumor Segmentation

Stars: ✭ 151 (+147.54%)

Mutual labels: vision-transformer

Dalle Pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Stars: ✭ 3,661 (+5901.64%)

Mutual labels: multi-modal

Valhalla

Open Source Routing Engine for OpenStreetMap

Stars: ✭ 1,794 (+2840.98%)

Mutual labels: multi-modal

Ghostnet

CV backbones including GhostNet, TinyNet and TNT, developed by Huawei Noah's Ark Lab.

Stars: ✭ 1,744 (+2759.02%)

Mutual labels: vision-transformer

SwinIR

SwinIR: Image Restoration Using Swin Transformer (official repository)

Stars: ✭ 1,260 (+1965.57%)

Mutual labels: vision-transformer

1-41 of 41 similar projects