GPU Accelerated TensorFlow Lite applications on Android NDK. Higher accuracy face detection, Age and gender estimation, Human pose estimation, Artistic style transfer

Stars: ✭ 105 (-70.59%)

Mutual labels: segmentation, pose-estimation

volkscv

A Python toolbox for computer vision research and project

Stars: ✭ 58 (-83.75%)

Mutual labels: classification, segmentation

GaitGraph

Official repository for "GaitGraph: Graph Convolutional Network for Skeleton-Based Gait Recognition" (ICIP'21)

Stars: ✭ 68 (-80.95%)

Mutual labels: pose-estimation, hrnet

verseagility

Ramp up your custom natural language processing (NLP) task, allowing you to bring your own data, use your preferred frameworks and bring models into production.

Stars: ✭ 23 (-93.56%)

Mutual labels: transformer, classification

Medical Transformer

Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation"

Stars: ✭ 153 (-57.14%)

Mutual labels: transformer, segmentation

Point2Sequence

Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network

Stars: ✭ 34 (-90.48%)

Mutual labels: classification, segmentation

TransPose

PyTorch Implementation for "TransPose: Keypoint localization via Transformer", ICCV 2021.

Stars: ✭ 250 (-29.97%)

Mutual labels: transformer, pose-estimation

dd-ml-segmentation-benchmark

DroneDeploy Machine Learning Segmentation Benchmark

Stars: ✭ 179 (-49.86%)

Mutual labels: vision, segmentation

FNet-pytorch

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

Stars: ✭ 204 (-42.86%)

Mutual labels: transformer, vision

TokenLabeling

Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"

Stars: ✭ 385 (+7.84%)

Mutual labels: transformer, vision

Awesome-Tensorflow2

基于Tensorflow2开发的优秀扩展包及项目

Stars: ✭ 45 (-87.39%)

Mutual labels: classification, segmentation

nested-transformer

Nested Hierarchical Transformer https://arxiv.org/pdf/2105.12723.pdf

Stars: ✭ 174 (-51.26%)

Mutual labels: transformer, vision

Conformer

Official code for Conformer: Local Features Coupling Global Representations for Visual Recognition

Stars: ✭ 345 (-3.36%)

Mutual labels: transformer, classification

Setr Pytorch

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Stars: ✭ 96 (-73.11%)

Mutual labels: transformer, segmentation

Nlp research

NLP research：基于tensorflow的nlp深度学习项目，支持文本分类/句子匹配/序列标注/文本生成四大任务

Stars: ✭ 141 (-60.5%)

Mutual labels: transformer, classification

COVID-19-Tweet-Classification-using-Roberta-and-Bert-Simple-Transformers

Rank 1 / 216

Stars: ✭ 24 (-93.28%)

Mutual labels: transformer, classification

mmrazor

OpenMMLab Model Compression Toolbox and Benchmark.

Stars: ✭ 644 (+80.39%)

Mutual labels: classification, segmentation

View All Similar Projects ➔

HRFormer: High-Resolution Transformer for Dense Prediction, NeurIPS 2021

Introduction

This is the official implementation of High-Resolution Transformer (HRFormer). We present a High-Resolution Transformer (HRFormer) that learns high-resolution representations for dense prediction tasks, in contrast to the original Vision Transformer that produces low-resolution representations and has high memory and computational cost. We take advantage of the multi-resolution parallel design introduced in high-resolution convolutional networks (HRNet), along with local-window self-attention that performs self-attention over small non-overlapping image windows, for improving the memory and computation efficiency. In addition, we introduce a convolution into the FFN to exchange information across the disconnected image windows. We demonstrate the effectiveness of the High-Resolution Transformer on human pose estimation and semantic segmentation tasks.

The HRFormer architecture:

The HRFormer Unit (trans. unit):

Pose estimation

2d Human Pose Estimation

Results on COCO `val2017` with detector having human AP of 56.4 on COCO `val2017` dataset

Backbone	Input Size	AP	AP⁵⁰	AP⁷⁵	AR^M	AR^L	AR	ckpt	log	script
HRFormer-S	256x192	74.0%	90.2%	81.2%	70.4%	80.7%	79.4%	ckpt	log	script
HRFormer-S	384x288	75.6%	90.3%	82.2%	71.6%	82.5%	80.7%	ckpt	log	script
HRFormer-B	256x192	75.6%	90.8%	82.8%	71.7%	82.6%	80.8%	ckpt	log	script
HRFormer-B	384x288	77.2%	91.0%	83.6%	73.2%	84.2%	82.0%	ckpt	log	script

Results on COCO `test-dev` with detector having human AP of 56.4 on COCO `val2017` dataset

Backbone	Input Size	AP	AP⁵⁰	AP⁷⁵	AR^M	AR^L	AR	ckpt	log	script
HRFormer-S	384x288	74.5%	92.3%	82.1%	70.7%	80.6%	79.8%	ckpt	log	script
HRFormer-B	384x288	76.2%	92.7%	83.8%	72.5%	82.3%	81.2%	ckpt	log	script

The models are first pre-trained on ImageNet-1K dataset, and then fine-tuned on COCO val2017 dataset.

Semantic segmentation

Cityscapes

Performance on the Cityscapes dataset. The models are trained and tested with input size of 512x1024 and 1024x2048 respectively.

Methods	Backbone	Window Size	Train Set	Test Set	Iterations	Batch Size	OHEM	mIoU	mIoU (Multi-Scale)	Log	ckpt	script
OCRNet	HRFormer-S	7x7	Train	Val	80000	8	Yes	80.0	81.0	log	ckpt	script
OCRNet	HRFormer-B	7x7	Train	Val	80000	8	Yes	81.4	82.0	log	ckpt	script
OCRNet	HRFormer-B	15x15	Train	Val	80000	8	Yes	81.9	82.6	log	ckpt	script

PASCAL-Context

The models are trained with the input size of 520x520, and tested with original size.

Methods	Backbone	Window Size	Train Set	Test Set	Iterations	Batch Size	OHEM	mIoU	mIoU (Multi-Scale)	Log	ckpt	script
OCRNet	HRFormer-S	7x7	Train	Val	60000	16	Yes	53.8	54.6	log	ckpt	script
OCRNet	HRFormer-B	7x7	Train	Val	60000	16	Yes	56.3	57.1	log	ckpt	script
OCRNet	HRFormer-B	15x15	Train	Val	60000	16	Yes	57.6	58.5	log	ckpt	script

COCO-Stuff

The models are trained with input size of 520x520, and tested with original size.

Methods	Backbone	Window Size	Train Set	Test Set	Iterations	Batch Size	OHEM	mIoU	mIoU (Multi-Scale)	Log	ckpt	script
OCRNet	HRFormer-S	7x7	Train	Val	60000	16	Yes	37.9	38.9	log	ckpt	script
OCRNet	HRFormer-B	7x7	Train	Val	60000	16	Yes	41.6	42.5	log	ckpt	script
OCRNet	HRFormer-B	15x15	Train	Val	60000	16	Yes	42.4	43.3	log	ckpt	script

ADE20K

The models are trained with input size of 520x520, and tested with original size. The results with window size 15x15 will be updated latter.

Methods	Backbone	Window Size	Train Set	Test Set	Iterations	Batch Size	OHEM	mIoU	mIoU (Multi-Scale)	Log	ckpt	script
OCRNet	HRFormer-S	7x7	Train	Val	150000	8	Yes	44.0	45.1	log	ckpt	script
OCRNet	HRFormer-B	7x7	Train	Val	150000	8	Yes	46.3	47.6	log	ckpt	script
OCRNet	HRFormer-B	13x13	Train	Val	150000	8	Yes	48.7	50.0	log	ckpt	script
OCRNet	HRFormer-B	15x15	Train	Val	150000	8	Yes	-	-	-	-	-

Classification

Results on ImageNet-1K

Backbone	acc@1	acc@5	#params	FLOPs	ckpt	log	script
HRFormer-T	78.6%	94.2%	8.0M	1.83G	ckpt	log	script
HRFormer-S	81.2%	95.6%	13.5M	3.56G	ckpt	log	script
HRFormer-B	82.8%	96.3%	50.3M	13.71G	ckpt	log	script

Citation

If you find this project useful in your research, please consider cite:

@article{YuanFHLZCW21,
  title={HRFormer: High-Resolution Transformer for Dense Prediction},
  author={Yuhui Yuan and Rao Fu and Lang Huang and Weihong Lin and Chao Zhang and Xilin Chen and Jingdong Wang},
  booktitle={NeurIPS},
  year={2021}
}

Acknowledgment

This project is developed based on the Swin-Transformer, openseg.pytorch, and mmpose.

git diff-index HEAD
git subtree add -P pose <url to sub-repo> <sub-repo branch>

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

HRNet / HRFormer

Programming Languages

Labels

Projects that are alternatives of or similar to HRFormer

HRFormer: High-Resolution Transformer for Dense Prediction, NeurIPS 2021

Introduction

Pose estimation

2d Human Pose Estimation

Results on COCO `val2017` with detector having human AP of 56.4 on COCO `val2017` dataset

Results on COCO `test-dev` with detector having human AP of 56.4 on COCO `val2017` dataset

Semantic segmentation

Cityscapes

PASCAL-Context

COCO-Stuff

ADE20K

Classification

Results on ImageNet-1K

Citation

Acknowledgment

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

HRNet / HRFormer

Programming Languages

Labels

Projects that are alternatives of or similar to HRFormer

HRFormer: High-Resolution Transformer for Dense Prediction, NeurIPS 2021

Introduction

Pose estimation

2d Human Pose Estimation

Results on COCO val2017 with detector having human AP of 56.4 on COCO val2017 dataset

Results on COCO test-dev with detector having human AP of 56.4 on COCO val2017 dataset

Semantic segmentation

Cityscapes

PASCAL-Context

COCO-Stuff

ADE20K

Classification

Results on ImageNet-1K

Citation

Acknowledgment

Results on COCO `val2017` with detector having human AP of 56.4 on COCO `val2017` dataset

Results on COCO `test-dev` with detector having human AP of 56.4 on COCO `val2017` dataset