All Projects → whai362 → Pvt

whai362 / Pvt

Licence: apache-2.0

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pvt

Awesome-Vision-Transformer-Collection
Variants of Vision Transformer and its downstream tasks
Stars: ✭ 124 (-67.28%)
Mutual labels:  detection, backbone, segmentation
unsupervised llamas
Code for https://unsupervised-llamas.com
Stars: ✭ 70 (-81.53%)
Mutual labels:  detection, segmentation
HRFormer
This is an official implementation of our NeurIPS 2021 paper "HRFormer: High-Resolution Transformer for Dense Prediction".
Stars: ✭ 357 (-5.8%)
Mutual labels:  transformer, segmentation
Visual-Transformer-Paper-Summary
Summary of Transformer applications for computer vision tasks.
Stars: ✭ 51 (-86.54%)
Mutual labels:  transformer, segmentation
rgbd person tracking
R-GBD Person Tracking is a ROS framework for detecting and tracking people from a mobile robot.
Stars: ✭ 46 (-87.86%)
Mutual labels:  detection, segmentation
mmrazor
OpenMMLab Model Compression Toolbox and Benchmark.
Stars: ✭ 644 (+69.92%)
Mutual labels:  detection, segmentation
mri-deep-learning-tools
Resurces for MRI images processing and deep learning in 3D
Stars: ✭ 56 (-85.22%)
Mutual labels:  detection, segmentation
Shadowless
A Fast and Open Source Autonomous Perception System.
Stars: ✭ 29 (-92.35%)
Mutual labels:  detection, segmentation
Detectron.pytorch
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.
Stars: ✭ 2,805 (+640.11%)
Mutual labels:  segmentation, detection
Holy Edge
Holistically-Nested Edge Detection
Stars: ✭ 277 (-26.91%)
Mutual labels:  segmentation, detection
Fastmaskrcnn
Mask RCNN in TensorFlow
Stars: ✭ 3,069 (+709.76%)
Mutual labels:  segmentation, detection
crowd density segmentation
The code for preparing the training data for crowd counting / segmentation algorithm.
Stars: ✭ 21 (-94.46%)
Mutual labels:  detection, segmentation
Awesome Iccv
ICCV2019最新录用情况
Stars: ✭ 305 (-19.53%)
Mutual labels:  segmentation, detection
volkscv
A Python toolbox for computer vision research and project
Stars: ✭ 58 (-84.7%)
Mutual labels:  detection, segmentation
BCNet
Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [CVPR 2021]
Stars: ✭ 434 (+14.51%)
Mutual labels:  detection, segmentation
segmenter
[ICCV2021] Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation
Stars: ✭ 463 (+22.16%)
Mutual labels:  transformer, segmentation
Awesome Carla
👉 CARLA resources such as tutorial, blog, code and etc https://github.com/carla-simulator/carla
Stars: ✭ 246 (-35.09%)
Mutual labels:  segmentation, detection
Sinet
Camouflaged Object Detection, CVPR 2020 (Oral & Reported by the New Scientist Magazine)
Stars: ✭ 246 (-35.09%)
Mutual labels:  segmentation, detection
Sipmask
SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation (ECCV2020)
Stars: ✭ 255 (-32.72%)
Mutual labels:  segmentation, detection
Cvpods
All-in-one Toolbox for Computer Vision Research.
Stars: ✭ 277 (-26.91%)
Mutual labels:  segmentation, detection

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

This repository contains PyTorch evaluation code, training code and pretrained models for PVT (Pyramid Vision Transformer).

Like ResNet, PVT is a pure transformer backbone that can be easily plugged in most downstream task models.

With a comparable number of parameters, PVT-Small+RetinaNet achieves 40.4 AP on the COCO dataset, surpassing ResNet50+RetinNet (36.3 AP) by 4.1 AP.

Figure 1: Performance of RetinaNet 1x with different backbones.

This repository is developed on top of pytorch-image-models and deit.

For details see Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions.

If you use this code for a paper please cite:

@misc{wang2021pyramid,
      title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions}, 
      author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
      year={2021},
      eprint={2102.12122},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Done

  • PVT-Tiny/-Small
  • PVT + Object Detection

Todo List

  • ImageNet model weights
  • PVT + Semantic FPN configs & models
  • PVT + DETR/Sparse R-CNN config & models
  • PVT + Trans2Seg config & models

Usage

First, clone the repository locally:

git clone https://github.com/whai362/PVT.git

Then, install PyTorch 1.6.0+ and torchvision 0.7.0+ and pytorch-image-models 0.3.2:

conda install -c pytorch pytorch torchvision
pip install timm==0.3.2

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Model Zoo

Object Detection

Detection configs & models see here.

Method Lr schd box AP mask AP Config Download
PVT-Tiny + RetinaNet (800x) 1x 36.7 - config Todo.
PVT-Small + RetinaNet (640x) 1x 38.7 - config model
PVT-Small + RetinaNet (800x) 1x 40.4 - config model
R50 + DETR 50ep 32.3 - config Todo.
PVT-Small + DETR 50ep 34.7 - config Todo.

Image Classification

We provide baseline PVT models pretrained on ImageNet 2012.

name [email protected] #params (M) url
PVT-Tiny 75.1 13.2 51 M, PyTorch<=1.5
PVT-Small 79.8 24.5 93 M, PyTorch<=1.5
PVT-Medium 81.2 44.2 Todo.
PVT-Large 81.7 61.4 Todo.

Evaluation

To evaluate a pre-trained PVT-Small on ImageNet val with a single GPU run:

sh dist_train.sh pvt_small 1 /path/to/checkpoint_root --data-path /path/to/imagenet --resume /path/to/checkpoint_file --eval

This should give

* [email protected] 79.764 [email protected] 94.950 loss 0.885
Accuracy of the network on the 50000 test images: 79.8%

Training

To train PVT-Small on ImageNet on a single node with 8 gpus for 300 epochs run:

sh dist_train.sh pvt_small 8 /path/to/checkpoint_root --data-path /path/to/imagenet

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].