All Projects β†’ kazuto1011 β†’ Deeplab Pytorch

kazuto1011 / Deeplab Pytorch

Licence: mit
PyTorch implementation of DeepLab v2 on COCO-Stuff / PASCAL VOC

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Deeplab Pytorch

Semanticsegmentation
A framework for training segmentation models in pytorch on labelme annotations with pretrained examples of skin, cat, and pizza topping segmentation
Stars: ✭ 52 (-93.39%)
Mutual labels:  semantic-segmentation, coco
Torchdistill
PyTorch-based modular, configuration-driven framework for knowledge distillation. πŸ†18 methods including SOTA are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy.
Stars: ✭ 177 (-77.51%)
Mutual labels:  semantic-segmentation, coco
Cocostuff10k
The official homepage of the (outdated) COCO-Stuff 10K dataset.
Stars: ✭ 248 (-68.49%)
Mutual labels:  semantic-segmentation, coco
Vqa.pytorch
Visual Question Answering in Pytorch
Stars: ✭ 602 (-23.51%)
Mutual labels:  coco
Light Weight Refinenet
Light-Weight RefineNet for Real-Time Semantic Segmentation
Stars: ✭ 619 (-21.35%)
Mutual labels:  semantic-segmentation
Pytorch Segmentation Toolbox
PyTorch Implementations for DeeplabV3 and PSPNet
Stars: ✭ 691 (-12.2%)
Mutual labels:  semantic-segmentation
Labelme
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
Stars: ✭ 7,742 (+883.74%)
Mutual labels:  semantic-segmentation
Pytorch Deeplab Resnet
DeepLab resnet v2 model in pytorch
Stars: ✭ 584 (-25.79%)
Mutual labels:  semantic-segmentation
Imglab
To speedup and simplify image labeling/ annotation process with multiple supported formats.
Stars: ✭ 723 (-8.13%)
Mutual labels:  coco
Pytorch segmentation
Semantic segmentation models, datasets and losses implemented in PyTorch.
Stars: ✭ 674 (-14.36%)
Mutual labels:  semantic-segmentation
Adaptsegnet
Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)
Stars: ✭ 654 (-16.9%)
Mutual labels:  semantic-segmentation
Semseg
εΈΈη”¨ηš„θ―­δΉ‰εˆ†ε‰²ζžΆζž„η»“ζž„η»ΌθΏ°δ»₯及代码倍现
Stars: ✭ 624 (-20.71%)
Mutual labels:  semantic-segmentation
Lightnet
LightNet: Light-weight Networks for Semantic Image Segmentation (Cityscapes and Mapillary Vistas Dataset)
Stars: ✭ 698 (-11.31%)
Mutual labels:  semantic-segmentation
Label Studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Stars: ✭ 7,264 (+823%)
Mutual labels:  semantic-segmentation
Pytorch 3dunet
3D U-Net model for volumetric semantic segmentation written in pytorch
Stars: ✭ 765 (-2.8%)
Mutual labels:  semantic-segmentation
Cvat
Powerful and efficient Computer Vision Annotation Tool (CVAT)
Stars: ✭ 6,557 (+733.16%)
Mutual labels:  semantic-segmentation
Gscnn
Gated-Shape CNN for Semantic Segmentation (ICCV 2019)
Stars: ✭ 720 (-8.51%)
Mutual labels:  semantic-segmentation
Prepare detection dataset
convert dataset to coco/voc format
Stars: ✭ 654 (-16.9%)
Mutual labels:  coco
Randla Net
πŸ”₯RandLA-Net in Tensorflow (CVPR 2020, Oral)
Stars: ✭ 637 (-19.06%)
Mutual labels:  semantic-segmentation
Mmpose
OpenMMLab Pose Estimation Toolbox and Benchmark.
Stars: ✭ 674 (-14.36%)
Mutual labels:  coco

DeepLab with PyTorch

This is an unofficial PyTorch implementation of DeepLab v2 [1] with a ResNet-101 backbone.

  • COCO-Stuff dataset [2] and PASCAL VOC dataset [3] are supported.
  • The official Caffe weights provided by the authors can be used without building the Caffe APIs.
  • DeepLab v3/v3+ models with the identical backbone are also included (not tested).
  • torch.hub is supported.

Performance

COCO-Stuff

Train set Eval set Code Weight CRF? Pixel
Accuracy
Mean
Accuracy
Mean IoU FreqW IoU
10k train † 10k val † Official [2] 65.1 45.5 34.4 50.4
This repo Download 65.8 45.7 34.8 51.2
βœ“ 67.1 46.4 35.6 52.5
164k train 164k val This repo Download ‑ 66.8 51.2 39.1 51.5
βœ“ 67.6 51.5 39.7 52.3

† Images and labels are pre-warped to square-shape 513x513
‑ Note for SPADE followers: The provided COCO-Stuff 164k weight has been kept intact since 2019/02/23.

PASCAL VOC 2012

Train set Eval set Code Weight CRF? Pixel
Accuracy
Mean
Accuracy
Mean IoU FreqW IoU
trainaug val Official [3] - - 76.35 -
βœ“ - - 77.69 -
This repo Download 94.64 86.50 76.65 90.41
βœ“ 95.04 86.64 77.93 91.06

Setup

Requirements

Required Python packages are listed in the Anaconda configuration file configs/conda_env.yaml. Please modify the listed cudatoolkit=10.2 and python=3.6 as needed and run the following commands.

# Set up with Anaconda
conda env create -f configs/conda_env.yaml
conda activate deeplab-pytorch

Download datasets

Download pre-trained caffemodels

Caffemodels pre-trained on COCO and PASCAL VOC datasets are released by the DeepLab authors. In accordance with the papers [1,2], this repository uses the COCO-trained parameters as initial weights.

  1. Run the follwing script to download the pre-trained caffemodels (1GB+).
$ bash scripts/setup_caffemodels.sh
  1. Convert the caffemodels to pytorch compatibles. No need to build the Caffe API!
# Generate "deeplabv1_resnet101-coco.pth" from "init.caffemodel"
$ python convert.py --dataset coco
# Generate "deeplabv2_resnet101_msc-vocaug.pth" from "train2_iter_20000.caffemodel"
$ python convert.py --dataset voc12

Training & Evaluation

To train DeepLab v2 on PASCAL VOC 2012:

python main.py train \
    --config-path configs/voc12.yaml

To evaluate the performance on a validation set:

python main.py test \
    --config-path configs/voc12.yaml \
    --model-path data/models/voc12/deeplabv2_resnet101_msc/train_aug/checkpoint_final.pth

Note: This command saves the predicted logit maps (.npy) and the scores (.json).

To re-evaluate with a CRF post-processing:

python main.py crf \
    --config-path configs/voc12.yaml

Execution of a series of the above scripts is equivalent to bash scripts/train_eval.sh.

To monitor a loss, run the following command in a separate terminal.

tensorboard --logdir data/logs

Please specify the appropriate configuration files for the other datasets.

Dataset Config file #Iterations Classes
PASCAL VOC 2012 configs/voc12.yaml 20,000 20 foreground + 1 background
COCO-Stuff 10k configs/cocostuff10k.yaml 20,000 182 thing/stuff
COCO-Stuff 164k configs/cocostuff164k.yaml 100,000 182 thing/stuff

Note: Although the label indices range from 0 to 181 in COCO-Stuff 10k/164k, only 171 classes are supervised.

Common settings:

  • Model: DeepLab v2 with ResNet-101 backbone. Dilated rates of ASPP are (6, 12, 18, 24). Output stride is 8.
  • GPU: All the GPUs visible to the process are used. Please specify the scope with CUDA_VISIBLE_DEVICES=.
  • Multi-scale loss: Loss is defined as a sum of responses from multi-scale inputs (1x, 0.75x, 0.5x) and element-wise max across the scales. The unlabeled class is ignored in the loss computation.
  • Gradient accumulation: The mini-batch of 10 samples is not processed at once due to the high occupancy of GPU memories. Instead, gradients of small batches of 5 samples are accumulated for 2 iterations, and weight updating is performed at the end (batch_size * iter_size = 10). GPU memory usage is approx. 11.2 GB with the default setting (tested on the single Titan X). You can reduce it with a small batch_size.
  • Learning rate: Stochastic gradient descent (SGD) is used with momentum of 0.9 and initial learning rate of 2.5e-4. Polynomial learning rate decay is employed; the learning rate is multiplied by (1-iter/iter_max)**power at every 10 iterations.
  • Monitoring: Moving average loss (average_loss in Caffe) can be monitored in TensorBoard.
  • Preprocessing: Input images are randomly re-scaled by factors ranging from 0.5 to 1.5, padded if needed, and randomly cropped to 321x321.

Processed images and labels in COCO-Stuff 164k:

Data

Inference Demo

You can use the pre-trained models, the converted models, or your models.

To process a single image:

python demo.py single \
    --config-path configs/voc12.yaml \
    --model-path deeplabv2_resnet101_msc-vocaug-20000.pth \
    --image-path image.jpg

To run on a webcam:

python demo.py live \
    --config-path configs/voc12.yaml \
    --model-path deeplabv2_resnet101_msc-vocaug-20000.pth

To run a CRF post-processing, add --crf. To run on a CPU, add --cpu.

Misc

torch.hub

Model setup with two lines

import torch.hub
model = torch.hub.load("kazuto1011/deeplab-pytorch", "deeplabv2_resnet101", pretrained='cocostuff164k', n_classes=182)

Difference with Caffe version

  • While the official code employs 1/16 bilinear interpolation (Interp layer) for downsampling a label for only 0.5x input, this codebase does for both 0.5x and 0.75x inputs with nearest interpolation (PIL.Image.resize, related issue).
  • Bilinear interpolation on images and logits is performed with the align_corners=False.

Training batch normalization

This codebase only supports DeepLab v2 training which freezes batch normalization layers, although v3/v3+ protocols require training them. If training their parameters on multiple GPUs as well in your projects, please install the extra library below.

pip install torch-encoding

Batch normalization layers in a model are automatically switched in libs/models/resnet.py.

try:
    from encoding.nn import SyncBatchNorm
    _BATCH_NORM = SyncBatchNorm
except:
    _BATCH_NORM = nn.BatchNorm2d

References

  1. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE TPAMI, 2018.
    Project / Code / arXiv paper

  2. H. Caesar, J. Uijlings, V. Ferrari. COCO-Stuff: Thing and Stuff Classes in Context. In CVPR, 2018.
    Project / arXiv paper

  3. M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, A. Zisserman. The PASCAL Visual Object Classes (VOC) Challenge. IJCV, 2010.
    Project / Paper

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].