All Projects → sithu31296 → semantic-segmentation

sithu31296 / semantic-segmentation

Licence: MIT license
SOTA Semantic Segmentation Models in PyTorch

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to semantic-segmentation

SegFormer
Official PyTorch implementation of SegFormer
Stars: ✭ 1,264 (+172.41%)
Mutual labels:  transformer, semantic-segmentation, cityscapes, ade20k
Hrnet Semantic Segmentation
The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919
Stars: ✭ 2,369 (+410.56%)
Mutual labels:  transformer, semantic-segmentation, cityscapes, pascal-context
semantic-segmentation-tensorflow
Semantic segmentation task for ADE20k & cityscapse dataset, based on several models.
Stars: ✭ 84 (-81.9%)
Mutual labels:  semantic-segmentation, cityscapes, ade20k
VT-UNet
[MICCAI2022] This is an official PyTorch implementation for A Robust Volumetric Transformer for Accurate 3D Tumor Segmentation
Stars: ✭ 151 (-67.46%)
Mutual labels:  transformer, semantic-segmentation, vision-transformer
visualization
a collection of visualization function
Stars: ✭ 189 (-59.27%)
Mutual labels:  transformer, vision-transformer
image-classification
A collection of SOTA Image Classification Models in PyTorch
Stars: ✭ 70 (-84.91%)
Mutual labels:  transformer, vision-transformer
SwinIR
SwinIR: Image Restoration Using Swin Transformer (official repository)
Stars: ✭ 1,260 (+171.55%)
Mutual labels:  transformer, vision-transformer
Mmsegmentation
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
Stars: ✭ 2,875 (+519.61%)
Mutual labels:  transformer, semantic-segmentation
Ghostnet
CV backbones including GhostNet, TinyNet and TNT, developed by Huawei Noah's Ark Lab.
Stars: ✭ 1,744 (+275.86%)
Mutual labels:  transformer, vision-transformer
Setr Pytorch
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Stars: ✭ 96 (-79.31%)
Mutual labels:  transformer, semantic-segmentation
keras-vision-transformer
The Tensorflow, Keras implementation of Swin-Transformer and Swin-UNET
Stars: ✭ 91 (-80.39%)
Mutual labels:  transformer, vision-transformer
towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Stars: ✭ 821 (+76.94%)
Mutual labels:  transformer, vision-transformer
YOLOS
You Only Look at One Sequence (NeurIPS 2021)
Stars: ✭ 612 (+31.9%)
Mutual labels:  transformer, vision-transformer
LaTeX-OCR
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Stars: ✭ 1,566 (+237.5%)
Mutual labels:  transformer, vision-transformer
TransMorph Transformer for Medical Image Registration
TransMorph: Transformer for Unsupervised Medical Image Registration (PyTorch)
Stars: ✭ 130 (-71.98%)
Mutual labels:  transformer, vision-transformer
multiclass-semantic-segmentation
Experiments with UNET/FPN models and cityscapes/kitti datasets [Pytorch]
Stars: ✭ 96 (-79.31%)
Mutual labels:  semantic-segmentation, cityscapes
libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
Stars: ✭ 284 (-38.79%)
Mutual labels:  transformer, vision-transformer
Lightnetplusplus
LightNet++: Boosted Light-weighted Networks for Real-time Semantic Segmentation
Stars: ✭ 218 (-53.02%)
Mutual labels:  semantic-segmentation, cityscapes
Decouplesegnets
Implementation of Our ECCV2020-work: Improving Semantic Segmentation via Decoupled Body and Edge Supervision
Stars: ✭ 232 (-50%)
Mutual labels:  semantic-segmentation, cityscapes
IAST-ECCV2020
IAST: Instance Adaptive Self-training for Unsupervised Domain Adaptation (ECCV 2020) https://teacher.bupt.edu.cn/zhuchuang/en/index.htm
Stars: ✭ 84 (-81.9%)
Mutual labels:  semantic-segmentation, cityscapes

Semantic Segmentation

Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch

Open In Colab

banner

Features

  • Applicable to following tasks:
    • Scene Parsing
    • Human Parsing
    • Face Parsing
    • Medical Image Segmentation (Coming Soon)
  • 20+ Datasets
  • 15+ SOTA Backbones
  • 10+ SOTA Semantic Segmentation Models
  • PyTorch, ONNX, TFLite, OpenVINO Export & Inference

Model Zoo

Supported Backbones:

Supported Heads/Methods:

Supported Standalone Models:

Supported Modules:

  • PPM (CVPR 2017)
  • PSA (ArXiv 2021)

Refer to MODELS for benchmarks and available pre-trained models.

And check BACKBONES for supported backbones.

Notes: Most of the methods do not have pre-trained models. It's very difficult to combine different models with pre-trained weights in one repository and limited resource to re-train myself.

Supported Datasets

Scene Parsing:

Human Parsing:

Face Parsing:

Others:

Refer to DATASETS for more details and dataset preparation.

Available Augmentations (click to expand)

Check the notebook here to test the augmentation effects.

Pixel-level Transforms:

  • ColorJitter (Brightness, Contrast, Saturation, Hue)
  • Gamma, Sharpness, AutoContrast, Equalize, Posterize
  • GaussianBlur, Grayscale

Spatial-level Transforms:

  • Affine, RandomRotation
  • HorizontalFlip, VerticalFlip
  • CenterCrop, RandomCrop
  • Pad, ResizePad, Resize
  • RandomResizedCrop

Usage

Installation
  • python >= 3.6
  • torch >= 1.8.1
  • torchvision >= 0.9.1

Then, clone the repo and install the project with:

$ git clone https://github.com/sithu31296/semantic-segmentation
$ cd semantic-segmentation
$ pip install -e .

Configuration (click to expand)

Create a configuration file in configs. Sample configuration for ADE20K dataset can be found here. Then edit the fields you think if it is needed. This configuration file is needed for all of training, evaluation and prediction scripts.


Training (click to expand)

To train with a single GPU:

$ python tools/train.py --cfg configs/CONFIG_FILE.yaml

To train with multiple gpus, set DDP field in config file to true and run as follows:

$ python -m torch.distributed.launch --nproc_per_node=2 --use_env tools/train.py --cfg configs/<CONFIG_FILE_NAME>.yaml

Evaluation (click to expand)

Make sure to set MODEL_PATH of the configuration file to your trained model directory.

$ python tools/val.py --cfg configs/<CONFIG_FILE_NAME>.yaml

To evaluate with multi-scale and flip, change ENABLE field in MSF to true and run the same command as above.


Inference

To make an inference, edit the parameters of the config file from below.

  • Change MODEL >> NAME and BACKBONE to your desired pretrained model.
  • Change DATASET >> NAME to the dataset name depending on the pretrained model.
  • Set TEST >> MODEL_PATH to pretrained weights of the testing model.
  • Change TEST >> FILE to the file or image folder path you want to test.
  • Testing results will be saved in SAVE_DIR.
## example using ade20k pretrained models
$ python tools/infer.py --cfg configs/ade20k.yaml

Example test results (SegFormer-B2):

test_result


Convert to other Frameworks (ONNX, CoreML, OpenVINO, TFLite)

To convert to ONNX and CoreML, run:

$ python tools/export.py --cfg configs/<CONFIG_FILE_NAME>.yaml

To convert to OpenVINO and TFLite, see torch_optimize.


Inference (ONNX, OpenVINO, TFLite)
## ONNX Inference
$ python scripts/onnx_infer.py --model <ONNX_MODEL_PATH> --img-path <TEST_IMAGE_PATH>

## OpenVINO Inference
$ python scripts/openvino_infer.py --model <OpenVINO_MODEL_PATH> --img-path <TEST_IMAGE_PATH>

## TFLite Inference
$ python scripts/tflite_infer.py --model <TFLite_MODEL_PATH> --img-path <TEST_IMAGE_PATH>

References (click to expand)

Citations (click to expand)
@article{xie2021segformer,
  title={SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers},
  author={Xie, Enze and Wang, Wenhai and Yu, Zhiding and Anandkumar, Anima and Alvarez, Jose M and Luo, Ping},
  journal={arXiv preprint arXiv:2105.15203},
  year={2021}
}

@misc{xiao2018unified,
  title={Unified Perceptual Parsing for Scene Understanding}, 
  author={Tete Xiao and Yingcheng Liu and Bolei Zhou and Yuning Jiang and Jian Sun},
  year={2018},
  eprint={1807.10221},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@article{hong2021deep,
  title={Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes},
  author={Hong, Yuanduo and Pan, Huihui and Sun, Weichao and Jia, Yisong},
  journal={arXiv preprint arXiv:2101.06085},
  year={2021}
}

@misc{zhang2021rest,
  title={ResT: An Efficient Transformer for Visual Recognition}, 
  author={Qinglong Zhang and Yubin Yang},
  year={2021},
  eprint={2105.13677},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{huang2021fapn,
  title={FaPN: Feature-aligned Pyramid Network for Dense Image Prediction}, 
  author={Shihua Huang and Zhichao Lu and Ran Cheng and Cheng He},
  year={2021},
  eprint={2108.07058},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{wang2021pvtv2,
  title={PVTv2: Improved Baselines with Pyramid Vision Transformer}, 
  author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
  year={2021},
  eprint={2106.13797},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@article{Liu2021PSA,
  title={Polarized Self-Attention: Towards High-quality Pixel-wise Regression},
  author={Huajun Liu and Fuqiang Liu and Xinyi Fan and Dong Huang},
  journal={Arxiv Pre-Print arXiv:2107.00782 },
  year={2021}
}

@misc{chao2019hardnet,
  title={HarDNet: A Low Memory Traffic Network}, 
  author={Ping Chao and Chao-Yang Kao and Yu-Shan Ruan and Chien-Hsiang Huang and Youn-Long Lin},
  year={2019},
  eprint={1909.00948},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@inproceedings{sfnet,
  title={Semantic Flow for Fast and Accurate Scene Parsing},
  author={Li, Xiangtai and You, Ansheng and Zhu, Zhen and Zhao, Houlong and Yang, Maoke and Yang, Kuiyuan and Tong, Yunhai},
  booktitle={ECCV},
  year={2020}
}

@article{Li2020SRNet,
  title={Towards Efficient Scene Understanding via Squeeze Reasoning},
  author={Xiangtai Li and Xia Li and Ansheng You and Li Zhang and Guang-Liang Cheng and Kuiyuan Yang and Y. Tong and Zhouchen Lin},
  journal={ArXiv},
  year={2020},
  volume={abs/2011.03308}
}

@ARTICLE{Yucondnet21,
  author={Yu, Changqian and Shao, Yuanjie and Gao, Changxin and Sang, Nong},
  journal={IEEE Signal Processing Letters}, 
  title={CondNet: Conditional Classifier for Scene Segmentation}, 
  year={2021},
  volume={28},
  number={},
  pages={758-762},
  doi={10.1109/LSP.2021.3070472}
}

@misc{yan2022lawin,
  title={Lawin Transformer: Improving Semantic Segmentation Transformer with Multi-Scale Representations via Large Window Attention}, 
  author={Haotian Yan and Chuang Zhang and Ming Wu},
  year={2022},
  eprint={2201.01615},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{yu2021metaformer,
  title={MetaFormer is Actually What You Need for Vision}, 
  author={Weihao Yu and Mi Luo and Pan Zhou and Chenyang Si and Yichen Zhou and Xinchao Wang and Jiashi Feng and Shuicheng Yan},
  year={2021},
  eprint={2111.11418},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{wightman2021resnet,
  title={ResNet strikes back: An improved training procedure in timm}, 
  author={Ross Wightman and Hugo Touvron and Hervé Jégou},
  year={2021},
  eprint={2110.00476},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{liu2022convnet,
  title={A ConvNet for the 2020s}, 
  author={Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
  year={2022},
  eprint={2201.03545},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{li2022uniformer,
  title={UniFormer: Unifying Convolution and Self-attention for Visual Recognition}, 
  author={Kunchang Li and Yali Wang and Junhao Zhang and Peng Gao and Guanglu Song and Yu Liu and Hongsheng Li and Yu Qiao},
  year={2022},
  eprint={2201.09450},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].