All Projects → JialeCao001 → Sipmask

JialeCao001 / Sipmask

Licence: mit
SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation (ECCV2020)

Projects that are alternatives of or similar to Sipmask

Deep Learning For Tracking And Detection
Collection of papers, datasets, code and other resources for object tracking and detection using deep learning
Stars: ✭ 1,920 (+652.94%)
Mutual labels:  object-detection, segmentation, tracking, detection
Dstl unet
Dstl Satellite Imagery Feature Detection
Stars: ✭ 117 (-54.12%)
Mutual labels:  jupyter-notebook, segmentation, detection
Rectlabel Support
RectLabel - An image annotation tool to label images for bounding box object detection and segmentation.
Stars: ✭ 338 (+32.55%)
Mutual labels:  object-detection, segmentation, detection
Automl
Google Brain AutoML
Stars: ✭ 4,795 (+1780.39%)
Mutual labels:  object-detection, jupyter-notebook, detection
Tensorflow 2.x Yolov3
YOLOv3 implementation in TensorFlow 2.3.1
Stars: ✭ 300 (+17.65%)
Mutual labels:  jupyter-notebook, tracking, detection
Awesome-Vision-Transformer-Collection
Variants of Vision Transformer and its downstream tasks
Stars: ✭ 124 (-51.37%)
Mutual labels:  tracking, detection, segmentation
Vehicle Detection And Tracking
Computer vision based vehicle detection and tracking using Tensorflow Object Detection API and Kalman-filtering
Stars: ✭ 384 (+50.59%)
Mutual labels:  object-detection, tracking, detection
Yet Another Efficientdet Pytorch
The pytorch re-implement of the official efficientdet with SOTA performance in real time and pretrained weights.
Stars: ✭ 4,945 (+1839.22%)
Mutual labels:  object-detection, jupyter-notebook, detection
Medicaldetectiontoolkit
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.
Stars: ✭ 917 (+259.61%)
Mutual labels:  object-detection, segmentation, detection
Albumentations
Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
Stars: ✭ 9,353 (+3567.84%)
Mutual labels:  object-detection, segmentation, detection
Jacinto Ai Devkit
Training & Quantization of embedded friendly Deep Learning / Machine Learning / Computer Vision models
Stars: ✭ 49 (-80.78%)
Mutual labels:  object-detection, segmentation, detection
Cnn Paper2
🎨 🎨 深度学习 卷积神经网络教程 :图像识别,目标检测,语义分割,实例分割,人脸识别,神经风格转换,GAN等🎨🎨 https://dataxujing.github.io/CNN-paper2/
Stars: ✭ 77 (-69.8%)
Mutual labels:  object-detection, segmentation, detection
Com.unity.perception
Perception toolkit for sim2real training and validation
Stars: ✭ 208 (-18.43%)
Mutual labels:  object-detection, segmentation, detection
opencv TLD
TLD:tracking-learning-detection 跟踪算法
Stars: ✭ 41 (-83.92%)
Mutual labels:  tracking, detection
rgbd person tracking
R-GBD Person Tracking is a ROS framework for detecting and tracking people from a mobile robot.
Stars: ✭ 46 (-81.96%)
Mutual labels:  detection, segmentation
Object-Detection-And-Tracking
Target detection in the first frame and Tracking target by SiamRPN.
Stars: ✭ 33 (-87.06%)
Mutual labels:  tracking, detection
crowd density segmentation
The code for preparing the training data for crowd counting / segmentation algorithm.
Stars: ✭ 21 (-91.76%)
Mutual labels:  detection, segmentation
ARFaceFilter
Javascript/WebGL lightweight face tracking library designed for augmented reality webcam filters. Features : multiple faces detection, rotation, mouth opening. Various integration examples are provided (Three.js, Babylon.js, FaceSwap, Canvas2D, CSS3D...).
Stars: ✭ 72 (-71.76%)
Mutual labels:  tracking, detection
farm-animal-tracking
Farm Animal Tracking (FAT)
Stars: ✭ 19 (-92.55%)
Mutual labels:  tracking, detection
pcan
Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight
Stars: ✭ 294 (+15.29%)
Mutual labels:  tracking, segmentation

SipMask

This is the official implementation of "SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation (ECCV2020)" built on the open-source mmdetection and maskrcnn-benchmark.

Introduction

Single-stage instance segmentation approaches have recently gained popularity due to their speed and simplicity, but are still lagging behind in accuracy, compared to two-stage methods. We propose a fast single-stage instance segmentation method, called SipMask, that preserves instance-specific spatial information by separating the mask prediction of an instance to different sub-regions of a detected bounding-box. Our main contribution is a novel light-weight spatial preservation (SP) module that generates a separate set of spatial coefficients for each sub-region within a bounding-box, leading to improved mask predictions. It also enables accurate delineation of spatially adjacent instances. Further, we introduce a mask alignment weighting loss and a feature alignment scheme to better correlate mask prediction with object detection.

SipMask-benchmark (image instance segmentation)

  • This project is built on the official implementation of FCOS, which is based on maskrcnn-benchmark.
  • High-quality version is provided.
  • Please use SipMask-benchmark and refer to INSTALL.md for installation.
  • PyTorch1.1.0 and cuda9.0/10.0 are used by me.
Train with multiple GPUs
python -m torch.distributed.launch --nproc_per_node=4 --master_port=$((RANDOM+10000)) tools/train_net.py --config-file ${CONFIG_FILE} DATALOADER.NUM_WORKERS 2 OUTPUT_DIR ${OUTPUT_PATH}
e.g.,
python -m torch.distributed.launch --nproc_per_node=4 --master_port=$((RANDOM+10000)) tools/train_net.py --config-file configs/sipmask/sipmask_R_50_FPN_1x.yaml DATALOADER.NUM_WORKERS 2 OUTPUT_DIR training_dir/sipmask_R_50_FPN_1x
Test with a single GPU
python tools/test_net.py --config-file ${CONFIG_FILE} MODEL.WEIGHT ${CHECKPOINT_FILE} TEST.IMS_PER_BATCH 4
e.g.,
python tools/test_net.py --config-file configs/sipmask/sipmask_R_50_FPN_1x.yaml MODEL.WEIGHT  training_dir/SipMask_R50_1x.pth TEST.IMS_PER_BATCH 4 
Results
name backbone input size epoch ms-train val. box AP val. mask AP download
SipMask R50 800 × 1333 1x no 39.5 34.2 model
SipMask R101 800 × 1333 3x yes 44.1 37.8 model

SipMask-mmdetection (image instance segmentation)

  • This project is built on mmdetection.
  • High-quality version and real-time version are both provided.
  • Please use SipMask-mmdetection and refer to INSTALL.md for installation.
  • PyTorch1.1.0, cuda9.0/10.0, and mmcv0.4.3 are used by me.
Train with multiple GPUs
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
e.g.,
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/sipmask/sipmask_r50_caffe_fpn_gn_1x_4gpu.py 4 --validate
Test with a single GPU
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]
e.g., 
python tools/test.py ./configs/sipmask/sipmask_r50_caffe_fpn_gn_1x_4gpu.py ./work_dirs/sipmask_r50_caffe_1x.pth --out results.pkl --eval bbox segm
Inference with saved results

With our trained model, detection results of an image can be visualized using the following command.

python ./demo/sipmask_demo.py ${CONFIG_FILE} ${CHECKPOINT_FILE} ${IMAGE_FILE} [--out ${OUT_PATH}]
e.g.,
python ./demo/sipmask_demo.py ./configs/sipmask/sipmask_r50_caffe_fpn_gn_1x_4gpu.py ./sipmask_r50_caffe_1x.pth ./demo/demo.jpg --out ./demo/aa.jpg
Results
name backbone input size epoch ms-train GN val. box AP val. mask AP download
SipMask R50 800×1333 1x no yes 38.2 33.5 model
SipMask R50 800×1333 2x yes yes 40.8 35.6 model
SipMask R101 800×1333 4x yes yes 43.6 37.8 model
SipMask R50 544×544 6x yes no 36.0 31.7 model
SipMask R50 544×544 10x yes yes 37.1 32.4 model
SipMask R101 544×544 6x yes no 38.4 33.6 model
SipMask R101 544×544 10x yes yes 40.3 34.8 model
SipMask++ R101-D 544×544 6x yes no 40.1 35.2 model
SipMask++ R101-D 544×544 10x yes yes 41.3 36.1 model
  • GN indicates group normalization used in prediction branch.
  • Model with the input size of 800×1333 fcoses on high accuracy, which is trained in RetinaNet style.
  • Model with the input size of 544×544 fcoses on fast speed, which is trained in SSD style.
  • ++ indicates adding deformable convolutions with interval of 3 in backbone and mask re-scoring module.

SipMask-VIS (video instance segmentation)

  • This project is an implementation for video instance segmenation based on mmdetection.
  • Please use SipMask-VIS and refer to INSTALL.md for installation.
  • PyTorch1.1.0, cuda9.0/10.0, and mmcv0.2.12 are used by me.

Please note that, to run YouTube-VIS dataset like MaskTrackRCNN, install the cocoapi for youtube-vis instead of installing the original cocoapi for coco as follows.

pip install git+https://github.com/youtubevos/cocoapi.git#"egg=pycocotools&subdirectory=PythonAPI"
or
cd SipMask-VIS/pycocotools/cocoapi/PythonAPI
python setup.py build_ext install
Train with multiple GPUs
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM}
e.g.,
CUDA_VISIBLE_DEVICES=0,1,2,3 ./toools/dist_train.sh ./configs/sipmask/sipmask_r50_caffe_fpn_gn_1x_4gpu.py 4
Test with a single GPU
python tools/test_video.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --eval segm
e.g.,
python ./tools/test_video.py configs/sipmask/sipmask_r50_caffe_fpn_gn_1x_4gpu.py ./work_dirs/sipmask_r50_fpn_1x.pth --out results.pkl --eval segm

If you want to save the results of video instance segmentation, please use the following command:

python tools/test_video.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --eval segm --show --save_path= ${SAVE_PATH}
  • CONFIG_FILE of SipMask-VIS is under the folder of SipMask-VIS/configs/sipmask.
  • The model pretrained on MS COCO dataset is used for weight initialization.
Results
name backbone input size epoch ms-train val. mask AP download
SipMask R50 360 × 640 1x no 32.5 model
SipMask R50 360 × 640 1x yes 33.7 model
  • The generated results on YouTube-VIS should be uploaded to codalab for evaluation.

Citation

If the project helps your research, please cite this paper.

@article{Cao_SipMask_ECCV_2020,
  author =       {Jiale Cao and Rao Muhammad Anwer and Hisham Cholakkal and Fahad Shahbaz Khan and Yanwei Pang and Ling Shao},
  title =        {SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation},
  journal =      {Proc. European Conference on Computer Vision},
  year =         {2020}
}

Acknowledgement

Many thanks to the open source codes, i.e., FCOS, mmdetection, YOLACT, and MaskTrack RCNN.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].