All Projects → cj4L → SSNM-Coseg

cj4L / SSNM-Coseg

Licence: other
[AAAI20] Deep Object Co-segmentation via Spatial-Semantic Network Modulation(Oral paper)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to SSNM-Coseg

Speech Transformer Tf2.0
transformer for ASR-systerm (via tensorflow2.0)
Stars: ✭ 90 (+328.57%)
Mutual labels:  end-to-end
Gun
An open source cybersecurity protocol for syncing decentralized graph data.
Stars: ✭ 15,172 (+72147.62%)
Mutual labels:  end-to-end
SegSwap
(CVPRW 2022) Learning Co-segmentation by Segment Swapping for Retrieval and Discovery
Stars: ✭ 46 (+119.05%)
Mutual labels:  co-segmentation
E2e Asr
PyTorch Implementations for End-to-End Automatic Speech Recognition
Stars: ✭ 106 (+404.76%)
Mutual labels:  end-to-end
Eend
End-to-End Neural Diarization
Stars: ✭ 153 (+628.57%)
Mutual labels:  end-to-end
My bibliography for research on autonomous driving
Personal notes about scientific and research works on "Decision-Making for Autonomous Driving"
Stars: ✭ 197 (+838.1%)
Mutual labels:  end-to-end
Tib
Easy e2e browser testing in Node
Stars: ✭ 64 (+204.76%)
Mutual labels:  end-to-end
gravity
User-space deniable data encryption client.
Stars: ✭ 89 (+323.81%)
Mutual labels:  end-to-end
End2end Asr Pytorch
End-to-End Automatic Speech Recognition on PyTorch
Stars: ✭ 175 (+733.33%)
Mutual labels:  end-to-end
LR-GCCF
Revisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach, AAAI2020
Stars: ✭ 99 (+371.43%)
Mutual labels:  aaai2020
Rnn Transducer
MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks
Stars: ✭ 114 (+442.86%)
Mutual labels:  end-to-end
Listen Attend Spell
A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.
Stars: ✭ 147 (+600%)
Mutual labels:  end-to-end
Sstd
Single Shot Text Detector with Regional Attention
Stars: ✭ 221 (+952.38%)
Mutual labels:  end-to-end
Tacotron Pytorch
A Pytorch Implementation of Tacotron: End-to-end Text-to-speech Deep-Learning Model
Stars: ✭ 104 (+395.24%)
Mutual labels:  end-to-end
quickstart-examples
Integration examples of Tanker's client-side encryption SDKs
Stars: ✭ 17 (-19.05%)
Mutual labels:  end-to-end
Protractor
E2E test framework for Angular apps
Stars: ✭ 8,792 (+41766.67%)
Mutual labels:  end-to-end
Kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.
Stars: ✭ 190 (+804.76%)
Mutual labels:  end-to-end
License-plate-recognition
使用 "Darknet yolov3-tiny" 进行车牌识别
Stars: ✭ 90 (+328.57%)
Mutual labels:  end-to-end
LCMCG-PyTorch
AAAI2020-The official implementation of "Learning Cross-modal Context Graph for Visual Grounding"
Stars: ✭ 53 (+152.38%)
Mutual labels:  aaai2020
Automatic speech recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Stars: ✭ 2,751 (+13000%)
Mutual labels:  end-to-end

[AAAI20] Deep Object Co-segmentation via Spatial-Semantic Network Modulation(Oral paper)

Authors: Kaihua Zhang, Jin Chen, Bo Liu, Qingshan Liu

Abstract

 Object co-segmentation is to segment the shared objects in multiple relevant images, which has numerous applications in computer vision. This paper presents a spatial and semantic modulated deep network framework for object co-segmentation. A backbone network is adopted to extract multi-resolution image features. With the multi-resolution features of the relevant images as input, we design a spatial modulator to learn a mask for each image. The spatial modulator captures the correlations of image feature descriptors via unsupervised learning. The learned mask can roughly localize the shared foreground object while suppressing the background. For the semantic modulator, we model it as a supervised image classification task. We propose a hierarchical second-order pooling module to transform the image features for classification use. The outputs of the two modulators manipulate the multi-resolution features by a shift-and-scale operation so that the features focus on segmenting co-object regions. The proposed model is trained end-to-end without any intricate post-processing. Extensive experiments on four image co-segmentation benchmark datasets demonstrate the superior accuracy of the proposed method compared to state-of-the-art methods.

Examples

Overview of our method

Datasets

 In order to compare the deep learning methods in recent years fairly, we conduct extensive evaluations on four widely-used benchmark datasets including sub-set of MSRC, Internet, sub-set of iCoseg, and PASCAL-VOC. Among them:

  • The sub-set of MSRC includes 7 classes: bird, car, cat, cow, dog, plane, sheep, and each class contains 10 images.
  • The Internet has 3 categories of airplane, car and horse. Each class has 100 images including some images with noisy labels.
  • The sub-set of iCoseg contains 8 categories, and each has a different number of images.
  • The PASCAL-VOC is the most challenging dataset with 1037 images of 20 categories selected from the PASCAL-VOC 2010 dataset.

Results download

Environment

  • Ubuntu 16.04, Nvidia RTX 2080Ti
  • Python 3
  • PyTorch>=1.0, TorchVision>=0.2.2
  • Numpy==1.16.2, Pillow, pycocotools

Test

  • Get or download the dataset we have processed in Google Drive.
  • Download VGG16-backbone pretrained model in Google Drive.
  • Modify the path config in coseg_test.py and run it.

Train

  • Get the COCO2017 Dataset for training the whole network.
  • Get the test dataset for val and test phase.
  • Download VGG16 pretrained weights in Google Drive. Actually is from PyTorch offical model weights, expect for deleting the last serveral layers.
  • Download dict.npy in Google Drive.
  • Modify the path config in main.py and run it.

Notes

  • Following the suggestion of reviewers in AAAI20, we would not release the HRNet-backbone trained model for fairly comparing with others methods.
  • There are some slight differences in the 'Fusion' part of the model but little impact.
  • There is a mistake value in Table 2, our HRNet J-index(82.5) in 'Car' in Internet Dataset should be modified with (73.9).
  • There is something wrong about the share link of BaiduPan, contact me if want.

Schedule

  • Create github repo (2019.11.18)
  • Release arXiv pdf (2019.12.2)
  • Release AAAI20 pdf (2020.7.3)
  • All results (2020.7.3)
  • Test and Train code (2021.6.4)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].