All Projects → megvii-research → SOLQ

megvii-research / SOLQ

Licence: other
"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

Programming Languages

python
139335 projects - #7 most used programming language
Cuda
1817 projects
C++
36643 projects - #6 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to SOLQ

Speech Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Stars: ✭ 565 (+255.35%)
Mutual labels:  end-to-end, transformer
kosr
Korean speech recognition based on transformer (트랜스포머 기반 한국어 음성 인식)
Stars: ✭ 25 (-84.28%)
Mutual labels:  end-to-end, transformer
max-deeplab
Unofficial implementation of MaX-DeepLab for Instance Segmentation
Stars: ✭ 84 (-47.17%)
Mutual labels:  transformer, instance-segmentation
Transformer-Transducer
PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)
Stars: ✭ 61 (-61.64%)
Mutual labels:  end-to-end, transformer
Kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.
Stars: ✭ 190 (+19.5%)
Mutual labels:  end-to-end, transformer
Speech Transformer Tf2.0
transformer for ASR-systerm (via tensorflow2.0)
Stars: ✭ 90 (-43.4%)
Mutual labels:  end-to-end, transformer
Cell Detr
Official and maintained implementation of the paper Attention-Based Transformers for Instance Segmentation of Cells in Microstructures [BIBM 2020].
Stars: ✭ 26 (-83.65%)
Mutual labels:  transformer, instance-segmentation
End2end Asr Pytorch
End-to-End Automatic Speech Recognition on PyTorch
Stars: ✭ 175 (+10.06%)
Mutual labels:  end-to-end, transformer
kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
Stars: ✭ 456 (+186.79%)
Mutual labels:  end-to-end, transformer
speech-transformer
Transformer implementation speciaized in speech recognition tasks using Pytorch.
Stars: ✭ 40 (-74.84%)
Mutual labels:  end-to-end, transformer
M3DETR
Code base for M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers
Stars: ✭ 47 (-70.44%)
Mutual labels:  transformer
php-hal
HAL+JSON & HAL+XML API transformer outputting valid (PSR-7) API Responses.
Stars: ✭ 30 (-81.13%)
Mutual labels:  transformer
deep-molecular-optimization
Molecular optimization by capturing chemist’s intuition using the Seq2Seq with attention and the Transformer
Stars: ✭ 60 (-62.26%)
Mutual labels:  transformer
TadTR
End-to-end Temporal Action Detection with Transformer. [Under review for a journal publication]
Stars: ✭ 55 (-65.41%)
Mutual labels:  transformer
DiscordEncryption
🔐 Configurable end to end encryption for Discord
Stars: ✭ 30 (-81.13%)
Mutual labels:  end-to-end
amrlib
A python library that makes AMR parsing, generation and visualization simple.
Stars: ✭ 107 (-32.7%)
Mutual labels:  transformer
Transformer tf2.0
Transfromer tensorflow2.0版本实现
Stars: ✭ 23 (-85.53%)
Mutual labels:  transformer
Embedding
Embedding模型代码和学习笔记总结
Stars: ✭ 25 (-84.28%)
Mutual labels:  transformer
COCO-dataset-explorer
Streamlit tool to explore coco datasets
Stars: ✭ 66 (-58.49%)
Mutual labels:  instance-segmentation
yolact
Tensorflow 2.x implementation YOLACT
Stars: ✭ 23 (-85.53%)
Mutual labels:  instance-segmentation

SOLQ: Segmenting Objects by Learning Queries

PWC PWC PWC PWC

This repository is an official implementation of the NeurIPS 2021 paper SOLQ: Segmenting Objects by Learning Queries.

Introduction

TL; DR. SOLQ is an end-to-end instance segmentation framework with Transformer. It directly outputs the instance masks without any box dependency.

Abstract. In this paper, we propose an end-to-end framework for instance segmentation. Based on the recently introduced DETR, our method, termed SOLQ, segments objects by learning unified queries. In SOLQ, each query represents one object and has multiple representations: class, location and mask. The object queries learned perform classification, box regression and mask encoding simultaneously in an unified vector form. During training phase, the mask vectors encoded are supervised by the compression coding of raw spatial masks. In inference time, mask vectors produced can be directly transformed to spatial masks by the inverse process of compression coding. Experimental results show that SOLQ can achieve state-of-the-art performance, surpassing most of existing approaches. Moreover, the joint learning of unified query representation can greatly improve the detection performance of original DETR. We hope our SOLQ can serve as a strong baseline for the Transformer-based instance segmentation.

Updates

  • (12/10/2021) Release D-DETR+SQR log.txt in SQR.
  • (29/09/2021) Our SOLQ has been accepted by NeurIPS 2021.
  • (14/07/2021) Higher performance (Box AP=56.5, Mask AP=46.7) is reported by training with long side 1536 on Swin-L backbone, instead of long side 1333.

Main Results

Method Backbone Dataset Box AP Mask AP Model
SOLQ R50 test-dev 47.8 39.7 google
SOLQ R101 test-dev 48.7 40.9 google
SOLQ Swin-L test-dev 55.4 45.9 google
SOLQ Swin-L & 1536 test-dev 56.5 46.7 google

Installation

The codebase is built on top of Deformable DETR.

Requirements

  • Linux, CUDA>=9.2, GCC>=5.4

  • Python>=3.7

    We recommend you to use Anaconda to create a conda environment:

    conda create -n deformable_detr python=3.7 pip

    Then, activate the environment:

    conda activate deformable_detr
  • PyTorch>=1.5.1, torchvision>=0.6.1 (following instructions here)

    For example, if your CUDA version is 9.2, you could install pytorch and torchvision as following:

    conda install pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=9.2 -c pytorch
  • Other requirements

    pip install -r requirements.txt
  • Build MultiScaleDeformableAttention

    cd ./models/ops
    sh ./make.sh

Usage

Dataset preparation

Please download COCO and organize them as following:

mkdir data && cd data
ln -s /path/to/coco coco

Training and Evaluation

Training on single node

Training SOLQ on 8 GPUs as following:

sh configs/r50_solq_train.sh

Evaluation

You can download the pretrained model of SOLQ (the link is in "Main Results" session), then run following command to evaluate it on COCO 2017 val dataset:

sh configs/r50_solq_eval.sh

Evaluation on COCO 2017 test-dev dataset

You can download the pretrained model of SOLQ (the link is in "Main Results" session), then run following command to evaluate it on COCO 2017 test-dev dataset (submit to server):

sh configs/r50_solq_submit.sh

Visualization on COCO 2017 val dataset

You can visualize on image as follows:

EXP_DIR=/path/to/checkpoint
python visual.py \
       --meta_arch solq \
       --backbone resnet50 \
       --with_vector \
       --with_box_refine \
       --masks \
       --batch_size 2 \
       --vector_hidden_dim 1024 \
       --vector_loss_coef 3 \
       --output_dir ${EXP_DIR} \
       --resume ${EXP_DIR}/solq_r50_final.pth \
       --eval    

Citing SOLQ

If you find SOLQ useful in your research, please consider citing:

@article{dong2021solq,
  title={SOLQ: Segmenting Objects by Learning Queries},
  author={Dong, Bin and Zeng, Fangao and Wang, Tiancai and Zhang, Xiangyu and Wei, Yichen},
  journal={NeurIPS},
  year={2021}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].