All Projects → saic-vul → imvoxelnet

saic-vul / imvoxelnet

Licence: MIT license
[WACV2022] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

Programming Languages

python
139335 projects - #7 most used programming language
C++
36643 projects - #6 most used programming language
Cuda
1817 projects

Projects that are alternatives of or similar to imvoxelnet

ViP
A New 3D Detector. Code Will be made public.
Stars: ✭ 29 (-83.8%)
Mutual labels:  kitti, 3d-object-detection, nuscenes
EgoNet
Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"
Stars: ✭ 111 (-37.99%)
Mutual labels:  kitti, 3d-object-detection
3D-Detection-Tracking-Viewer
3D detection and tracking viewer (visualization) for kitti & waymo dataset
Stars: ✭ 150 (-16.2%)
Mutual labels:  kitti, 3d-object-detection
Awesome-3D-Object-Detection-for-Autonomous-Driving
Papers on 3D Object Detection for Autonomous Driving
Stars: ✭ 52 (-70.95%)
Mutual labels:  3d-object-detection, nuscenes
FrameNet
FrameNet: Learning Local Canonical Frames of 3D Surfaces from a Single RGB Image
Stars: ✭ 115 (-35.75%)
Mutual labels:  scannet
Indoor-SfMLearner
[ECCV'20] Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation
Stars: ✭ 115 (-35.75%)
Mutual labels:  scannet
torch-points3d
Pytorch framework for doing deep learning on point clouds.
Stars: ✭ 1,823 (+918.44%)
Mutual labels:  scannet
M3DETR
Code base for M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers
Stars: ✭ 47 (-73.74%)
Mutual labels:  3d-object-detection
FCOSR
FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection
Stars: ✭ 59 (-67.04%)
Mutual labels:  mmdetection
K-Net
[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation
Stars: ✭ 434 (+142.46%)
Mutual labels:  mmdetection
SpinNet
[CVPR 2021] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration
Stars: ✭ 181 (+1.12%)
Mutual labels:  kitti
continuous-fusion
(ROS) Sensor fusion algorithm for camera+lidar.
Stars: ✭ 26 (-85.47%)
Mutual labels:  kitti
learning-topology-synthetic-data
Tensorflow implementation of Learning Topology from Synthetic Data for Unsupervised Depth Completion (RAL 2021 & ICRA 2021)
Stars: ✭ 22 (-87.71%)
Mutual labels:  kitti
DSIN
Deep Image Compression using Decoder Side Information (ECCV 2020)
Stars: ✭ 39 (-78.21%)
Mutual labels:  kitti
kitti-A-LOAM
Easy description to run and evaluate A-LOAM with KITTI-data
Stars: ✭ 28 (-84.36%)
Mutual labels:  kitti
BtcDet
Behind the Curtain: Learning Occluded Shapes for 3D Object Detection
Stars: ✭ 104 (-41.9%)
Mutual labels:  3d-object-detection
StereoNet
A customized implementation of the paper "StereoNet: guided hierarchical refinement for real-time edge-aware depth prediction"
Stars: ✭ 107 (-40.22%)
Mutual labels:  kitti
nnDetection
nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.
Stars: ✭ 355 (+98.32%)
Mutual labels:  3d-object-detection
Tools RosBag2KITTI
Conversion from ROSBAG (.bag) to image (.png) and points cloud (.bin), including ROSBAG decoding, pcd2bin and file directory extraction.
Stars: ✭ 131 (-26.82%)
Mutual labels:  kitti
SASensorProcessing
ROS node to create pointcloud out of stereo images from the KITTI Vision Benchmark Suite
Stars: ✭ 26 (-85.47%)
Mutual labels:  kitti

PWC PWC

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

News:

This repository contains implementation of the monocular/multi-view 3D object detector ImVoxelNet, introduced in our paper:

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
Danila Rukhovich, Anna Vorontsova, Anton Konushin
Samsung AI Center Moscow
https://arxiv.org/abs/2106.01178

drawing

Installation

For convenience, we provide a Dockerfile. Alternatively, you can install all required packages manually.

This implementation is based on mmdetection3d framework. Please refer to the original installation guide install.md, replacing open-mmlab/mmdetection3d with saic-vul/imvoxelnet. Also, rotated_iou should be installed with these 4 commands.

Most of the ImVoxelNet-related code locates in the following files: detectors/imvoxelnet.py, necks/imvoxelnet.py, dense_heads/imvoxel_head.py, pipelines/multi_view.py.

Datasets

We support three benchmarks based on the SUN RGB-D dataset.

  • For the VoteNet benchmark with 10 object categories, you should follow the instructions in sunrgbd.
  • For the PerspectiveNet benchmark with 30 object categories, the same instructions can be applied; you only need to set dataset argument to sunrgbd_monocular when running create_data.py.
  • The Total3DUnderstanding benchmark implies detecting objects of 37 categories along with camera pose and room layout estimation. Download the preprocessed data as train.json and val.json and put it to ./data/sunrgbd. Then run:
    python tools/data_converter/sunrgbd_total.py

For ScanNet please follow instructions in scannet. For KITTI and nuScenes, please follow instructions in getting_started.md.

Getting Started

Please see getting_started.md for basic usage examples.

Training

To start training, run dist_train with ImVoxelNet configs:

bash tools/dist_train.sh configs/imvoxelnet/imvoxelnet_kitti.py 8

Testing

Test pre-trained model using dist_test with ImVoxelNet configs:

bash tools/dist_test.sh configs/imvoxelnet/imvoxelnet_kitti.py \
    work_dirs/imvoxelnet_kitti/latest.pth 8 --eval mAP

Visualization

Visualizations can be created with test script. For better visualizations, you may set score_thr in configs to 0.15 or more:

python tools/test.py configs/imvoxelnet/imvoxelnet_kitti.py \
    work_dirs/imvoxelnet_kitti/latest.pth --show \
    --show-dir work_dirs/imvoxelnet_kitti

Models

v2 adds center sampling for indoor scenario. v3 simplifies 3d neck for indoor scenario. Differences are discussed in v2 and v3 preprints.

Dataset Object Classes Version Download
SUN RGB-D 37 from
Total3dUnderstanding
v1 | [email protected]: 41.5
v2 | [email protected]: 42.7
v3 | [email protected]: 43.7
model | log | config
model | log | config
model | log | config
SUN RGB-D 30 from
PerspectiveNet
v1 | [email protected]: 44.9
v2 | [email protected]: 47.2
v3 | [email protected]: 48.7
model | log | config
model | log | config
model | log | config
SUN RGB-D 10 from VoteNet v1 | [email protected]: 38.8
v2 | [email protected]: 39.4
v3 | [email protected]: 40.7
model | log | config
model | log | config
model | log | config
ScanNet 18 from VoteNet v1 | [email protected]: 40.6
v2 | [email protected]: 45.7
v3 | [email protected]: 48.1
model | log | config
model | log | config
model | log | config
KITTI Car v1 | [email protected]: 17.8 model | log | config
nuScenes Car v1 | AP: 51.8 model | log | config

Example Detections

drawing

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{rukhovich2022imvoxelnet,
  title={Imvoxelnet: Image to voxels projection for monocular and multi-view general-purpose 3d object detection},
  author={Rukhovich, Danila and Vorontsova, Anna and Konushin, Anton},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={2397--2406},
  year={2022}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].