All Projects → amazon-research → progressive-coordinate-transforms

amazon-research / progressive-coordinate-transforms

Licence: Apache-2.0 license
Progressive Coordinate Transforms for Monocular 3D Object Detection, NeurIPS 2021

Programming Languages

python
139335 projects - #7 most used programming language
C++
36643 projects - #6 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to progressive-coordinate-transforms

Waymo Kitti Adapter
A tool converting Waymo dataset format to Kitti dataset format.
Stars: ✭ 83 (+50.91%)
Mutual labels:  kitti-dataset, waymo-open-dataset
M3DETR
Code base for M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers
Stars: ✭ 47 (-14.55%)
Mutual labels:  kitti-dataset, waymo-open-dataset
voxelnet chainer
VoxelNet implementation in Chainer
Stars: ✭ 26 (-52.73%)
Mutual labels:  kitti-dataset, 3d-detection
Awesome-3D-Object-Detection-for-Autonomous-Driving
Papers on 3D Object Detection for Autonomous Driving
Stars: ✭ 52 (-5.45%)
Mutual labels:  kitti-dataset, waymo-open-dataset
deepOF
TensorFlow implementation for "Guided Optical Flow Learning"
Stars: ✭ 26 (-52.73%)
Mutual labels:  kitti-dataset
cisip-FIRe
Fast Image Retrieval (FIRe) is an open source project to promote image retrieval research. It implements most of the major binary hashing methods to date, together with different popular backbone networks and public datasets.
Stars: ✭ 40 (-27.27%)
Mutual labels:  neurips-2021
multiclass-semantic-segmentation
Experiments with UNET/FPN models and cityscapes/kitti datasets [Pytorch]
Stars: ✭ 96 (+74.55%)
Mutual labels:  kitti-dataset
SASensorProcessing
ROS node to create pointcloud out of stereo images from the KITTI Vision Benchmark Suite
Stars: ✭ 26 (-52.73%)
Mutual labels:  kitti-dataset
kitti deeplab
Inference script and frozen inference graph with fine tuned weights for semantic segmentation on images from the KITTI dataset.
Stars: ✭ 26 (-52.73%)
Mutual labels:  kitti-dataset
SoCo
[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning
Stars: ✭ 125 (+127.27%)
Mutual labels:  neurips-2021
DenseLidarNet
No description or website provided.
Stars: ✭ 35 (-36.36%)
Mutual labels:  kitti-dataset
Cost-Aggregation-transformers
Official implementation of CATs
Stars: ✭ 120 (+118.18%)
Mutual labels:  neurips-2021
Visualizing-lidar-data
Visualizing lidar data using Uber Autonomous Visualization System (AVS) and Jupyter Notebook Application
Stars: ✭ 75 (+36.36%)
Mutual labels:  kitti-dataset
DSP-SLAM
[3DV 2021] DSP-SLAM: Object Oriented SLAM with Deep Shape Priors
Stars: ✭ 377 (+585.45%)
Mutual labels:  kitti-dataset
efficient online learning
Efficient Online Transfer Learning for 3D Object Detection in Autonomous Driving
Stars: ✭ 20 (-63.64%)
Mutual labels:  kitti-dataset
Entity-Graph-VLN
Code of the NeurIPS 2021 paper: Language and Visual Entity Relationship Graph for Agent Navigation
Stars: ✭ 34 (-38.18%)
Mutual labels:  neurips-2021
DiGCL
The PyTorch implementation of Directed Graph Contrastive Learning (DiGCL), NeurIPS-2021
Stars: ✭ 27 (-50.91%)
Mutual labels:  neurips-2021
awesome-point-cloud-deep-learning
Paper list of deep learning on point clouds.
Stars: ✭ 39 (-29.09%)
Mutual labels:  3d-detection
Awesome-Monocular-3D-detection
Awesome Monocular 3D detection
Stars: ✭ 169 (+207.27%)
Mutual labels:  3d-detection
WS3D
Official version of 'Weakly Supervised 3D object detection from Lidar Point Cloud'(ECCV2020)
Stars: ✭ 104 (+89.09%)
Mutual labels:  3d-detection

Progressive Coordinate Transforms for Monocular 3D Object Detection

This repository is the official implementation of PCT.

Introduction

In this paper, we propose a novel and lightweight approach, dubbed Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations for monocular 3D object detection. Specifically, a localization boosting mechanism with confidence-aware loss is introduced to progressively refine the localization prediction. In addition, semantic image representation is also exploited to compensate for the usage of patch proposals. Despite being lightweight and simple, our strategy allows us to establish a new state-of-the-art among the monocular 3D detectors on the competitive KITTI benchmark. At the same time, our proposed PCT shows great generalization to most coordinate-based 3D detection frameworks.

arch

Requirements

Installation

Download this repository (tested under python3.7, pytorch1.3.1 and ubuntu 16.04.7). There are also some dependencies like cv2, yaml, tqdm, etc., and please install them accordingly:

cd #root
pip install -r requirements

Then, you need to compile the evaluation script:

cd root/tools/kitti_eval
sh compile.sh

Prepare your data

First, you should download the KITTI dataset, and organize the data as follows (* indicates an empty directory to store the data generated in subsequent steps):


#ROOT
  |data
    |KITTI
      |2d_detections
      |ImageSets
      |pickle_files *
      |object
        |training
          |calib
          |image_2
          |label
          |depth *
          |pseudo_lidar (optional for Pseudo-LiDAR)*
          |velodyne (optional for FPointNet)
        |testing
          |calib
          |image_2
          |depth *
          |pseudo_lidar (optional for Pseudo-LiDAR)*
          |velodyne (optional for FPointNet)

Second, you need to prepare your depth maps and put them to data/KITTI/object/training/depth. For ease of use, we also provide the estimated depth maps (these data generated from the pretrained models provided by DORN and Pseudo-LiDAR).

Monocular (DORN) Stereo (PSMNet)
trainval(~1.6G), test(~1.6G) trainval(~2.5G)

Then, you need to generate image 2D features for the 2D bounding boxes and put them to data/KITTI/pickle_files/org. We train the 2D detector according to the 2D detector in RTM3D. You can also use your own 2D detector for training and inference.

Finally, generate the training data using provided scripts :

cd #root/tools/data_prepare
python patch_data_prepare_val.py --gen_train --gen_val --gen_val_detection --car_only
mv *.pickle ../../data/KITTI/pickle_files

Prepare Waymo dataset

We also provide Waymo Usage for monocular 3D detection.

Training

Move to the workplace and train the mode (also need to modify the path of pickle files in config file):

 cd #root
 cd experiments/pct
 python ../../tools/train_val.py --config config_val.yaml

Evaluation

Generate the results using the trained model:

 python ../../tools/train_val.py --config config_val.yaml --e

and evalute the generated results using:

../../tools/kitti_eval/evaluate_object_3d_offline_ap11 ../../data/KITTI/object/training/label_2 ./output

or

../../tools/kitti_eval/evaluate_object_3d_offline_ap40 ../../data/KITTI/object/training/label_2 ./output

we provide the generated results for evaluation due to the tedious process of data preparation process. Unzip the output.zip and then execute the above evaluation commonds. Result is:

Models AP3D11@mod. AP3D11@easy AP3D11@hard
PatchNet + PCT 27.53 / 34.65 38.39 / 47.16 24.44 / 28.47

Acknowledgements

This code benefits from the excellent work PatchNet, and use the off-the-shelf models provided by DORN and RTM3D.

Citation

@article{wang2021pct,
  title={Progressive Coordinate Transforms for Monocular 3D Object Detection},
  author={Li Wang, Li Zhang, Yi Zhu, Zhi Zhang, Tong He, Mu Li, Xiangyang Xue},
  journal={arXiv preprint arXiv:2108.05793},
  year={2021}
}

Contact

For questions regarding PCT-3D, feel free to post here or directly contact the authors ([email protected]).

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].