All Projects → princeton-vl → Rel3D

princeton-vl / Rel3D

Licence: BSD-3-Clause license
Official code for NeurRIPS 2020 paper "Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D"

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to Rel3D

lowshot-shapebias
Learning low-shot object classification with explicit shape bias learned from point clouds
Stars: ✭ 37 (+54.17%)
Mutual labels:  3d-vision
MinkLocMultimodal
MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition
Stars: ✭ 65 (+170.83%)
Mutual labels:  3d-vision
continuous-time-flow-process
PyTorch code of "Modeling Continuous Stochastic Processes with Dynamic Normalizing Flows" (NeurIPS 2020)
Stars: ✭ 34 (+41.67%)
Mutual labels:  neurips-2020
label-fusion
Volumetric Fusion of Multiple Semantic Labels and Masks
Stars: ✭ 18 (-25%)
Mutual labels:  3d-vision
pgdl
Winning Solution of the NeurIPS 2020 Competition on Predicting Generalization in Deep Learning
Stars: ✭ 36 (+50%)
Mutual labels:  neurips-2020
void-dataset
Visual Odometry with Inertial and Depth (VOID) dataset
Stars: ✭ 74 (+208.33%)
Mutual labels:  3d-vision
EgoNet
Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"
Stars: ✭ 111 (+362.5%)
Mutual labels:  3d-vision
DeepI2P
DeepI2P: Image-to-Point Cloud Registration via Deep Classification. CVPR 2021
Stars: ✭ 130 (+441.67%)
Mutual labels:  3d-vision
generative pose
Code for our ICCV 19 paper : Monocular 3D Human Pose Estimation by Generation and Ordinal Ranking
Stars: ✭ 63 (+162.5%)
Mutual labels:  3d-vision
awesome-point-cloud-deep-learning
Paper list of deep learning on point clouds.
Stars: ✭ 39 (+62.5%)
Mutual labels:  3d-vision
learning-topology-synthetic-data
Tensorflow implementation of Learning Topology from Synthetic Data for Unsupervised Depth Completion (RAL 2021 & ICRA 2021)
Stars: ✭ 22 (-8.33%)
Mutual labels:  3d-vision
NeuralRecon
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral
Stars: ✭ 812 (+3283.33%)
Mutual labels:  3d-vision
3D-PV-Locator
Repo for "3D-PV-Locator: Large-scale detection of rooftop-mounted photovoltaic systems in 3D" based on Applied Energy publication.
Stars: ✭ 35 (+45.83%)
Mutual labels:  neurips-2020
Trending-in-3D-Vision
An on-going paper list on new trends in 3D vision with deep learning
Stars: ✭ 42 (+75%)
Mutual labels:  3d-vision
RefRESH
Create RefRESH data: dataset tools for Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation (ECCV 2018)
Stars: ✭ 51 (+112.5%)
Mutual labels:  3d-vision
SimpleView
Official Code for ICML 2021 paper "Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline"
Stars: ✭ 95 (+295.83%)
Mutual labels:  3d-vision
RandLA-Net-pytorch
🍀 Pytorch Implementation of RandLA-Net (https://arxiv.org/abs/1911.11236)
Stars: ✭ 69 (+187.5%)
Mutual labels:  3d-vision
AWP
Codes for NeurIPS 2020 paper "Adversarial Weight Perturbation Helps Robust Generalization"
Stars: ✭ 114 (+375%)
Mutual labels:  neurips-2020
PaiConvMesh
Official repository for the paper "Learning Local Neighboring Structure for Robust 3D Shape Representation"
Stars: ✭ 19 (-20.83%)
Mutual labels:  3d-vision
SpatialSense
An Adversarially Crowdsourced Benchmark for Spatial Relation Recognition
Stars: ✭ 62 (+158.33%)
Mutual labels:  spatial-relation-recognition

Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D Ankit Goyal, Kaiyu Yang, Dawei Yang, Jia Deng
Neural Information Processing Systems (NeuRIPS), 2020 (Spotlight)

Getting Started

First clone the repository. We would refer to the directory containing the code as Rel3D.

git clone [email protected]:princeton-vl/Rel3D.git

Requirements

The code is tested on Linux OS with Python version 3.6.9, CUDA version 10.2.

Install Libraries

We recommend you to first install Anaconda and create a virtual environment.

conda create --name rel3d python=3.6

Activate the virtual environment and install the libraries. Make sure you are in Rel3D.

conda activate rel3d
pip install -r requirements.txt
conda install sed

Download Datasets and Pre-trained Models

Make sure you are in Rel3D. download.sh script can be used for downloading all the data and the pretrained models. It also places them at the correct locations. First, use the following command to provide execute permission to the download.sh script.

chmod +x download.sh

To download the data sufficient for running all experiments in Table 1, execute the following command. It will download only the primary split of the data (~2GB) that is used in Table 1.

./download.sh data_min

To download the data for running all experiments (i.e. Table 1 and Fig. 5), execute the following command. It will download all different splits of the data (~8GB) which are required for running the Contrastive vs Non-Contrastive experiments with varying dataset sizes. It will also download the primary split.

./download.sh data

To download the pretrained models, execute the following command.

./download.sh pretrained_model

To download the raw data, execute the following command. It places the data in the data/20200223. For each sample there is a .pkl, .png and .tiff file. The .png and .tiff files store rgb and depth respectively at 720X1280 resolution. Information about object masks, bounding box and surface normal are stored in the .pkl file. Note that the ./download.sh data downloads the rgb and depth images in a compressed format, which is sufficient to reproduce all the experiments. The raw data is much larger and might not be necessary for most use cases.

WARNING: You also need to execute ./download.sh data or ./download data_min to download the <split>.json files (described later). All information like spatial relation and object category should be parsed using the <split>.json files and not from the file names.

./download.sh data_raw

If you get error while executing the above command, you can manually download the data using the link. After downloading the zip file, you need to extract it and place the extracted 20200223 folder inside the data folder.

Data Organization

All data to run the models is in the Rel3D/data folder.

The raw images are stored in the Rel3D/data/20200223 folder (in case you downloaded them).

There are 7 splits for the complete dataset. If you used ./download.sh data_min, you would have only the primary split. If you used ./download.sh data, you would have all the 7 splits.

Each split is named as <c/nc>_<per_train>_<c/nc>_<per_valid>. Here c stands for contrastive and nc stands for non-contrastive. For example, the <nc>_<0.4>_<nc>_<0.1> split means that the training and validation samples are non-contrastive, and 40% of the complete dataset is used for training while 10% is used for validation. All experiments in Table 1 are conducted using the c_0.9_c_0.1 split. The other 6 splits are used to conduct the Contrastive vs Non-Contrastive experiments shown in Figure 5 of the paper. The testing data is the same for all splits.

For each split, there are 10 files. The <split>.json stores information about each split in the json format. Each sample is represented as a dictionary, with different keys storing various information like rgb image path (rgb), depth image path (depth), information about the camera used for rendering the image (camera_info), image dimensions (width, height), subject (subject), object (object), spatial relation (predicate), whether the spatial relation holds (label), and the simple 3D features we extracted for experiments in Section 5 (transform_vector).

We also have <split>_<train/test/valid/stats>_<crop_or_not>.h5 files for each split. They contain the pre-processed rgb and depth images in a compressed format. This allows us to load the entire dataset in memory, which speeds up training. If the *.h5 files are not present in the Rel3D/data, they are generated on-the-fly using the raw images, as described here.

You can visualize the samples with just the *.h5 files and even without downloading the raw data. For this, use the following command:

python dataloader.py

This will run the __main__ function inside the dataloader.py and save samples in the Rel3D directory. You can edit the arguments inside the __main__ function depending on your need. This part of the dataloader code generates the visualizations.

Code Organization

  • Rel3D/models: PyTorch model code for various models in PyTorch.
  • Rel3D/configs: Configuration files for various models.
  • Rel3D/main.py: Training and testing any models.
  • Rel3D/configs.py: Hyperparameters for different models and dataloader.
  • Rel3D/dataloader.py: Code for creating a PyTorch dataloader for our dataset.
  • Rel3D/utils.py: Code for various utility functions.

Running Experiments

Training and Testing

To train, validate, and test any model, we use the main.py script. The format for running this script is as follows.

python main.py --exp-config <path to the config>

exp-config contains all information about the experiment. It contains the training hyper-parameters, model hyper-parameters as well as the dataloader hyper-parameters. The default value for each hyper-parameter is defined in configs.py. These default values are overwritten by the values in the exp-config. We provide exp-config for each model in Table 1. These configs can be found in the Rel3D/configs folder. As a concrete example, to execute the experiment for the DRNet model, use the command python main.py --exp-config ./configs/drnet.yaml. To execute a new experiment with a different hyperparameter, one needs to create a configuration file.

The python main.py --exp-config <path to the config> command stores all the training logs in the Rel3D/runs/EXP_ID folder. The EXP_ID is specified in the exp-config. The best performing model on the validation set is saved as Rel3D/runs/EXP_ID/model_best.pth. The performance of this model is used for reporting results.

Evaluate a pretained model

We provide pretrained models. They can be downloaded using the ./download pretrianed_models command and are stored in the Rel3D/pretrained_model folder. To test a pretrained model use the following command. The <model_name> has to be one either 2d, drnet, mlp_aligned, mlp_raw, pprfcn, vipcnn or vtranse. Note that since we retrained the models, there are small differences (+- 0.5%) in performance from the reported numbers in the paper.

python main.py --entry test --exp-config configs/<model_name>.yaml --model-path pretrained_models/<model_name>.yaml

To render images from the 3D data, please use the Rel3D_Render repository. It also contains information about extracting 3D features which we used in our MLP baseline. (Table 1, Column8-9)

If you find our research useful, consider citing it:

@article{goyal2020rel3d,
  title={Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D},
  author={Goyal, Ankit and Yang, Kaiyu and Yang, Dawei and Deng, Jia},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].