All Projects → zju3dv → Gift

zju3dv / Gift

Licence: agpl-3.0
Code for "GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs" NeurIPS 2019

Programming Languages

python
139335 projects - #7 most used programming language

GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs
Yuan Liu, Zehong Shen, Zhixuan Lin, Sida Peng, Hujun Bao, Xiaowei Zhou
NeurIPS 2019 Project Page

Any questions or discussions are welcomed!

Requirements & Compilation

  1. Requirements

Required packages are listed in requirements.txt.

Note that an old version of OpenCV (3.4.2) is needed since the code uses SIFT module of OpenCV.

The code is tested using Python-3.7.3 with pytorch 1.3.0.

  1. Compile hard example mining functions
cd hard_mining
python setup.py build_ext --inplace
  1. Compile extend utilities
cd utils/extend_utils
python build_extend_utils_cffi.py

According to your installation path of CUDA, you may need to revise the variables cuda_version, cuda_include and cuda_library in build_extend_utils_cffi.py.

Testing

Download pretrained models

  1. Pretrained GIFT model can be found at here.

  2. Pretrained SuperPoint model can be found at here.

  3. Make a directory called data and arrange these files like the following.

data/
├── superpoint/
|   └── superpoint_v1.pth
└── model/
    └── GIFT-stage2-pretrain/
        └── 20001.pth

Demo

We provide some examples of relative pose estimation in demo.ipynb.

Test on HPatches dataset

  1. Download Resized HPatches, ER-HPatches and ES-HPatches datasets at here. Optionally, you can also generate these datasets from original Hpatches sequences using correspondence_database.py.

  2. Extract these datasets like the following.

data/
├── hpatches_resize/
├── hpatches_erotate/
├── hpatches_erotate_illm/
├── hpatches_escale/
└── hpatches_escale_illm/
  1. Evaluation
# use keypoints detected by superpoint and descriptors computed by GIFT
python run.py --task=eval_original \
              --det_cfg=configs/eval/superpoint_det.yaml \
              --desc_cfg=configs/eval/gift_pretrain_desc.yaml \
              --match_cfg=configs/eval/match_v2.yaml

# use keypoints detected by superpoint and descriptors computed by superpoint
python run.py --task=eval_original \
              --det_cfg=configs/eval/superpoint_det.yaml \
              --desc_cfg=configs/eval/superpoint_desc.yaml \
              --match_cfg=configs/eval/match_v2.yaml

The output is es superpoint_det gift_pretrain_desc match_v2 pck-5 0.290 -2 0.132 -1 0.057 cost 267.826 s, which are datasetname detector_name descriptor_name matching_strategy PCK-5 PCK-2 PCK-1. PCK-5 means that a correspondence is correct if the distance between the matched keypoint and its ground truth location is less than 5 pixels.

Test on relative pose estimation dataset

  1. Download the st_peters_squares dataset from here. (This dataset is a part of st_peters.)

  2. Extract the dataset and arrange directories like the following.

data
└── st_peters_square_dataset/
    └── test/
  1. Evaluation
# use keypoints detected by superpoint and descriptors computed by GIFT
python run.py --task=rel_pose \
              --det_cfg=configs/eval/superpoint_det.yaml \
              --desc_cfg=configs/eval/gift_pretrain_desc.yaml \
              --match_cfg=configs/eval/match_v0.yaml

# use keypoints detected by superpoint and descriptors computed by superpoint
python run.py --task=rel_pose \
              --det_cfg=configs/eval/superpoint_det.yaml \
              --desc_cfg=configs/eval/superpoint_desc.yaml \
              --match_cfg=configs/eval/match_v0.yaml

The output is sps_100_200_first_100 superpoint_det gift_pretrain_desc match_v0 ang diff 24.100 inlier 62.240 correct-5 0.170 -10 0.350 -20 0.650 which is datasetname detector_name descriptor_name matching_strategy average_angle_difference average_inlier_number correct_rate_5_degree correct_rate_10_degree correct_rate_20_degree. In relative pose estimation, we can compute the angle difference between the estimated rotation and the ground truth rotation. average_angle_difference is the average angle difference among all image pairs. average_inlier_number is the number of inlier keypoints after RANSAC. correct_rate_5_degree indicate the percentage of image pairs whose angle difference is less than 5 degree.

Training

  1. Download the train-2014 and val-2014 set of COCO dataset and the SUN397 dataset.

  2. Organize files like the following

data
├── SUN2012Images/
|   └── JPEGImages/
└── coco/
    ├── train2014/ 
    └── val2014/
  1. Training
mkdir data/record
python run.py --task=train --cfg=configs/GIFT-stage1.yaml # train group extractor (Vanilla CNN)
python run.py --task=train --cfg=configs/GIFT-stage2.yaml # train group embedder (Group CNNs)

Acknowledgements

We have used codes or datasets from following projects:

Copyright

This work is affliated with ZJU-SenseTime Joint Lab of 3D Vision, and its intellectual property belongs to SenseTime Group Ltd.

Copyright (c) ZJU-SenseTime Joint Lab of 3D Vision. All Rights Reserved.

Licensed under the GNU AFFERO GENERAL PUBLIC LICENSE;
you may not use this file except in compliance with the License.

Everyone is permitted to copy and distribute verbatim copies 
of this license document, but changing it is not allowed.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].