All Projects → tonyngjichun → SOLAR

tonyngjichun / SOLAR

Licence: MIT license
PyTorch code for "SOLAR: Second-Order Loss and Attention for Image Retrieval". In ECCV 2020

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to SOLAR

JSTASR-DesnowNet-ECCV-2020
This is the project page of our paper which has been published in ECCV 2020.
Stars: ✭ 17 (-88.67%)
Mutual labels:  eccv, eccv2020, eccv-2020
Dehazing-PMHLD-Patch-Map-Based-Hybrid-Learning-DehazeNet-for-Single-Image-Haze-Removal-TIP-2020
This is the source code of PMHLD-Patch-Map-Based-Hybrid-Learning-DehazeNet-for-Single-Image-Haze-Removal which has been accepted by IEEE Transaction on Image Processing 2020.
Stars: ✭ 14 (-90.67%)
Mutual labels:  eccv, eccv2020
tfvaegan
[ECCV 2020] Official Pytorch implementation for "Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification". SOTA results for ZSL and GZSL
Stars: ✭ 107 (-28.67%)
Mutual labels:  eccv2020, eccv-2020
3ddfa v2
The official PyTorch implementation of Towards Fast, Accurate and Stable 3D Dense Face Alignment, ECCV 2020.
Stars: ✭ 1,961 (+1207.33%)
Mutual labels:  eccv, eccv-2020
softpool
SoftPoolNet: Shape Descriptor for Point Cloud Completion and Classification - ECCV 2020 oral
Stars: ✭ 62 (-58.67%)
Mutual labels:  eccv2020
ECCV-2020-point-cloud-analysis
ECCV 2020 papers focusing on point cloud analysis
Stars: ✭ 22 (-85.33%)
Mutual labels:  eccv-2020
FFWM
Implementation of "Learning Flow-based Feature Warping for Face Frontalization with Illumination Inconsistent Supervision" (ECCV 2020).
Stars: ✭ 107 (-28.67%)
Mutual labels:  eccv2020
CURL
Code for the ICPR 2020 paper: "CURL: Neural Curve Layers for Image Enhancement"
Stars: ✭ 177 (+18%)
Mutual labels:  eccv
ada-hessian
Easy-to-use AdaHessian optimizer (PyTorch)
Stars: ✭ 59 (-60.67%)
Mutual labels:  second-order
SpatiallyAdaptiveInference-Detection
Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation, ECCV 2020 Oral
Stars: ✭ 55 (-63.33%)
Mutual labels:  eccv2020
visdial
Visual Dialog: Light-weight Transformer for Many Inputs (ECCV 2020)
Stars: ✭ 27 (-82%)
Mutual labels:  eccv2020
FAIRY
Fast and scalable search of whole-slide images via self-supervised deep learning - Nature Biomedical Engineering
Stars: ✭ 43 (-71.33%)
Mutual labels:  image-retrieval
mildnet
Visual Similarity research at Fynd. Contains code to reproduce 2 of our research papers.
Stars: ✭ 76 (-49.33%)
Mutual labels:  image-retrieval
Guided-I2I-Translation-Papers
Guided Image-to-Image Translation Papers
Stars: ✭ 117 (-22%)
Mutual labels:  eccv
XCloud
Official Code for Paper <XCloud: Design and Implementation of AI Cloud Platform with RESTful API Service> (arXiv1912.10344)
Stars: ✭ 58 (-61.33%)
Mutual labels:  image-retrieval
People-Flows
The code for our ECCV 2020 paper: Estimating People Flows to Better Count Them in Crowded Scenes
Stars: ✭ 44 (-70.67%)
Mutual labels:  eccv2020
PCLoc
Pose Correction for Highly Accurate Visual Localization in Large-scale Indoor Spaces (ICCV 2021)
Stars: ✭ 37 (-75.33%)
Mutual labels:  image-retrieval
gnn-re-ranking
A real-time GNN-based method. Understanding Image Retrieval Re-Ranking: A Graph Neural Network Perspective
Stars: ✭ 64 (-57.33%)
Mutual labels:  image-retrieval
Ranked-List-Loss-for-DML
CVPR 2019: Ranked List Loss for Deep Metric Learning, with extension for TPAMI submission
Stars: ✭ 56 (-62.67%)
Mutual labels:  image-retrieval
LaBERT
A length-controllable and non-autoregressive image captioning model.
Stars: ✭ 50 (-66.67%)
Mutual labels:  eccv2020

SOLAR: Second-Order Loss and Attention for Image Retrieval

teaser_gif

This repository contains the PyTorch implementation of our paper:

"SOLAR: Second-Order Loss and Attention for Image Retrieval"
Tony Ng, Vassileios Balntas, Yurun Tian, Krystian Mikolajczyk. ECCV 2020.
[arXiv] [short video] [long video] [ECCV Daily feature article] [OpenCV blog]

teaser

Before going further, please check out Filip Radenovic's great repository on image retrieval. Our solar-global module is heavily built upon it. If you use this code in your research, please also cite their work! [link to license]

Features

  • Complete test scripts for large-scale image retrieval with solar-global
  • Inference code for extracting local descriptors with solar-local
  • Second-order attention map visualisation for large images
  • Image matching visualisation
  • Training code for image retrieval

Requirements

Download model weights and descriptors

Begin with downloading our best models (both global and local) described in the paper, as well as the pre-computed descriptors of the 1M distractors set.

sh download.sh

The global model is saved at data/networks/resnet101-solar-best.pth and the local model at solar_local/weights/local-solar-345-liberty.pth. The descriptors of the 1M distractors are saved in the main directory (the file is quite big ~8GB, so it might take a while to download).

Testing our global descriptor

Here you can try out our pretrained model resnet101-solar-best.pth on the Revisiting Oxford and Paris dataset

Testing on R-Oxford, R-Paris
Once you've successfully downloaded the global model weights, run
python3 -m solar_global.examples.test

This script automatically downloads roxford5k,rparis6k into data/test/ and evaluates SOLAR on them. After a while, you should be able to get results as below

>> roxford5k: mAP E: 85.88, M: 69.9, H: 47.91
>> roxford5k: mP@k[1, 5, 10] E: [94.12 92.45 88.8 ], M: [94.29 90.86 86.71], H: [88.57 74.29 63.  ]

>> rparis6k: mAP E: 92.95, M: 81.57, H: 64.45
>> rparis6k: mP@k[1, 5, 10] E: [100.   96.57 95.43], M: [100.   98.   97.14], H: [97.14 94.57 93.  ]

Retrieval rankings are visualised in specs/ using

tensorboard --logdir specs/ --samples_per_plugin images=1000

You can view them on your browser at localhost:6006 in the IMAGES tab. Here's an example

ranks

You can also switch to the PROJECTOR tab and play around with TensorBoard's embedding visualisation tool. Here's an example of the 6322 database images in R-Paris, visualised with t-SNE

embeddings

Testing with the extra 1-million distractors

If you decide to extract the descriptors on your own, run

(Note: this step takes a lot of time and storage, and we only provide it for verification. You can skip to the next command if you've already downloded the pre-computed descriptors from the previous step!)

python3 -m solar_global.examples.extract_1m

This script would download and extract the 1M distractors set and save them into data/test/revisitop1m/. This dataset is quite large (400GB+), so depending on your network & GPU, the whole process of downloading + extracting descriptors can take from a couple of days to a week. In our setting (~100MBps, V100), the download + extraction takes ~10 hours and the descriptors ~30 hours to be computed.

Now, make sure that resnet101-solar-best.pth_vecs_revisitop1m.pt is in the main directory, whether from the extraction step above or from the download ealier. Then you can run

python3 -m solar_global.examples.test_1m

and get results as below

>> roxford5k: mAP E: 72.04, M: 53.49, H: 29.89
>> roxford5k: mP@k[1, 5, 10] E: [88.24 81.99 76.96], M: [88.57 82.29 76.71], H: [74.29 58.29 48.86]

>> rparis6k: mAP E: 83.35, M: 59.19, H: 33.41
>> rparis6k: mP@k[1, 5, 10] E: [98.57 95.14 93.57], M: [98.57 96.29 94.86], H: [92.86 89.14 81.57]

Visualising second-order attention maps

Using our interactive visualisation tool

We provide a small demo for you to click around an image and interactively visualise the second-order attention (SOA) maps at different locations you select. (c.f. Section 4.3 in the paper for an in-depth analysis)

First, run

python3 -m demo.interactive_soa

This gorgeous image of the Eiffel Tower should pop up in a new window

demo

Try drawing a (light green) rectangle centred at the location you would like to visualise the SOA map

demo

A new window titled Second order attention with the SOA from the closest location in the feature map overlaid on the image, and a white dot indicating where you've selected should appear as below

demo

Now, try drawing a rectangle in the sky, you should see the SOA more spread-out and silhouetting the main landmarks like this

demo

You can keep clicking around the image to visualise more SOAs. Remember, the white dot in the SOA map tells you where the currently displayed attention map is selected from!

You can also try out different images by parsing the programme with

python3 -m demo.interactive_soa --image PATH/TO/YOUR/IMAGE
Jupyter-Notebook

Coming Soon!

Testing our local descriptor

Simple inference

We provide a bare-bones inference code for the local counterpart of SOLAR (Section 5.3 in the paper), so you can plug it into whatever applications you have for local descriptors.

To check that it works, run

python3 -m solar_local.example

If successful, it should display the following message

SOLAR_LOCAL - SOSNet w/ SOA layers:
SOA_3:
Num channels:    in   out   mid
                 64    64    16
SOA_4:
Num channels:    in   out   mid
                 64    64    16
SOA_5:
Num channels:    in   out   mid
                128   128    64
Descriptors shape torch.Size([512, 128])
Jupyter-Notebook

Follow our demo notebook to see a comparison between solar_local and the baseline SOSNet on an image-matching toy example.

Training SOLAR

Pre-processing the training set

As the GL18 dataset consists of only URLs, many of which have already expired, this part of the code lets you download the images we had at the time of training our models. However, this also means that extra storage space would be required for extracting tarballs, so please expect to have ~700GB upwards available. Otherwise, you could still download using GL18's downloader and save the images at data/train/gl18/jpg.

To download the images and pre-process them for training, simply run

sh gl18_preprocessing.sh

This would take sometime but you should then see around 1-million images in data/train/gl18/jpg and the pickle file data/train/gl18/db_gl18.pkl required for training.

If you downloaded the images from the URLs directly, please also make sure you download train.csv, boxes_split1.csv and boxes_split2.csv and save them into data/train/gl18. Then you can run

cd data/train/gl18 && python3 create_db_pickle.py

You should then see data/train/gl18/db_gl18.pkl successfully created.

Training

Once you've downloaded and pre-processed GL18, you can start the training with the settings described in the paper by running

python3 -m solar_global.examples.train specs/gl18 --training-dataset 'gl18' --test-datasets 'roxford5k,rparis6k' --arch 'resnet101' --pool 'gem' --p 3 --loss 'triplet' --pretrained-type 'gl18' --loss-margin 1.25 --optimizer 'adam' --lr 1e-6 -ld 1e-2 --neg-num 5 --query-size 2000 --pool-size 20000 --batch-size 8 --image-size 1024 --update-every 1 --whitening --soa --soa-layers '45' --sos --lambda 10 --no-val --print-freq 10 --flatten-desc

You can monitor the training losses and image pairs with tensorboard

tensorboard --logdir specs/

Citation

If you use this repository in your work, please cite our paper:

@inproceedings{ng2020solar,
    author    = {Ng, Tony and Balntas, Vassileios and Tian, Yurun and Mikolajczyk, Krystian},
    title     = {{SOLAR}: Second-Order Loss and Attention for Image Retrieval},
    booktitle = {ECCV},
    year      = {2020}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].