All Projects → Cuberick-Orion → CIRR

Cuberick-Orion / CIRR

Licence: MIT license
Official repository of ICCV 2021 - Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models

Projects that are alternatives of or similar to CIRR

Person Reid Triplet Loss
Person re-ID baseline with triplet loss
Stars: ✭ 165 (+153.85%)
Mutual labels:  image-retrieval
Pytorch Image Retrieval
A PyTorch framework for an image retrieval task including implementation of N-pair Loss (NIPS 2016) and Angular Loss (ICCV 2017).
Stars: ✭ 203 (+212.31%)
Mutual labels:  image-retrieval
University1652 Baseline
ACM Multimedia2020 University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization 🚁 annotates 1652 buildings in 72 universities around the world.
Stars: ✭ 232 (+256.92%)
Mutual labels:  image-retrieval
Cnn Cbir Benchmark
CNN CBIR benchmark (ongoing)
Stars: ✭ 171 (+163.08%)
Mutual labels:  image-retrieval
Semantic Embeddings
Hierarchy-based Image Embeddings for Semantic Image Retrieval
Stars: ✭ 196 (+201.54%)
Mutual labels:  image-retrieval
Retrieval 2017 Cam
Class-Weighted Convolutional Features for Image Retrieval (BMVC 2017)
Stars: ✭ 219 (+236.92%)
Mutual labels:  image-retrieval
Revisitop
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking
Stars: ✭ 147 (+126.15%)
Mutual labels:  image-retrieval
Sola
Scene search On Liresolr for Animation. (and video)
Stars: ✭ 253 (+289.23%)
Mutual labels:  image-retrieval
Deep Fashion Retrieval
Simple image retrival on deep-fashion dataset with pytorch - A course project
Stars: ✭ 197 (+203.08%)
Mutual labels:  image-retrieval
Person reid baseline pytorch
Pytorch ReID: A tiny, friendly, strong pytorch implement of object re-identification baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial
Stars: ✭ 2,963 (+4458.46%)
Mutual labels:  image-retrieval
Revisiting deep metric learning pytorch
(ICML 2020) This repo contains code for our paper "Revisiting Training Strategies and Generalization Performance in Deep Metric Learning" (https://arxiv.org/abs/2002.08473) to facilitate consistent research in the field of Deep Metric Learning.
Stars: ✭ 172 (+164.62%)
Mutual labels:  image-retrieval
Affnet
Code and weights for local feature affine shape estimation paper "Repeatability Is Not Enough: Learning Discriminative Affine Regions via Discriminability"
Stars: ✭ 191 (+193.85%)
Mutual labels:  image-retrieval
Image Text Embedding
TOMM2020 Dual-Path Convolutional Image-Text Embedding https://arxiv.org/abs/1711.05535
Stars: ✭ 223 (+243.08%)
Mutual labels:  image-retrieval
Cnnimageretrieval
CNN Image Retrieval in MatConvNet: Training and evaluating CNNs for Image Retrieval in MatConvNet
Stars: ✭ 168 (+158.46%)
Mutual labels:  image-retrieval
Delf Pytorch
PyTorch Implementation of "Large-Scale Image Retrieval with Attentive Deep Local Features"
Stars: ✭ 245 (+276.92%)
Mutual labels:  image-retrieval
Pytorch deephash
Pytorch implementation of Deep Learning of Binary Hash Codes for Fast Image Retrieval, CVPRW 2015
Stars: ✭ 148 (+127.69%)
Mutual labels:  image-retrieval
Caffe Deepbinarycode
Supervised Semantics-preserving Deep Hashing (TPAMI18)
Stars: ✭ 206 (+216.92%)
Mutual labels:  image-retrieval
SegSwap
(CVPRW 2022) Learning Co-segmentation by Segment Swapping for Retrieval and Discovery
Stars: ✭ 46 (-29.23%)
Mutual labels:  image-retrieval
Openunreid
PyTorch open-source toolbox for unsupervised or domain adaptive object re-ID.
Stars: ✭ 250 (+284.62%)
Mutual labels:  image-retrieval
Map Based Visual Localization
A general framework for map-based visual localization. It contains 1) Map Generation which support traditional features or deeplearning features. 2) Hierarchical-Localizationvisual in visual(points or line) map. 3)Fusion framework with IMU, wheel odom and GPS sensors.
Stars: ✭ 229 (+252.31%)
Mutual labels:  image-retrieval

Composed Image Retrieval on Real-life Images

arXiv arXiv

This repository contains the Composed Image Retrieval on Real-life images (CIRR) dataset.

For details please see our ICCV 2021 paper - Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models.

If you find this repository useful, we would appreciate it if you could give us a star.

You are currently viewing the Dataset repository. For more information, see our Project homepage.

If you wish to develop on this task using our codebase, we recommend first checking out our Code repository, setting up the code locally, then downloading the dataset.

News and Upcoming Updates

Please note there is a typo in our paper (Table 2) -- the number of pairs in val is 4,184 4,181.

Click to see news
  • Oct. 2021: We have uploaded our ICCV video.
  • Aug. 2021: We have updated our test-split server to include the Recall_Subset evaluation.
  • Aug. 2021: We have opened our test-split evaluation server.
  • Aug. 2021: We are releasing our dataset and code for the project.
Click to see our planned updates
  • Upload TIRG Implementation in our codebase (hosted individually).

Download CIRR Dataset

Our dataset is structured in a similar way as Fashion-IQ, an existing dataset on this task.

Annotations

Obtain the annotations by:

# create a `data` folder at your desired location
mkdir data
cd data

# clone the cirr_dataset branch to the local data/cirr folder
git clone -b cirr_dataset [email protected]:Cuberick-Orion/CIRR.git cirr

The data/cirr folder contains all relevant annotations. File structure is described below.

Pre-extracted Image Features

The available types of image features are:

Each zip file we provide contains a folder of individual image feature files .pkl.

Once downloaded, unzip it into data/cirr/, following the file structure below.

Raw Images

Training and testing on CIRR do not require raw images. However, should you want to access them, please refer to our image source NLVR2.

Note: We do not recommend downloading the images by URLs, as it contains too many broken links. Instead, we suggest following the instructions here to directly access the images. To quote the authors:

To obtain access, please fill out the linked Google Form. This form asks for your basic information and asks you to agree to our Terms of Service. We will get back to you within a week. If you have any questions, please email [email protected].

Dataset File Structure

The downloaded dataset should look like this (Click to expand!)
data
└─── cirr
    ├─── captions
    │        cap.VER.test1.json
    │        cap.VER.train.json
    │        cap.VER.val.json
    ├─── captions_ext
    │        cap.ext.VER.test1.json
    │        cap.ext.VER.train.json
    │        cap.ext.VER.val.json
    ├─── image_splits
    │        split.VER.test1.json
    │        split.VER.train.json
    │        split.VER.val.json
    ├─── img_feat_frcnn  
    │    ├── train      
    │    │      <IMG0_ID>.pkl
    │    │      <IMG1_ID>.pkl
    │    │           ...
    │    ├── dev         
    │    │      <IMG0_ID>.pkl
    │    │      <IMG1_ID>.pkl
    │    │           ...
    │    └── test1       
    │           <IMG0_ID>.pkl
    │           <IMG1_ID>.pkl
    │                ...
    ├─── img_feat_res152 
    │        <Same subfolders as above>
    └─── img_raw         
              <Same subfolders as above>

Dataset File Description

  • captions/cap.VER.SPLIT.json

    • A list of elements, where each element contains core information on a query-target pair.

    • Details on each entry can be found in the supp. mat. Sec. G of our paper.

    • Click to see an example
          {"pairid": 12063, 
          "reference":   "test1-147-1-img1", 
          "target_hard": "test1-83-0-img1", 
          "target_soft": {"test1-83-0-img1": 1.0}, 
          "caption": "remove all but one dog and add a woman hugging   it", 
          "img_set": {"id": 1, 
                      "members": ["test1-147-1-img1", 
                                  "test1-1001-2-img0",  
                                  "test1-83-1-img1",           
                                  "test1-359-0-img1",  
                                  "test1-906-0-img1", 
                                  "test1-83-0-img1"],
                      "reference_rank": 3, 
                      "target_rank": 4}
          }
  • captions_ext/cap.ext.VER.SPLIT.json

    • A list of elements, where each element contains auxiliary annotations on a query-target pair.

    • Details on the auxiliary annotations can be found in the supp. mat. Sec. C of our paper.

    • Click to see an example
          {"pairid": 12063, 
          "reference":   "test1-147-1-img1", 
          "target_hard": "test1-83-0-img1", 
          "caption_extend": {"0": "being a photo of dogs", 
                            "1": "add a big dog", 
                            "2": "more focused on the hugging", 
                            "3": "background should contain grass"}
          }
  • image_splits/split.VER.SPLIT.json

    • A dictionary, where each key:value pair maps an image filename to the relative path of the img file, example:
      "test1-147-1-img1": "./test1/test1-147-1-img1.png",
    • image filenames are preserved from the NLVR2 dataset.
  • img_feat_<...>/

    • A folder containing a certain type of pre-extracted image features, each file saves the feature of one image.
    • Filename is generated as such:
      <IMG0_ID> = "test1-147-1-img1.png".replace('.png','.pkl')
      in this case, test1-147-1-img1.pkl, so that each file can be directly indexed by its name.

Test-split Evaluation Server

We do not publish the ground truth for the test split of CIRR. Instead, an evaluation server is hosted here, should you prefer to publish results on the test-split. The functions of the test-split server will be incrementally updated.

See test-split server instructions.

The server is hosted independently at CECS ANU, so please email us if the site is down.

License

  • We have licensed the annotations of CIRR under the MIT License. Please refer to the LICENSE file for details.

  • Following NLVR2 Licensing, we do not license the images used in CIRR, as we do not hold the copyright to them.

  • The images used in CIRR are sourced from the NLVR2 dataset. Users shall be bounded by its Terms of Service.

Citation

Please cite our paper if it helps your research:

@inproceedings{Liu:ICCV2021,
  author    = {Zheyuan Liu and
               Cristian Rodriguez and
               Damien Teney and
               Stephen Gould},
  title     = {Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models},
  booktitle = {ICCV},
  year      = {2021}
}

Contact

If you have any questions regarding our dataset, model, or publication, please create an issue in the project repository, or email [email protected].

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].