All Projects → layumi → Image Text Embedding

layumi / Image Text Embedding

Licence: mit
TOMM2020 Dual-Path Convolutional Image-Text Embedding https://arxiv.org/abs/1711.05535

Programming Languages

matlab
3953 projects

Projects that are alternatives of or similar to Image Text Embedding

Batch Dropblock Network
Official source code of "Batch DropBlock Network for Person Re-identification and Beyond" (ICCV 2019)
Stars: ✭ 304 (+36.32%)
Mutual labels:  image-retrieval, person-reidentification
Person reid baseline pytorch
Pytorch ReID: A tiny, friendly, strong pytorch implement of object re-identification baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial
Stars: ✭ 2,963 (+1228.7%)
Mutual labels:  image-retrieval, person-reidentification
Dg Net
Joint Discriminative and Generative Learning for Person Re-identification. CVPR'19 (Oral)
Stars: ✭ 1,042 (+367.26%)
Mutual labels:  image-retrieval, person-reidentification
Person Reid gan
ICCV2017 Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro
Stars: ✭ 301 (+34.98%)
Mutual labels:  image-retrieval, person-reidentification
Fast Reid
SOTA Re-identification Methods and Toolbox
Stars: ✭ 2,287 (+925.56%)
Mutual labels:  image-retrieval, person-reidentification
Person Reid Gan Pytorch
A Pytorch Implementation of "Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro"(ICCV17)
Stars: ✭ 147 (-34.08%)
Mutual labels:  person-reidentification
Proxy Anchor Cvpr2020
Official PyTorch Implementation of Proxy Anchor Loss for Deep Metric Learning, CVPR 2020
Stars: ✭ 188 (-15.7%)
Mutual labels:  image-retrieval
Reid Mgn
Reproduction of paper: Learning Discriminative Features with Multiple Granularities for Person Re-Identification
Stars: ✭ 145 (-34.98%)
Mutual labels:  person-reidentification
Liresolr
Putting LIRE into Solr - an ongoing project
Stars: ✭ 140 (-37.22%)
Mutual labels:  image-retrieval
Caffe Deepbinarycode
Supervised Semantics-preserving Deep Hashing (TPAMI18)
Stars: ✭ 206 (-7.62%)
Mutual labels:  image-retrieval
Deep Fashion Retrieval
Simple image retrival on deep-fashion dataset with pytorch - A course project
Stars: ✭ 197 (-11.66%)
Mutual labels:  image-retrieval
Revisiting deep metric learning pytorch
(ICML 2020) This repo contains code for our paper "Revisiting Training Strategies and Generalization Performance in Deep Metric Learning" (https://arxiv.org/abs/2002.08473) to facilitate consistent research in the field of Deep Metric Learning.
Stars: ✭ 172 (-22.87%)
Mutual labels:  image-retrieval
Pytorch deephash
Pytorch implementation of Deep Learning of Binary Hash Codes for Fast Image Retrieval, CVPRW 2015
Stars: ✭ 148 (-33.63%)
Mutual labels:  image-retrieval
Affnet
Code and weights for local feature affine shape estimation paper "Repeatability Is Not Enough: Learning Discriminative Affine Regions via Discriminability"
Stars: ✭ 191 (-14.35%)
Mutual labels:  image-retrieval
Revisitop
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking
Stars: ✭ 147 (-34.08%)
Mutual labels:  image-retrieval
Learning Via Translation
Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification (https://arxiv.org/pdf/1711.07027.pdf). CVPR2018
Stars: ✭ 202 (-9.42%)
Mutual labels:  person-reidentification
Attribute Aware Attention
[ACM MM 2018] Attribute-Aware Attention Model for Fine-grained Representation Learning
Stars: ✭ 143 (-35.87%)
Mutual labels:  person-reidentification
Cnn Cbir Benchmark
CNN CBIR benchmark (ongoing)
Stars: ✭ 171 (-23.32%)
Mutual labels:  image-retrieval
Semantic Embeddings
Hierarchy-based Image Embeddings for Semantic Image Retrieval
Stars: ✭ 196 (-12.11%)
Mutual labels:  image-retrieval
Self Similarity Grouping
Self-similarity Grouping: A Simple Unsupervised Cross Domain Adaptation Approach for Person Re-identification (ICCV 2019, Oral)
Stars: ✭ 171 (-23.32%)
Mutual labels:  person-reidentification

Dual-Path Convolutional Image-Text Embedding

[Paper] [Slide]

This repository contains the code for our paper Dual-Path Convolutional Image-Text Embedding. Thank you for your kindly attention.

Some News

11 June 2020 People live in the 3D world. We release one new person re-id code Person Re-identification in the 3D Space, which conduct representation learning in the 3D space. You are welcomed to check out it.

30 April 2020 We have won the AICity Challenge 2020 in CVPR 2020, yielding the 1st Place Submission to the retrieval track 🚗. Check out here.

01 March 2020 We release one new image retrieval dataset, called University-1652, for drone-view target localization and drone navigation 🚁. It has a similar setting with the person re-ID. You are welcomed to check out it.

What's New: We updated the paper to the second version, adding more illustration about the mechanism of the proposed instance loss.

Install Matconvnet

I have included my Matconvnet in this repo, so you do not need to download it again.You just need to uncomment and modify some lines in gpu_compile.m and run it in Matlab. Try it~ (The code does not support cudnn 6.0. You may just turn off the Enablecudnn or try cudnn5.1)

If you fail in compilation, you may refer to http://www.vlfeat.org/matconvnet/install/

Prepocess Datasets

  1. Extract wrod2vec weights. Follow the instruction in ./word2vector_matlab;

  2. Prepocess the dataset. Follow the instruction in ./dataset. You can choose one dataset to run. Three datasets need different prepocessing. I write the instruction for Flickr30k, MSCOCO and CUHK-PEDES.

  3. Download the model pre-trained on ImageNet. And put the model into './data'.

(bash) wget http://www.vlfeat.org/matconvnet/models/imagenet-resnet-50-dag.mat

Alternatively, you may try VGG16 or VGG19.

You may have a different split with me. (Sorry, this is my fault. I used a random split.) Just for a backup, this is the dictionary archive used in the paper.

Trained Model

You may download the three trained models from GoogleDrive.

Train

  • For Flickr30k, run train_flickr_word2_1_pool.m for Stage I training.

Run train_flickr_word_Rankloss_shift_hard for Stage II training.

  • For MSCOCO, run train_coco_word2_1_pool.m for Stage I training.

Run train_coco_Rankloss_shift_hard.m for Stage II training.

  • For CUHK-PEDES, run train_cuhk_word2_1_pool.m for Stage I training.

Run train_cuhk_word_Rankloss_shift for Stage II training.

Test

Select one model and have fun!

  • For Flickr30k, run test/extract_pic_feature_word2_plus_52.m and to extract the feature from image and text. Note that you need to change the model path in the code.

  • For MSCOCO, run test_coco/extract_pic_feature_word2_plus.m and to extract the feature from image and text. Note that you need to change the model path in the code.

  • For CUHK-PEDES, run test_cuhk/extract_pic_feature_word2_plus_52.m and to extract the feature from image and text. Note that you need to change the model path in the code.

CheckList

  • [x] Get word2vec weight

  • [x] Data Preparation (Flickr30k)

  • [x] Train on Flickr30k

  • [x] Test on Flickr30k

  • [x] Data Preparation (MSCOCO)

  • [x] Train on MSCOCO

  • [x] Test on MSCOCO

  • [x] Data Preparation (CUHK-PEDES)

  • [x] Train on CUHK-PEDES

  • [x] Test on CUHK-PEDES

  • [ ] Run the code on another machine

Citation

@article{zheng2017dual,
  title={Dual-Path Convolutional Image-Text Embeddings with Instance Loss},
  author={Zheng, Zhedong and Zheng, Liang and Garrett, Michael and Yang, Yi and Xu, Mingliang and Shen, Yi-Dong},
  journal={ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)},
  doi={10.1145/3383184},
  volume={16},
  number={2},
  pages={1--23},
  year={2020},
  publisher={ACM New York, NY, USA}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].