All Projects β†’ layumi β†’ University1652 Baseline

layumi / University1652 Baseline

Licence: mit
ACM Multimedia2020 University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization 🚁 annotates 1652 buildings in 72 universities around the world.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to University1652 Baseline

Fast Reid
SOTA Re-identification Methods and Toolbox
Stars: ✭ 2,287 (+885.78%)
Mutual labels:  apex, image-retrieval
Dg Net
Joint Discriminative and Generative Learning for Person Re-identification. CVPR'19 (Oral)
Stars: ✭ 1,042 (+349.14%)
Mutual labels:  apex, image-retrieval
Watermarkreco
Pytorch implementation of the paper "Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approach"
Stars: ✭ 45 (-80.6%)
Mutual labels:  dataset, image-retrieval
Person reid baseline pytorch
Pytorch ReID: A tiny, friendly, strong pytorch implement of object re-identification baseline. Tutorial πŸ‘‰https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial
Stars: ✭ 2,963 (+1177.16%)
Mutual labels:  apex, image-retrieval
Torchdata
PyTorch dataset extended with map, cache etc. (tensorflow.data like)
Stars: ✭ 226 (-2.59%)
Mutual labels:  dataset
Soqlx
SoqlXplorer is an awesome tool for developers using the Salesforce.com platform.
Stars: ✭ 220 (-5.17%)
Mutual labels:  apex
Lightnetplusplus
LightNet++: Boosted Light-weighted Networks for Real-time Semantic Segmentation
Stars: ✭ 218 (-6.03%)
Mutual labels:  apex
Bccd dataset
BCCD (Blood Cell Count and Detection) Dataset is a small-scale dataset for blood cells detection.
Stars: ✭ 216 (-6.9%)
Mutual labels:  dataset
Datasets
source{d} datasets ("big code") for source code analysis and machine learning on source code
Stars: ✭ 231 (-0.43%)
Mutual labels:  dataset
Map Based Visual Localization
A general framework for map-based visual localization. It contains 1) Map Generation which support traditional features or deeplearning features. 2) Hierarchical-Localizationvisual in visual(points or line) map. 3)Fusion framework with IMU, wheel odom and GPS sensors.
Stars: ✭ 229 (-1.29%)
Mutual labels:  image-retrieval
Image Text Embedding
TOMM2020 Dual-Path Convolutional Image-Text Embedding https://arxiv.org/abs/1711.05535
Stars: ✭ 223 (-3.88%)
Mutual labels:  image-retrieval
Source one
Open Source FPV Drone Frame
Stars: ✭ 220 (-5.17%)
Mutual labels:  drone
Vehicle reid Collection
πŸš— the collection of vehicle re-ID papers, datasets. πŸš—
Stars: ✭ 225 (-3.02%)
Mutual labels:  dataset
Retrieval 2017 Cam
Class-Weighted Convolutional Features for Image Retrieval (BMVC 2017)
Stars: ✭ 219 (-5.6%)
Mutual labels:  image-retrieval
Structured3d
[ECCV'20] Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
Stars: ✭ 224 (-3.45%)
Mutual labels:  dataset
Collection
Collection Data for Cooper Hewitt, Smithsonian Design Museum
Stars: ✭ 214 (-7.76%)
Mutual labels:  dataset
Automated Resume Screening System
Automated Resume Screening System using Machine Learning (With Dataset)
Stars: ✭ 224 (-3.45%)
Mutual labels:  dataset
Weatherbench
A benchmark dataset for data-driven weather forecasting
Stars: ✭ 227 (-2.16%)
Mutual labels:  dataset
Rio Tiler
Rasterio plugin to create web map tiles from raster datasets.
Stars: ✭ 221 (-4.74%)
Mutual labels:  satellite
Stationary
Get hourly meteorological data from one of thousands of global stations
Stars: ✭ 225 (-3.02%)
Mutual labels:  dataset

University1652-Baseline

Python 3.6 Language grade: Python Total alerts License: MIT

VideoDemo

[Paper] [Slide] [Explore Drone-view Data] [Explore Satellite-view Data] [Explore Street-view Data] [Video Sample] [中文介绍]

This repository contains the dataset link and the code for our paper University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization. We collect 1652 buildings of 72 universities around the world. Thank you for your kindly attention.

Task 1: Drone-view target localization. (Drone -> Satellite) Given one drone-view image or video, the task aims to find the most similar satellite-view image to localize the target building in the satellite view.

Task 2: Drone navigation. (Satellite -> Drone) Given one satellite-view image, the drone intends to find the most relevant place (drone-view images) that it has passed by. According to its flight history, the drone could be navigated back to the target place.

Table of contents

About Dataset

The dataset split is as follows: | Split | #imgs | #buildings | #universities| | -------- | ----- | ----| ----| |Training | 50,218 | 701 | 33 | | Query_drone | 37,855 | 701 | 39 | | Query_satellite | 701 | 701 | 39| | Query_ground | 2,579 | 701 | 39| | Gallery_drone | 51,355 | 951 | 39| | Gallery_satellite | 951 | 951 | 39| | Gallery_ground | 2,921 | 793 | 39|

More detailed file structure:

β”œβ”€β”€ University-1652/
β”‚   β”œβ”€β”€ readme.txt
β”‚   β”œβ”€β”€ train/
β”‚       β”œβ”€β”€ drone/                   /* drone-view training images 
β”‚           β”œβ”€β”€ 0001
|           β”œβ”€β”€ 0002
|           ...
β”‚       β”œβ”€β”€ street/                  /* street-view training images 
β”‚       β”œβ”€β”€ satellite/               /* satellite-view training images       
β”‚       β”œβ”€β”€ google/                  /* noisy street-view training images (collected from Google Image)
β”‚   β”œβ”€β”€ test/
β”‚       β”œβ”€β”€ query_drone/  
β”‚       β”œβ”€β”€ gallery_drone/  
β”‚       β”œβ”€β”€ query_street/  
β”‚       β”œβ”€β”€ gallery_street/ 
β”‚       β”œβ”€β”€ query_satellite/  
β”‚       β”œβ”€β”€ gallery_satellite/ 
β”‚       β”œβ”€β”€ 4K_drone/

We note that there are no overlaps between 33 univeristies of training set and 39 univeristies of test set.

News

3 March 2021 GeM Pooling is added. You may use it by --pool gem.

21 January 2021 The GPU-Re-Ranking, a GNN-based real-time post-processing code, is at Here.

21 August 2020 The transfer learning code for Oxford and Paris is at Here.

27 July 2020 The meta data of 1652 buildings, such as latitude and longitude, are now available at Google Driver. (You could use Google Earth Pro to open the kml file or use vim to check the value).
We also provide the spiral flight tour file at Google Driver. (You could open the kml file via Google Earth Pro to enable the flight camera).

26 July 2020 The paper is accepted by ACM Multimedia 2020.

12 July 2020 I made the baseline of triplet loss (with soft margin) on University-1652 public available at Here.

12 March 2020 I add the state-of-the-art page for geo-localization and tutorial, which will be updated soon.

Code Features

Now we have supported:

  • Float16 to save GPU memory based on apex
  • Multiple Query Evaluation
  • Re-Ranking
  • Random Erasing
  • ResNet/VGG-16
  • Visualize Training Curves
  • Visualize Ranking Result
  • Linear Warm-up

Prerequisites

  • Python 3.6
  • GPU Memory >= 8G
  • Numpy > 1.12.1
  • Pytorch 0.3+
  • [Optional] apex (for float16)

Getting started

Installation

git clone https://github.com/pytorch/vision
cd vision
python setup.py install
  • [Optinal] You may skip it. Install apex from the source
git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext

Dataset & Preparation

Download [University-1652] upon request. You may use the request template.

Or download CVUSA / CVACT.

For CVUSA, I follow the training/test split in (https://github.com/Liumouliu/OriCNN).

Train & Evaluation

Train & Evaluation University-1652

python train.py --name three_view_long_share_d0.75_256_s1_google  --extra --views 3  --droprate 0.75  --share  --stride 1 --h 256  --w 256 --fp16; 
python test.py --name three_view_long_share_d0.75_256_s1_google

Default setting: Drone -> Satellite If you want to try other evaluation setting, you may change these lines at: https://github.com/layumi/University1652-Baseline/blob/master/test.py#L217-L225

Ablation Study only Satellite & Drone

python train_no_street.py --name two_view_long_no_street_share_d0.75_256_s1  --share --views 3  --droprate 0.75  --stride 1 --h 256  --w 256  --fp16; 
python test.py --name two_view_long_no_street_share_d0.75_256_s1

Set three views but set the weight of loss on street images to zero.

Train & Evaluation CVUSA

python prepare_cvusa.py
python train_cvusa.py --name usa_vgg_noshare_warm5_lr2 --warm 5 --lr 0.02 --use_vgg16 --h 256 --w 256  --fp16 --batchsize 16;
python test_cvusa.py  --name usa_vgg_noshare_warm5_lr2 

Trained Model

You could download the trained model at GoogleDrive or OneDrive. After download, please put model folders under ./model/.

Citation

The following paper uses and reports the result of the baseline model. You may cite it in your paper.

@article{zheng2020university,
  title={University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization},
  author={Zheng, Zhedong and Wei, Yunchao and Yang, Yi},
  journal={ACM Multimedia},
  year={2020}
}

Instance loss is defined in

@article{zheng2017dual,
  title={Dual-Path Convolutional Image-Text Embeddings with Instance Loss},
  author={Zheng, Zhedong and Zheng, Liang and Garrett, Michael and Yang, Yi and Xu, Mingliang and Shen, Yi-Dong},
  journal={ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)},
  doi={10.1145/3383184},
  volume={16},
  number={2},
  pages={1--23},
  year={2020},
  publisher={ACM New York, NY, USA}
}

Related Work

  • Instance Loss Code
  • Lending Orientation to Neural Networks for Cross-view Geo-localization Code
  • Predicting Ground-Level Scene Layout from Aerial Imagery Code
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].