All Projects → kregmi → cross-view-image-synthesis

kregmi / cross-view-image-synthesis

Licence: other
[CVPR 2018] Cross-View Image Synthesis using Conditional GANs, [CVIU 2019] Cross-view image synthesis using geometry-guided conditional GANs

Programming Languages

lua
6591 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to cross-view-image-synthesis

Aerial
Aerial Apple TV screen saver for Windows
Stars: ✭ 1,853 (+4111.36%)
Mutual labels:  aerial
csound-extended
Extensions for Csound including algorithmic composition, Android app, and WebAssembly.
Stars: ✭ 38 (-13.64%)
Mutual labels:  synthesis
Main-Supercollider-Files
my supercollider codes, version history is at the branches
Stars: ✭ 21 (-52.27%)
Mutual labels:  synthesis
FScape-next
Audio rendering software, based on UGen graphs. Issue tracker: https://codeberg.org/sciss/FScape-next/issues
Stars: ✭ 13 (-70.45%)
Mutual labels:  synthesis
3D Ground Segmentation
A ground segmentation algorithm for 3D point clouds based on the work described in “Fast segmentation of 3D point clouds: a paradigm on LIDAR data for Autonomous Vehicle Applications”, D. Zermas, I. Izzat and N. Papanikolopoulos, 2017. Distinguish between road and non-road points. Road surface extraction. Plane fit ground filter
Stars: ✭ 55 (+25%)
Mutual labels:  ground
synthesis
🔥 Synthesis is Meteor + Polymer
Stars: ✭ 28 (-36.36%)
Mutual labels:  synthesis
Gwion
🎵 strongly-timed musical programming language
Stars: ✭ 235 (+434.09%)
Mutual labels:  synthesis
omega
Specify and synthesize systems using symbolic algorithms
Stars: ✭ 36 (-18.18%)
Mutual labels:  synthesis
lessampler
lessampler is a Singing Voice Synthesizer
Stars: ✭ 59 (+34.09%)
Mutual labels:  synthesis
Comet
Web Synthesis on steroids
Stars: ✭ 18 (-59.09%)
Mutual labels:  synthesis
herbie
Optimize floating-point expressions for accuracy
Stars: ✭ 614 (+1295.45%)
Mutual labels:  synthesis
xeda
Cross EDA Abstraction and Automation
Stars: ✭ 25 (-43.18%)
Mutual labels:  synthesis
async fifo
A dual clock asynchronous FIFO written in verilog, tested with Icarus Verilog
Stars: ✭ 117 (+165.91%)
Mutual labels:  synthesis
reef
Automatically labeling training data
Stars: ✭ 102 (+131.82%)
Mutual labels:  synthesis
rgbd person tracking
R-GBD Person Tracking is a ROS framework for detecting and tracking people from a mobile robot.
Stars: ✭ 46 (+4.55%)
Mutual labels:  ground
I dropped my phone the screen cracked
web audio, cracked.
Stars: ✭ 245 (+456.82%)
Mutual labels:  synthesis
GCPEditorPro
Amazingly fast and simple ground control points interface. ◎
Stars: ✭ 33 (-25%)
Mutual labels:  ground
WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Stars: ✭ 55 (+25%)
Mutual labels:  synthesis
AerialForWindows
Aerial For Windows is a Windows screen saver based on the new Apple TV screen saver
Stars: ✭ 30 (-31.82%)
Mutual labels:  aerial
magphase
MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.
Stars: ✭ 76 (+72.73%)
Mutual labels:  synthesis

cross-view-image-synthesis

[Project] [Paper, CVPR 2018] [Paper, CVIU 2019]

Abstract

Learning to generate natural scenes has always been a challenging task in computer vision. It is even more painstaking when the generation is conditioned on images with drastically different views. This is mainly because understanding, corresponding, and transforming appearance and semantic information across the views is not trivial. In this paper, we attempt to solve the novel problem of cross-view image synthesis, aerial to street-view and vice versa, using conditional generative adversarial networks (cGAN). Two new architectures called Crossview Fork (X-Fork) and Crossview Sequential (X-Seq) are proposed to generate scenes with resolutions of 64x64 and 256x256 pixels. X-Fork architecture has a single discriminator and a single generator. The generator hallucinates both the image and its semantic segmentation in the target view. X-Seq architecture utilizes two cGANs. The first one generates the target image which is subsequently fed to the second cGAN for generating its corresponding semantic segmentation map. The feedback from the second cGAN helps the first cGAN generate sharper images. Both of our proposed architectures learn to generate natural images as well as their semantic segmentation maps. The proposed methods show that they are able to capture and maintain the true semantics of objects in source and target views better than the traditional image-to-image translation method which considers only the visual appearance of the scene. Extensive qualitative and quantitative evaluations support the effectiveness of our frameworks, compared to two state of the art methods, for natural scene generation across drastically different views.

Code

Our code is borrowed from pix2pix. The data loader is modified to handle images and semantic segmentation maps.

Setup

Getting Started

luarocks install nngraph
luarocks install https://raw.githubusercontent.com/szym/display/master/display-scm-0.rockspec
  • Clone this repo:
git clone [email protected]:kregmi/cross-view-image-synthesis.git
cd cross-view-image-synthesis
  • Training the model
DATA_ROOT=./datasets/AB_AsBs name=sample_images which_direction=a2g phase=sample th train_fork.lua
  • For CPU only training:
DATA_ROOT=./datasets/AB_AsBs name=sample_images which_direction=a2g phase=sample gpu=0 cudnn=0 th train_fork.lua
  • Testing the model:
DATA_ROOT=./datasets/AB_AsBs name=sample_images which_direction=a2g phase=sample which_epoch=35 th test_fork.lua 

The test results will be saved to: ./results/sample_images/35_net_G_sample/images/.

Training and Test data

Datasets

The original datasets are available here:

  1. GT-CrossView
  2. CVUSA

Ground Truth semantic segmentation maps are not available for the Dayton (GT-CrossView) dataset. We used RefineNet trained on CityScapes for generating semantic segmentation maps and used them as Gound Truth segmaps in our experiments. Please cite their papers if you use the dataset. We have shared the segmentation maps of Dayton Dataset here Dayton SegMaps.

Segmentation maps for CVUSA dataset are available with the dataset. Please follow the instruction in this link to convert them to the segmaps used in this project.

Train/Test splits for Dayton Dataset can be downloaded from here Dayton.

For CVUSA dataset experiments, we used the same Train/Test split as provided in the dataset.

Generating Pairs

Refer to pix2pix for steps and code to generate pairs of images required for training/testing.

First concatenate the streetview and aerial images followed by concatenating their segmentation maps and finally concatenating them all along the columns. Each concatenated image file in the dataset will contain {A,B,As,Bs}, where A=streetview image, B=aerial image, As=segmentation map for streetview image, and Bs=segmentation map for aerial image.

Train

DATA_ROOT=/path/to/data/ name=expt_name which_direction=a2g th train_fork.lua

Switch a2g to g2a to train in opposite direction.

Models are saved to ./checkpoints/expt_name (can be changed by passing checkpoint_dir=your_dir in train_fork.lua).

See opt in train_fork.lua for additional training options.

Test

DATA_ROOT=/path/to/data/ name=expt_name which_direction=a2g phase=val th test_fork.lua

This will run the model named expt_name in direction a2g on all images in /path/to/data/val.

Result images, and a webpage to view them, are saved to ./results/expt_name (can be changed by passing results_dir=your_dir in test_fork.lua).

See opt in test_fork.lua for additional testing options.

Models

Pretrained models can be downloaded here.

[X-Pix2pix] [X-Fork] [X-Seq]

Place the models in ./checkpoints/ after the download has finished.

Results

Some qualitative results on GT-CrossView Dataset:

result

CVPR Poster

poster

Citation

If you find our works useful for your research, please cite the following papers:

  • Cross-View Image Synthesis Using Conditional GANs, CVPR 2018 pdf, bibtex
  • Cross-view image synthesis using geometry-guided conditional GANs, CVIU 2019 pdf, bibtex

Questions

Please contact: '[email protected]'

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].