All Projects → microsoft → Maskflownet

microsoft / Maskflownet

Licence: mit
[CVPR 2020, Oral] MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Maskflownet

Densematchingbenchmark
Dense Matching Benchmark
Stars: ✭ 120 (-50.41%)
Mutual labels:  optical-flow
Spynet
Spatial Pyramid Network for Optical Flow
Stars: ✭ 158 (-34.71%)
Mutual labels:  optical-flow
Df Net
[ECCV 2018] DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency
Stars: ✭ 190 (-21.49%)
Mutual labels:  optical-flow
Arflow
The official PyTorch implementation of the paper "Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation".
Stars: ✭ 134 (-44.63%)
Mutual labels:  optical-flow
Tfvos
Semi-Supervised Video Object Segmentation (VOS) with Tensorflow. Includes implementation of *MaskRNN: Instance Level Video Object Segmentation (NIPS 2017)* as part of the NIPS Paper Implementation Challenge.
Stars: ✭ 151 (-37.6%)
Mutual labels:  optical-flow
Py Denseflow
Extract TVL1 optical flows in python (multi-process && multi-server)
Stars: ✭ 159 (-34.3%)
Mutual labels:  optical-flow
Pwc Net pytorch
pytorch implementation of "PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume"
Stars: ✭ 111 (-54.13%)
Mutual labels:  optical-flow
Df Vo
Depth and Flow for Visual Odometry
Stars: ✭ 233 (-3.72%)
Mutual labels:  optical-flow
Frvsr
Frame-Recurrent Video Super-Resolution (official repository)
Stars: ✭ 157 (-35.12%)
Mutual labels:  optical-flow
Opticalflow visualization
Python optical flow visualization following Baker et al. (ICCV 2007) as used by the MPI-Sintel challenge
Stars: ✭ 183 (-24.38%)
Mutual labels:  optical-flow
Video2tfrecord
Easily convert RGB video data (e.g. .avi) to the TensorFlow tfrecords file format for training e.g. a NN in TensorFlow. This implementation allows to limit the number of frames per video to be stored in the tfrecords.
Stars: ✭ 137 (-43.39%)
Mutual labels:  optical-flow
Deep Learning For Tracking And Detection
Collection of papers, datasets, code and other resources for object tracking and detection using deep learning
Stars: ✭ 1,920 (+693.39%)
Mutual labels:  optical-flow
Hidden Two Stream
Caffe implementation for "Hidden Two-Stream Convolutional Networks for Action Recognition"
Stars: ✭ 179 (-26.03%)
Mutual labels:  optical-flow
Netdef models
Repository for different network models related to flow/disparity (ECCV 18)
Stars: ✭ 130 (-46.28%)
Mutual labels:  optical-flow
Liteflownet2
A Lightweight Optical Flow CNN - Revisiting Data Fidelity and Regularization, TPAMI 2020
Stars: ✭ 195 (-19.42%)
Mutual labels:  optical-flow
Vcn
Volumetric Correspondence Networks for Optical Flow, NeurIPS 2019.
Stars: ✭ 118 (-51.24%)
Mutual labels:  optical-flow
Pysteps
Python framework for short-term ensemble prediction systems.
Stars: ✭ 159 (-34.3%)
Mutual labels:  optical-flow
Unflow
UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss
Stars: ✭ 239 (-1.24%)
Mutual labels:  optical-flow
Opticalflowtoolkit
Python-based optical flow toolkit for existing popular dataset
Stars: ✭ 219 (-9.5%)
Mutual labels:  optical-flow
Clover
ROS-based framework and RPi image to control PX4-powered drones 🍀
Stars: ✭ 177 (-26.86%)
Mutual labels:  optical-flow

MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask, CVPR 2020 (Oral)

By Shengyu Zhao, Yilun Sheng, Yue Dong, Eric I-Chao Chang, Yan Xu.

[arXiv] [ResearchGate]

@inproceedings{zhao2020maskflownet,
  author = {Zhao, Shengyu and Sheng, Yilun and Dong, Yue and Chang, Eric I-Chao and Xu, Yan},
  title = {MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2020}
}

Introduction

mask_visualization

Feature warping is a core technique in optical flow estimation; however, the ambiguity caused by occluded areas during warping is a major problem that remains unsolved. We propose an asymmetric occlusion-aware feature matching module, which can learn a rough occlusion mask that filters useless (occluded) areas immediately after feature warping without any explicit supervision. The proposed module can be easily integrated into end-to-end network architectures and enjoys performance gains while introducing negligible computational cost. The learned occlusion mask can be further fed into a subsequent network cascade with dual feature pyramids with which we achieve state-of-the-art performance. For more details, please refer to our paper.

This repository includes:

  • Training and inferring scripts using Python and MXNet; and
  • Pretrained models of MaskFlownet-S and MaskFlownet.

Code has been tested with Python 3.6 and MXNet 1.5.

Datasets

We follow the common training schedule for optical flow using the following datasets:

Please modify the paths specified in main.py (for FlyingChairs), reader/things3d.py (for FlyingThings3D), reader/sintel.py (for Sintel), reader/kitti.py (for KITTI 2012 & KITTI 2015), and reader/hd1k.py (for HD1K) according to where you store the corresponding datasets. Please be aware that the FlyingThings3D dataset (subset) is still very large, so you might want to load only a relatively small proportion of it (see main.py).

Training

The following script is for training:

python main.py CONFIG [-dataset_cfg DATASET_CONFIG] [-g GPU_DEVICES] [-c CHECKPOINT, --clear_steps] [--debug]

where CONFIG specifies the network and training configuration; DATASET_CONFIG specifies the dataset configuration (default to chairs.yaml); GPU_DEVICES specifies the GPU IDs to use (default to cpu only), split by commas with multi-GPU support. Please make sure that the number of GPUs evenly divides the BATCH_SIZE, which depends on DATASET_CONFIG (BATCH_SIZE are 8 or 4 in the given configurations, so 4, 2, or 1 GPU(s) will be fine); CHECKPOINT specifies the previous checkpoint to start with; use --clear_steps to clear the step history and start from step 0; use --debug to enter the DEBUG mode, where only a small fragment of the data is read. To test whether your environment has been set up properly, run: python main.py MaskFlownet.yaml -g 0 --debug.

Here, we present the procedure to train a complete MaskFlownet model for validation on the Sintel dataset. About 20% sequences (ambush_2, ambush_6, bamboo_2, cave_4, market_6, temple_2) are split as Sintel val, while the remaining are left as Sintel train (see Sintel_train_val_maskflownet.txt). CHECKPOINT in each command line should correspond to the name of the checkpoint generated in the previous step.

# Network Training Validation Command Line
1 MaskFlownet-S Flying Chairs Sintel train + val python main.py MaskFlownet_S.yaml -g 0,1,2,3
2 MaskFlownet-S Flying Things3D Sintel train + val python main.py MaskFlownet_S_ft.yaml --dataset_cfg things3d.yaml -g 0,1,2,3 -c [CHECKPOINT] --clear_steps
3 MaskFlownet-S Sintel train + KITTI 2015 + HD1K Sintel val python main.py MaskFlownet_S_sintel.yaml --dataset_cfg sintel_kitti2015_hd1k.yaml -g 0,1,2,3 -c [CHECKPOINT] --clear_steps
4 MaskFlownet Flying Chairs Sintel val python main.py MaskFlownet.yaml -g 0,1,2,3 -c [CHECKPOINT] --clear_steps
5 MaskFlownet Flying Things3D Sintel val python main.py MaskFlownet_ft.yaml --dataset_cfg things3d.yaml -g 0,1,2,3 -c [CHECKPOINT] --clear_steps
6 MaskFlownet Sintel train + KITTI 2015 + HD1K Sintel val python main.py MaskFlownet_sintel.yaml --dataset_cfg sintel_kitti2015_hd1k.yaml -g 0,1,2,3 -c [CHECKPOINT] --clear_steps

Pretrained Models

Pretrained models for step 2, 3, and 6 in the above procedure are given (see ./weights/).

Inferring

The following script is for inferring:

python main.py CONFIG [-g GPU_DEVICES] [-c CHECKPOINT] [--valid or --predict] [--resize INFERENCE_RESIZE]

where CONFIG specifies the network configuration (MaskFlownet_S.yaml or MaskFlownet.yaml); GPU_DEVICES specifies the GPU IDs to use, split by commas with multi-GPU support; CHECKPOINT specifies the checkpoint to do inference on; use --valid to do validation; use --predict to do prediction; INFERENCE_RESIZE specifies the resize used to do inference.

For example,

  • to do validation for MaskFlownet-S on checkpoint fffMar16, run python main.py MaskFlownet_S.yaml -g 0 -c fffMar16 --valid (the output will be under ./logs/val/).

  • to do prediction for MaskFlownet on checkpoint 000Mar17, run python main.py MaskFlownet.yaml -g 0 -c 000Mar17 --predict (the output will be under ./flows/).

Inferrence on New Data

For those who do not wish to train the model and would purely like to obtain flow images from a pretrained model on their own data, please use predict_new_data.py. You do not need to download any of the optical flow datasets to use predict_new_data.py, although you will have to additionally pip install flow_vis and moviepy. The functions provide a means to load a model and perform inference on a given pair of images or to obtain a series of flow images corresponding to the movement between component images of a given video without the need to download optical flow datasets. These can be called from another script or you can call the program from a terminal/Anaconda prompt like so:

  • to obtain a video composed of the flow images corresponding to input_video.mp4, run python predict_new_data.py C:/Users/my_username/flow_video_filepath.mp4 MaskFlownet.yaml --video_filepath C:/Users/my_username/input_video.mp4 -g 0 -c 8caNov12

  • to obtain a flow image from 2 input images image_1.png and image_2.png, run python predict_new_data.py C:/Users/my_username/flow_image_filepath.png MaskFlownet.yaml --image_1 C:/Users/my_username/image_1.png --image_2 C:/Users/my_username/image_2.png -g 0 -c 8caNov12

Acknowledgement

We thank Tingfung Lau for the initial implementation of the FlyingChairs pipeline.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].