All Projects → CS-GangXu → TMNet

CS-GangXu / TMNet

Licence: Apache-2.0 License
The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

Programming Languages

python
139335 projects - #7 most used programming language
Cuda
1817 projects
C++
36643 projects - #6 most used programming language
c
50402 projects - #5 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to TMNet

Cvpr2021 Papers With Code
CVPR 2021 论文和开源项目合集
Stars: ✭ 7,138 (+9170.13%)
Mutual labels:  paper, cvpr
Papercrawler
Crawler used to crawl papers
Stars: ✭ 20 (-74.03%)
Mutual labels:  paper, cvpr
Cvpr 2019 Paper Statistics
Statistics and Visualization of acceptance rate, main keyword of CVPR 2019 accepted papers for the main Computer Vision conference (CVPR)
Stars: ✭ 527 (+584.42%)
Mutual labels:  paper, cvpr
Restoring-Extremely-Dark-Images-In-Real-Time
The project is the official implementation of our CVPR 2021 paper, "Restoring Extremely Dark Images in Real Time"
Stars: ✭ 79 (+2.6%)
Mutual labels:  paper, cvpr
Pwc
Papers with code. Sorted by stars. Updated weekly.
Stars: ✭ 15,288 (+19754.55%)
Mutual labels:  paper, cvpr
Awesome-Computer-Vision-Paper-List
This repository contains all the papers accepted in top conference of computer vision, with convenience to search related papers.
Stars: ✭ 248 (+222.08%)
Mutual labels:  paper, cvpr
Cv paperdaily
CV 论文笔记
Stars: ✭ 555 (+620.78%)
Mutual labels:  paper, cvpr
Srflow
Official SRFlow training code: Super-Resolution using Normalizing Flow in PyTorch
Stars: ✭ 537 (+597.4%)
Mutual labels:  paper, super-resolution
3pu
Patch-base progressive 3D Point Set Upsampling
Stars: ✭ 131 (+70.13%)
Mutual labels:  paper, super-resolution
Deeplpf
Code for CVPR 2020 paper "Deep Local Parametric Filters for Image Enhancement"
Stars: ✭ 91 (+18.18%)
Mutual labels:  paper, cvpr
Zooming Slow Mo Cvpr 2020
Fast and Accurate One-Stage Space-Time Video Super-Resolution (accepted in CVPR 2020)
Stars: ✭ 555 (+620.78%)
Mutual labels:  super-resolution, cvpr
CURL
Code for the ICPR 2020 paper: "CURL: Neural Curve Layers for Image Enhancement"
Stars: ✭ 177 (+129.87%)
Mutual labels:  paper, cvpr
Pytorch Vdsr
VDSR (CVPR2016) pytorch implementation
Stars: ✭ 313 (+306.49%)
Mutual labels:  super-resolution, cvpr
GuidedLabelling
Exploiting Saliency for Object Segmentation from Image Level Labels, CVPR'17
Stars: ✭ 35 (-54.55%)
Mutual labels:  paper, cvpr
Awesome Computer Vision
Awesome Resources for Advanced Computer Vision Topics
Stars: ✭ 92 (+19.48%)
Mutual labels:  paper, super-resolution
AIPaperCompleteDownload
Complete download for papers in various top conferences
Stars: ✭ 64 (-16.88%)
Mutual labels:  paper, cvpr
cool-papers-in-pytorch
Reimplementing cool papers in PyTorch...
Stars: ✭ 21 (-72.73%)
Mutual labels:  paper, cvpr
AMP-Regularizer
Code for our paper "Regularizing Neural Networks via Adversarial Model Perturbation", CVPR2021
Stars: ✭ 26 (-66.23%)
Mutual labels:  cvpr
Facial-Recognition-Attendance-System
An attendance system which uses facial recognition to detect which people are present in any image.
Stars: ✭ 48 (-37.66%)
Mutual labels:  paper
sensim
Sentence Similarity Estimator (SenSim)
Stars: ✭ 15 (-80.52%)
Mutual labels:  paper

This is the official PyTorch implementation of TMNet in the CVPR 2021 paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution". Our TMNet can flexibly interpolate intermediate frames for space-time video super-resolution (STVSR).

Updates

  • 2021.06.17 Dataset preparation for Adobe240fps and Vid4 Dataset
  • 2021.05.08 Upload the code of training and testing.
  • 2021.04.23 Init the repositories.

Contents

  1. Introduction
  2. Installation
  3. Train
  4. Test
  5. Results
  6. Citation
  7. Acknowledgment
  8. Contact

Introduction

Space-time video super-resolution (STVSR) aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos. Recently, deformable convolution based methods have achieved promising STVSR performance, but they could only infer the intermediate frame pre-defined in the training stage. Besides, these methods undervalued the short-term motion cues among adjacent frames. In this paper, we propose a Temporal Modulation Network (TMNet) to interpolate arbitrary intermediate frame(s) with accurate high-resolution reconstruction. Specifically, we propose a Temporal Modulation Block (TMB) to modulate deformable convolution kernels for controllable feature interpolation. To well exploit the temporal information, we propose a Locally-temporal Feature Comparison (LFC) module, along with the Bi-directional Deformable ConvLSTM, to extract short-term and long-term motion cues in videos. Experiments on three benchmark datasets demonstrate that our TMNet outperforms previous STVSR methods.

Installation

Install the Requirement packages

DCNv2

1. Clone the TMNet repository.

git clone https://github.com/CS-GangXu/TMNet.git

2. Compile the DCNv2 ($ROOT means the working directory dir of the code of TMNet).

You should first set configuration of the $ROOT/models/modules/DCNv2/make.sh

#!/usr/bin/env bash

# You may need to modify the following paths before compiling.
CUDA_HOME=/usr/local/cuda-10.0 \
CUDNN_INCLUDE_DIR=/usr/local/cuda-10.0/include \
CUDNN_LIB_DIR=/usr/local/cuda-10.0/lib64 \

python setup.py build develop

Then, run the make.sh:

cd $ROOT/models/modules/DCNv2
bash make.sh

Train

1. Dataset preparation

You need to prepare datasets for following training and testing activities, the detailed information is at Dataset Setup.

2. Get pretrained models

Our pretrained models (tmnet_single_frame.pth and tmnet_multiple_frames.pth) can be downloaded via Google Drive or Baidu Netdisk(access code: wiq7). After you download the pretrained models, please put them into the $ROOT/checkpoints folder.

3. Set up configuration

Our training settings in the paper can be found at $ROOT/configs/TMNet_single_frame.yaml and $ROOT/configs/TMNet_multiple_frames.yaml . We'll take these settings as an example to illustrate our training strategy in our paper.

4. Train the TMNet without the TMB block

We need to train the TMNet without the TMB block on the Vimeo-90K Septuplet dataset. Thus we need to follow the configuration in $ROOT/configs/TMNet_single_frame.yaml.

If you want to train the TMNet without distributed learning:

python train.py -opt configs/TMNet_single_frame.yaml

If you want to train the TMNet with distributed learning ($GPU_NUMBER means the number of GPUs you used):

python -m torch.distributed.launch --nproc_per_node=$GPU_NUMBER train.py -opt configs/TMNet_single_frame.yaml --launcher pytorch

5. Fintune the TMB block

We need to fintune the TMB block for temporal modulation on the Adobe240fps dataset with the other parameters being fixed. Thus we need to follow the configuration in $ROOT/configs/TMNet_multiple_frames.yaml.

If you want to train the TMNet without distributed learning:

python train.py -opt configs/TMNet_multiple_frames.yaml

If you want to train the TMNet with distributed learning ($GPU_NUMBER means the number of GPUs you used):

python -m torch.distributed.launch --nproc_per_node=$GPU_NUMBER train.py -opt configs/TMNet_multiple_frames.yaml --launcher pytorch

After training, the model, its training states and a corresponding log file are placed in the directory of $ROOT/experiments.

Test

You can evaluate the performance of the trained TMNet for single frame generation at the intermediate moment using the Vimeo-90k Septuplet dataset (for example, if we input a video with 30fps as the input, this code takes the generated video with 60fps for evaluation):

python test_single_frame.py

You can evaluate the performance of the trained TMNet for multiple (x6) frames generation using the Adobe240fps dataset (for example, if we input a video with 30fps as the input, this code takes the generated video with 180fps for evaluation):

python test_multiple_frames.py

All the evaluation results are placed in to $ROOT/evaluations

Results

Quantitative Results

Comparison of PSNR, SSIM, speed (in fps), and parameters (in million) by different STVSR methods on Vid4, Vimeo-Fast, Vimeo-Medium, Vimeo-Slow:

Visual Results

Qualitative and quantitative results of different methods on STVSR:

Comparison of flexibility on STVSR by our TMNet (1-st, 3-rd, and 5-th columns) and Zooming Slow-Mo (2-nd, 4-th, and 6-th columns) on three video clips from the Vimeo-Fast dataset:

Temporal consistency of our TMNet on STVSR:

Citation

If you find the code helpful in your research or work, please cite our paper.

@InProceedings{xu2021temporal,
  author = {Gang Xu and Jun Xu and Zhen Li and Liang Wang and Xing Sun and Mingming Cheng},
  title = {Temporal Modulation Network for Controllable Space-Time Video Super-Resolution},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2021}
}

Acknowledgment

Our code is built on Zooming-Slow-Mo-CVPR-2020 and EDVR. We thank the authors for sharing their codes. Our project is sponsored by CAAI-Huawei MindSpore Open Fund.

Contact

If you have any questions, feel free to E-mail me with [email protected].

License

The code is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for NonCommercial use only. Any commercial use should get formal permission first.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].