All Projects → xingyul → cpnet

xingyul / cpnet

Licence: other
Learning Video Representations from Correspondence Proposals (CVPR 2019 Oral)

Programming Languages

python
139335 projects - #7 most used programming language
C++
36643 projects - #6 most used programming language
shell
77523 projects
Cuda
1817 projects
Makefile
30231 projects

Projects that are alternatives of or similar to cpnet

MiCT-Net-PyTorch
Video Recognition using Mixed Convolutional Tube (MiCT) on PyTorch with a ResNet backbone
Stars: ✭ 48 (-48.39%)
Mutual labels:  action-recognition, video-classification
GST-video
ICCV 19 Grouped Spatial-Temporal Aggretation for Efficient Action Recognition
Stars: ✭ 40 (-56.99%)
Mutual labels:  action-recognition, video-classification
TA3N
[ICCV 2019 Oral] TA3N: https://github.com/cmhungsteve/TA3N (Most updated repo)
Stars: ✭ 45 (-51.61%)
Mutual labels:  action-recognition, video-classification
C3D-tensorflow
Action recognition with C3D network implemented in tensorflow
Stars: ✭ 34 (-63.44%)
Mutual labels:  action-recognition, video-classification
conv3d-video-action-recognition
My experimentation around action recognition in videos. Contains Keras implementation for C3D network based on original paper "Learning Spatiotemporal Features with 3D Convolutional Networks", Tran et al. and it includes video processing pipelines coded using mPyPl package. Model is being benchmarked on popular UCF101 dataset and achieves result…
Stars: ✭ 50 (-46.24%)
Mutual labels:  action-recognition, video-classification
TCE
This repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE).
Stars: ✭ 51 (-45.16%)
Mutual labels:  representation-learning, action-recognition
MTL-AQA
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]
Stars: ✭ 38 (-59.14%)
Mutual labels:  representation-learning, action-recognition
Pose2vec
A Repository for maintaining various human skeleton preprocessing steps in numpy and tensorflow along with tensorflow model to learn pose embeddings.
Stars: ✭ 25 (-73.12%)
Mutual labels:  representation-learning, action-recognition
san
The official PyTorch implementation of "Context Matters: Self-Attention for sign Language Recognition"
Stars: ✭ 17 (-81.72%)
Mutual labels:  action-recognition
two-stream-fusion-for-action-recognition-in-videos
No description or website provided.
Stars: ✭ 80 (-13.98%)
Mutual labels:  action-recognition
poincare embedding
Poincaré Embedding
Stars: ✭ 36 (-61.29%)
Mutual labels:  representation-learning
video repres mas
code for CVPR-2019 paper: Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics
Stars: ✭ 63 (-32.26%)
Mutual labels:  action-recognition
torch-points3d
Pytorch framework for doing deep learning on point clouds.
Stars: ✭ 1,823 (+1860.22%)
Mutual labels:  point-cloud
OverlapPredator
[CVPR 2021, Oral] PREDATOR: Registration of 3D Point Clouds with Low Overlap.
Stars: ✭ 293 (+215.05%)
Mutual labels:  point-cloud
RealSense
Extension of RealSense Unity Wrapper [Unofficial]
Stars: ✭ 31 (-66.67%)
Mutual labels:  point-cloud
theWorldInSafety
Surveillance System Against Violence
Stars: ✭ 31 (-66.67%)
Mutual labels:  action-recognition
game-feature-learning
Code for paper "Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery", Ren et al., CVPR'18
Stars: ✭ 68 (-26.88%)
Mutual labels:  representation-learning
visual syntactic embedding video captioning
Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*
Stars: ✭ 23 (-75.27%)
Mutual labels:  representation-learning
TadTR
End-to-end Temporal Action Detection with Transformer. [Under review for a journal publication]
Stars: ✭ 55 (-40.86%)
Mutual labels:  action-recognition
costmap depth camera
This is a costmap plugin for costmap_2d pkg. This plugin supports multiple depth cameras and run in real time.
Stars: ✭ 26 (-72.04%)
Mutual labels:  point-cloud

Learning Video Representations from Correspondence Proposals

Created by Xingyu Liu, Joon-Young Lee and Hailin Jin from Stanford University and Adobe Research (paper link).

Citation

If you find our work useful in your research, please cite:

    @article{liu:2019:cpnet,
      title={Learning Video Representations from Correspondence Proposals},
      author={Xingyu Liu and Joon-Young Lee and Hailin Jin},
      journal={CVPR},
      year={2019}
    }

Abstract

Correspondences between frames encode rich information about dynamic content in videos. However, it is challenging to effectively capture and learn those due to their irregular structure and complex dynamics. In this paper, we propose a novel neural network that learns video representations by aggregating information from potential correspondences. This network, named CPNet, can learn evolving 2D fields with temporal consistency. In particular, it can effectively learn representations for videos by mixing appearance and long-range motion with an RGB-only input. We provide extensive ablation experiments to validate our model. CPNet shows stronger performance than existing methods on Kinetics and achieves the state-of-the-art performance on Something-Something and Jester. We provide analysis towards the behavior of our model and show its robustness to errors in proposals.

Installation

Install TensorFlow. The code is tested under TF1.9.0 GPU version, g++ 5.4.0, CUDA 9.0 and Python 3.5 on Ubuntu 16.04. There are also some dependencies for a few Python libraries for data processing and visualizations like cv2. It's highly recommended that you have access to GPUs.

Compile Customized TF Operators

The TF operators are included under tf_ops, you need to compile them first by make under each ops subfolder (check Makefile). Update arch in the Makefiles for different CUDA Compute Capability that suits your GPU if necessary.

Data Preprocessing

The data preprocessing scripts are included in utils/data_preparation. Please follow the instructions in the README.md of each subdirectory.

Training and Evaluation

First download the ImageNet pretrained ResNet model from here and put it in pretrained_models/ImageNet-ResNet34.npz.

To train the model for Jester dataset, rename command_train.sh.jester.experiment to be command_train.sh and simply execute the shell script command_train.sh. Batch size, learning rate etc are adjustable.

sh command_train.sh

To evaluate the model, rename command_evaluate.sh.jester.experiment to be command_evaluate.sh and simply execute the shell script command_evaluate.sh.

sh command_evaluate.sh

To test the model, rename command_test.sh.jester.experiment to be command_test.sh and simply execute the shell script command_test.sh.

sh command_test.sh

A pre-trained model with ResNet-34 as backbone on Jester dataset is provided here for download.

For Something-Something dataset, the train, evaluation and test command files are command_train.sh.something.something.experiment, command_evaluate.sh.something.something.experiment and command_test.sh.something.something.experiment.

A pre-trained model with ResNet-34 as backbone on Something-Something dataset is provided here for download.

License

Our code is released under CC BY-NC-SA-4.0 License (see LICENSE file for details).

Related Projects

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].