All Projects → doc-doc → vRGV

doc-doc / vRGV

Licence: other
Visual Relation Grounding in Videos (ECCV'20, Spotlight)

Programming Languages

python
139335 projects - #7 most used programming language
c
50402 projects - #5 most used programming language
Cuda
1817 projects
cython
566 projects

Projects that are alternatives of or similar to vRGV

PSTCR
Q. Zhang, Q. Yuan, J. Li, Z. Li, H. Shen, and L. Zhang, "Thick Cloud and Cloud Shadow Removal in Multitemporal Images using Progressively Spatio-Temporal Patch Group Learning", ISPRS Journal, 2020.
Stars: ✭ 43 (-20.37%)
Mutual labels:  spatio-temporal
Hierarchical-Word-Sense-Disambiguation-using-WordNet-Senses
Word Sense Disambiguation using Word Specific models, All word models and Hierarchical models in Tensorflow
Stars: ✭ 33 (-38.89%)
Mutual labels:  hierarchical
pred-rnn
PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs
Stars: ✭ 115 (+112.96%)
Mutual labels:  spatio-temporal
Spatio-Temporal-papers
This project is a collection of recent research in areas such as new infrastructure and urban computing, including white papers, academic papers, AI lab and dataset etc.
Stars: ✭ 180 (+233.33%)
Mutual labels:  spatio-temporal
simple-page-ordering
Order your pages and other hierarchical post types with simple drag and drop right from the standard page list.
Stars: ✭ 88 (+62.96%)
Mutual labels:  hierarchical
st dbscan
ST-DBSCAN: Simple and effective tool for spatial-temporal clustering
Stars: ✭ 82 (+51.85%)
Mutual labels:  spatio-temporal
pconf
Hierarchical python configuration with files, environment variables and command-line arguments.
Stars: ✭ 17 (-68.52%)
Mutual labels:  hierarchical
wattnet-fx-trading
WATTNet: Learning to Trade FX with Hierarchical Spatio-Temporal Representations of Highly Multivariate Time Series
Stars: ✭ 70 (+29.63%)
Mutual labels:  spatio-temporal
R2Plus1D-C3D
A PyTorch implementation of R2Plus1D and C3D based on CVPR 2017 paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition" and CVPR 2014 paper "Learning Spatiotemporal Features with 3D Convolutional Networks"
Stars: ✭ 54 (+0%)
Mutual labels:  spatio-temporal
metacoder
Parsing, Manipulation, and Visualization of Metabarcoding/Taxonomic data
Stars: ✭ 120 (+122.22%)
Mutual labels:  hierarchical
LBYLNet
[CVPR2021] Look before you leap: learning landmark features for one-stage visual grounding.
Stars: ✭ 46 (-14.81%)
Mutual labels:  visual-grounding
bbhtm
bare bone Hierarchial Temporal Memory
Stars: ✭ 14 (-74.07%)
Mutual labels:  hierarchical
SemEval2019Task3
Code for ANA at SemEval-2019 Task 3
Stars: ✭ 41 (-24.07%)
Mutual labels:  hierarchical
UnityHFSM
A simple yet powerful class based hierarchical finite state machine for Unity3D
Stars: ✭ 243 (+350%)
Mutual labels:  hierarchical
Traffic-Prediction-Open-Code-Summary
Summary of open source code for deep learning models in the field of traffic prediction
Stars: ✭ 58 (+7.41%)
Mutual labels:  spatio-temporal
st-hadoop
ST-Hadoop is an open-source MapReduce extension of Hadoop designed specially to analyze your spatio-temporal data efficiently
Stars: ✭ 17 (-68.52%)
Mutual labels:  spatio-temporal
scanstatistics
An R package for space-time anomaly detection using scan statistics.
Stars: ✭ 41 (-24.07%)
Mutual labels:  spatio-temporal
pytorch-psetae
PyTorch implementation of the model presented in "Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention"
Stars: ✭ 117 (+116.67%)
Mutual labels:  spatio-temporal
ConvLSTM-PyTorch
ConvLSTM/ConvGRU (Encoder-Decoder) with PyTorch on Moving-MNIST
Stars: ✭ 202 (+274.07%)
Mutual labels:  spatio-temporal
CAST
Developer Version of the R package CAST: Caret Applications for Spatio-Temporal models
Stars: ✭ 65 (+20.37%)
Mutual labels:  spatio-temporal

Visual Relation Grounding in Videos

This is the pytorch implementation of our work at ECCV2020 (Spotlight). teaser The repository mainly includes 3 parts: (1) Extract RoI feature; (2) Train and inference; and (3) Generate relation-aware trajectories.

Notes

Fix issue on unstable result [2021/10/07].

Environment

Anaconda 3, python 3.6.5, pytorch 0.4.1 (Higher version is OK once feature is ready) and cuda >= 9.0. For others libs, please refer to the file requirements.txt.

Install

Please create an env for this project using anaconda3 (should install anaconda first)

>conda create -n envname python=3.6.5 # Create
>conda activate envname # Enter
>pip install -r requirements.txt # Install the provided libs
>sh vRGV/lib/make.sh # Set the environment for detection, make sure you have nvcc

Data Preparation

Please download the data here. The folder ground_data should be at the same directory as vRGV. Please merge the downloaded vRGV folder with this repo.

Please download the videos here and extract the frames into ground_data. The directory should be like: ground_data/vidvrd/JPEGImages/ILSVRC2015_train_xxx/000000.JPEG.

Usage

Feature Extraction. (need about 100G storage! Because I dumped all the detected bboxes along with their features. It can be greatly reduced by changing detect_frame.py to return the top-40 bboxes and save them with .npz file.)

./detection.sh 0 val #(or train)

Sample video features:

cd tools
python sample_video_feature.py

Test. You can use our provided model to verify the feature and environment:

./ground.sh 0 val # Output the relation-aware spatio-temporal attention
python generate_track_link.py # Generate relation-aware trajectories with Viterbi algorithm.
python eval_ground.py # Evaluate the performance

You will get accuracy Acc_R: 24.58%.

Train. If you want to train the model from scratch. Please apply a two-stage training scheme: 1) train a basic model without relation attendance, and 2) load the reconstruction part of the pre-trained model to learn the whole model (with the same lr_rate). For implementation, please turn off/on [pretrain] in line 52 of ground.py, and switch between line 6 & 7 in ground_relation.py for 1st & 2nd stage training respectively. Also, you need to change the model files in line 69 & 70 of ground_relation.py to the best model obtained at the first stage for 2nd-stage training.

./ground.sh 0 train # Train the model with GPU id 0

The results maybe slightly different (+/-0.5%), For comparison, please follow the results reported in our paper.

Result Visualization

Query bicycle-jump_beneath-person person-feed-elephant person-stand_above-bicycle dog-watch-turtle
Result
Query person-ride-horse person-ride-bicycle person-drive-car bicycle-move_toward-car
Result

Citation

@inproceedings{xiao2020visual,
  title={Visual Relation Grounding in Videos},
  author={Xiao, Junbin and Shang, Xindi and Yang, Xun and Tang, Sheng and Chua, Tat-Seng},
  booktitle={European Conference on Computer Vision},
  pages={447--464},
  year={2020},
  organization={Springer}
}

License

NUS © NExT++

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].