doc-doc / vRGV

Licence: other

Visual Relation Grounding in Videos (ECCV'20, Spotlight)

Programming Languages

python

139335 projects - #7 most used programming language

50402 projects - #5 most used programming language

Cuda

1817 projects

cython

566 projects

Projects that are alternatives of or similar to vRGV

PSTCR

Q. Zhang, Q. Yuan, J. Li, Z. Li, H. Shen, and L. Zhang, "Thick Cloud and Cloud Shadow Removal in Multitemporal Images using Progressively Spatio-Temporal Patch Group Learning", ISPRS Journal, 2020.

Stars: ✭ 43 (-20.37%)

Mutual labels: spatio-temporal

Hierarchical-Word-Sense-Disambiguation-using-WordNet-Senses

Word Sense Disambiguation using Word Specific models, All word models and Hierarchical models in Tensorflow

Stars: ✭ 33 (-38.89%)

Mutual labels: hierarchical

pred-rnn

PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs

Stars: ✭ 115 (+112.96%)

Mutual labels: spatio-temporal

Spatio-Temporal-papers

This project is a collection of recent research in areas such as new infrastructure and urban computing, including white papers, academic papers, AI lab and dataset etc.

Stars: ✭ 180 (+233.33%)

Mutual labels: spatio-temporal

simple-page-ordering

Order your pages and other hierarchical post types with simple drag and drop right from the standard page list.

Stars: ✭ 88 (+62.96%)

Mutual labels: hierarchical

st dbscan

ST-DBSCAN: Simple and effective tool for spatial-temporal clustering

Stars: ✭ 82 (+51.85%)

Mutual labels: spatio-temporal

pconf

Hierarchical python configuration with files, environment variables and command-line arguments.

Stars: ✭ 17 (-68.52%)

Mutual labels: hierarchical

wattnet-fx-trading

WATTNet: Learning to Trade FX with Hierarchical Spatio-Temporal Representations of Highly Multivariate Time Series

Stars: ✭ 70 (+29.63%)

Mutual labels: spatio-temporal

R2Plus1D-C3D

A PyTorch implementation of R2Plus1D and C3D based on CVPR 2017 paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition" and CVPR 2014 paper "Learning Spatiotemporal Features with 3D Convolutional Networks"

Stars: ✭ 54 (+0%)

Mutual labels: spatio-temporal

metacoder

Parsing, Manipulation, and Visualization of Metabarcoding/Taxonomic data

Stars: ✭ 120 (+122.22%)

Mutual labels: hierarchical

LBYLNet

[CVPR2021] Look before you leap: learning landmark features for one-stage visual grounding.

Stars: ✭ 46 (-14.81%)

Mutual labels: visual-grounding

bbhtm

bare bone Hierarchial Temporal Memory

Stars: ✭ 14 (-74.07%)

Mutual labels: hierarchical

SemEval2019Task3

Code for ANA at SemEval-2019 Task 3

Stars: ✭ 41 (-24.07%)

Mutual labels: hierarchical

UnityHFSM

A simple yet powerful class based hierarchical finite state machine for Unity3D

Stars: ✭ 243 (+350%)

Mutual labels: hierarchical

Traffic-Prediction-Open-Code-Summary

Summary of open source code for deep learning models in the field of traffic prediction

Stars: ✭ 58 (+7.41%)

Mutual labels: spatio-temporal

st-hadoop

ST-Hadoop is an open-source MapReduce extension of Hadoop designed specially to analyze your spatio-temporal data efficiently

Stars: ✭ 17 (-68.52%)

Mutual labels: spatio-temporal

scanstatistics

An R package for space-time anomaly detection using scan statistics.

Stars: ✭ 41 (-24.07%)

Mutual labels: spatio-temporal

pytorch-psetae

PyTorch implementation of the model presented in "Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention"

Stars: ✭ 117 (+116.67%)

Mutual labels: spatio-temporal

ConvLSTM-PyTorch

ConvLSTM/ConvGRU (Encoder-Decoder) with PyTorch on Moving-MNIST

Stars: ✭ 202 (+274.07%)

Mutual labels: spatio-temporal

CAST

Developer Version of the R package CAST: Caret Applications for Spatio-Temporal models

Stars: ✭ 65 (+20.37%)

Mutual labels: spatio-temporal

Visual Relation Grounding in Videos

This is the pytorch implementation of our work at ECCV2020 (Spotlight). The repository mainly includes 3 parts: (1) Extract RoI feature; (2) Train and inference; and (3) Generate relation-aware trajectories.

Notes

Fix issue on unstable result [2021/10/07].

Environment

Anaconda 3, python 3.6.5, pytorch 0.4.1 (Higher version is OK once feature is ready) and cuda >= 9.0. For others libs, please refer to the file requirements.txt.

Install

Please create an env for this project using anaconda3 (should install anaconda first)

>conda create -n envname python=3.6.5 # Create
>conda activate envname # Enter
>pip install -r requirements.txt # Install the provided libs
>sh vRGV/lib/make.sh # Set the environment for detection, make sure you have nvcc

Data Preparation

Please download the data here. The folder ground_data should be at the same directory as vRGV. Please merge the downloaded vRGV folder with this repo.

Please download the videos here and extract the frames into ground_data. The directory should be like: ground_data/vidvrd/JPEGImages/ILSVRC2015_train_xxx/000000.JPEG.

Usage

Feature Extraction. (need about 100G storage! Because I dumped all the detected bboxes along with their features. It can be greatly reduced by changing detect_frame.py to return the top-40 bboxes and save them with .npz file.)

./detection.sh 0 val #(or train)

Sample video features:

cd tools
python sample_video_feature.py

Test. You can use our provided model to verify the feature and environment:

./ground.sh 0 val # Output the relation-aware spatio-temporal attention
python generate_track_link.py # Generate relation-aware trajectories with Viterbi algorithm.
python eval_ground.py # Evaluate the performance

You will get accuracy Acc_R: 24.58%.

Train. If you want to train the model from scratch. Please apply a two-stage training scheme: 1) train a basic model without relation attendance, and 2) load the reconstruction part of the pre-trained model to learn the whole model (with the same lr_rate). For implementation, please turn off/on [pretrain] in line 52 of ground.py, and switch between line 6 & 7 in ground_relation.py for 1st & 2nd stage training respectively. Also, you need to change the model files in line 69 & 70 of ground_relation.py to the best model obtained at the first stage for 2nd-stage training.

./ground.sh 0 train # Train the model with GPU id 0

The results maybe slightly different (+/-0.5%), For comparison, please follow the results reported in our paper.

Result Visualization

Query	bicycle-jump_beneath-person	person-feed-elephant	person-stand_above-bicycle	dog-watch-turtle
Result
Query	person-ride-horse	person-ride-bicycle	person-drive-car	bicycle-move_toward-car
Result

Citation

@inproceedings{xiao2020visual,
  title={Visual Relation Grounding in Videos},
  author={Xiao, Junbin and Shang, Xindi and Yang, Xun and Tang, Sheng and Chua, Tat-Seng},
  booktitle={European Conference on Computer Vision},
  pages={447--464},
  year={2020},
  organization={Springer}
}

License

NUS © NExT++

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

doc-doc / vRGV

Programming Languages

Labels

Projects that are alternatives of or similar to vRGV

Visual Relation Grounding in Videos

Notes

Environment

Install

Data Preparation

Usage

Result Visualization

Citation

License