Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → bryanyzhu → Two Stream Pytorch

bryanyzhu / Two Stream Pytorch

Licence: mit

PyTorch implementation of two-stream networks for video action recognition

Programming Languages

139335 projects - #7 most used programming language

Labels

pytorch video action-recognition

Projects that are alternatives of or similar to Two Stream Pytorch

End-to-end Temporal Action Detection with Transformer. [Under review for a journal publication]

Stars: ✭ 55 (-87.15%)

Mutual labels: action-recognition

auditory-slow-fast

Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch

Stars: ✭ 46 (-89.25%)

Mutual labels: action-recognition

Awesome Action Recognition

A curated list of action recognition and related area resources

Stars: ✭ 3,202 (+648.13%)

Mutual labels: action-recognition

An implementation of the LRCN in Torch

Stars: ✭ 85 (-80.14%)

Mutual labels: action-recognition

基于kinect 的人体动作识别

Stars: ✭ 129 (-69.86%)

Mutual labels: action-recognition

DIN-Group-Activity-Recognition-Benchmark

A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Stars: ✭ 26 (-93.93%)

Mutual labels: action-recognition

two-stream-fusion-for-action-recognition-in-videos

No description or website provided.

Stars: ✭ 80 (-81.31%)

Mutual labels: action-recognition

Video Understanding Dataset

A collection of recent video understanding datasets, under construction!

Stars: ✭ 387 (-9.58%)

Mutual labels: action-recognition

vlog action recognition

Identifying Visible Actions in Lifestyle Vlogs

Stars: ✭ 13 (-96.96%)

Mutual labels: action-recognition

Realtime Action Detection

This repository host the code for real-time action detection paper

Stars: ✭ 271 (-36.68%)

Mutual labels: action-recognition

NTU-X, which is an extended version of popular NTU dataset

Stars: ✭ 55 (-87.15%)

Mutual labels: action-recognition

ICCV 19 Grouped Spatial-Temporal Aggretation for Efficient Action Recognition

Stars: ✭ 40 (-90.65%)

Mutual labels: action-recognition

[ICCV 2021 Oral] Deep Evidential Action Recognition

Stars: ✭ 36 (-91.59%)

Mutual labels: action-recognition

Learning Video Representations from Correspondence Proposals (CVPR 2019 Oral)

Stars: ✭ 93 (-78.27%)

Mutual labels: action-recognition

Action Recognition Visual Attention

Action recognition using soft attention based deep recurrent neural networks

Stars: ✭ 350 (-18.22%)

Mutual labels: action-recognition

Robust-Deep-Learning-Pipeline

Deep Convolutional Bidirectional LSTM for Complex Activity Recognition with Missing Data. Human Activity Recognition Challenge. Springer SIST (2020)

Stars: ✭ 20 (-95.33%)

Mutual labels: action-recognition

tennis action recognition

Using deep learning to perform action recognition in the sport of tennis.

Stars: ✭ 17 (-96.03%)

Mutual labels: action-recognition

Realtime Action Recognition

Apply ML to the skeletons from OpenPose; 9 actions; multiple people. (WARNING: I'm sorry that this is only good for course demo, not for real world applications !!! Those ary very difficult !!!)

Stars: ✭ 417 (-2.57%)

Mutual labels: action-recognition

Awesome Skeleton Based Action Recognition

Skeleton-based Action Recognition

Stars: ✭ 360 (-15.89%)

Mutual labels: action-recognition

3d Resnets Pytorch

3D ResNets for Action Recognition (CVPR 2018)

Stars: ✭ 3,169 (+640.42%)

Mutual labels: action-recognition

View All Similar Projects ➔

PyTorch implementation of popular two-stream frameworks for video action recognition

Current release is the PyTorch implementation of the "Towards Good Practices for Very Deep Two-Stream ConvNets". You can refer to paper for more details at Arxiv.

If you find this implementation useful in your work, please acknowledge it appropriately and cite the paper or code accordingly:

@article{zhu_arxiv2020_comprehensiveVideo,
  title={A Comprehensive Study of Deep Video Action Recognition},
  author={Yi Zhu, Xinyu Li, Chunhui Liu, Mohammadreza Zolfaghari, Yuanjun Xiong, Chongruo Wu, Zhi Zhang, Joseph Tighe, R. Manmatha, Mu Li},
  journal={arXiv preprint arXiv:2012.06567},
  year={2020}
}

@inproceedings{wang_eccv2016_tsn,
  title={Temporal Segment Networks: Towards Good Practices for Deep Action Recognition},
  author={Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc Van Gool},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2016}
}

@inproceedings{zhu_accv2018_hidden,
  title={Hidden Two-Stream Convolutional Networks for Action Recognition},
  author={Yi Zhu, Zhenzhong Lan, Shawn Newsam, Alexander G. Hauptmann},
  booktitle={Asian Conference on Computer Vision (ACCV)},
  url={https://arxiv.org/abs/1704.00389}
  year={2018}
}

If you are looking for a good-to-use codebase with a large model zoo, please checkout the video toolkit at GluonCV. We have SOTA model implementations (TSN, I3D, NLN, SlowFast, etc.) for popular datasets (Kinetics400, UCF101, Something-Something-v2, etc.) in both PyTorch and MXNet. We also have accompaning survey paper and video tutorial.

Installation

Tested on PyTorch:

OS: Ubuntu 16.04
Python: 3.5
CUDA: 8.0
OpenCV3
dense_flow

To successfully install dense_flow(branch opencv-3.1), you probably need to install opencv3 with opencv_contrib. (For opencv-2.4.13, dense_flow will be installed more easily without opencv_contrib, but you should run code of this repository under opencv3 to avoid error)

Code also works for Python 2.7.

Data Preparation

Download data UCF101 and use unrar x UCF101.rar to extract the videos.

Convert video to frames and extract optical flow

python build_of.py --src_dir ./UCF-101 --out_dir ./ucf101_frames --df_path <path to dense_flow>

build file lists for training and validation

python build_file_list.py --frame_path ./ucf101_frames --out_list_path ./settings

Training

For spatial stream (single RGB frame), run:

python main_single_gpu.py DATA_PATH -m rgb -a rgb_resnet152 --new_length=1
--epochs 250 --lr 0.001 --lr_steps 100 200

For temporal stream (10 consecutive optical flow images), run:

python main_single_gpu.py DATA_PATH -m flow -a flow_resnet152
--new_length=10 --epochs 350 --lr 0.001 --lr_steps 200 300

DATA_PATH is where you store RGB frames or optical flow images. Change the parameters passing to argparse as you need.

Testing

Go into "scripts/eval_ucf101_pytorch" folder, run python spatial_demo.py to obtain spatial stream result, and run python temporal_demo.py to obtain temporal stream result. Change those label files before running the script.

For ResNet152, I can obtain a 85.60% accuracy for spatial stream and 85.71% for temporal stream on the split 1 of UCF101 dataset. The result looks promising. Pre-trained RGB_ResNet152 Model Pre-trained Flow_ResNet152 Model

For VGG16, I can obtain a 78.5% accuracy for spatial stream and 80.4% for temporal stream on the split 1 of UCF101 dataset. The spatial result is close to the number reported in original paper, but flow result is 5% away. There are several reasons, maybe the pretained VGG16 model in PyTorch is differnt from Caffe, maybe there are subtle bugs in my VGG16 flow model. Welcome any comments if you found the reason why there is a performance gap. Pre-trained RGB_VGG16 Model Pre-trained Flow_VGG16 Model

I am experimenting with memory efficient DenseNet now, will release the code in a couple of days. Stay tuned.

Related Projects

TSN: Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

Hidden Two-Stream: Hidden Two-Stream Convolutional Networks for Action Recognition

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 428

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗