All Projects → leftthomas → R2Plus1D-C3D

leftthomas / R2Plus1D-C3D

Licence: other
A PyTorch implementation of R2Plus1D and C3D based on CVPR 2017 paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition" and CVPR 2014 paper "Learning Spatiotemporal Features with 3D Convolutional Networks"

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to R2Plus1D-C3D

Hake Action
As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).
Stars: ✭ 72 (+33.33%)
Mutual labels:  activity-recognition
Timeception
Timeception for Complex Action Recognition, CVPR 2019 (Oral Presentation)
Stars: ✭ 153 (+183.33%)
Mutual labels:  activity-recognition
Video Caffe
Video-friendly caffe -- comes with the most recent version of Caffe (as of Jan 2019), a video reader, 3D(ND) pooling layer, and an example training script for C3D network and UCF-101 data
Stars: ✭ 172 (+218.52%)
Mutual labels:  activity-recognition
M Pact
A one stop shop for all of your activity recognition needs.
Stars: ✭ 85 (+57.41%)
Mutual labels:  activity-recognition
Hake
HAKE: Human Activity Knowledge Engine (CVPR'18/19/20, NeurIPS'20)
Stars: ✭ 132 (+144.44%)
Mutual labels:  activity-recognition
Motion Sense
MotionSense Dataset for Human Activity and Attribute Recognition ( time-series data generated by smartphone's sensors: accelerometer and gyroscope)
Stars: ✭ 159 (+194.44%)
Mutual labels:  activity-recognition
Wdk
The Wearables Development Toolkit - a development environment for activity recognition applications with sensor signals
Stars: ✭ 68 (+25.93%)
Mutual labels:  activity-recognition
Gait-Recognition-Using-Smartphones
Deep Learning-Based Gait Recognition Using Smartphones in the Wild
Stars: ✭ 77 (+42.59%)
Mutual labels:  activity-recognition
Awesome Activity Prediction
Paper list of activity prediction and related area
Stars: ✭ 147 (+172.22%)
Mutual labels:  activity-recognition
C3d Keras
C3D for Keras + TensorFlow
Stars: ✭ 171 (+216.67%)
Mutual labels:  activity-recognition
T3d
Temporal 3D ConvNet
Stars: ✭ 97 (+79.63%)
Mutual labels:  activity-recognition
Machinelearning
一些关于机器学习的学习资料与研究介绍
Stars: ✭ 1,707 (+3061.11%)
Mutual labels:  activity-recognition
Deep Learning Activity Recognition
A tutorial for using deep learning for activity recognition (Pytorch and Tensorflow)
Stars: ✭ 159 (+194.44%)
Mutual labels:  activity-recognition
Hake Action Torch
HAKE-Action in PyTorch
Stars: ✭ 74 (+37.04%)
Mutual labels:  activity-recognition
Charades Algorithms
Activity Recognition Algorithms for the Charades Dataset
Stars: ✭ 181 (+235.19%)
Mutual labels:  activity-recognition
React Native Activity Recognition
React Native wrapper for the Activity Recognition API.
Stars: ✭ 69 (+27.78%)
Mutual labels:  activity-recognition
Fall Detection
Human Fall Detection from CCTV camera feed
Stars: ✭ 154 (+185.19%)
Mutual labels:  activity-recognition
gaitutils
Extract and visualize gait data
Stars: ✭ 28 (-48.15%)
Mutual labels:  c3d
Step
STEP: Spatio-Temporal Progressive Learning for Video Action Detection. CVPR'19 (Oral)
Stars: ✭ 196 (+262.96%)
Mutual labels:  activity-recognition
Rnn For Human Activity Recognition Using 2d Pose Input
Activity Recognition from 2D pose using an LSTM RNN
Stars: ✭ 165 (+205.56%)
Mutual labels:  activity-recognition

R2Plus1D-C3D

A PyTorch implementation of R2Plus1D and C3D based on CVPR 2017 paper A Closer Look at Spatiotemporal Convolutions for Action Recognition and CVPR 2014 paper Learning Spatiotemporal Features with 3D Convolutional Networks.

Requirements

conda install pytorch torchvision -c pytorch
  • opencv
conda install opencv
  • rarfile
pip install rarfile
  • rar
sudo apt install rar
  • unrar
sudo apt install unrar
  • ffmpeg
sudo apt install build-essential openssl libssl-dev autoconf automake cmake git-core libass-dev libfreetype6-dev libsdl2-dev libtool libva-dev libvdpau-dev libvorbis-dev libxcb1-dev libxcb-shm0-dev libxcb-xfixes0-dev pkg-config texinfo wget zlib1g-dev nasm yasm libx264-dev libx265-dev libnuma-dev libvpx-dev libfdk-aac-dev libmp3lame-dev libopus-dev
wget https://ffmpeg.org/releases/ffmpeg-4.1.3.tar.bz2
tar -jxvf ffmpeg-4.1.3.tar.bz2
cd ffmpeg-4.1.3/
./configure --prefix="../build" --enable-static --enable-gpl --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree --enable-openssl
make -j4
make install
sudo cp ../build/bin/ffmpeg /usr/local/bin/ 
rm -rf ../ffmpeg-4.1.3/ ../ffmpeg-4.1.3.tar.bz2 ../build/
  • youtube-dl
pip install youtube-dl
  • joblib
pip install joblib
  • PyTorchNet
pip install git+https://github.com/pytorch/tnt.git@master

Datasets

The datasets are coming from UCF101HMDB51 and KINETICS600. Download UCF101 and HMDB51 datasets with train/val/test split files into data directory. We use the split1 to split files. Run misc.py to preprocess these datasets.

For KINETICS600 dataset, first download train/val/test split files into data directory, then run download.py to download and preprocess this dataset.

Usage

Train Model

visdom -logging_level WARNING & python train.py --num_epochs 20 --pre_train kinetics600_r2plus1d.pth
optional arguments:
--data_type                   dataset type [default value is 'ucf101'](choices=['ucf101', 'hmdb51', 'kinetics600'])
--gpu_ids                     selected gpu [default value is '0,1']
--model_type                  model type [default value is 'r2plus1d'](choices=['r2plus1d', 'c3d'])
--batch_size                  training batch size [default value is 8]
--num_epochs                  training epochs number [default value is 100]
--pre_train                   used pre-trained model epoch name [default value is None]

Visdom now can be accessed by going to 127.0.0.1:8097 in your browser.

Inference Video

python inference.py --video_name data/ucf101/ApplyLipstick/v_ApplyLipstick_g04_c02.avi
optional arguments:
--data_type                   dataset type [default value is 'ucf101'](choices=['ucf101', 'hmdb51', 'kinetics600'])
--model_type                  model type [default value is 'r2plus1d'](choices=['r2plus1d', 'c3d'])
--video_name                  test video name
--model_name                  model epoch name [default value is 'ucf101_r2plus1d.pth']

The inferences will show in a pop up window.

Benchmarks

Adam optimizer (lr=0.0001) is used with learning rate scheduling.

For ucf101 and hmdb51 dataset, the models are trained with 100 epochs and batch size of 8 on one NVIDIA Tesla V100 (32G) GPU.

For kinetics600 dataset, the models are trained with 100 epochs and batch size of 32 on two NVIDIA Tesla V100 (32G) GPU. Because the training time is too long, so this experiment have not been finished.

The videos are preprocessed as 32 frames of 128x128, and cropped to 112x112.

Dataset UCF101 HMDB51 Kinetics600
Num. of Train Videos 9,537 3,570 375,008
Num. of Val Videos 756 1,666 28,638
Num. of Test Videos 3,783 1,530 56,982
Num. of Classes 101 51 600
Accuracy (R2Plus1D) 63.60% 24.97% \
Accuracy (C3D) 51.63% 25.10% \
Num. of Parameters (R2Plus1D) 33,220,990 33,195,340 33,476,977
Num. of Parameters (C3D) 78,409,573 78,204,723 80,453,976
Training Time (R2Plus1D) 19.3h 7.3h 350h
Training Time (C3D) 10.9h 4.1h 190h

Results

The train/val/test loss、accuracy and confusion matrix are showed on visdom. The pretrained models can be downloaded from BaiduYun (access code: ducr).

UCF101

R2Plus1D result C3D result

HMDB51

R2Plus1D result C3D result

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].