All Projects → chihyaoma → Activity Recognition With Cnn And Rnn

chihyaoma / Activity Recognition With Cnn And Rnn

Licence: mit
Temporal Segments LSTM and Temporal-Inception for Activity Recognition

Programming Languages

lua
6591 projects

Projects that are alternatives of or similar to Activity Recognition With Cnn And Rnn

Pytorch Learners Tutorial
PyTorch tutorial for learners
Stars: ✭ 97 (-76.63%)
Mutual labels:  convolutional-neural-networks, torch, lstm-neural-networks
Ti Pooling
TI-pooling: transformation-invariant pooling for feature learning in Convolutional Neural Networks
Stars: ✭ 119 (-71.33%)
Mutual labels:  convolutional-neural-networks, torch
Automatic Image Captioning
Generating Captions for images using Deep Learning
Stars: ✭ 84 (-79.76%)
Mutual labels:  convolutional-neural-networks, lstm-neural-networks
Timeception
Timeception for Complex Action Recognition, CVPR 2019 (Oral Presentation)
Stars: ✭ 153 (-63.13%)
Mutual labels:  convolutional-neural-networks, activity-recognition
Machine Learning Curriculum
💻 Make machines learn so that you don't have to struggle to program them; The ultimate list
Stars: ✭ 761 (+83.37%)
Mutual labels:  convolutional-neural-networks, torch
Image Captioning
Image Captioning: Implementing the Neural Image Caption Generator with python
Stars: ✭ 52 (-87.47%)
Mutual labels:  convolutional-neural-networks, lstm-neural-networks
Image Caption Generator
[DEPRECATED] A Neural Network based generative model for captioning images using Tensorflow
Stars: ✭ 141 (-66.02%)
Mutual labels:  convolutional-neural-networks, lstm-neural-networks
Abnormal event detection
Abnormal Event Detection in Videos using SpatioTemporal AutoEncoder
Stars: ✭ 139 (-66.51%)
Mutual labels:  convolutional-neural-networks, lstm-neural-networks
Awesome Tensorlayer
A curated list of dedicated resources and applications
Stars: ✭ 248 (-40.24%)
Mutual labels:  convolutional-neural-networks, lstm-neural-networks
Artificial Intelligence Deep Learning Machine Learning Tutorials
A comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.
Stars: ✭ 2,966 (+614.7%)
Mutual labels:  convolutional-neural-networks, torch
glimpse clouds
Pytorch implementation of the paper "Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points", F. Baradel, C. Wolf, J. Mille , G.W. Taylor, CVPR 2018
Stars: ✭ 30 (-92.77%)
Mutual labels:  activity-recognition, video-understanding
Sketch simplification
Models and code related to sketch simplification of rough sketches.
Stars: ✭ 531 (+27.95%)
Mutual labels:  convolutional-neural-networks, torch
Step
STEP: Spatio-Temporal Progressive Learning for Video Action Detection. CVPR'19 (Oral)
Stars: ✭ 196 (-52.77%)
Mutual labels:  activity-recognition, video-understanding
Awesome Activity Prediction
Paper list of activity prediction and related area
Stars: ✭ 147 (-64.58%)
Mutual labels:  activity-recognition, video-understanding
Personality Detection
Implementation of a hierarchical CNN based model to detect Big Five personality traits
Stars: ✭ 338 (-18.55%)
Mutual labels:  convolutional-neural-networks, lstm-neural-networks
Motion Sense
MotionSense Dataset for Human Activity and Attribute Recognition ( time-series data generated by smartphone's sensors: accelerometer and gyroscope)
Stars: ✭ 159 (-61.69%)
Mutual labels:  convolutional-neural-networks, activity-recognition
Awesome Action Recognition
A curated list of action recognition and related area resources
Stars: ✭ 3,202 (+671.57%)
Mutual labels:  activity-recognition, video-understanding
Trashnet
Dataset of images of trash; Torch-based CNN for garbage image classification
Stars: ✭ 368 (-11.33%)
Mutual labels:  convolutional-neural-networks, torch
Human Activity Recognition Using Cnn
Convolutional Neural Network for Human Activity Recognition in Tensorflow
Stars: ✭ 382 (-7.95%)
Mutual labels:  convolutional-neural-networks
Tf Pose Estimation
Deep Pose Estimation implemented using Tensorflow with Custom Architectures for fast inference.
Stars: ✭ 3,856 (+829.16%)
Mutual labels:  convolutional-neural-networks

Activity Recognition with RNN and Temporal-ConvNet

License: MIT

Chih-Yao Ma, Min-Hung Chen
(equal contribution)

Codes for the paper:
TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition
(Accepted in the journal Signal Processing: Image Communication, 2019)

Project:
Activity Recognition with RNN and Temporal-ConvNet


Abstract

In this work, we demonstrate a strong baseline two-stream ConvNet using ResNet-101. We use this baseline to thoroughly examine the use of both RNNs and Temporal-ConvNets for extracting spatiotemporal information. Building upon our experimental results, we then propose and investigate two different networks to further integrate spatiotemporal information: 1) temporal segment RNN and 2) Inception-style Temporal-ConvNet.

Our analysis identifies specific limitations for each method that could form the basis of future work. Our experimental results on UCF101 and HMDB51 datasets achieve state-of-the-art performances, 94.1% and 69.0%, respectively, without requiring extensive temporal augmentation.


How we tackle Activity Recognition problem?


Demo

The GIFs demonstrate the top-3 predictions results of our TS-LSTM and Temporal-Inception methods. The text on the top is the ground truth, three texts are the predictions for each of the method, and the bar right next to the predictions are how confident the model makes predictions.


Dataset

We are currently using UCF101 and HMDB51 dataset for our project. You can directly download the videos here:

UCF101 HMDB51
RGB link link
TV-L1 link link

Prerequisites


Usage

We proposed two different methods to train the models for activity recognition: TS-LSTM and Temporal-Inception.

Inputs

Our models takes the feature vectors generated by the first stage two-stream ConvNet as input for training. You can generate the features using our codes under "/CNN-Pred-Feat/". You can also download the feature vectors generated by us. (please refer to the Dropbox link below.) We followed the training/testing splits from UCF101 and HMDB51. If you would like to compare with our results, please use the same training and testing list, as it will affect your overall performance a lot.

  • Features for training:
UCF101 HMDB51
RGB sp1 sp2 sp3 sp1 sp2 sp3
TV-L1 sp1 sp2 sp3 sp1 sp2 sp3
  • Features for testing:
UCF101 HMDB51
RGB sp1 sp2 sp3 sp1 sp2 sp3
TV-L1 sp1 sp2 sp3 sp1 sp2 sp3

Train with RNN

We use the RNN library provided by Element-Research. Simply install it by:

$ luarocks install rnn

After you downloaded the feature vectors, please modify the code in ./RNN/data-ucf101.lua to the director where you put your feature vector files.

To start the training process, go to ./RNN and simply execute:

$ th main.lua -pastalogName 'model_RNN' -nGPU 1 -dataset 'ucf101' -split '1' -fcSize '{0}' -hiddenSize '{512}' -lstm -spatFeatDir '<path/to/feature/>' -tempFeatDir '<path/to/feature/>'

The training and testing loss will be reported, and the results will be saved into log files. The learning rate and best testing accuracy will be reported each epoch if there is any update.

Train with Temporal-ConvNet

To start the training process, go to ./Temporal-ConvNet and simply execute:

$ th run.lua -o <output_folder_name> --dataset <dataset-name>

For more details and hyper-parameter tuning, please refer to the readme file in the folder ./Temporal-ConvNet/.

You also need to modify the code in ./Temporal-ConvNet/data-2Stream.lua to the director where you put your feature vector files.

The training and testing performance will be plotted, and the results will be saved into log files. The best testing accuracy will be reported each epoch if there is any update.


Can I train with frame-level features?

To standardize the comparison, the above features are equally sampled across each video. If you would like to train with frame-level features extracted at 25fps for all videos in UCF101. Please refer to Temporal Augmentation using frame-level features with RNN.


Citation

@article{ma2019ts,
  title={Ts-lstm and temporal-inception: Exploiting spatiotemporal dynamics for activity recognition},
  author={Ma, Chih-Yao and Chen, Min-Hung and Kira, Zsolt and AlRegib, Ghassan},
  journal={Signal Processing: Image Communication},
  volume={71},
  pages={76--87},
  year={2019},
  publisher={Elsevier}
}

Acknowledgment

This work was initialized as a class project for deep learning class in Georgia Tech 2016 Spring. We were teamed up with Hao Yan and Casey Battaglino to work on this class project, who have been a great help and provide valuable discussions as we go long this class project.

Please contact us if you have any questions.

Chih-Yao Ma at [email protected] or [LinkedIn]
Min-Hung Chen at [email protected] or [LinkedIn]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].