All Projects → woodfrog → Actionrecognition

woodfrog / Actionrecognition

Licence: mit
Explore Action Recognition

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Actionrecognition

Robust-Deep-Learning-Pipeline
Deep Convolutional Bidirectional LSTM for Complex Activity Recognition with Missing Data. Human Activity Recognition Challenge. Springer SIST (2020)
Stars: ✭ 20 (-85.61%)
Mutual labels:  lstm, action-recognition
MTL-AQA
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]
Stars: ✭ 38 (-72.66%)
Mutual labels:  lstm, action-recognition
torch-lrcn
An implementation of the LRCN in Torch
Stars: ✭ 85 (-38.85%)
Mutual labels:  lstm, action-recognition
pose2action
experiments on classifying actions using poses
Stars: ✭ 24 (-82.73%)
Mutual labels:  lstm, action-recognition
Video Classification
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Stars: ✭ 543 (+290.65%)
Mutual labels:  lstm, action-recognition
Abstractive Summarization
Implementation of abstractive summarization using LSTM in the encoder-decoder architecture with local attention.
Stars: ✭ 128 (-7.91%)
Mutual labels:  lstm
Handwriting Synthesis
Implementation of "Generating Sequences With Recurrent Neural Networks" https://arxiv.org/abs/1308.0850
Stars: ✭ 135 (-2.88%)
Mutual labels:  lstm
Image Caption Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Stars: ✭ 126 (-9.35%)
Mutual labels:  lstm
Pytorch convlstm
convolutional lstm implementation in pytorch
Stars: ✭ 126 (-9.35%)
Mutual labels:  lstm
Document Classifier Lstm
A bidirectional LSTM with attention for multiclass/multilabel text classification.
Stars: ✭ 136 (-2.16%)
Mutual labels:  lstm
Ncrfpp
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Stars: ✭ 1,767 (+1171.22%)
Mutual labels:  lstm
Stockprediction
Plain Stock Close-Price Prediction via Graves LSTM RNNs
Stars: ✭ 134 (-3.6%)
Mutual labels:  lstm
Mmaction
An open-source toolbox for action understanding based on PyTorch
Stars: ✭ 1,711 (+1130.94%)
Mutual labels:  action-recognition
Easyocr
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Stars: ✭ 13,379 (+9525.18%)
Mutual labels:  lstm
I3d finetune
TensorFlow code for finetuning I3D model on UCF101.
Stars: ✭ 128 (-7.91%)
Mutual labels:  action-recognition
Vpilot
Scripts and tools to easily communicate with DeepGTAV. In the future a self-driving agent will be implemented.
Stars: ✭ 136 (-2.16%)
Mutual labels:  lstm
Chinese Chatbot
中文聊天机器人,基于10万组对白训练而成,采用注意力机制,对一般问题都会生成一个有意义的答复。已上传模型,可直接运行,跑不起来直播吃键盘。
Stars: ✭ 124 (-10.79%)
Mutual labels:  lstm
Deep Learning With Python
Example projects I completed to understand Deep Learning techniques with Tensorflow. Please note that I do no longer maintain this repository.
Stars: ✭ 134 (-3.6%)
Mutual labels:  lstm
Deeplearningfornlpinpytorch
An IPython Notebook tutorial on deep learning for natural language processing, including structure prediction.
Stars: ✭ 1,744 (+1154.68%)
Mutual labels:  lstm
Hake
HAKE: Human Activity Knowledge Engine (CVPR'18/19/20, NeurIPS'20)
Stars: ✭ 132 (-5.04%)
Mutual labels:  action-recognition

Action Recognition

This project aims to accurately recognize user's action in a series of video frames through combination of convolution neural nets, and long-short term memory neural nets.

Project Overview

  • This project explores prominent action recognition models with UCF-101 dataset

  • Perfomance of different models are compared and analysis of experiment results are provided

File Structure of the Repo

rnn_practice: Practices on RNN models and LSTMs with online tutorials and other useful resources

data: Training and testing data. (NOTE: please don't add large data files to this repo, add them to .gitignore)

models: Defining the architecture of models

utils: Utils scripts for dataset preparation, input pre-processing and other helper functions

train_CNN: Training CNN models. The program loads corresponding models, sets the training parameters and initializes network training

process_CNN: Processing video with CNN models. The CNN component is pre-trained and fixed during the training phase of LSTM cells. We can utilize the CNN model to pre-process frames of each video and store the intermediate results for feeding into LSTMs later. This procedure improves the training efficiency of the LRCN model significantly

train_RNN: Training the LRCN model

predict: Calculating the overall testing accuracy on the entire testing set

Models Description

  • Fine-tuned ResNet50 and trained solely with single-frame image data. Each frame of the video is considered as an image for training and testing, which generates a natural data augmentation. The ResNet50 is from keras repo, with weights pre-trained on Imagenet. ./models/finetuned_resnet.py

  • LRCN (CNN feature extractor, here we use the fine-tuned ResNet50 and LSTMs). The input of LRCN is a sequence of frames uniformly extracted from each video. The fine-tuned ResNet directly uses the result of [1] without extra training (C.F.Long-term recurrent convolutional network).

    Produce intermediate data using ./process_CNN.py and then train and predict with ./models/RNN.py

  • Simple CNN model trained with stacked optical flow data (generate one stacked optical flow from each of the video, and use the optical flow as the input of the network). ./models/temporal_CNN.py

  • Two-stream model, combines the models in [2] and [3] with an extra fusion layer that output the final result. [3] and [4] refer to this paper ./models/two_stream.py

Citations

If you use this code or ideas from the paper for your research, please cite the following papers:

@inproceedings{lrcn2014,
   Author = {Jeff Donahue and Lisa Anne Hendricks and Sergio Guadarrama
             and Marcus Rohrbach and Subhashini Venugopalan and Kate Saenko
             and Trevor Darrell},
   Title = {Long-term Recurrent Convolutional Networks
            for Visual Recognition and Description},
   Year  = {2015},
   Booktitle = {CVPR}
}
@article{DBLP:journals/corr/SimonyanZ14,
  author    = {Karen Simonyan and
               Andrew Zisserman},
  title     = {Two-Stream Convolutional Networks for Action Recognition in Videos},
  journal   = {CoRR},
  volume    = {abs/1406.2199},
  year      = {2014},
  url       = {http://arxiv.org/abs/1406.2199},
  archivePrefix = {arXiv},
  eprint    = {1406.2199},
  timestamp = {Mon, 13 Aug 2018 16:47:39 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/SimonyanZ14},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].