All Projects → garythung → torch-lrcn

garythung / torch-lrcn

Licence: MIT license
An implementation of the LRCN in Torch

Programming Languages

lua
6591 projects

Projects that are alternatives of or similar to torch-lrcn

Pytorch Learners Tutorial
PyTorch tutorial for learners
Stars: ✭ 97 (+14.12%)
Mutual labels:  torch, lstm
Video Classification
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Stars: ✭ 543 (+538.82%)
Mutual labels:  lstm, action-recognition
Actionrecognition
Explore Action Recognition
Stars: ✭ 139 (+63.53%)
Mutual labels:  lstm, action-recognition
Pytorch gbw lm
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
Stars: ✭ 101 (+18.82%)
Mutual labels:  torch, lstm
Robust-Deep-Learning-Pipeline
Deep Convolutional Bidirectional LSTM for Complex Activity Recognition with Missing Data. Human Activity Recognition Challenge. Springer SIST (2020)
Stars: ✭ 20 (-76.47%)
Mutual labels:  lstm, action-recognition
MTL-AQA
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]
Stars: ✭ 38 (-55.29%)
Mutual labels:  lstm, action-recognition
Alphaction
Spatio-Temporal Action Localization System
Stars: ✭ 221 (+160%)
Mutual labels:  torch, action-recognition
pose2action
experiments on classifying actions using poses
Stars: ✭ 24 (-71.76%)
Mutual labels:  lstm, action-recognition
Intelligent-Bangla-typing-assistant
Artificially intelligent typing assistant that suggest the next word using user typing history and full sentence using LSTM . Works with any environment (MS word, notepad, coding IDEs or anything)
Stars: ✭ 13 (-84.71%)
Mutual labels:  lstm
CharLSTM
Bidirectional Character LSTM for Sentiment Analysis - Tensorflow Implementation
Stars: ✭ 49 (-42.35%)
Mutual labels:  lstm
myDL
Deep Learning
Stars: ✭ 18 (-78.82%)
Mutual labels:  lstm
FlowNetTorch
Torch implementation of Fischer et al. FlowNet training code
Stars: ✭ 27 (-68.24%)
Mutual labels:  torch
TorchGA
Train PyTorch Models using the Genetic Algorithm with PyGAD
Stars: ✭ 47 (-44.71%)
Mutual labels:  torch
object-tracking
Multiple Object Tracking System in Keras + (Detection Network - YOLO)
Stars: ✭ 89 (+4.71%)
Mutual labels:  lstm
SoH estimation of Lithium-ion battery
State of health (SOH) prediction for Lithium-ion batteries using regression and LSTM
Stars: ✭ 28 (-67.06%)
Mutual labels:  lstm
Rus-SpeechRecognition-LSTM-CTC-VoxForge
Распознавание речи русского языка используя Tensorflow, обучаясь на базе Voxforge
Stars: ✭ 50 (-41.18%)
Mutual labels:  lstm
fsauor2018
基于LSTM网络与自注意力机制对中文评论进行细粒度情感分析
Stars: ✭ 36 (-57.65%)
Mutual labels:  lstm
TextRankPlus
基于深度学习的中文NLP工具
Stars: ✭ 36 (-57.65%)
Mutual labels:  lstm
TadTR
End-to-end Temporal Action Detection with Transformer. [Under review for a journal publication]
Stars: ✭ 55 (-35.29%)
Mutual labels:  action-recognition
Baby-Action-Detection-for-Safety-System-Prototype
Prototype for Baby Action Detection and Classification
Stars: ✭ 23 (-72.94%)
Mutual labels:  lstm

torch-lrcn

torch-lrcn provides a framework in Torch7 for action recognition using Long-term Recurrent Convolutional Networks. The LRCN model was proposed by Jeff Donahue et. al in this paper. Find more information about their Caffe code and experiments here.

Note that currently this library does not support fine-grained action detection (i.e. a specific label for each frame). The detection accuracy it computes is simply the frame accuracy using only a single label for each video.

Installation

System setup

You need ffmpeg accessible via command line. Find installation guides here.

Lua setup

All code is written in Lua using Torch; you can find installation instructions here. You'll need the following Lua packages:

After installing Torch, you can install / update these packages by running the following:

# Install using Luarocks
luarocks install torch
luarocks install nn
luarocks install optim
luarocks install image
luarocks install ffmpeg

We also need @jcjohnson's LSTM module, which is already included in this repository.

CUDA support

Because training takes awhile, you will want to use CUDA to get results in a reasonable amount of time. To enable GPU acceleration with CUDA, you'll first need to install CUDA 6.5 or higher. Find CUDA installations here.

Then you need to install following Lua packages for CUDA:

You can install / update the Lua packages by running:

luarocks install cutorch
luarocks install cunn

Usage

Training and testing a model requires some text files. The scripts assume that there are valid text files detailed below, and that all videos have the same native resolution.

Step 1: Ready the data

The training step requires a text file for each of the training, validation, and testing splits. The structure of these text files is identical.

Example line: <path to video> <label>

Example file:

/path/to/video1.avi 1
/path/to/video2.avi 4
...
/path/to/video10.avi 3

Step 2: Train the model

With the text files ready, we can begin training using train.lua. This will take quite some time because it is training a CNN and LSTM step.

You can run the training script, at minimum, like this:

th train.lua -trainList train.txt -valList val.txt -testList test.txt -numClasses 101 -videoHeight 240 -videoWidth 320

By default, this will dump 8 random frames at 5 FPS in native resolution representing semi-equally sized chunks for each video, train for 30 epochs, and save checkpoints to the trained models with names like checkpoints/checkpoint_3.t7. This also runs with CUDA by default. Run on CPU with -cuda 0. The default values are tuned to fit on an NVIDIA GPU with 4GB VRAM.

Some important parameters for training to tune are:

  • -scaledHeight: optional downscaling
  • -scaledWidth: optional downscaling
  • -desiredFPS: FPS rate to convert videos to
  • -seqLength: number of frames for each video
  • -batchSize: number of videos per batch
  • -numEpochs: number of epochs to train for
  • -learningRate: learning rate
  • -lrDecayFactor: multiplier for the learning rate decay
  • -lrDecayEvery: decay the learning rate after every n epochs

An example of a more specific run:

th train.lua -trainList train.txt -valList val.txt -testList test.txt -numClasses 101 -videoHeight 240 -videoWidth 320 -scaledHeight 224 -scaledWidth 224 -seqLength 16 -batchSize 4 -numEpochs 15

Step 3: Test the model

After training a model, you can compute the action recognition and detection accuracies using a model you trained. Do this by running test.lua as such:

th test.lua -checkpoint checkpoints/checkpoint_final.t7

By default, this will load the trained checkpoint checkpoints/checkpoint_final.t7 from the training step and then compute the action detection and recognition accuracies for the test split. This also runs with CUDA by default. Run on CPU with -cuda 0.

The list of parameters is:

  • -checkpoint: path to a checkpoint file (default: '')
  • -split: name of split to test on (default: 'test')
  • -cuda: run with CUDA (default: 1)

Acknowledgments

  • J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. In CVPR, 2015.
  • Justin Johnson for his torch-rnn library, which this library was heavily modeled after.
  • Serena Yeung for the project idea, direction, and advice.
  • Stanford University CS 231N course staff for granting funds for AWS EC2 testing.

TODOs

  • Separate data preprocessing into its own step.
  • Parallelize data loading.
  • Write more documentation in a doc folder about training flags.
  • Implement fine grained action detection..
  • Add unit tests.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].