All Projects → ahmetgunduz → Real Time Gesrec

ahmetgunduz / Real Time Gesrec

Licence: mit
Real-time Hand Gesture Recognition with PyTorch on EgoGesture, NvGesture, Jester, Kinetics and UCF101

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Real Time Gesrec

Iseebetter
iSeeBetter: Spatio-Temporal Video Super Resolution using Recurrent-Generative Back-Projection Networks | Python3 | PyTorch | GANs | CNNs | ResNets | RNNs | Published in Springer Journal of Computational Visual Media, September 2020, Tsinghua University Press
Stars: ✭ 202 (-40.41%)
Mutual labels:  cnn, resnet, video-processing
Iresnet
Improved Residual Networks (https://arxiv.org/pdf/2004.04989.pdf)
Stars: ✭ 163 (-51.92%)
Mutual labels:  deep-neural-networks, cnn, resnet
Paddlex
PaddlePaddle End-to-End Development Toolkit(『飞桨』深度学习全流程开发工具)
Stars: ✭ 3,399 (+902.65%)
Mutual labels:  deep-neural-networks, resnet
Livianet
This repository contains the code of LiviaNET, a 3D fully convolutional neural network that was employed in our work: "3D fully convolutional networks for subcortical segmentation in MRI: A large-scale study"
Stars: ✭ 143 (-57.82%)
Mutual labels:  deep-neural-networks, cnn
Octconv.pytorch
PyTorch implementation of Octave Convolution with pre-trained Oct-ResNet and Oct-MobileNet models
Stars: ✭ 229 (-32.45%)
Mutual labels:  deep-neural-networks, resnet
Pytorch convlstm
convolutional lstm implementation in pytorch
Stars: ✭ 126 (-62.83%)
Mutual labels:  deep-neural-networks, cnn
Voice activity detection
Voice Activity Detection based on Deep Learning & TensorFlow
Stars: ✭ 132 (-61.06%)
Mutual labels:  deep-neural-networks, resnet
Tf Adnet Tracking
Deep Object Tracking Implementation in Tensorflow for 'Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning(CVPR 2017)'
Stars: ✭ 162 (-52.21%)
Mutual labels:  deep-neural-networks, cnn
Video2description
Video to Text: Generates description in natural language for given video (Video Captioning)
Stars: ✭ 107 (-68.44%)
Mutual labels:  deep-neural-networks, video-processing
Hierarchical Attention Networks Pytorch
Hierarchical Attention Networks for document classification
Stars: ✭ 239 (-29.5%)
Mutual labels:  deep-neural-networks, cnn
Lcnn
LCNN: End-to-End Wireframe Parsing
Stars: ✭ 234 (-30.97%)
Mutual labels:  deep-neural-networks, cnn
Resnetcam Keras
Keras implementation of a ResNet-CAM model
Stars: ✭ 269 (-20.65%)
Mutual labels:  cnn, resnet
Hyperdensenet
This repository contains the code of HyperDenseNet, a hyper-densely connected CNN to segment medical images in multi-modal image scenarios.
Stars: ✭ 124 (-63.42%)
Mutual labels:  deep-neural-networks, cnn
Lenet 5
PyTorch implementation of LeNet-5 with live visualization
Stars: ✭ 122 (-64.01%)
Mutual labels:  deep-neural-networks, cnn
Adnet
Attention-guided CNN for image denoising(Neural Networks,2020)
Stars: ✭ 135 (-60.18%)
Mutual labels:  deep-neural-networks, cnn
Robust Lane Detection
Stars: ✭ 110 (-67.55%)
Mutual labels:  deep-neural-networks, cnn
Yolo V2 Pytorch
YOLO for object detection tasks
Stars: ✭ 302 (-10.91%)
Mutual labels:  deep-neural-networks, cnn
Models
DLTK Model Zoo
Stars: ✭ 101 (-70.21%)
Mutual labels:  deep-neural-networks, cnn
Tensorflow2.0 Examples
🙄 Difficult algorithm, Simple code.
Stars: ✭ 1,397 (+312.09%)
Mutual labels:  deep-neural-networks, resnet
Pyconv
Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition (https://arxiv.org/pdf/2006.11538.pdf)
Stars: ✭ 231 (-31.86%)
Mutual labels:  deep-neural-networks, cnn

Real-time Hand Gesture Recognition with 3D CNNs

PyTorch implementation of the article Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks and Resource Efficient 3D Convolutional Neural Networks, codes and pretrained models.

simulation results

Figure: A real-time simulation of the architecture with input video from EgoGesture dataset (on left side) and real-time (online) classification scores of each gesture (on right side) are shown, where each class is annotated with different color.

This code includes training, fine-tuning and testing on EgoGesture and nvGesture datasets.
Note that the code only includes ResNet-10, ResNetL-10, ResneXt-101, C3D v1, whose other versions can be easily added.

Abstract

Real-time recognition of dynamic hand gestures from video streams is a challenging task since (i) there is no indication when a gesture starts and ends in the video, (ii) performed gestures should only be recognized once, and (iii) the entire architecture should be designed considering the memory and power budget. In this work, we address these challenges by proposing a hierarchical structure enabling offline-working convolutional neural network (CNN) architectures to operate online efficiently by using sliding window approach. The proposed architecture consists of two models: (1) A detector which is a lightweight CNN architecture to detect gestures and (2) a classifier which is a deep CNN to classify the detected gestures. In order to evaluate the single-time activations of the detected gestures, we propose to use the Levenshtein distance as an evaluation metric since it can measure misclassifications, multiple detections, and missing detections at the same time. We evaluate our architecture on two publicly available datasets - EgoGesture and NVIDIA Dynamic Hand Gesture Datasets - which require temporal detection and classification of the performed hand gestures. ResNeXt-101 model, which is used as a classifier, achieves the state-of-the-art offline classification accuracy of 94.04% and 83.82% for depth modality on EgoGesture and NVIDIA benchmarks, respectively. In real-time detection and classification, we obtain considerable early detections while achieving performances close to offline operation. The codes and pretrained models used in this work are publicly available.

Requirements

conda install pytorch torchvision cuda80 -c soumith
  • Python 3

Pretrained models

Pretrained_models_v1 (1.08GB): The best performing models in paper

Pretrained_RGB_models_for_det_and_clf (371MB)(Google Drive) Pretrained_RGB_models_for_det_and_clf (371MB)(Baidu Netdisk) -code:p1va

Pretrained_models_v2 (15.2GB): All models in paper with efficient 3D-CNN Models

Preparation

EgoGesture

  • Download videos by following the official site.

  • We will use extracted images that is also provided by the owners

  • Generate n_frames files using utils/ego_prepare.py

N frames format is as following: "path to the folder" "class index" "start frame" "end frame"

mkdir annotation_EgoGesture
python utils/ego_prepare.py training trainlistall.txt all
python utils/ego_prepare.py training trainlistall_but_None.txt all_but_None
python utils/ego_prepare.py training trainlistbinary.txt binary
python utils/ego_prepare.py validation vallistall.txt all
python utils/ego_prepare.py validation vallistall_but_None.txt all_but_None
python utils/ego_prepare.py validation vallistbinary.txt binary
python utils/ego_prepare.py testing testlistall.txt all
python utils/ego_prepare.py testing testlistall_but_None.txt all_but_None
python utils/ego_prepare.py testing testlistbinary.txt binary
  • Generate annotation file in json format similar to ActivityNet using utils/egogesture_json.py
python utils/egogesture_json.py 'annotation_EgoGesture' all
python utils/egogesture_json.py 'annotation_EgoGesture' all_but_None
python utils/egogesture_json.py 'annotation_EgoGesture' binary

nvGesture

  • Download videos by following the official site.

  • Generate n_frames files using utils/nv_prepare.py

N frames format is as following: "path to the folder" "class index" "start frame" "end frame"

mkdir annotation_nvGesture
python utils/nv_prepare.py training trainlistall.txt all
python utils/nv_prepare.py training trainlistall_but_None.txt all_but_None
python utils/nv_prepare.py training trainlistbinary.txt binary
python utils/nv_prepare.py validation vallistall.txt all
python utils/nv_prepare.py validation vallistall_but_None.txt all_but_None
python utils/nv_prepare.py validation vallistbinary.txt binary
  • Generate annotation file in json format similar to ActivityNet using utils/nv_json.py
python utils/nv_json.py 'annotation_nvGesture' all
python utils/nv_json.py 'annotation_nvGesture' all_but_None
python utils/nv_json.py 'annotation_nvGesture' binary

Jester

  • Download videos by following the official site.

  • N frames and class index file is already provided annotation_Jester/{'classInd.txt', 'trainlist01.txt', 'vallist01.txt'}

N frames format is as following: "path to the folder" "class index" "start frame" "end frame"

  • Generate annotation file in json format similar to ActivityNet using utils/jester_json.py
python utils/jester_json.py 'annotation_Jester'

Running the code

  • Offline testing (offline_test.py) and training (main.py)
bash run_offline.sh
  • Online testing
bash run_online.sh

Citation

Please cite the following articles if you use this code or pre-trained models:

@article{kopuklu_real-time_2019,
	title = {Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks},
	url = {http://arxiv.org/abs/1901.10323},
	author = {Köpüklü, Okan and Gunduz, Ahmet and Kose, Neslihan and Rigoll, Gerhard},
  year={2019}
}
@article{kopuklu2020online,
  title={Online Dynamic Hand Gesture Recognition Including Efficiency Analysis},
  author={K{\"o}p{\"u}kl{\"u}, Okan and Gunduz, Ahmet and Kose, Neslihan and Rigoll, Gerhard},
  journal={IEEE Transactions on Biometrics, Behavior, and Identity Science},
  volume={2},
  number={2},
  pages={85--97},
  year={2020},
  publisher={IEEE}
}

Acknowledgement

We thank Kensho Hara for releasing his codebase, which we build our work on top.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].