karolzak / conv3d-video-action-recognition

Licence: MIT license

My experimentation around action recognition in videos. Contains Keras implementation for C3D network based on original paper "Learning Spatiotemporal Features with 3D Convolutional Networks", Tran et al. and it includes video processing pipelines coded using mPyPl package. Model is being benchmarked on popular UCF101 dataset and achieves result…

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to conv3d-video-action-recognition

MiCT-Net-PyTorch

Video Recognition using Mixed Convolutional Tube (MiCT) on PyTorch with a ResNet backbone

Stars: ✭ 48 (-4%)

Mutual labels: action-recognition, video-classification, video-recognition, ucf101

MTL-AQA

What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]

Stars: ✭ 38 (-24%)

Mutual labels: video-processing, action-recognition, c3d

temporal-ssl

Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.

Stars: ✭ 46 (-8%)

Mutual labels: action-recognition, c3d, ucf101

Awesome Action Recognition

A curated list of action recognition and related area resources

Stars: ✭ 3,202 (+6304%)

Mutual labels: video-processing, action-recognition, video-recognition

C3D-tensorflow

Action recognition with C3D network implemented in tensorflow

Stars: ✭ 34 (-32%)

Mutual labels: action-recognition, c3d, video-classification

ViCC

[WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https://arxiv.org/abs/2106.10137.

Stars: ✭ 33 (-34%)

Mutual labels: action-recognition, video-recognition

cpnet

Learning Video Representations from Correspondence Proposals (CVPR 2019 Oral)

Stars: ✭ 93 (+86%)

Mutual labels: action-recognition, video-classification

GST-video

ICCV 19 Grouped Spatial-Temporal Aggretation for Efficient Action Recognition

Stars: ✭ 40 (-20%)

Mutual labels: action-recognition, video-classification

two-stream-fusion-for-action-recognition-in-videos

No description or website provided.

Stars: ✭ 80 (+60%)

Mutual labels: action-recognition, ucf101

3d Resnets Pytorch

3D ResNets for Action Recognition (CVPR 2018)

Stars: ✭ 3,169 (+6238%)

Mutual labels: action-recognition, video-recognition

vlog action recognition

Identifying Visible Actions in Lifestyle Vlogs

Stars: ✭ 13 (-74%)

Mutual labels: video-processing, action-recognition

Actionvlad

ActionVLAD for video action classification (CVPR 2017)

Stars: ✭ 217 (+334%)

Mutual labels: video-processing, action-recognition

Lintel

A Python module to decode video frames directly, using the FFmpeg C API.

Stars: ✭ 240 (+380%)

Mutual labels: video-processing, action-recognition

tennis action recognition

Using deep learning to perform action recognition in the sport of tennis.

Stars: ✭ 17 (-66%)

Mutual labels: video-processing, action-recognition

TA3N

[ICCV 2019 Oral] TA3N: https://github.com/cmhungsteve/TA3N (Most updated repo)

Stars: ✭ 45 (-10%)

Mutual labels: action-recognition, video-classification

Video-Stabilization-and-image-mosaicing

video stabilization: stabilize the videos which is taken from wavering camera. Image mosaicing: stitches multiple, overlapping snapshot images of a video together in order to produce one large image.

Stars: ✭ 16 (-68%)

Mutual labels: video-processing

CCL

PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning

Stars: ✭ 76 (+52%)

Mutual labels: video-recognition

CrowdFlow

Optical Flow Dataset and Benchmark for Visual Crowd Analysis

Stars: ✭ 87 (+74%)

Mutual labels: video-processing

FunFilter

Freely painted area, the software will automatically add filter on its.

Stars: ✭ 15 (-70%)

Mutual labels: video-processing

UniFormer

[ICLR2022] official implementation of UniFormer

Stars: ✭ 574 (+1048%)

Mutual labels: video-classification

View All Similar Projects ➔

About

This repo contains Keras implementation for C3D network based on paper "Learning Spatiotemporal Features with 3D Convolutional Networks", Tran et al. and it includes video processing pipelines coded using amazing mPyPl package. Model is being benchmarked on UCF101 - a popular action recognition dataset - and achieves results similar to those reported by original authors.

Examples of action recognition tasks from UCF101:

ApplyEyeMakeUp	Kayaking	PlayingFlute

The whole beauty of C3D is that it's using Conv3D layers to learn spatiotemporal features from video frames and if trained on big enough amount of data it can be successfully used as a compact and uniform video feature extractor/descriptor. Features extracted from such model can easily be used to build a simple Linear SVM classifier. This approach achieves close-to-the-best results for most of the action recognition benchmark datasets while remaining very fast and efficient which is perfect for video processing tasks. C3D pretrained weights used in this exepriment are coming from training on Sports-1M dataset.

Prerequisites

This project heavily relies on mPyPl package and it's included in requirements.txt so in order to install it with all the other required dependencies you can run:

pip install -r requirements.txt

How to run it

End-to-end experiment

In order to run the experiment end-to-end (from original UCF101 videos, through preprocessing and features generation) you need to follow the steps below:

Download UCF101 dataset and put videos with all classes subfolders under data/ucf101/videos/ so you have a structure such as this:

data
├── ucf101
    ├── videos
        ├── ApplyEyeMakeup    
        │   ├── v_ApplyEyeMakeup_g01_c01.avi       
        │   ├── v_ApplyEyeMakeup_g01_c02.avi           
        │   └── ...  
        ├── ApplyLipstick             
        │   ├── v_ApplyLipstick_g01_c01.avi       
        │   ├── v_ApplyLipstick_g01_c02.avi            
        │   └── ...                
        └── ...

Download Sports-1M pretrained C3D model and put it under models/
Go to the notebook with end-to-end experiment for action recognition using pretrained C3D on UCF101 and execute it cell by cell

Final classification part of experiment only

In case you don't want to go through all the data loading and preprocessing steps (as it is very time and storage consuming) you can simply download feature vectors for each video and skip the data loading and preprocessing steps from end-to-end experiment:

Download feature vectors for each video from UCF101 and put it under data/ucf101/videos/ so you have a structure such as this:

data
├── ucf101
    ├── videos
        ├── ApplyEyeMakeup    
        │   ├── v_ApplyEyeMakeup_g01_c01.proc.c3d-avg.npy     
        │   ├── v_ApplyEyeMakeup_g01_c02.proc.c3d-avg.npy            
        │   └── ...  
        ├── ApplyLipstick             
        │   ├── v_ApplyLipstick_g01_c01.proc.c3d-avg.npy        
        │   ├── v_ApplyLipstick_g01_c02.proc.c3d-avg.npy              
        │   └── ...                
        └── ...

Go to the notebook with end-to-end experiment for action recognition using pretrained C3D on UCF101 and execute it cell by cell skipping STEP 1 and STEP 2

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

karolzak / conv3d-video-action-recognition

Programming Languages

Labels

Projects that are alternatives of or similar to conv3d-video-action-recognition

About

Examples of action recognition tasks from UCF101:

Prerequisites

How to run it

End-to-end experiment

Final classification part of experiment only