Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → PaddlePaddle → Paddlevideo

PaddlePaddle / Paddlevideo

Licence: apache-2.0

Comprehensive, latest, and deployable video deep learning algorithm, including video recognition, action localization, and temporal action detection tasks. It's a high-performance, light-weight codebase provides practical models for video understanding research and application

Programming Languages

python

139335 projects - #7 most used programming language

Labels

action-recognition ava video-understanding

Projects that are alternatives of or similar to Paddlevideo

Mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Stars: ✭ 684 (+213.76%)

Mutual labels: action-recognition, video-understanding, ava

Step

STEP: Spatio-Temporal Progressive Learning for Video Action Detection. CVPR'19 (Oral)

Stars: ✭ 196 (-10.09%)

Mutual labels: action-recognition, video-understanding, ava

Video Understanding Dataset

A collection of recent video understanding datasets, under construction!

Stars: ✭ 387 (+77.52%)

Mutual labels: action-recognition, video-understanding

Action Detection

temporal action detection with SSN

Stars: ✭ 597 (+173.85%)

Mutual labels: action-recognition, video-understanding

Actionvlad

ActionVLAD for video action classification (CVPR 2017)

Stars: ✭ 217 (-0.46%)

Mutual labels: action-recognition, video-understanding

DIN-Group-Activity-Recognition-Benchmark

A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Stars: ✭ 26 (-88.07%)

Mutual labels: action-recognition, video-understanding

DEAR

[ICCV 2021 Oral] Deep Evidential Action Recognition

Stars: ✭ 36 (-83.49%)

Mutual labels: action-recognition, video-understanding

Tsn Pytorch

Temporal Segment Networks (TSN) in PyTorch

Stars: ✭ 895 (+310.55%)

Mutual labels: action-recognition, video-understanding

Temporal Segment Networks

Code & Models for Temporal Segment Networks (TSN) in ECCV 2016

Stars: ✭ 1,287 (+490.37%)

Mutual labels: action-recognition, video-understanding

Movienet Tools

Tools for movie and video research

Stars: ✭ 113 (-48.17%)

Mutual labels: action-recognition, video-understanding

I3d finetune

TensorFlow code for finetuning I3D model on UCF101.

Stars: ✭ 128 (-41.28%)

Mutual labels: action-recognition, video-understanding

MTL-AQA

What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]

Stars: ✭ 38 (-82.57%)

Mutual labels: action-recognition, video-understanding

Alphaction

Spatio-Temporal Action Localization System

Stars: ✭ 221 (+1.38%)

Mutual labels: action-recognition, ava

Awesome Action Recognition

A curated list of action recognition and related area resources

Stars: ✭ 3,202 (+1368.81%)

Mutual labels: action-recognition, video-understanding

Tdn

[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition

Stars: ✭ 72 (-66.97%)

Mutual labels: action-recognition, video-understanding

Mmaction

An open-source toolbox for action understanding based on PyTorch

Stars: ✭ 1,711 (+684.86%)

Mutual labels: action-recognition, video-understanding

Awesome Activity Prediction

Paper list of activity prediction and related area

Stars: ✭ 147 (-32.57%)

Mutual labels: action-recognition, video-understanding

Youtube 8m

The 2nd place Solution to the Youtube-8M Video Understanding Challenge by Team Monkeytyping (based on tensorflow)

Stars: ✭ 171 (-21.56%)

Mutual labels: video-understanding

Mmskeleton

A OpenMMLAB toolbox for human pose estimation, skeleton-based action recognition, and action synthesis.

Stars: ✭ 2,378 (+990.83%)

Mutual labels: action-recognition

Lad

👦 Lad is the best Node.js framework. Made by a former Express TC and Koa team member.

Stars: ✭ 2,112 (+868.81%)

Mutual labels: ava

View All Similar Projects ➔

简体中文 | English

PaddleVideo

Introduction

PaddleVideo is a toolset for video recognition, action localization, and spatio temporal action detection tasks prepared for the industry and academia. This repository provides examples and best practice guildelines for exploring deep learning algorithm in the scene of video area. We devote to support experiments and utilities which can significantly reduce the "time to deploy". By the way, this is also a proficiency verification and implementation of the newest PaddlePaddle 2.0 in the video field.

Feature

Advanced model zoo design PaddleVideo unifies the video understanding tasks, including recogniztion, localization, spatio temporal action detection, and so on. with the clear configuration system based on IOC/DI, we design a decoupling modular and extensible framework which can easily construct a customized network by combining different modules.
Various dataset and architectures PaddleVideo supports more datasets and architectures, including Kinectics400, ucf101, YoutTube8M datasets, and video recognition models, such as TSN, TSM, SlowFast, AttentionLSTM and action localization model, like BMN.
Higher performance PaddleVideo has built-in solutions to improve accuracy on the recognition models. PP-TSM, which is based on the standard TSM, already archive the best performance in the 2D recognition network, has the same size of parameters but improve the Top1 Acc to 73.5% , and one can easily apply the soulutions on his own dataset.
Faster training strategy PaddleVideo suppors faster training strategy, it accelerates by 100% compared with the standard Slowfast version, and it only takes 10 days to train from scratch on the kinetics400 dataset.
Deployable PaddleVideo is powered by the Paddle Inference. There is no need to convert the model to ONNX format when deploying it, all you want can be found in this repository.

Overview of the kit structures

Architectures

Frameworks

Components

Data Augmentation

Recognition

TSN
TSM
SlowFast
PP-TSM
VideoTag
AttentionLSTM

Localization

Recognizer1D Recognizer2D Recognizer3D Localizer

resnet
resnet_tsm
resnet_tweaks_tsm
bmn

tsm_head
tsn_head
bmn_head

Solver

Optimizer

Momentum
RMSProp

LearningRate

PiecewiseDecay

Loss

CrossEntropy
BMNLoss

Metrics

CenterCrop
MultiCrop

Video

Mixup
Cutmix

Image

Scale
Random FLip
Jitter Scale
Crop
MultiCrop
Center Crop
MultiScaleCrop
Random Crop
PackOutput

Overview of the performance

The chart below illustrates the performance of the video recognition models both 2D and 3D architectures, including our implementation and Pytorch version. It shows the relationship between Acc Top1 and VPS on the Kinectics400 dataset. (Tested on the NVIDIA® Tesla® GPU V100.)

Note：

PP-TSM improves almost 3.5% Top1 accuracy from standard TSM.
all these models described by RED color can be obtained in the Model Zoo, and others are Pytorch results.

Community

Scan the QR code below with your Wechat and reply "video", you can access to official technical exchange group. Look forward to your participation.

Applications

VideoTag: 3k Large-Scale video classification model

FootballAction: Football action detection model

Tutorials and Docs

Quick Start
Project design
- Modular design
- Configuration design
Model zoo
- recognition
  - Attention-LSTM
  - TSN
  - TSM
  - PP-TSM
  - SlowFast
- Localization
  - BMN
- Spatio temporal action detection
  - Coming Soon!
Practice
Others
- Benchmark
- Tools

License

PaddleVideo is released under the Apache 2.0 license.

Contributing

This poject welcomes contributions and suggestions. Please see our contribution guidelines.

Many thanks to mohui37 for contributing the code for prediction.

Call for suggestions

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 218

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (24) 🔗