yoosan / Video Understanding Dataset
A collection of recent video understanding datasets, under construction!
Stars: ✭ 387
Projects that are alternatives of or similar to Video Understanding Dataset
Tdn
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Stars: ✭ 72 (-81.4%)
Mutual labels: action-recognition, video-understanding
Mmaction
An open-source toolbox for action understanding based on PyTorch
Stars: ✭ 1,711 (+342.12%)
Mutual labels: action-recognition, video-understanding
Temporal Segment Networks
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
Stars: ✭ 1,287 (+232.56%)
Mutual labels: action-recognition, video-understanding
Action Detection
temporal action detection with SSN
Stars: ✭ 597 (+54.26%)
Mutual labels: action-recognition, video-understanding
Paddlevideo
Comprehensive, latest, and deployable video deep learning algorithm, including video recognition, action localization, and temporal action detection tasks. It's a high-performance, light-weight codebase provides practical models for video understanding research and application
Stars: ✭ 218 (-43.67%)
Mutual labels: action-recognition, video-understanding
Tsn Pytorch
Temporal Segment Networks (TSN) in PyTorch
Stars: ✭ 895 (+131.27%)
Mutual labels: action-recognition, video-understanding
I3d finetune
TensorFlow code for finetuning I3D model on UCF101.
Stars: ✭ 128 (-66.93%)
Mutual labels: action-recognition, video-understanding
Mmaction2
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Stars: ✭ 684 (+76.74%)
Mutual labels: action-recognition, video-understanding
Actionvlad
ActionVLAD for video action classification (CVPR 2017)
Stars: ✭ 217 (-43.93%)
Mutual labels: action-recognition, video-understanding
Step
STEP: Spatio-Temporal Progressive Learning for Video Action Detection. CVPR'19 (Oral)
Stars: ✭ 196 (-49.35%)
Mutual labels: action-recognition, video-understanding
Awesome Action Recognition
A curated list of action recognition and related area resources
Stars: ✭ 3,202 (+727.39%)
Mutual labels: action-recognition, video-understanding
DIN-Group-Activity-Recognition-Benchmark
A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.
Stars: ✭ 26 (-93.28%)
Mutual labels: action-recognition, video-understanding
Movienet Tools
Tools for movie and video research
Stars: ✭ 113 (-70.8%)
Mutual labels: action-recognition, video-understanding
Awesome Activity Prediction
Paper list of activity prediction and related area
Stars: ✭ 147 (-62.02%)
Mutual labels: action-recognition, video-understanding
MTL-AQA
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]
Stars: ✭ 38 (-90.18%)
Mutual labels: action-recognition, video-understanding
DEAR
[ICCV 2021 Oral] Deep Evidential Action Recognition
Stars: ✭ 36 (-90.7%)
Mutual labels: action-recognition, video-understanding
Akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 4,334 (+1019.9%)
Mutual labels: datasets
Realtime Action Detection
This repository host the code for real-time action detection paper
Stars: ✭ 271 (-29.97%)
Mutual labels: action-recognition
Dr.sure
🏫DeepLearning学习笔记以及Tensorflow、Pytorch的使用心得笔记。Dr. Sure会不定时往项目中添加他看到的最新的技术,欢迎批评指正。
Stars: ✭ 365 (-5.68%)
Mutual labels: datasets
Video-understanding-dataset
Please feel free to pull a request.
Note: ActivityNet v1.3, Kinetics-600, Moments in time, AVA will be used at ActivityNet challenge 2018
Video Classification
Dataset | Paper | Website | Category | #Examples | #Classes | Duration | Organizer | SOTA performance |
---|---|---|---|---|---|---|---|---|
UCF101 | Link | human action | 13,320 | 101 | <10s | UCF | 98% (DeepMind I3D) | |
HMDB51 | Link | human action | 6,766 | 51 | <10s | Brown | 80.7% (DeepMind I3D) | |
ActivityNet v1.3 | Link | human activities | ~20,000 | 200 | - | ActivityNet | 8.83% err (iBUG) | |
Charades | Link | daily human activities | 9,848 | 157 | - | AI2 | 0.3441 mAP (DeepMind I3D) | |
Kinetics | Link | human action | ~500,000 | 600 | 10s | DeepMind | - | |
Sports-1M | Link | sports | ~1 million | 478 | 5m36s | Google & Stanford | - | |
YouTube-8M | Link | visual contents | ~7 million | 4716 | 120-500s | Google Cloud | 85% GAP (WILLOW) | |
FCVID | Link | visual contents | 91,223 | 239 | 100s+ | Fudan-Columbia | - | |
Something-Something | Link | action with objects | 108,499 | 174 | ~4s | TwentyBN | - | |
Moments in Time | Link | action or activity | ~1 million | 339 | 3s | MIT-IBM Watson | - | |
SLAC | arXiv | Link | recognition and localization | 520K | 200 | ~30.6s | MIT and Facebook | - |
Temporal Action Detection
Dataset | Paper | Website | #Examples | Organizer | SOTA performance |
---|---|---|---|---|---|
THUMOS2014 | Link | 9.682 | UCF | - | |
ActivityNet(v1.3) | Link | ~20,000 | ActivityNet | 0.344(SJTU & Columbia ) | |
Broad Video Highlights | - | Link | 18000 | Baidu | - |
Spatio-temporally Localized Atomic Visual Actions
Dataset | Paper | Website | #Examples | #Classes | Organizer | SOTA performance |
---|---|---|---|---|---|---|
AVA | arXiv | Link | 57.6k | 80 | Google & Berkeley | - |
Hand Gestures in Videos
Dataset | Paper | Website | #Examples | #Classes | Organizer | SOTA performance |
---|---|---|---|---|---|---|
Jester | - | Link | 148,092 | 27 | TwentyBN | 95.34%(Ke Yang, NUDT_PDL) |
Video Captioning
Dataset | Paper | Website | Context | #Examples | Organizer | SOTA performance |
---|---|---|---|---|---|---|
MPII-MD | Link | movie | 68,337 clips with 68,375 sentences | MPII | - | |
MSR-VTT | Link | 20 categories | 10,000 clips wth 200,000 sentences | MSR | - | |
Charades | Link | human activity | 9,848 clips wth 27,847 sentences | AI2 | - | |
Densevid | Link | event | 20k clips and 100k sentences | Stanford, ActivityNet | - |
Video Question Answering
Dataset | Paper | Website | Task | #Examples | Organizer | SOTA performance |
---|---|---|---|---|---|---|
MovieQA | Link | question-answering in movies | 408 movies & 14944 QAs | UToronto | - | |
MarioQA | Link | reasoning events in game videos | 187,757 examples with 92,874 QAs | POSTECH | - |
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].