All Projects → WuJie1010 → Temporally Language Grounding

WuJie1010 / Temporally Language Grounding

A Pytorch implemention for some state-of-the-art models for" Temporally Language Grounding in Untrimmed Videos"

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Temporally Language Grounding

Awesome Grounding
awesome grounding: A curated list of research papers in visual grounding
Stars: ✭ 247 (+238.36%)
Mutual labels:  video-understanding
CP-360-Weakly-Supervised-Saliency
CP-360-Weakly-Supervised-Saliency
Stars: ✭ 20 (-72.6%)
Mutual labels:  video-understanding
Activity Recognition With Cnn And Rnn
Temporal Segments LSTM and Temporal-Inception for Activity Recognition
Stars: ✭ 415 (+468.49%)
Mutual labels:  video-understanding
glimpse clouds
Pytorch implementation of the paper "Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points", F. Baradel, C. Wolf, J. Mille , G.W. Taylor, CVPR 2018
Stars: ✭ 30 (-58.9%)
Mutual labels:  video-understanding
just-ask
[TPAMI Special Issue on ICCV 2021 Best Papers, Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Stars: ✭ 57 (-21.92%)
Mutual labels:  video-understanding
PyAnomaly
Useful Toolbox for Anomaly Detection
Stars: ✭ 95 (+30.14%)
Mutual labels:  video-understanding
Actionvlad
ActionVLAD for video action classification (CVPR 2017)
Stars: ✭ 217 (+197.26%)
Mutual labels:  video-understanding
Tsn Pytorch
Temporal Segment Networks (TSN) in PyTorch
Stars: ✭ 895 (+1126.03%)
Mutual labels:  video-understanding
Awesome-Temporally-Language-Grounding
A curated list of “Temporally Language Grounding” and related area
Stars: ✭ 97 (+32.88%)
Mutual labels:  video-understanding
Video Understanding Dataset
A collection of recent video understanding datasets, under construction!
Stars: ✭ 387 (+430.14%)
Mutual labels:  video-understanding
NExT-QA
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
Stars: ✭ 50 (-31.51%)
Mutual labels:  video-understanding
STCNet
STCNet: Spatio-Temporal Cross Network for Industrial Smoke Detection
Stars: ✭ 29 (-60.27%)
Mutual labels:  video-understanding
DEAR
[ICCV 2021 Oral] Deep Evidential Action Recognition
Stars: ✭ 36 (-50.68%)
Mutual labels:  video-understanding
SSTDA
[CVPR 2020] Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation (PyTorch)
Stars: ✭ 150 (+105.48%)
Mutual labels:  video-understanding
Action Detection
temporal action detection with SSN
Stars: ✭ 597 (+717.81%)
Mutual labels:  video-understanding
Paddlevideo
Comprehensive, latest, and deployable video deep learning algorithm, including video recognition, action localization, and temporal action detection tasks. It's a high-performance, light-weight codebase provides practical models for video understanding research and application
Stars: ✭ 218 (+198.63%)
Mutual labels:  video-understanding
DIN-Group-Activity-Recognition-Benchmark
A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.
Stars: ✭ 26 (-64.38%)
Mutual labels:  video-understanding
Tdn
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Stars: ✭ 72 (-1.37%)
Mutual labels:  video-understanding
Mmaction2
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Stars: ✭ 684 (+836.99%)
Mutual labels:  video-understanding
Awesome Action Recognition
A curated list of action recognition and related area resources
Stars: ✭ 3,202 (+4286.3%)
Mutual labels:  video-understanding

Temporally-language-grounding

A Pytorch implemention for some state-of-the-art models for "Temporally language grounding in untrimmed videos"

Requirements

  • Python 2.7
  • Pytorch 0.4.1
  • matplotlib
  • The code is for Charades-STA dataset.

Three Models for this task

Supervised Learning based methods

  • TALL: Temporal Activity Localization via Language Query
  • MAC: MAC: Mining Activity Concepts for Language-based Temporal Localization.

Reinforcement Learning based method

  • A2C: Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos.

Performance

Methods [email protected], IoU0.7 [email protected], IoU0.5 [email protected], IoU0.7 [email protected], IoU0.5
TALL 8.63 24.09 29.33 59.60
MAC 12.31 29.68 37.31 64.14
A2C 14.25 32.66 None None

Features Download

Training and Testing

Training and Testing for TALL, run

python main_charades_SL.py --model TALL

Training and Testing for MAC, run

python main_charades_SL.py --model MAC

Training and Testing for A2C, run

python main_charades_RL.py

Acknowledgements

Thanks the original TALL, MAC and awesome PyTorch team.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].