WuJie1010 / Temporally Language Grounding
A Pytorch implemention for some state-of-the-art models for" Temporally Language Grounding in Untrimmed Videos"
Stars: ✭ 73
Programming Languages
python
139335 projects - #7 most used programming language
Labels
Projects that are alternatives of or similar to Temporally Language Grounding
Awesome Grounding
awesome grounding: A curated list of research papers in visual grounding
Stars: ✭ 247 (+238.36%)
Mutual labels: video-understanding
CP-360-Weakly-Supervised-Saliency
CP-360-Weakly-Supervised-Saliency
Stars: ✭ 20 (-72.6%)
Mutual labels: video-understanding
Activity Recognition With Cnn And Rnn
Temporal Segments LSTM and Temporal-Inception for Activity Recognition
Stars: ✭ 415 (+468.49%)
Mutual labels: video-understanding
glimpse clouds
Pytorch implementation of the paper "Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points", F. Baradel, C. Wolf, J. Mille , G.W. Taylor, CVPR 2018
Stars: ✭ 30 (-58.9%)
Mutual labels: video-understanding
just-ask
[TPAMI Special Issue on ICCV 2021 Best Papers, Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Stars: ✭ 57 (-21.92%)
Mutual labels: video-understanding
PyAnomaly
Useful Toolbox for Anomaly Detection
Stars: ✭ 95 (+30.14%)
Mutual labels: video-understanding
Actionvlad
ActionVLAD for video action classification (CVPR 2017)
Stars: ✭ 217 (+197.26%)
Mutual labels: video-understanding
Tsn Pytorch
Temporal Segment Networks (TSN) in PyTorch
Stars: ✭ 895 (+1126.03%)
Mutual labels: video-understanding
Awesome-Temporally-Language-Grounding
A curated list of “Temporally Language Grounding” and related area
Stars: ✭ 97 (+32.88%)
Mutual labels: video-understanding
Video Understanding Dataset
A collection of recent video understanding datasets, under construction!
Stars: ✭ 387 (+430.14%)
Mutual labels: video-understanding
NExT-QA
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
Stars: ✭ 50 (-31.51%)
Mutual labels: video-understanding
STCNet
STCNet: Spatio-Temporal Cross Network for Industrial Smoke Detection
Stars: ✭ 29 (-60.27%)
Mutual labels: video-understanding
DEAR
[ICCV 2021 Oral] Deep Evidential Action Recognition
Stars: ✭ 36 (-50.68%)
Mutual labels: video-understanding
SSTDA
[CVPR 2020] Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation (PyTorch)
Stars: ✭ 150 (+105.48%)
Mutual labels: video-understanding
Action Detection
temporal action detection with SSN
Stars: ✭ 597 (+717.81%)
Mutual labels: video-understanding
Paddlevideo
Comprehensive, latest, and deployable video deep learning algorithm, including video recognition, action localization, and temporal action detection tasks. It's a high-performance, light-weight codebase provides practical models for video understanding research and application
Stars: ✭ 218 (+198.63%)
Mutual labels: video-understanding
DIN-Group-Activity-Recognition-Benchmark
A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.
Stars: ✭ 26 (-64.38%)
Mutual labels: video-understanding
Tdn
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Stars: ✭ 72 (-1.37%)
Mutual labels: video-understanding
Mmaction2
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Stars: ✭ 684 (+836.99%)
Mutual labels: video-understanding
Awesome Action Recognition
A curated list of action recognition and related area resources
Stars: ✭ 3,202 (+4286.3%)
Mutual labels: video-understanding
Temporally-language-grounding
A Pytorch implemention for some state-of-the-art models for "Temporally language grounding in untrimmed videos"
Requirements
- Python 2.7
- Pytorch 0.4.1
- matplotlib
- The code is for Charades-STA dataset.
Three Models for this task
Supervised Learning based methods
- TALL: Temporal Activity Localization via Language Query
- MAC: MAC: Mining Activity Concepts for Language-based Temporal Localization.
Reinforcement Learning based method
- A2C: Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos.
Performance
Methods | [email protected], IoU0.7 | [email protected], IoU0.5 | [email protected], IoU0.7 | [email protected], IoU0.5 |
---|---|---|---|---|
TALL | 8.63 | 24.09 | 29.33 | 59.60 |
MAC | 12.31 | 29.68 | 37.31 | 64.14 |
A2C | 14.25 | 32.66 | None | None |
Features Download
- visual features
- visual activity concepts (for MAC)
- ref_info
- RL_pickle (for A2C)
Training and Testing
Training and Testing for TALL, run
python main_charades_SL.py --model TALL
Training and Testing for MAC, run
python main_charades_SL.py --model MAC
Training and Testing for A2C, run
python main_charades_RL.py
Acknowledgements
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].