All Projects → JacobYuan7 → DIN-Group-Activity-Recognition-Benchmark

JacobYuan7 / DIN-Group-Activity-Recognition-Benchmark

Licence: MIT License
A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Programming Languages

python
139335 projects - #7 most used programming language
Dockerfile
14818 projects

Projects that are alternatives of or similar to DIN-Group-Activity-Recognition-Benchmark

Mmaction2
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Stars: ✭ 684 (+2530.77%)
Mutual labels:  action-recognition, video-understanding
Movienet Tools
Tools for movie and video research
Stars: ✭ 113 (+334.62%)
Mutual labels:  action-recognition, video-understanding
Tsn Pytorch
Temporal Segment Networks (TSN) in PyTorch
Stars: ✭ 895 (+3342.31%)
Mutual labels:  action-recognition, video-understanding
Awesome Action Recognition
A curated list of action recognition and related area resources
Stars: ✭ 3,202 (+12215.38%)
Mutual labels:  action-recognition, video-understanding
Step
STEP: Spatio-Temporal Progressive Learning for Video Action Detection. CVPR'19 (Oral)
Stars: ✭ 196 (+653.85%)
Mutual labels:  action-recognition, video-understanding
Action Detection
temporal action detection with SSN
Stars: ✭ 597 (+2196.15%)
Mutual labels:  action-recognition, video-understanding
Temporal Segment Networks
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
Stars: ✭ 1,287 (+4850%)
Mutual labels:  action-recognition, video-understanding
Video Understanding Dataset
A collection of recent video understanding datasets, under construction!
Stars: ✭ 387 (+1388.46%)
Mutual labels:  action-recognition, video-understanding
Awesome Activity Prediction
Paper list of activity prediction and related area
Stars: ✭ 147 (+465.38%)
Mutual labels:  action-recognition, video-understanding
Mmaction
An open-source toolbox for action understanding based on PyTorch
Stars: ✭ 1,711 (+6480.77%)
Mutual labels:  action-recognition, video-understanding
DEAR
[ICCV 2021 Oral] Deep Evidential Action Recognition
Stars: ✭ 36 (+38.46%)
Mutual labels:  action-recognition, video-understanding
Paddlevideo
Comprehensive, latest, and deployable video deep learning algorithm, including video recognition, action localization, and temporal action detection tasks. It's a high-performance, light-weight codebase provides practical models for video understanding research and application
Stars: ✭ 218 (+738.46%)
Mutual labels:  action-recognition, video-understanding
Tdn
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Stars: ✭ 72 (+176.92%)
Mutual labels:  action-recognition, video-understanding
I3d finetune
TensorFlow code for finetuning I3D model on UCF101.
Stars: ✭ 128 (+392.31%)
Mutual labels:  action-recognition, video-understanding
Actionvlad
ActionVLAD for video action classification (CVPR 2017)
Stars: ✭ 217 (+734.62%)
Mutual labels:  action-recognition, video-understanding
MTL-AQA
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]
Stars: ✭ 38 (+46.15%)
Mutual labels:  action-recognition, video-understanding
Spectral-Designed-Graph-Convolutions
Codes for "Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks" paper
Stars: ✭ 39 (+50%)
Mutual labels:  graph-neural-networks
vlog action recognition
Identifying Visible Actions in Lifestyle Vlogs
Stars: ✭ 13 (-50%)
Mutual labels:  action-recognition
gemnet pytorch
GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)
Stars: ✭ 80 (+207.69%)
Mutual labels:  graph-neural-networks
SelfTask-GNN
Implementation of paper "Self-supervised Learning on Graphs:Deep Insights and New Directions"
Stars: ✭ 78 (+200%)
Mutual labels:  graph-neural-networks

Spatio-Temporal Dynamic Inference Network for Group Activity Recognition

The source codes for ICCV2021 Paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition.
[paper] [supplemental material] [arXiv]

If you find our work or the codebase inspiring and useful to your research, please cite

@inproceedings{yuan2021DIN,
  title={Spatio-Temporal Dynamic Inference Network for Group Activity Recognition},
  author={Yuan, Hangjie and Ni, Dong and Wang, Mang},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={7476--7485},
  year={2021}
}

Dependencies

  • Software Environment: Linux (CentOS 7)
  • Hardware Environment: NVIDIA TITAN RTX
  • Python 3.6
  • PyTorch 1.2.0, Torchvision 0.4.0
  • RoIAlign for Pytorch

Prepare Datasets

  1. Download publicly available datasets from following links: Volleyball dataset and Collective Activity dataset.
  2. Unzip the dataset file into data/volleyball or data/collective.
  3. Download the file tracks_normalized.pkl from cvlab-epfl/social-scene-understanding and put it into data/volleyball/videos

Using Docker

  1. Checkout repository and cd PROJECT_PATH

  2. Build the Docker container

docker build -t din_gar https://github.com/JacobYuan7/DIN_GAR.git#main
  1. Run the Docker container
docker run --shm-size=2G -v data/volleyball:/opt/DIN_GAR/data/volleyball -v result:/opt/DIN_GAR/result --rm -it din_gar
  • --shm-size=2G: To prevent ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm)., you have to extend the container's shared memory size. Alternatively: --ipc=host
  • -v data/volleyball:/opt/DIN_GAR/data/volleyball: Makes the host's folder data/volleyball available inside the container at /opt/DIN_GAR/data/volleyball
  • -v result:/opt/DIN_GAR/result: Makes the host's folder result available inside the container at /opt/DIN_GAR/result
  • -it & --rm: Starts the container with an interactive session (PROJECT_PATH is /opt/DIN_GAR) and removes the container after closing the session.
  • din_gar the name/tag of the image
  • optional: --gpus='"device=7"' restrict the GPU devices the container can access.

Get Started

  1. Train the Base Model: Fine-tune the base model for the dataset.

    # Volleyball dataset
    cd PROJECT_PATH 
    python scripts/train_volleyball_stage1.py
    
    # Collective Activity dataset
    cd PROJECT_PATH 
    python scripts/train_collective_stage1.py
  2. Train with the reasoning module: Append the reasoning modules onto the base model to get a reasoning model.

    1. Volleyball dataset

      • DIN

        python scripts/train_volleyball_stage2_dynamic.py
        
      • lite DIN
        We can run DIN in lite version by setting cfg.lite_dim = 128 in scripts/train_volleyball_stage2_dynamic.py.

        python scripts/train_volleyball_stage2_dynamic.py
        
      • ST-factorized DIN
        We can run ST-factorized DIN by setting cfg.ST_kernel_size = [(1,3),(3,1)] and cfg.hierarchical_inference = True.

        Note that if you set cfg.hierarchical_inference = False, cfg.ST_kernel_size = [(1,3),(3,1)] and cfg.num_DIN = 2, then multiple interaction fields run in parallel.

        python scripts/train_volleyball_stage2_dynamic.py
        

      Other model re-implemented by us according to their papers or publicly available codes:

      • AT
        python scripts/train_volleyball_stage2_at.py
        
      • PCTDM
        python scripts/train_volleyball_stage2_pctdm.py
        
      • SACRF
        python scripts/train_volleyball_stage2_sacrf_biute.py
        
      • ARG
        python scripts/train_volleyball_stage2_arg.py
        
      • HiGCIN
        python scripts/train_volleyball_stage2_higcin.py
        
    2. Collective Activity dataset

      • DIN
        python scripts/train_collective_stage2_dynamic.py
        
      • DIN lite
        We can run DIN in lite version by setting 'cfg.lite_dim = 128' in 'scripts/train_collective_stage2_dynamic.py'.
        python scripts/train_collective_stage2_dynamic.py
        

Another work done by us, solving GAR from the perspective of incorporating visual context, is also available.

@inproceedings{yuan2021visualcontext,
  title={Learning Visual Context for Group Activity Recognition},
  author={Yuan, Hangjie and Ni, Dong},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={4},
  pages={3261--3269},
  year={2021}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].