All Projects → cmhungsteve → SSTDA

cmhungsteve / SSTDA

Licence: MIT license
[CVPR 2020] Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation (PyTorch)

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to SSTDA

Transferlearning
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
Stars: ✭ 8,481 (+5554%)
Mutual labels:  domain-adaptation, self-supervised-learning
SHOT-plus
code for our TPAMI 2021 paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"
Stars: ✭ 46 (-69.33%)
Mutual labels:  domain-adaptation, self-supervised-learning
Temporal Shift Module
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
Stars: ✭ 1,282 (+754.67%)
Mutual labels:  video-understanding
TA3N
[ICCV 2019 Oral] TA3N: https://github.com/cmhungsteve/TA3N (Most updated repo)
Stars: ✭ 45 (-70%)
Mutual labels:  domain-adaptation
Object level visual reasoning
Pytorch Implementation of "Object level Visual Reasoning in Videos", F. Baradel, N. Neverova, C. Wolf, J. Mille, G. Mori , ECCV 2018
Stars: ✭ 163 (+8.67%)
Mutual labels:  video-understanding
Movienet Tools
Tools for movie and video research
Stars: ✭ 113 (-24.67%)
Mutual labels:  video-understanding
Step
STEP: Spatio-Temporal Progressive Learning for Video Action Detection. CVPR'19 (Oral)
Stars: ✭ 196 (+30.67%)
Mutual labels:  video-understanding
Tdn
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Stars: ✭ 72 (-52%)
Mutual labels:  video-understanding
form2fit
[ICRA 2020] Train generalizable policies for kit assembly with self-supervised dense correspondence learning.
Stars: ✭ 78 (-48%)
Mutual labels:  self-supervised-learning
Awesome Activity Prediction
Paper list of activity prediction and related area
Stars: ✭ 147 (-2%)
Mutual labels:  video-understanding
Awesome Grounding
awesome grounding: A curated list of research papers in visual grounding
Stars: ✭ 247 (+64.67%)
Mutual labels:  video-understanding
Video2tfrecord
Easily convert RGB video data (e.g. .avi) to the TensorFlow tfrecords file format for training e.g. a NN in TensorFlow. This implementation allows to limit the number of frames per video to be stored in the tfrecords.
Stars: ✭ 137 (-8.67%)
Mutual labels:  video-understanding
I3d finetune
TensorFlow code for finetuning I3D model on UCF101.
Stars: ✭ 128 (-14.67%)
Mutual labels:  video-understanding
Actionvlad
ActionVLAD for video action classification (CVPR 2017)
Stars: ✭ 217 (+44.67%)
Mutual labels:  video-understanding
Temporal Segment Networks
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
Stars: ✭ 1,287 (+758%)
Mutual labels:  video-understanding
MGAN
Exploiting Coarse-to-Fine Task Transfer for Aspect-level Sentiment Classification (AAAI'19)
Stars: ✭ 44 (-70.67%)
Mutual labels:  domain-adaptation
Temporally Language Grounding
A Pytorch implemention for some state-of-the-art models for" Temporally Language Grounding in Untrimmed Videos"
Stars: ✭ 73 (-51.33%)
Mutual labels:  video-understanding
Multiverse
Dataset, code and model for the CVPR'20 paper "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction". And for the ECCV'20 SimAug paper.
Stars: ✭ 131 (-12.67%)
Mutual labels:  video-understanding
Youtube 8m
The 2nd place Solution to the Youtube-8M Video Understanding Challenge by Team Monkeytyping (based on tensorflow)
Stars: ✭ 171 (+14%)
Mutual labels:  video-understanding
SimMIM
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".
Stars: ✭ 717 (+378%)
Mutual labels:  self-supervised-learning

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation

PWC PWC PWC


This is the official PyTorch implementation of our paper:

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation
Min-Hung Chen, Baopu Li, Yingze Bao, Ghassan AlRegib (Advisor), and Zsolt Kira
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
[arXiv][1-min Video][5-min Video][Poster][Slides][Open Access]
[Project][Overview Talk][GT@CVPR'20]

Despite the recent progress of fully-supervised action segmentation techniques, the performance is still not fully satisfactory. One main challenge is the problem of spatiotemporal variations (e.g. different people may perform the same activity in various ways). Therefore, we exploit unlabeled videos to address this problem by reformulating the action segmentation task as a cross-domain problem with domain discrepancy caused by spatio-temporal variations. To reduce the discrepancy, we propose Self-Supervised Temporal Domain Adaptation (SSTDA), which contains two self-supervised auxiliary tasks (binary and sequential domain prediction) to jointly align cross-domain feature spaces embedded with local and global temporal dynamics, achieving better performance than other Domain Adaptation (DA) approaches. On three challenging benchmark datasets (GTEA, 50Salads, and Breakfast), SSTDA outperforms the current state-of-the-art method by large margins (e.g. for the F1@25 score, from 59.6% to 69.1% on Breakfast, from 73.4% to 81.5% on 50Salads, and from 83.6% to 89.1% on GTEA), and requires only 65% of the labeled training data for comparable performance, demonstrating the usefulness of adapting to unlabeled target videos across variations.


Requirements

Tested with:

  • Ubuntu 18.04.2 LTS
  • PyTorch 1.1.0
  • Torchvision 0.3.0
  • Python 3.7.3
  • GeForce GTX 1080Ti
  • CUDA 9.2.88
  • CuDNN 7.14

Or you can directly use our environment file:

conda env create -f environment.yml

Data Preparation

  • Clone the this repository:
git clone https://github.com/cmhungsteve/SSTDA.git
  • Download the Dataset folder, which contains the features and the ground truth labels. (~30GB)
  • To avoid the difficulty for downloading the whole file, we also divide it into multiple files:
  • Extract it so that you have the Datasets folder.
  • The default path for the dataset is ../../Datasets/action-segmentation/ if the current location is ./action-segmentation-DA/. If you change the dataset path, you need to edit the scripts as well.

Usage

Quick Run

  • Since there are lots of arguments, we recommend to directly run the scripts.
  • All the scripts are in the folder scripts/ with the name run_<dataset>_<method>.sh.
  • You can simply copy any script to the main folder (same location as all the .py files), and run the script as below:
# one example
bash run_gtea_SSTDA.sh

The script will do training, predicting and evaluation for all the splits on the dataset (<dataset>) using the method (<method>).

More Details

  • In each script, you may want to modify the following sections:
    • # === Mode Switch On/Off === #
      • training, predict and eval are the modes that can be switched on or off by set as true or false.
    • # === Paths === #
      • path_data needs to be the same as the location of the input data.
      • path_model and path_result are the path for output models and prediction. The folders will be created if not existing.
    • # === Main Program === #
      • You can run only the specific splits by editing for split in 1 2 3 ... (line 53).
  • We DO NOT recommend to edit other parts (e.g. # === Config & Setting === # ); otherwise the implementation may be different.

Citation

If you find this repository useful, please cite our papers:

@inproceedings{chen2020action,
  title={Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation},
  author={Chen, Min-Hung and Li, Baopu and Bao, Yingze and AlRegib, Ghassan and Kira, Zsolt},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020},
  url={https://arxiv.org/abs/2003.02824}
}

@inproceedings{chen2020mixed,
  title={Action Segmentation with Mixed Temporal Domain Adaptation},
  author={Chen, Min-Hung and Li, Baopu and Bao, Yingze and AlRegib, Ghassan},
  booktitle={IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2020}
}

Acknowledgments

This work was done with the support from OLIVES@GT.
Feel free to check our lab's Website and GitHub for other interesting work!!!

Some codes are borrowed from ms-tcn, TA3N, swd_pytorch, and VCOP.


Contact

Min-Hung Chen
cmhungsteve AT gatech DOT edu

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].