csiro-robotics / TCE

Licence: other
This repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE).

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to TCE

temporal-ssl
Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.
Stars: ✭ 46 (-9.8%)
Mutual labels:  action-recognition, hmdb51, self-supervised-learning
Revisiting-Contrastive-SSL
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]
Stars: ✭ 81 (+58.82%)
Mutual labels:  representation-learning, self-supervised-learning, contrastive-learning
object-aware-contrastive
Object-aware Contrastive Learning for Debiased Scene Representation (NeurIPS 2021)
Stars: ✭ 44 (-13.73%)
Mutual labels:  representation-learning, self-supervised-learning, contrastive-learning
info-nce-pytorch
PyTorch implementation of the InfoNCE loss for self-supervised learning.
Stars: ✭ 160 (+213.73%)
Mutual labels:  contrastive-loss, self-supervised-learning, contrastive-learning
Simclr
SimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners
Stars: ✭ 2,720 (+5233.33%)
Mutual labels:  representation-learning, self-supervised-learning, contrastive-learning
Pytorch Metric Learning
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
Stars: ✭ 3,936 (+7617.65%)
Mutual labels:  metric-learning, self-supervised-learning, contrastive-learning
S2-BNN
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)
Stars: ✭ 53 (+3.92%)
Mutual labels:  contrastive-loss, self-supervised-learning, contrastive-learning
simclr-pytorch
PyTorch implementation of SimCLR: supports multi-GPU training and closely reproduces results
Stars: ✭ 89 (+74.51%)
Mutual labels:  representation-learning, self-supervised-learning, contrastive-learning
ViCC
[WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https://arxiv.org/abs/2106.10137.
Stars: ✭ 33 (-35.29%)
Mutual labels:  action-recognition, self-supervised-learning, contrastive-learning
GeDML
Generalized Deep Metric Learning.
Stars: ✭ 30 (-41.18%)
Mutual labels:  metric-learning, self-supervised-learning, contrastive-learning
DisCont
Code for the paper "DisCont: Self-Supervised Visual Attribute Disentanglement using Context Vectors".
Stars: ✭ 13 (-74.51%)
Mutual labels:  contrastive-loss, self-supervised-learning, contrastive-learning
COCO-LM
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Stars: ✭ 109 (+113.73%)
Mutual labels:  representation-learning, contrastive-learning
Supervised-Contrastive-Learning-in-TensorFlow-2
Implements the ideas presented in https://arxiv.org/pdf/2004.11362v1.pdf by Khosla et al.
Stars: ✭ 117 (+129.41%)
Mutual labels:  representation-learning, contrastive-learning
CLSA
official implemntation for "Contrastive Learning with Stronger Augmentations"
Stars: ✭ 48 (-5.88%)
Mutual labels:  self-supervised-learning, contrastive-learning
two-stream-action-recognition-keras
Two-stream CNNs for video action recognition implemented in Keras
Stars: ✭ 116 (+127.45%)
Mutual labels:  action-recognition, ucf-101
SCL
📄 Spatial Contrastive Learning for Few-Shot Classification (ECML/PKDD 2021).
Stars: ✭ 42 (-17.65%)
Mutual labels:  self-supervised-learning, contrastive-learning
Magnetloss Pytorch
PyTorch implementation of a deep metric learning technique called "Magnet Loss" from Facebook AI Research (FAIR) in ICLR 2016.
Stars: ✭ 217 (+325.49%)
Mutual labels:  embeddings, metric-learning
VQ-APC
Vector Quantized Autoregressive Predictive Coding (VQ-APC)
Stars: ✭ 34 (-33.33%)
Mutual labels:  representation-learning, self-supervised-learning
Squeeze-and-Recursion-Temporal-Gates
Code for : [Pattern Recognit. Lett. 2021] "Learn to cycle: Time-consistent feature discovery for action recognition" and [IJCNN 2021] "Multi-Temporal Convolutions for Human Action Recognition in Videos".
Stars: ✭ 62 (+21.57%)
Mutual labels:  action-recognition, kinetics-datasets
image embeddings
Using efficientnet to provide embeddings for retrieval
Stars: ✭ 107 (+109.8%)
Mutual labels:  embeddings, representation-learning

Temporally Coherent Embeddings for Self-Supervised Video Representation Learning

This repository contains the code implementation used in the ICPR2020 paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE). [arXiv] [Website] Our contributions in this repository are:

  • A Pytorch implementation of the self-supervised training used in the TCE paper
  • A Pytorch implementation of action recognition fine-tuning
  • Pre-trained checkpoints for models trained using the TCE self-supervised training paradigm
  • A Pytorch implementation of t-SNE visualisations of the network output

Network Architecture

We benchmark our code on Split 1 of the UCF101 action recognition dataset, providing pre-trained models for our downstream and upstream training. See Models for our provided models and Getting Started (#getting-started) for for instructions on training and evaluation.

If you find this repo useful for your research, please consider citing the paper

@inproceedings{knights2020tce,
 title={Temporally Coherent Embeddings for Self-Supervised Video Representation Learning},
 author={Joshua Knights and Ben Harwood and Daniel Ward and Anthony Vanderkop and Olivia Mackenzie-Ross and Peyman Moghadam},
booktitle={25th International Conference on Pattern Recognition (ICPR)},
 year={2020}
}

Updates

  • 23/04/2020 : Initial Commit
  • 30/11/2020 : ICPR Update

Table of Contents

Data Preparation

Kinetics400

Kinetics400 videos can be downloaded and split into frames directly from Showmax/kinetics-downloader

The file directory should have the following layout:

├── kinetics400/train
    |
    ├── CLASS_001
    ├── CLASS_002
    .
    .
    .
    CLASS_400
        | 
        ├── VID_001
        ├── VID_002
        .
        .
        .
        ├── VID_###
            | 
            ├── frame1.jpg
            ├── frame2.jpg
            .
            .
            .
            ├── frame###.jpg

Once the dataset is downloaded and split into frames, edit the following parameters in config/default.py to point towards the frames and splits:

  • DATASET.KINETICS400.FRAMES_PATH = /path/to/kinetics400/train

UCF101

UCF101 frames and splits can be downloaded directly from feichtenhofer/twostreamfusion

wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.001
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.002
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.003

cat ucf101_jpegs_256.zip* > ucf101_jpegs_256.zip
unzip ucf101_jpegs_256.zip

The file directory should have the following layout:

├── UCF101
    |
    ├── v_{_CLASS_001}_g01_c01
    .   | 
    .   ├── frame000001.jpg
    .   ├── frame000002.jpg 
    .   .
    .   .
    .   ├── frame000###.jpg
    .
    ├── v_{_CLASS_101}_g##_c##
        | 
        ├── frame000001.jpg
        ├── frame000002.jpg 
        .
        .
        ├── frame000###.jpg

Once the dataset is downloaded and decompressed, edit the following parameters in config/default.py to point towards the frames and splits:

  • DATASET.UCF101.FRAMES_PATH = /path/to/UCF101_frames
  • DATASET.UCF101.SPLITS_PATH = /path/to/UCF101_splits

Installation

TCE is built using Python == 3.7.1 and PyTorch == 1.7.0

We use Conda to setup the Python environment for this repository. In order to create the environment, run the following commands from the root directory:

conda env create -f TCE.yaml
conda activate TCE

Once this is done, also specify a path to save assets (such as dataset pickles for faster setup) to in config.default.py:

  • ASSETS_PATH = /path/to/assets/folder

Models

Architecture Pre-Training Dataset Link
ResNet-18 Kinetics400 Link
ResNet-50 Kinetics400 Link

Getting Started

Self-Supervised Training

We provide a script for pre-training with the Kinetics400 dataset using TCE, pretrain.py. To train, run the following script:

python finetune.py \
    --cfg config/pretrain_kinetics400miningr_finetune_UCF101_resnet18.yaml  \
    TRAIN.PRETRAINING.SAVEDIR /path/to/savedir 

If resuming from a previous pre-training checkpoint, set the flag TRAIN.PRETRAINING.CHECKPOINT to the path to the checkpoint to resume from

Fine-tuning for action recognition

We provide a fine-tuning script for action recognition on the UCF-101 dataset, finetune.py. To train, run the following script:

python finetune.py \
    --cfg config/pretrain_kinetics400miningr_finetune_UCF101_resnet18.yaml \
    TRAIN.FINETUNING.CHECKPOINT "/path/to/pretrained_checkpoint" \
    TRAIN.FINETUNING.SAVEDIR "/path/to/savedir"

If resuming training from an earlier finetuning checkpoint, set the flag TRAIN.FINETUNING.RESUME to True

Visualisation

vid

In order to demonstrate the ability of our approach to create temporally coherent embeddings, we provide a package to create t-SNE visualisations of our features similar to those found in the paper. This package can also be applied to other approaches and network architectures.

The files in this repository used for generating t-SNE visualisations are:

  • visualise_tsne.py Is a wrapper for t-SNE and our network architecture for end-to-end generation of the t-SNE
  • utils/tsne_utils.py Contains t-SNE functionality for reducing the dimensionality of an array of embedded features for plotting, as well as tools to create an animated visualisation of the embedding's behaviour over time

The following flags can be used as inputs for make_tsne.py:

  • --cfg : Path to config file
  • --target : Path to video to visualise t-SNE for. This video can either be a video file (avi, mp4) or a directory of images representing frames
  • --ckpt : Path to the model chekpoint to visualise the embedding space for
  • --gif : Use to visualise the change in the embedding space over time alongside the input video as a gif file
  • --fps : Set the framerate of the gif
  • --save : Path to save the output t-SNE to

To visualise the embeddings from TCE, download our self-supervised model above and use the following command to visualise our embedding space as a gif:

python visualise_tsne.py
    --cfg config/pretrain_kinetics400miningr_finetune_UCF101_resnet18.yaml \
    --target "/path/to/target/video" \
    --ckpt "/path/to/TCE_checkpoint" \
    --gif \
    --fps 25 \
    --save "/path/to/save/folder/t-SNE.gif"

Alternatively, to visualise the t-SNE as a PNG image use the following:

python visualise_tsne.py
    --cfg config/pretrain_kinetics400miningr_finetune_UCF101_resnet18.yaml \
    --target "/path/to/target/video" \
    --ckpt "/path/to/TCE_checkpoint" \
    --save "/path/to/save/folder/t-SNE.png"

Acknowledgements

Parts of this code base are derived from Yonglong Tian's unsupervised learning algorithm Contrastive Multiview Coding and Jeffrey Huang's implementation of action recognition.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].