All Projects → moabitcoin → Ig65m Pytorch

moabitcoin / Ig65m Pytorch

Licence: mit
PyTorch 3D video classification models pre-trained on 65 million Instagram videos

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Ig65m Pytorch

Mmaction
An open-source toolbox for action understanding based on PyTorch
Stars: ✭ 1,711 (+688.48%)
Mutual labels:  action-recognition
Dd Net
A lightweight network for body/hand action recognition
Stars: ✭ 161 (-25.81%)
Mutual labels:  action-recognition
Hidden Two Stream
Caffe implementation for "Hidden Two-Stream Convolutional Networks for Action Recognition"
Stars: ✭ 179 (-17.51%)
Mutual labels:  action-recognition
Hake
HAKE: Human Activity Knowledge Engine (CVPR'18/19/20, NeurIPS'20)
Stars: ✭ 132 (-39.17%)
Mutual labels:  action-recognition
Untrimmednet
Weakly Supervised Action Recognition and Detection
Stars: ✭ 152 (-29.95%)
Mutual labels:  action-recognition
C3d Keras
C3D for Keras + TensorFlow
Stars: ✭ 171 (-21.2%)
Mutual labels:  action-recognition
Skeleton Based Action Recognition Papers And Notes
Skeleton-based Action Recognition Papers and Small Notes and Top 2 Leaderboard for NTU-RGBD
Stars: ✭ 126 (-41.94%)
Mutual labels:  action-recognition
Mmskeleton
A OpenMMLAB toolbox for human pose estimation, skeleton-based action recognition, and action synthesis.
Stars: ✭ 2,378 (+995.85%)
Mutual labels:  action-recognition
Timeception
Timeception for Complex Action Recognition, CVPR 2019 (Oral Presentation)
Stars: ✭ 153 (-29.49%)
Mutual labels:  action-recognition
Vip
Video Platform for Action Recognition and Object Detection in Pytorch
Stars: ✭ 175 (-19.35%)
Mutual labels:  action-recognition
Actionrecognition
Explore Action Recognition
Stars: ✭ 139 (-35.94%)
Mutual labels:  action-recognition
Awesome Activity Prediction
Paper list of activity prediction and related area
Stars: ✭ 147 (-32.26%)
Mutual labels:  action-recognition
Video Caffe
Video-friendly caffe -- comes with the most recent version of Caffe (as of Jan 2019), a video reader, 3D(ND) pooling layer, and an example training script for C3D network and UCF-101 data
Stars: ✭ 172 (-20.74%)
Mutual labels:  action-recognition
Action Recognition
Exploration of different solutions to action recognition in video, using neural networks implemented in PyTorch.
Stars: ✭ 129 (-40.55%)
Mutual labels:  action-recognition
Amass
Data preparation and loader for AMASS
Stars: ✭ 180 (-17.05%)
Mutual labels:  action-recognition
I3d finetune
TensorFlow code for finetuning I3D model on UCF101.
Stars: ✭ 128 (-41.01%)
Mutual labels:  action-recognition
Dynamic Image Nets
Dynamic Image Networks for Action Recognition
Stars: ✭ 163 (-24.88%)
Mutual labels:  action-recognition
Step
STEP: Spatio-Temporal Progressive Learning for Video Action Detection. CVPR'19 (Oral)
Stars: ✭ 196 (-9.68%)
Mutual labels:  action-recognition
Optical Flow Guided Feature
Implementation Code of the paper Optical Flow Guided Feature, CVPR 2018
Stars: ✭ 186 (-14.29%)
Mutual labels:  action-recognition
Hand pose action
Dataset and code for the paper "First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations", CVPR 2018.
Stars: ✭ 173 (-20.28%)
Mutual labels:  action-recognition

IG-65M PyTorch

Unofficial PyTorch (and ONNX) 3D video classification models and weights pre-trained on IG-65M (65MM Instagram videos).

IG-65M activations for the Primer movie trailer video; time goes top to bottom

IG-65M video deep dream: maximizing activations; for more see this pull request

Usage 💻

The following describes how to use the model in your own project and how to use our conversion and extraction tools.

PyTorch Models

We provide convenient PyTorch Hub integration

>>> import torch
>>>
>>> torch.hub.list("moabitcoin/ig65m-pytorch")
['r2plus1d_34_32_ig65m', 'r2plus1d_34_32_kinetics', 'r2plus1d_34_8_ig65m', 'r2plus1d_34_8_kinetics']
>>>
>>> model = torch.hub.load("moabitcoin/ig65m-pytorch", "r2plus1d_34_32_ig65m", num_classes=359, pretrained=True)

Tools

We build and publish Docker images (see all tags) via Travis CI/CD for master and for all releases.

In these images we provide the following tools:

  • convert - to convert Caffe2 blobs to PyTorch model and weights
  • extract - to compute clip features for a video with a pre-trained model
  • semcode - to visualize clip features for a video over time
  • index-build - to build an approximate nearest neighbor index from clip features
  • index-serve - to load an approximate nearest neighbor index and serve queries
  • index-query- to make approximate nearest neighbor queries against an index server

Run these pre-built images via

docker run moabitcoin/ig65m-pytorch:latest-cpu --help

Example for running on CPUs:

docker run --ipc=host -v $PWD:/data moabitcoin/ig65m-pytorch:latest-cpu \
    extract /data/myvideo.mp4 /data/myfeatures.npy

Example for running on GPUs via nvidia-docker:

docker run --runtime=nvidia --ipc=host -v $PWD:/data moabitcoin/ig65m-pytorch:latest-gpu \
    extract /data/myvideo.mp4 /data/myfeatures.npy

Development

We provide CPU and nvidia-docker based GPU Dockerfiles for self-contained and reproducible environments. Use the convenience Makefile to build the Docker image and then get into the container mounting a host directory to /data inside the container:

make
make run datadir=/Path/To/My/Videos

By default we build and run the CPU Docker images; for GPUs run:

make dockerfile=Dockerfile.gpu
make gpu

The WebcamDataset requires exposing /dev/video0 to the container which will only work on Linux:

make
make webcam

PyTorch and ONNX Models 🏆

We provide converted .pth and .pb PyTorch and ONNX weights, respectively.

Model Pretrain+Finetune Input Size pth onnx caffe2
R(2+1)D_34 IG-65M + None 8x112x112 r2plus1d_34_clip8_ig65m_from_scratch-9bae36ae.pth r2plus1d_34_clip8_ig65m_from_scratch-748ab053.pb r2plus1d_34_clip8_ig65m_from_scratch.pkl
R(2+1)D_34 IG-65M + Kinetics 8x112x112 r2plus1d_34_clip8_ft_kinetics_from_ig65m-0aa0550b.pth r2plus1d_34_clip8_ft_kinetics_from_ig65m-625d61b3.pb r2plus1d_34_clip8_ft_kinetics_from_ig65m.pkl
R(2+1)D_34 IG-65M + None 32x112x112 r2plus1d_34_clip32_ig65m_from_scratch-449a7af9.pth r2plus1d_34_clip32_ig65m_from_scratch-e304d648.pb r2plus1d_34_clip32_ig65m_from_scratch.pkl
R(2+1)D_34 IG-65M + Kinetics 32x112x112 r2plus1d_34_clip32_ft_kinetics_from_ig65m-ade133f1.pth r2plus1d_34_clip32_ft_kinetics_from_ig65m-10f4c3bf.pb r2plus1d_34_clip32_ft_kinetics_from_ig65m.pkl

Notes

  • ONNX models provided here have not been optimized for inference.
  • Models fine-tuned on Kinetics have 400 classes, the plain IG65 models 359 (32 clips), and 487 (8 clips) classes.
  • For models fine-tuned on Kinetics you can use the labels from here.
  • For plain IG65 models there is no label map available.
  • Official Facebook Research Caffe2 models are here.

References 📖

  1. D. Tran, H. Wang, L. Torresani, J. Ray, Y. LeCun and M. Paluri. A Closer Look at Spatiotemporal Convolutions for Action Recognition. CVPR 2018.
  2. D. Tran, H. Wang, L. Torresani and M. Feiszli. Video Classification with Channel-Separated Convolutional Networks. ICCV 2019.
  3. D. Ghadiyaram, M. Feiszli, D. Tran, X. Yan, H. Wang and D. Mahajan, Large-scale weakly-supervised pre-training for video action recognition. CVPR 2019.
  4. VMZ: Model Zoo for Video Modeling
  5. Kinetics & IG-65M

License

Copyright © 2019 MoabitCoin

Distributed under the MIT License (MIT).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].