Extract video features from raw videos using multiple GPUs. We support RAFT and PWC flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, ResNet features.
A PyTorch implementation of R2Plus1D and C3D based on CVPR 2017 paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition" and CVPR 2014 paper "Learning Spatiotemporal Features with 3D Convolutional Networks"