gzsl-odOut-of-Distribution Detection for Generalized Zero-Shot Action Recognition
Stars: ✭ 47 (+235.71%)
ntu-xNTU-X, which is an extended version of popular NTU dataset
Stars: ✭ 55 (+292.86%)
tfvaegan[ECCV 2020] Official Pytorch implementation for "Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification". SOTA results for ZSL and GZSL
Stars: ✭ 107 (+664.29%)
MmskeletonA OpenMMLAB toolbox for human pose estimation, skeleton-based action recognition, and action synthesis.
Stars: ✭ 2,378 (+16885.71%)
UAV-Human[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles
Stars: ✭ 122 (+771.43%)
C3D-tensorflowAction recognition with C3D network implemented in tensorflow
Stars: ✭ 34 (+142.86%)
Two-Stream-CNNTwo Stream CNN implemented in Keras using in skeleton-based action recognition with dataset NTU RGB+D
Stars: ✭ 75 (+435.71%)
dynamic-images-for-action-recognitionA public Python implementation for generating Dynamic Images introduced in 'Dynamic Image Networks for Action Recognition' by Bilen et al.
Stars: ✭ 27 (+92.86%)
calvinCALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
Stars: ✭ 105 (+650%)
zero shot learningA Visual-semantic embedding model using word2vec and CNNs
Stars: ✭ 13 (-7.14%)
MSAFOffical implementation of paper "MSAF: Multimodal Split Attention Fusion"
Stars: ✭ 47 (+235.71%)
VidSitu[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
Stars: ✭ 41 (+192.86%)
CE-GZSLCodes for the CVPR 2021 paper: Contrastive Embedding for Generalized Zero-Shot Learning
Stars: ✭ 73 (+421.43%)
TCEThis repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE).
Stars: ✭ 51 (+264.29%)
BlazePoseBarracudaBlazePoseBarracuda is a human 2D/3D pose estimation neural network that runs the Mediapipe Pose (BlazePose) pipeline on the Unity Barracuda with GPU.
Stars: ✭ 131 (+835.71%)
clip playgroundAn ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities
Stars: ✭ 80 (+471.43%)
good robot"Good Robot! Now Watch This!": Repurposing Reinforcement Learning for Task-to-Task Transfer; and “Good Robot!”: Efficient Reinforcement Learning for Multi-Step Visual Tasks with Sim to Real Transfer
Stars: ✭ 84 (+500%)
MTL-AQAWhat and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]
Stars: ✭ 38 (+171.43%)
temporal-sslVideo Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.
Stars: ✭ 46 (+228.57%)
class-normClass Normalization for Continual Zero-Shot Learning
Stars: ✭ 34 (+142.86%)
CBPOfficial Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction"
Stars: ✭ 52 (+271.43%)
TRAR-VQA[ICCV 2021] TRAR: Routing the Attention Spans in Transformers for Visual Question Answering -- Official Implementation
Stars: ✭ 49 (+250%)
Zero-Shot-LearningA python ZSL system which makes it easy to run Zero-Shot Learning on new datasets, by giving it features and attributes. Used for the paper "Zero-Shot Learning Based Approach For Medieval Word Recognition Using Deep-Learned Features", published in ICFHR2018.
Stars: ✭ 21 (+50%)
pose2actionexperiments on classifying actions using poses
Stars: ✭ 24 (+71.43%)
X-VLMX-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
Stars: ✭ 283 (+1921.43%)
iMIXA framework for Multimodal Intelligence research from Inspur HSSLAB.
Stars: ✭ 21 (+50%)
conv3d-video-action-recognitionMy experimentation around action recognition in videos. Contains Keras implementation for C3D network based on original paper "Learning Spatiotemporal Features with 3D Convolutional Networks", Tran et al. and it includes video processing pipelines coded using mPyPl package. Model is being benchmarked on popular UCF101 dataset and achieves result…
Stars: ✭ 50 (+257.14%)
wikiHow paper listA paper list of research conducted based on wikiHow
Stars: ✭ 25 (+78.57%)
MUSES[CVPR 2021] Multi-shot Temporal Event Localization: a Benchmark
Stars: ✭ 51 (+264.29%)
VideoTransformer-pytorchPyTorch implementation of a collections of scalable Video Transformer Benchmarks.
Stars: ✭ 159 (+1035.71%)
bLVNet-TAMThe official Codes for NeurIPS 2019 paper. Quanfu Fan, Ricarhd Chen, Hilde Kuehne, Marco Pistoia, David Cox, "More Is Less: Learning Efficient Video Representations by Temporal Aggregation Modules"
Stars: ✭ 54 (+285.71%)
MiCT-Net-PyTorchVideo Recognition using Mixed Convolutional Tube (MiCT) on PyTorch with a ResNet backbone
Stars: ✭ 48 (+242.86%)
TA3N[ICCV 2019 Oral] TA3N: https://github.com/cmhungsteve/TA3N (Most updated repo)
Stars: ✭ 45 (+221.43%)
Squeeze-and-Recursion-Temporal-GatesCode for : [Pattern Recognit. Lett. 2021] "Learn to cycle: Time-consistent feature discovery for action recognition" and [IJCNN 2021] "Multi-Temporal Convolutions for Human Action Recognition in Videos".
Stars: ✭ 62 (+342.86%)
Zero-Shot-TTSUnofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Stars: ✭ 33 (+135.71%)
LintelA Python module to decode video frames directly, using the FFmpeg C API.
Stars: ✭ 240 (+1614.29%)
temporal-binding-networkImplementation of "EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition, ICCV, 2019" in PyTorch
Stars: ✭ 95 (+578.57%)
Pose2vecA Repository for maintaining various human skeleton preprocessing steps in numpy and tensorflow along with tensorflow model to learn pose embeddings.
Stars: ✭ 25 (+78.57%)
lang2segReferring Expression Object Segmentation with Caption-Aware Consistency, BMVC 2019
Stars: ✭ 30 (+114.29%)
ailia-modelsThe collection of pre-trained, state-of-the-art AI models for ailia SDK
Stars: ✭ 1,102 (+7771.43%)
sparsepropTemporal action proposals
Stars: ✭ 46 (+228.57%)
cvxpnplA Perspective-n-Points-and-Lines method.
Stars: ✭ 56 (+300%)
maksMotion Averaging
Stars: ✭ 52 (+271.43%)
Openpose-based-GUI-for-Realtime-Pose-Estimate-and-Action-RecognitionGUI based on the python api of openpose in windows using cuda10 and cudnn7. Support body , hand, face keypoints estimation and data saving. Realtime gesture recognition is realized through two-layer neural network based on the skeleton collected from the gui.
Stars: ✭ 69 (+392.86%)
AttentionalpoolingactionCode/Model release for NIPS 2017 paper "Attentional Pooling for Action Recognition"
Stars: ✭ 248 (+1671.43%)
MSPNMulti-Stage Pose Network
Stars: ✭ 40 (+185.71%)
AlphactionSpatio-Temporal Action Localization System
Stars: ✭ 221 (+1478.57%)
Action-LocalizationAction-Localization, Atomic Visual Actions (AVA) Dataset
Stars: ✭ 22 (+57.14%)
adascan-publicCode for AdaScan: Adaptive Scan Pooling (CVPR 2017)
Stars: ✭ 43 (+207.14%)
stanford-cs231n-assignments-2020This repository contains my solutions to the assignments for Stanford's CS231n "Convolutional Neural Networks for Visual Recognition" (Spring 2020).
Stars: ✭ 84 (+500%)