All Projects → Sid2697 → awesome-egocentric-vision

Sid2697 / awesome-egocentric-vision

Licence: CC0-1.0 license
A curated list of egocentric (first-person) vision and related area resources

Projects that are alternatives of or similar to awesome-egocentric-vision

Intro To Cv Ud810
Problem Set solutions for the "Introduction to Computer Vision (ud810)" MOOC from Udacity
Stars: ✭ 110 (+6.8%)
Mutual labels:  activity-recognition
Rnn For Human Activity Recognition Using 2d Pose Input
Activity Recognition from 2D pose using an LSTM RNN
Stars: ✭ 165 (+60.19%)
Mutual labels:  activity-recognition
R2Plus1D-C3D
A PyTorch implementation of R2Plus1D and C3D based on CVPR 2017 paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition" and CVPR 2014 paper "Learning Spatiotemporal Features with 3D Convolutional Networks"
Stars: ✭ 54 (-47.57%)
Mutual labels:  activity-recognition
Hake
HAKE: Human Activity Knowledge Engine (CVPR'18/19/20, NeurIPS'20)
Stars: ✭ 132 (+28.16%)
Mutual labels:  activity-recognition
Motion Sense
MotionSense Dataset for Human Activity and Attribute Recognition ( time-series data generated by smartphone's sensors: accelerometer and gyroscope)
Stars: ✭ 159 (+54.37%)
Mutual labels:  activity-recognition
Video Caffe
Video-friendly caffe -- comes with the most recent version of Caffe (as of Jan 2019), a video reader, 3D(ND) pooling layer, and an example training script for C3D network and UCF-101 data
Stars: ✭ 172 (+66.99%)
Mutual labels:  activity-recognition
M Pact
A one stop shop for all of your activity recognition needs.
Stars: ✭ 85 (-17.48%)
Mutual labels:  activity-recognition
Awesome-Human-Activity-Recognition
An up-to-date & curated list of Awesome IMU-based Human Activity Recognition(Ubiquitous Computing) papers, methods & resources. Please note that most of the collections of researches are mainly based on IMU data.
Stars: ✭ 72 (-30.1%)
Mutual labels:  activity-recognition
Deep Learning Activity Recognition
A tutorial for using deep learning for activity recognition (Pytorch and Tensorflow)
Stars: ✭ 159 (+54.37%)
Mutual labels:  activity-recognition
Gait-Recognition-Using-Smartphones
Deep Learning-Based Gait Recognition Using Smartphones in the Wild
Stars: ✭ 77 (-25.24%)
Mutual labels:  activity-recognition
Awesome Activity Prediction
Paper list of activity prediction and related area
Stars: ✭ 147 (+42.72%)
Mutual labels:  activity-recognition
Fall Detection
Human Fall Detection from CCTV camera feed
Stars: ✭ 154 (+49.51%)
Mutual labels:  activity-recognition
Charades Algorithms
Activity Recognition Algorithms for the Charades Dataset
Stars: ✭ 181 (+75.73%)
Mutual labels:  activity-recognition
Machinelearning
一些关于机器学习的学习资料与研究介绍
Stars: ✭ 1,707 (+1557.28%)
Mutual labels:  activity-recognition
stipcv
Realtime implemnetation of spatial-temporal local features
Stars: ✭ 14 (-86.41%)
Mutual labels:  activity-recognition
T3d
Temporal 3D ConvNet
Stars: ✭ 97 (-5.83%)
Mutual labels:  activity-recognition
C3d Keras
C3D for Keras + TensorFlow
Stars: ✭ 171 (+66.02%)
Mutual labels:  activity-recognition
Squeeze-and-Recursion-Temporal-Gates
Code for : [Pattern Recognit. Lett. 2021] "Learn to cycle: Time-consistent feature discovery for action recognition" and [IJCNN 2021] "Multi-Temporal Convolutions for Human Action Recognition in Videos".
Stars: ✭ 62 (-39.81%)
Mutual labels:  activity-recognition
glimpse clouds
Pytorch implementation of the paper "Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points", F. Baradel, C. Wolf, J. Mille , G.W. Taylor, CVPR 2018
Stars: ✭ 30 (-70.87%)
Mutual labels:  activity-recognition
Step
STEP: Spatio-Temporal Progressive Learning for Video Action Detection. CVPR'19 (Oral)
Stars: ✭ 196 (+90.29%)
Mutual labels:  activity-recognition

Awesome Egocentric Vision Awesome

A curated list of egocentric vision resources.

Egocentric (first-person) vision is a sub-field of computer vision that analyses image/video data obtained using a wearable camera simulating a person's visual field.

Contents

Papers

Clustered in various problem statements.

Action/Activity Recognition

Object/Hand Recognition

Action/Gaze Anticipation

Localization

Clustering

Video Summarization

Social Interactions

Pose Estimation

Human Object Interaction

Temporal Boundary Detection

Privacy in Egocentric Videos

Multiple Egocentric Tasks

  • Ego4D: Around the World in 3,000 Hours of Egocentric Video - Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Christian Fuegen, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Bernard Ghanem, Vamsi Krishna Ithapu, C.V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, and Jitendra Malik. In CVPR 2022. [Github] [project page] [video]

Task Understanding

Miscellaneous (New Tasks)

Clustered according to the conferences.

CVPR

ECCV

ICCV

WACV

BMVC

Datasets

  • EgoProceL - 62 hours of egocentric videos recorded by 130 subjects performing 16 tasks for procedure learning.
  • EgoBody - Large-scale dataset capturing ground-truth 3D human motions during social interactions in 3D scenes.
  • UnrealEgo - Large-scale naturalistic dataset for egocentric 3D human pose estimation.
  • Hand-object Segments - Hand-object interactions in 11,235 frames from 1,000 videos covering daily activities in diverse scenarios.
  • Ego4D - 3,025 hours of daily-life activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 855 unique camera wearers from 74 worldwide locations and 9 different countries.
  • HOI4D - HOI4D consists of 2.4M RGB-D egocentric video frames over 4000 sequences collected by 9 participants interacting with 800 different object instances from 16 categories over 610 different indoor rooms.
  • EgoCom - A natural conversations dataset containing multi-modal human communication data captured simultaneously from the participants' egocentric perspectives.
  • TREK-100 - Object tracking in first person vision.
  • MECCANO - 20 subject assembling a toy motorbike.
  • EPIC-Kitchens 2020 - Subjects performing unscripted actions in their native environments.
  • EPIC-Tent - 29 participants assembling a tent while wearing two head-mounted cameras. [paper]
  • EGO-CH - 70 subjects visiting two cultural sites in Sicily, Italy.
  • EPIC-Kitchens 2018 - 32 subjects performing unscripted actions in their native environments.
  • Charade-Ego - Paired first-third person videos.
  • EGTEA Gaze+ - 32 subjects, 86 cooking sessions, 28 hours.
  • ADL - 20 subjects performing daily activities in their native environments.
  • CMU kitchen - Multimodal, 18 subjects cooking 5 different recipes: brownies, eggs, pizza, salad, sandwich.
  • EgoSeg - Long term actions (walking, running, driving, etc.)
  • First-Person Social Interactions - 8 subjects at disneyworld.
  • UEC Dataset - Two choreographed datasets with different egoactions (walk, jump, climb, etc.) + 6 YouTube sports videos.
  • JPL - Interaction with a robot.
  • FPPA - Five subjects performing 5 daily actions.
  • UT Egocentric - 3-5 hours long videos capturing a person's day.
  • VINST/ Visual Diaries - 31 videos capturing the visual experience of a subject walking from metro station to work.
  • Bristol Egocentric Object Interaction (BEOID) - 8 subjects, six locations. Interaction with objects and environment.
  • Object Search Dataset - 57 sequences of 55 subjects on search and retrieval tasks.
  • UNICT-VEDI - Different subjects visiting a museum.
  • UNICT-VEDI-POI - Different subjects visiting a museum.
  • Simulated Egocentric Navigations - Simulated navigations of a virtual agent within a large building.
  • EgoCart - Egocentric images collected by a shopping cart in a retail store.
  • Unsupervised Segmentation of Daily Living Activities - Egocentric videos of daily activities.
  • Visual Market Basket Analysis - Egocentric images collected by a shopping cart in a retail store.
  • Location Based Segmentation of Egocentric Videos - Egocentric videos of daily activities.
  • Recognition of Personal Locations from Egocentric Videos - Egocentric videos clips of daily.
  • EgoGesture - 2k videos from 50 subjects performing 83 gestures.
  • EgoHands - 48 videos of interactions between two people.
  • DoMSEV - 80 hours/different activities.
  • DR(eye)VE - 74 videos of people driving.
  • THU-READ - 8 subjects performing 40 actions with a head-mounted RGBD camera.
  • EgoDexter - 4 sequences with 4 actors (2 female), and varying interactions with various objects and and cluttered background. [paper]
  • First-Person Hand Action (FPHA) - 3D hand-object interaction. Includes 1175 videos belonging to 45 different activity categories performed by 6 actors. [paper]
  • UTokyo Paired Ego-Video (PEV) - 1,226 pairs of first-person clips extracted from the ones recorded synchronously during dyadic conversations.
  • UTokyo Ego-Surf - Contains 8 diverse groups of first-person videos recorded synchronously during face-to-face conversations.
  • TEgO: Teachable Egocentric Objects Dataset - Contains egocentric images of 19 distinct objects taken by two people for training a teachable object recognizer.
  • Multimodal Focused Interaction Dataset - Contains 377 minutes of continuous multimodal recording captured during 19 sessions, with 17 conversational partners in 18 different indoor/outdoor locations.

Contribute

This is a work in progress. Contributions welcome! Read the contribution guidelines first.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].