All Projects → DirtyHarryLYL → Hoi Learning List

DirtyHarryLYL / Hoi Learning List

A list of the Human-Object Interaction Learning studies.

Projects that are alternatives of or similar to Hoi Learning List

Daps
This repo allocate DAPs code of our ECCV 2016 publication
Stars: ✭ 74 (-48.97%)
Mutual labels:  action-recognition
Tdd
Trajectory-pooled Deep-Convolutional Descriptors
Stars: ✭ 99 (-31.72%)
Mutual labels:  action-recognition
I3d finetune
TensorFlow code for finetuning I3D model on UCF101.
Stars: ✭ 128 (-11.72%)
Mutual labels:  action-recognition
Vidvrd Helper
To keep updates with VRU Grand Challenge, please use https://github.com/NExTplusplus/VidVRD-helper
Stars: ✭ 81 (-44.14%)
Mutual labels:  action-recognition
Video Dataset Loading Pytorch
Generic PyTorch Dataset Implementation for Loading, Preprocessing and Augmenting Video Datasets
Stars: ✭ 92 (-36.55%)
Mutual labels:  action-recognition
Modelfeast
Pytorch model zoo for human, include all kinds of 2D CNN, 3D CNN, and CRNN
Stars: ✭ 116 (-20%)
Mutual labels:  action-recognition
Hake Action
As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).
Stars: ✭ 72 (-50.34%)
Mutual labels:  action-recognition
Hake
HAKE: Human Activity Knowledge Engine (CVPR'18/19/20, NeurIPS'20)
Stars: ✭ 132 (-8.97%)
Mutual labels:  action-recognition
3d Resnets
3D ResNets for Action Recognition
Stars: ✭ 95 (-34.48%)
Mutual labels:  action-recognition
Skeleton Based Action Recognition Papers And Notes
Skeleton-based Action Recognition Papers and Small Notes and Top 2 Leaderboard for NTU-RGBD
Stars: ✭ 126 (-13.1%)
Mutual labels:  action-recognition
M Pact
A one stop shop for all of your activity recognition needs.
Stars: ✭ 85 (-41.38%)
Mutual labels:  action-recognition
Temporal Segment Networks
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
Stars: ✭ 1,287 (+787.59%)
Mutual labels:  action-recognition
Keras Kinetics I3d
keras implementation of inflated 3d from Quo Vardis paper + weights
Stars: ✭ 116 (-20%)
Mutual labels:  action-recognition
Hake Action Torch
HAKE-Action in PyTorch
Stars: ✭ 74 (-48.97%)
Mutual labels:  action-recognition
Mmaction
An open-source toolbox for action understanding based on PyTorch
Stars: ✭ 1,711 (+1080%)
Mutual labels:  action-recognition
Tdn
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Stars: ✭ 72 (-50.34%)
Mutual labels:  action-recognition
Movienet Tools
Tools for movie and video research
Stars: ✭ 113 (-22.07%)
Mutual labels:  action-recognition
Actionrecognition
Explore Action Recognition
Stars: ✭ 139 (-4.14%)
Mutual labels:  action-recognition
Action Recognition
Exploration of different solutions to action recognition in video, using neural networks implemented in PyTorch.
Stars: ✭ 129 (-11.03%)
Mutual labels:  action-recognition
Epic Kitchens 55 Annotations
🍴 Annotations for the EPIC KITCHENS-55 Dataset.
Stars: ✭ 120 (-17.24%)
Mutual labels:  action-recognition

HOI-Learning-List

Some recent (2015-now) Human-Object Interaction Learing studies. If you find any errors or problems, please feel free to comment.

A list of Transfomer-based vision works: https://github.com/DirtyHarryLYL/Transformer-in-Vision.

Dataset

More...

Method

HOI Recognition: Image-based, to recognize all the HOIs in one image.

More...

Unseen or zero-shot learning (image-level recognition).

  • Compositional Learning for Human Object Interaction (ECCV2018) [Paper]

  • Zero-Shot Human-Object Interaction Recognition via Affordance Graphs (Sep. 2020) [Paper]

More...

HOI Detection: Instance-based, to detect the human-object pairs and classify the interactions.

More...

Unseen or zero-shot learning (instance-level detection).

  • FCL (CVPR2021) [Paper], [Code]

  • Detecting Human-Object Interaction with Mixed Supervision (WACV 2021) [Paper]

  • Zero-Shot Human-Object Interaction Recognition via Affordance Graphs (Sep. 2020) [Paper]

  • VCL (ECCV2020) [Paper] [Code]

  • HOID (CVPR2020) [Code] [Paper]

  • Novel Human-Object Interaction Detection via Adversarial Domain Generalization (May. 2020) [Paper]

  • Analogy (ICCV2019) [Code] [Paper]

  • Functional (AAAI2020) [Paper]

  • Scaling Human-Object Interaction Recognition through Zero-Shot Learning (WACV2018) [Paper]

More...

Video HOI methods

  • LIGHTEN (ACMMM2020) [Paper] [Code]

  • Generating Videos of Zero-Shot Compositions of Actions and Objects (Jul 2020), HOI GAN, [Paper]

  • Grounded Human-Object Interaction Hotspots from Video (ICCV2019) [Code] [Paper]

  • GPNN (ECCV2018) [Code] [Paper]

More...

Result

PaStaNet-HOI:

Proposed by TIN (TPAMI version, Transferable Interactiveness Network). It is built on HAKE data, includes 110K+ images and 520 HOIs (without the 80 "no_interaction" HOIs of HICO-DET to avoid the incomplete labeling). It has a more severe long-tailed data distribution thus is more difficult.

Detector: COCO pre-trained

Method mAP
iCAN 11.00
iCAN+NIS 13.13
TIN 15.38

HICO-DET:

1) Detector: COCO pre-trained

Method Pub Full(def) Rare(def) None-Rare(def) Full(ko) Rare(ko) None-Rare(ko)
Shen et al. WACV2018 6.46 4.24 7.12 - - -
HO-RCNN WACV2018 7.81 5.37 8.54 10.41 8.94 10.85
InteractNet CVPR2018 9.94 7.16 10.77 - - -
Turbo AAAI2019 11.40 7.30 12.60 - - -
GPNN ECCV2018 13.11 9.34 14.23 - - -
Xu et. al ICCV2019 14.70 13.26 15.13 - - -
iCAN BMVC2018 14.84 10.45 16.15 16.26 11.33 17.73
Wang et. al. ICCV2019 16.24 11.16 17.75 17.73 12.78 19.21
Lin et. al IJCAI2020 16.63 11.30 18.22 19.22 14.56 20.61
Functional (suppl) AAAI2020 16.96 11.73 18.52 - - -
Interactiveness CVPR2019 17.03 13.42 18.11 19.17 15.51 20.26
No-Frills ICCV2019 17.18 12.17 18.68 - - -
RPNN ICCV2019 17.35 12.78 18.71 - - -
PMFNet ICCV2019 17.46 15.65 18.00 20.34 17.47 21.20
SIGN ICME2020 17.51 15.31 18.53 20.49 17.53 21.51
Interactiveness-optimized CVPR2019 17.54 13.80 18.65 19.75 15.70 20.96
Wang et al. ECCV2020 17.57 16.85 17.78 21.00 20.74 21.08
In-GraphNet IJCAI-PRICAI 2020 17.72 12.93 19.31 - - -
HOID CVPR2020 17.85 12.85 19.34 - - -
MLCNet ICMR2020 17.95 16.62 18.35 22.28 20.73 22.74
SAG arXiv 18.26 13.40 19.71 - - -
Sarullo et al. arXiv 18.74 - - - - -
DRG ECCV2020 19.26 17.74 19.71 23.40 21.75 23.89
Analogy ICCV2019 19.40 14.60 20.90 - - -
VCL ECCV2020 19.43 16.55 20.29 22.00 19.09 22.87
VS-GATs arXiv 19.66 15.79 20.81 - - -
VSGNet CVPR2020 19.80 16.05 20.91 - - -
PFNet CVM 20.05 16.66 21.07 24.01 21.09 24.89
FCMNet ECCV2020 20.41 17.34 21.56 22.04 18.97 23.12
ACP ECCV2020 20.59 15.92 21.98 - - -
PD-Net ECCV2020 20.81 15.90 22.28 24.78 18.88 26.54
TIN-PAMI TAPMI2021 20.93 18.95 21.32 23.02 20.96 23.42
PMN arXiv 21.21 17.60 22.29 - - -
DJ-RN CVPR2020 21.34 18.53 22.18 23.69 20.64 24.60
OSGNet IEEE Access 21.40 18.12 22.38 - - -
DIRV AAAI2021 21.78 16.38 23.39 25.52 20.84 26.92
ConsNet ACMMM2020 22.15 17.12 23.65 - - -
IDN NeurIPS2020 23.36 22.47 23.63 26.43 25.01 26.85

2) Detector: pre-trained on COCO, fine-tuned on HICO-DET train set (with GT human-object pair boxes) or one-stage detector

Finetuned detector would learn to only detect the interactive humans and objects (with interactiveness), thus suppress many wrong pairings (non-interactive human-object pairs) and boost the performance. |Method| Pub|Full(def) | Rare(def) | None-Rare(def)| Full(ko) | Rare(ko) | None-Rare(ko) | |:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| |UniDet|ECCV2020|17.58 |11.72 |19.33 |19.76 |14.68 |21.27| |IP-Net | CVPR2020| 19.56 |12.79| 21.58 |22.05 |15.77 |23.92| |PPDM (paper) |CVPR2020|21.10 |14.46| 23.09| -|-|-| |PPDM (github-hourglass104) |CVPR2020|21.73/21.94 |13.78/13.97 |24.10/24.32 |24.58/24.81| 16.65/17.09| 26.84/27.12| |Functional |AAAI2020|21.96 |16.43|23.62| -|-|-| |SABRA-Res50| arXiv| 23.48| 16.39| 25.59| 28.79| 22.75| 30.54| |VCL|ECCV2020|23.63 |17.21 |25.55 |25.98 |19.12 |28.03| |SABRA-Res50FPN| arXiv| 24.12 |15.91| 26.57| 29.65| 22.92| 31.65| |ConsNet|ACMMM2020|24.39 |17.10 |26.56|-|-|-| |DRG|ECCV2020|24.53 |19.47 |26.04 |27.98 |23.11 |29.43| |SABRA-Res152| arXiv| 26.09 |16.29| 29.02| 31.08| 23.44| 33.37| |IDN|NeurIPS2020|26.29|22.61|27.39|28.24|24.47|29.37| |Zou et al.|CVPR2021|26.61 |19.15| 28.84| 29.13| 20.98| 31.57| |AS-Net|CVPR2021|28.87 |24.25 |30.25 |31.74 |27.07 |33.14| |QPIC-Res50|CVPR2021| 29.07 |21.85 |31.23 |31.68 |24.14 |33.93| |FCL|CVPR2021|29.12 |23.67 |30.75 |31.31 |25.62 |33.02| |QPIC-Res101|CVPR2021|29.90 |23.92 |31.69 |32.38 |26.06 |34.27|

3) Ground Truth human-object pair boxes (only evaluating HOI recognition)

Method Pub Full(def) Rare(def) None-Rare(def)
iCAN BMVC2018 33.38 21.43 36.95
Interactiveness CVPR2019 34.26 22.90 37.65
Analogy ICCV2019 34.35 27.57 36.38
IDN NeurIPS2020 43.98 40.27 45.09
FCL CVPR2021 45.25 36.27 47.94

4) Enhanced with HAKE:

Method Pub Full(def) Rare(def) None-Rare(def) Full(ko) Rare(ko) None-Rare(ko)
iCAN BMVC2018 14.84 10.45 16.15 16.26 11.33 17.73
iCAN + HAKE-HICO-DET CVPR2020 19.61 (+4.77) 17.29 20.30 22.10 20.46 22.59
Interactiveness CVPR2019 17.03 13.42 18.11 19.17 15.51 20.26
Interactiveness + HAKE-HICO-DET CVPR2020 22.12 (+5.09) 20.19 22.69 24.06 22.19 24.62
Interactiveness + HAKE-Large CVPR2020 22.66 (+5.63) 21.17 23.09 24.53 23.00 24.99

Ambiguous-HOI

Detector: COCO pre-trained

Method mAP
iCAN 8.14
Interactiveness 8.22
Analogy(reproduced) 9.72
DJ-RN 10.37

V-COCO: Scenario1

1) Detector: COCO pre-trained or one-stage detector

Method Pub AP(role)
Gupta et al. arXiv 31.8
InteractNet CVPR2018 40.0
Turbo AAAI2019 42.0
GPNN ECCV2018 44.0
iCAN BMVC2018 45.3
Xu et. al CVPR2019 45.9
Wang et. al. ICCV2019 47.3
UniDet ECCV2020 47.5
Interactiveness CVPR2019 47.8
Lin et. al IJCAI2020 48.1
VCL ECCV2020 48.3
Zhou et. al. CVPR2020 48.9
In-GraphNet IJCAI-PRICAI 2020 48.9
Interactiveness-optimized CVPR2019 49.0
TIN-PAMI TAPMI2021 49.1
IP-Net CVPR2020 51.0
DRG ECCV2020 51.0
VSGNet CVPR2020 51.8
PMN arXiv 51.8
PMFNet ICCV2019 52.0
FCL CVPR2021 52.35
PD-Net ECCV2020 52.6
Wang et.al. ECCV2020 52.7
PFNet CVM 52.8
Zou et al. CVPR2021 52.9
SIGN ICME2020 53.1
ACP ECCV2020 52.98 (53.23)
FCMNet ECCV2020 53.1
ConsNet ACMMM2020 53.2
IDN NeurIPS2020 53.3
OSGNet IEEE Access 53.43
SABRA-Res50 arXiv 53.57
AS-Net CVPR2021 53.9
SABRA-Res50FPN arXiv 54.69
MLCNet ICMR2020 55.2
DIRV AAAI2021 56.1
SABRA-Res152 arXiv 56.62
QPIC-Res101 CVPR2021 58.3
QPIC-Res50 CVPR2021 58.8

2) Enhanced with HAKE:

Method Pub AP(role)
iCAN CVPR2019 45.3
iCAN + HAKE-Large (transfer learning) CVPR2020 49.2 (+3.9)
Interactiveness CVPR2019 47.8
Interactiveness + HAKE-Large (transfer learning) CVPR2020 51.0 (+3.2)

HICO

1) Default

Method mAP
R*CNN 28.5
Girdhar et.al. 34.6
Mallya et.al. 36.1
Pairwise 39.9

2) Enhanced with HAKE:

Method mAP
Mallya et.al. 36.1
Mallya et.al.+HAKE-HICO 45.0 (+8.9)
Pairwise 39.9
Pairwise+HAKE-HICO 45.9 (+6.0)
Pairwise+HAKE-Large 46.3 (+6.4)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].