All Projects → wmcnally → kapao

wmcnally / kapao

Licence: GPL-3.0 license
KAPAO is an efficient single-stage human pose estimation model that detects keypoints and poses as objects and fuses the detections to predict human poses.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to kapao

Keras realtime multi Person pose estimation
Keras version of Realtime Multi-Person Pose Estimation project
Stars: ✭ 728 (+20.53%)
Mutual labels:  human-pose-estimation, pose-estimation
Tensorflow realtime multi Person pose estimation
Multi-Person Pose Estimation project for Tensorflow 2.0 with a small and fast model based on MobilenetV3
Stars: ✭ 129 (-78.64%)
Mutual labels:  human-pose-estimation, pose-estimation
Convolutional Pose Machines Tensorflow
Stars: ✭ 758 (+25.5%)
Mutual labels:  human-pose-estimation, pose-estimation
Openpose
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
Stars: ✭ 22,892 (+3690.07%)
Mutual labels:  human-pose-estimation, pose-estimation
Multiposenet.pytorch
pytorch implementation of MultiPoseNet (ECCV 2018, Muhammed Kocabas et al.)
Stars: ✭ 191 (-68.38%)
Mutual labels:  human-pose-estimation, pose-estimation
Alphapose
Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System
Stars: ✭ 5,697 (+843.21%)
Mutual labels:  human-pose-estimation, pose-estimation
Pytorch Pose
A PyTorch toolkit for 2D Human Pose Estimation.
Stars: ✭ 932 (+54.3%)
Mutual labels:  human-pose-estimation, pose-estimation
rmpe dataset server
Realtime Multi-Person Pose Estimation data server. Used as a training and validation data provider in training process.
Stars: ✭ 14 (-97.68%)
Mutual labels:  human-pose-estimation, pose-estimation
posture recognition
Posture recognition based on common camera
Stars: ✭ 91 (-84.93%)
Mutual labels:  human-pose-estimation, pose-estimation
Awesome Human Pose Estimation
A collection of awesome resources in Human Pose estimation.
Stars: ✭ 2,022 (+234.77%)
Mutual labels:  human-pose-estimation, pose-estimation
Tf Pose Estimation
Deep Pose Estimation implemented using Tensorflow with Custom Architectures for fast inference.
Stars: ✭ 3,856 (+538.41%)
Mutual labels:  human-pose-estimation, pose-estimation
Ai Basketball Analysis
🏀🤖🏀 AI web app and API to analyze basketball shots and shooting pose.
Stars: ✭ 582 (-3.64%)
Mutual labels:  yolo, pose-estimation
Pytorch Human Pose Estimation
Implementation of various human pose estimation models in pytorch on multiple datasets (MPII & COCO) along with pretrained models
Stars: ✭ 346 (-42.72%)
Mutual labels:  human-pose-estimation, pose-estimation
Openpifpaf
Official implementation of "OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association" in PyTorch.
Stars: ✭ 662 (+9.6%)
Mutual labels:  human-pose-estimation, pose-estimation
Pose Residual Network Pytorch
Code for the Pose Residual Network introduced in 'MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network' paper https://arxiv.org/abs/1807.04067
Stars: ✭ 277 (-54.14%)
Mutual labels:  human-pose-estimation, pose-estimation
Poseestimationformobile
💃 Real-time single person pose estimation for Android and iOS.
Stars: ✭ 783 (+29.64%)
Mutual labels:  human-pose-estimation, pose-estimation
ICON
ICON: Implicit Clothed humans Obtained from Normals (CVPR 2022)
Stars: ✭ 641 (+6.13%)
Mutual labels:  human-pose-estimation, pose-estimation
openpifpaf
Official implementation of "OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association" in PyTorch.
Stars: ✭ 900 (+49.01%)
Mutual labels:  human-pose-estimation, pose-estimation
Gccpm Look Into Person Cvpr19.pytorch
Fast and accurate single-person pose estimation, ranked 10th at CVPR'19 LIP challenge. Contains implementation of "Global Context for Convolutional Pose Machines" paper.
Stars: ✭ 137 (-77.32%)
Mutual labels:  human-pose-estimation, pose-estimation
Monoloco
[ICCV 2019] Official implementation of "MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation" in PyTorch + Social Distancing
Stars: ✭ 242 (-59.93%)
Mutual labels:  human-pose-estimation, pose-estimation

KAPAO (Keypoints and Poses as Objects)

Accepted to ECCV 2022

KAPAO is an efficient single-stage multi-person human pose estimation method that models keypoints and poses as objects within a dense anchor-based detection framework. KAPAO simultaneously detects pose objects and keypoint objects and fuses the detections to predict human poses:

alt text

When not using test-time augmentation (TTA), KAPAO is much faster and more accurate than previous single-stage methods like DEKR, HigherHRNet, HigherHRNet + SWAHR, and CenterGroup:

alt text

This repository contains the official PyTorch implementation for the paper:
Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation.

Our code was forked from ultralytics/yolov5 at commit 5487451.

Setup

  1. If you haven't already, install Anaconda or Miniconda.
  2. Create a new conda environment with Python 3.6: $ conda create -n kapao python=3.6.
  3. Activate the environment: $ conda activate kapao
  4. Clone this repo: $ git clone https://github.com/wmcnally/kapao.git
  5. Install the dependencies: $ cd kapao && pip install -r requirements.txt
  6. Download the trained models: $ python data/scripts/download_models.py

Inference Demos

Note: FPS calculations include all processing (i.e., including image loading, resizing, inference, plotting / tracking, etc.). See script arguments for inference options.


Static Image

To generate the four images in the GIF above:

  1. $ python demos/image.py --bbox
  2. $ python demos/image.py --bbox --pose --face --no-kp-dets
  3. $ python demos/image.py --bbox --pose --face --no-kp-dets --kp-bbox
  4. $ python demos/image.py --pose --face

Shuffling Video

KAPAO runs fastest on low resolution video with few people in the frame. This demo runs KAPAO-S on a single-person 480p dance video using an input size of 1024. The inference speed is ~9.5 FPS on our CPU, and ~60 FPS on our TITAN Xp.

CPU inference:
alt text

To display the results in real-time:
$ python demos/video.py --face --display

To create the GIF above:
$ python demos/video.py --face --device cpu --gif


Flash Mob Video

This demo runs KAPAO-S on a 720p flash mob video using an input size of 1280.

GPU inference:
alt text

To display the results in real-time:
$ python demos/video.py --yt-id 2DiQUX11YaY --tag 136 --imgsz 1280 --color 255 0 255 --start 188 --end 196 --display

To create the GIF above:
$ python demos/video.py --yt-id 2DiQUX11YaY --tag 136 --imgsz 1280 --color 255 0 255 --start 188 --end 196 --gif


Red Light Green Light

This demo runs KAPAO-L on a 480p clip from the TV show Squid Game using an input size of 1024. The plotted poses constitute keypoint objects only.

GPU inference:
alt text

To display the results in real-time:
$ python demos/video.py --yt-id nrchfeybHmw --imgsz 1024 --weights kapao_l_coco.pt --conf-thres-kp 0.01 --kp-obj --face --start 56 --end 72 --display

To create the GIF above:
$ python demos/video.py --yt-id nrchfeybHmw --imgsz 1024 --weights kapao_l_coco.pt --conf-thres-kp 0.01 --kp-obj --face --start 56 --end 72 --gif


Squash Video

This demo runs KAPAO-S on a 1080p slow motion squash video. It uses a simple player tracking algorithm based on the frame-to-frame pose differences.

GPU inference:
alt text

To display the inference results in real-time:
$ python demos/squash.py --display --fps

To create the GIF above:
$ python demos/squash.py --start 42 --end 50 --gif --fps


Depth Video

Pose objects generalize well and can even be detected in depth video. Here KAPAO-S was run on a depth video from a fencing action recognition dataset.

alt text

The depth video above can be downloaded directly from here. To create the GIF above:
$ python demos/video.py -p 2016-01-04_21-33-35_Depth.avi --face --start 0 --end -1 --gif --gif-size 480 360


Web Demo

A web demo was integrated to Huggingface Spaces with Gradio (credit to @AK391). It uses KAPAO-S to run CPU inference on short video clips.

COCO Experiments

Download the COCO dataset: $ sh data/scripts/get_coco_kp.sh

Validation (without TTA)

  • KAPAO-S (63.0 AP): $ python val.py --rect
  • KAPAO-M (68.5 AP): $ python val.py --rect --weights kapao_m_coco.pt
  • KAPAO-L (70.6 AP): $ python val.py --rect --weights kapao_l_coco.pt

Validation (with TTA)

  • KAPAO-S (64.3 AP): $ python val.py --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-M (69.6 AP): $ python val.py --weights kapao_m_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-L (71.6 AP): $ python val.py --weights kapao_l_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1

Testing

  • KAPAO-S (63.8 AP): $ python val.py --scales 0.8 1 1.2 --flips -1 3 -1 --task test
  • KAPAO-M (68.8 AP): $ python val.py --weights kapao_m_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1 --task test
  • KAPAO-L (70.3 AP): $ python val.py --weights kapao_l_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1 --task test

Training

The following commands were used to train the KAPAO models on 4 V100s with 32GB memory each.

KAPAO-S:

python -m torch.distributed.launch --nproc_per_node 4 train.py \
--img 1280 \
--batch 128 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5s6.pt \
--project runs/s_e500 \
--name train \
--workers 128

KAPAO-M:

python train.py \
--img 1280 \
--batch 72 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5m6.pt \
--project runs/m_e500 \
--name train \
--workers 128

KAPAO-L:

python train.py \
--img 1280 \
--batch 48 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5l6.pt \
--project runs/l_e500 \
--name train \
--workers 128

Note: DDP is usually recommended but we found training was less stable for KAPAO-M/L using DDP. We are investigating this issue.

CrowdPose Experiments

  • Install the CrowdPose API to your conda environment:
    $ cd .. && git clone https://github.com/Jeff-sjtu/CrowdPose.git
    $ cd CrowdPose/crowdpose-api/PythonAPI && sh install.sh && cd ../../../kapao
  • Download the CrowdPose dataset: $ sh data/scripts/get_crowdpose.sh

Testing

  • KAPAO-S (63.8 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_s_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-M (67.1 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_m_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-L (68.9 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_l_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1

Training

The following commands were used to train the KAPAO models on 4 V100s with 32GB memory each. Training was performed on the trainval split with no validation. The test results above were generated using the last model checkpoint.

KAPAO-S:

python -m torch.distributed.launch --nproc_per_node 4 train.py \
--img 1280 \
--batch 128 \
--epochs 300 \
--data data/crowdpose.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5s6.pt \
--project runs/cp_s_e300 \
--name train \
--workers 128 \
--noval

KAPAO-M:

python train.py \
--img 1280 \
--batch 72 \
--epochs 300 \
--data data/crowdpose.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5m6.pt \
--project runs/cp_m_e300 \
--name train \
--workers 128 \
--noval

KAPAO-L:

python train.py \
--img 1280 \
--batch 48 \
--epochs 300 \
--data data/crowdpose.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5l6.pt \
--project runs/cp_l_e300 \
--name train \
--workers 128 \
--noval

Acknowledgements

This work was supported in part by Compute Canada, the Canada Research Chairs Program, the Natural Sciences and Engineering Research Council of Canada, a Microsoft Azure Grant, and an NVIDIA Hardware Grant.

If you find this repo is helpful in your research, please cite our paper:

@article{mcnally2021kapao,
  title={Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation},
  author={McNally, William and Vats, Kanav and Wong, Alexander and McPhee, John},
  journal={arXiv preprint arXiv:2111.08557},
  year={2021}
}

Please also consider citing our previous works:

@inproceedings{mcnally2021deepdarts,
  title={DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in Darts using a Single Camera},
  author={McNally, William and Walters, Pascale and Vats, Kanav and Wong, Alexander and McPhee, John},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4547--4556},
  year={2021}
}

@article{mcnally2021evopose2d,
  title={EvoPose2D: Pushing the Boundaries of 2D Human Pose Estimation Using Accelerated Neuroevolution With Weight Transfer},
  author={McNally, William and Vats, Kanav and Wong, Alexander and McPhee, John},
  journal={IEEE Access},
  volume={9},
  pages={139403--139414},
  year={2021},
  publisher={IEEE}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].