All Projects → dragonfly90 → Mxnet_realtime_multi Person_pose_estimation

dragonfly90 / Mxnet_realtime_multi Person_pose_estimation

This is a mxnet version of Realtime_Multi-Person_Pose_Estimation, origin code is here https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation

Projects that are alternatives of or similar to Mxnet realtime multi Person pose estimation

Tutorials Scikit Learn
Scikit-Learn tutorials
Stars: ✭ 121 (-0.82%)
Mutual labels:  jupyter-notebook
Pyross
PyRoss: inference, forecasts, and optimised control of epidemiological models in Python - http://pyross.readthedocs.io
Stars: ✭ 122 (+0%)
Mutual labels:  jupyter-notebook
Prototypical Networks Tensorflow
Tensorflow implementation of NIPS 2017 Paper "Prototypical Networks for Few-shot Learning"
Stars: ✭ 122 (+0%)
Mutual labels:  jupyter-notebook
Snucse
📓 Happy Campus Life
Stars: ✭ 121 (-0.82%)
Mutual labels:  jupyter-notebook
Feature Engineering Live Sessions
Stars: ✭ 122 (+0%)
Mutual labels:  jupyter-notebook
Ema
Explanatory Model Analysis. Explore, Explain and Examine Predictive Models
Stars: ✭ 121 (-0.82%)
Mutual labels:  jupyter-notebook
Qrs detector
Python Online and Offline ECG QRS Detector based on the Pan-Tomkins algorithm
Stars: ✭ 120 (-1.64%)
Mutual labels:  jupyter-notebook
Kaggle
My solution to Web Traffic Predictions competition on Kaggle.
Stars: ✭ 121 (-0.82%)
Mutual labels:  jupyter-notebook
Mpss
Modelos Probabilísticos de Señales y Sistemas
Stars: ✭ 122 (+0%)
Mutual labels:  jupyter-notebook
Opencv 3 Computer Vision With Python Cookbook
Published by Packt
Stars: ✭ 122 (+0%)
Mutual labels:  jupyter-notebook
Learn Machine Learning In Two Months
Những kiến thức cần thiết để học tốt Machine Learning trong vòng 2 tháng. Essential Knowledge for learning Machine Learning in two months.
Stars: ✭ 1,726 (+1314.75%)
Mutual labels:  jupyter-notebook
Magface
MagFace: A Universal Representation for Face Recognition and Quality Assessment
Stars: ✭ 117 (-4.1%)
Mutual labels:  jupyter-notebook
Pytorch Dc Tts
Text to Speech with PyTorch (English and Mongolian)
Stars: ✭ 122 (+0%)
Mutual labels:  jupyter-notebook
Hermes
Recommender System Framework
Stars: ✭ 121 (-0.82%)
Mutual labels:  jupyter-notebook
Applied Machine Learning
A step-by-step guide to get started with Applied Machine Learning
Stars: ✭ 122 (+0%)
Mutual labels:  jupyter-notebook
Python In A Notebook
Collection of Jupyter Notebooks about Python programming
Stars: ✭ 121 (-0.82%)
Mutual labels:  jupyter-notebook
Bps
Efficient Learning on Point Clouds with Basis Point Sets
Stars: ✭ 122 (+0%)
Mutual labels:  jupyter-notebook
Disvoice
feature extraction from speech signals
Stars: ✭ 121 (-0.82%)
Mutual labels:  jupyter-notebook
Mooc Coursera Advanced Machine Learning
Content from Coursera's ADVANCED MACHINE LEARNING Specialization (Deep Learning, Bayesian Methods, Natural Language Processing, Reinforcement Learning, Computer Vision).
Stars: ✭ 122 (+0%)
Mutual labels:  jupyter-notebook
Practicalsessions
Stars: ✭ 122 (+0%)
Mutual labels:  jupyter-notebook

Reimplementation of human keypoint detection in mxnet

  1. You can download mxnet model and parameters(coco and MPII) from google drive:

    https://drive.google.com/drive/folders/0BzffphMuhDDMV0RZVGhtQWlmS1U

    or check caffe_to_mxnet folder to download original caffe model and transfer it to mxnet model.

    install heatmap and pafmap cython: cython/rebuild.sh

  2. Test demo based on model of coco dataset: testModel.ipynb

  3. Test demo based on model of MPII dataset: testModel_mpi.ipynb

  4. Train with vgg model warm up. You can download mxnet model and parameters for vgg19 from here

    python TrainWeightOnVgg.py
    

    Train from CMU's converted model

    python TrainWeight.py 
    
  5. Check if heat map, part affinity graph map, mask are generated correctly in training: test_generateLabel.ipynb

  6. Evaluation on coco validation dataset with transfered mxnet model: evaluation_coco.py

The result is as following, the mean average precision (AP) over 10 OKS threshold on the first 2644 images in the val set is 0.550, which is 0.577 in original implementation.

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.550
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.800
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.610
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.541
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.576
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.591
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.812
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.644
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.549
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.651

Cite paper Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

@article{cao2016realtime,
  title={Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields},
  author={Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh},
  journal={arXiv preprint arXiv:1611.08050},
  year={2016}
  }

original caffe training https://github.com/CMU-Perceptual-Computing-Lab/caffe_rtpose

TODO:

  • [x] Test demo
  • [x] Train demo
  • [x] Add image augmentation: rotation, flip
  • [x] Add weight vector
  • [x] Train all images
  • [x] Train from vgg model
  • [x] Evaluation code
  • [x] Generate heat map and part affinity graph map in C++
  • [ ] Enhancement: feature pyramid backend in training, symbol and iterator in featurePyramidCPM.py

Training with vgg warm up

python TrainWeightOnVgg.py

(1) Before We tested the code using two K80 GPUS on COCO dataset, with batch size set to 10 and learning rate set to 0.00004. and using vgg pretrained vgg model to initialize our parameters. After 20 epochs, we tested our model on COCO validation dataset(only 50 images) and we got only 0.048 as mAP, very low compared to original implementation. Please reach us if you have some ideas about this issue.

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.048
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.183
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.019
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.078
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.035
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.066
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.224
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.022
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.075
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.054

(2) Fix the iterator bug, no data augmentation

We tested the code using one TITAN X (Pascal) on COCO dataset, with batch size set to 10 and learning rate set to 0.00004. and using pretrained vgg model to initialize our parameters. After 4 epochs, we tested our model on COCO validation dataset(only first 50 images) and we got only 0.115 as mAP, the original transfered model gots 0.530 .

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.115
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.350
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.030
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.168
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.091
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.141
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.373
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.067
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.164
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.117

After 18 epochs, we tested our model on COCO validation dataset(only first 50 images) and we got only 0.226 as mAP.

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.226
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.434
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.201
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.254
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.226
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.250
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.440
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.239
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.252
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.261

After 23 epochs, we tested our model on COCO validation dataset(only first 50 images) and we got only 0.231 as mAP.

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.231
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.466
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.230
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.245
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.249
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.251
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.470
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.261
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.243
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.278

After 36 epochs, we tested our model on COCO validation dataset(only first 50 images) and we got only 0.229 as mAP.

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.229
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.442
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.218
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.233
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.260
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.257
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.455
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.269
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.232
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.302

(3) batch size set to 10 and learning rate set to 0.00004, GTX 1080

First level

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.190
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.403
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.146
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.218
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.185
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.216
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.418
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.187
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.216
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.224

Six level

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.258
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.478
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.251
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.280
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.268
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.284
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.493
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.291
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.280
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.307

The traning process is not so easy, I found this model even can't converge if all layers are initialized randomly, I guess one reason is that this model uses many convolution layers with a large kernel, whose big pad may introduce much noise, and another reason may be the fact that this model uses MSE as loss function, and maybe it's better to use sigmoid as the activation function of the last layer and use entropy loss function instead.

Other implementations

Original caffe training model

Original data preparation and demo

Pytorch

keras

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].