All Projects → Binyr → ASDA

Binyr / ASDA

Licence: other
No description, website, or topics provided.

Programming Languages

python
139335 projects - #7 most used programming language
Cuda
1817 projects

Introduction

This is an official pytorch implementation of Adversarial Semantic Data Augmentation for Human Pose Estimation. This code is based on the official pytorch implementation of HRNet.

Environment

python 3.7
torch==1.0.1.post2
torchvision==0.2.2

EasyDict==1.7
opencv-python==3.4.1.15
shapely==1.6.4
Cython
scipy
pandas
pyyaml
json_tricks
scikit-image
yacs>=0.1.5
tensorboardX==1.6

Quick start

  1. install dependency

  2. Make libs

cd ${POSE_ROOT}/lib
make
  1. Install COCOAPI:
# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python3 setup.py install --user

Note that instructions like # COCOAPI=/path/to/install/cocoapi indicate that you should pick a path where you'd like to have the software cloned and then set an environment variable (COCOAPI in this case) accordingly.

  1. Init output(training model output directory) and log(tensorboard log directory) directory:
mkdir output 
mkdir log

Your directory tree should look like this:

${POSE_ROOT}
├── data
├── experiments
├── lib
├── log
├── models
├── output
├── tools 
├── README.md
└── requirements.txt
  1. Download pretrained models from HRNet model zoo(GoogleDrive or OneDrive)
${POSE_ROOT}
 |-- models
     |-- pytorch
         |-- imagenet
         |   |-- hrnet_w32-36af842e.pth
         |   |-- hrnet_w48-8ef0771d.pth
         |   |-- resnet50-19c8e357.pth
         |   |-- resnet101-5d3b4d8f.pth
         |   |-- resnet152-b121ed2d.pth
         |-- pose_coco
         |   |-- pose_hrnet_w32_256x192.pth
         |   |-- pose_hrnet_w32_384x288.pth
         |   |-- pose_hrnet_w48_256x192.pth
         |   |-- pose_hrnet_w48_384x288.pth
         |   |-- pose_resnet_101_256x192.pth
         |   |-- pose_resnet_101_384x288.pth
         |   |-- pose_resnet_152_256x192.pth
         |   |-- pose_resnet_152_384x288.pth
         |   |-- pose_resnet_50_256x192.pth
         |   |-- pose_resnet_50_384x288.pth
         |-- pose_mpii
             |-- pose_hrnet_w32_256x256.pth
             |-- pose_hrnet_w48_256x256.pth
             |-- pose_resnet_101_256x256.pth
             |-- pose_resnet_152_256x256.pth
             |-- pose_resnet_50_256x256.pth

Data preparation

For MPII data, please download from MPII Human Pose Dataset. The original annotation files are in matlab format. We use the json format from HRNet. You can download them from OneDrive or GoogleDrive. Extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
`-- |-- mpii
    `-- |-- annot
        |   |-- gt_valid.mat
        |   |-- test.json
        |   |-- train.json
        |   |-- trainval.json
        |   `-- valid.json
        `-- images
            |-- 000001163.jpg
            |-- 000003072.jpg

For COCO data, please download from COCO download, 2017 Train/Val is needed for COCO keypoints training and validation. We use the person detection result provided by HRNet to reproduce our multi-person pose estimation results. You could download from OneDrive or GoogleDrive. Download and extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
`-- |-- coco
    `-- |-- annotations
        |   |-- person_keypoints_train2017.json
        |   `-- person_keypoints_val2017.json
        |-- person_detection_results
        |   |-- COCO_val2017_detections_AP_H_56_person.json
        |   |-- COCO_test-dev2017_detections_AP_H_609_person.json
        `-- images
            |-- train2017
            |   |-- 000000000009.jpg
            |   |-- 000000000025.jpg
            |   |-- 000000000030.jpg
            |   |-- ... 
            `-- val2017
                |-- 000000000139.jpg
                |-- 000000000285.jpg
                |-- 000000000632.jpg
                |-- ... 

Training and Testing

Training on MPII dataset

python tools/train_stn.py \
    --cfg experiments/mpii/hrnet/stn/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_bmp.yaml

Testing on MPII dataset

python tools/test.py \
--cfg experiments/mpii/hrnet/stn/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_bmp.yaml \
TEST.MODEL_FILE path/to/res_dir/model_best.pth

Multi-scale voting test

python tools/test_multiscale_multistage_voting.py \
--cfg experiments/mpii/hrnet/stn/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_bmp.yaml \
TEST.MODEL_FILE path/to/res_dir/model_best.pth

The parameters 'num_stage_to_fuse' and 'scale_pyramid' can be adjusted in the file to fuse predictions of different scales and stages. To avoid repeated predictions, after running this file, a file named 'test_preds_pyramid_mstage.npy' will be saved under ${POSE_ROOT}, which stores prediction results for num_scale scales and num_stage stages. When multi-scale prediction for different models is required, the file 'test_preds_pyramid_mstage.npy' needs to be deleted in advance.

Training on COCO dataset

python tools/train_stn.py \
    --cfg experiments/coco/hrnet/w32_256x192_adam_lr1e-3_numPart1_bmp_notBothHfpSa.yaml

Testing on COCO dataset

python tools/test.py \
--cfg experiments/coco/hrnet/w32_256x192_adam_lr1e-3_numPart1_bmp_notBothHfpSa.yaml \
TEST.MODEL_FILE path/to/res_dir/model_best.pth

Visualizing augmented image

python tools/visualize_stn.py \
--cfg experiments/mpii/hrnet/stn/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_bmp.yaml

result will be saved in 'path/to/output_dir/stn_vis/'

Important Parameters

Name Type Optimal
ASA.NUM_AUG tuple (1,)
ASA.PART_ANN_FILE str './lip/parts_bmp_filter_done/part_anns.json'
ASA.PART_ROOT_DIR str './lip/parts_bmp_filter_done/'
ASA.BOTH_HF_SA bool False
STN.LR float 0.001
STN.STN_FIRST bool 0
STN.NG int 1
STN.ND int 1

Model ZOO

MPII dataset

Name Path
HRNet-w32 Num_Parts1 output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts2 output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart2_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts3 output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart3_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts4 output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart4_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts6 output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart6_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts8 output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart8_nohfp_bmp/model_best.pth
Name Path
2-stacked hourglass Num_Parts1 output/mpii/pose_hourglass/stack2_256x256_c256_adam_lr1e-3_stn_res18_numPart1_nohfp_bmp/model_best.pth
8-stacked hourglass Num_Parts1 output/mpii/pose_hourglass/stack8_256x256_c256_adam_lr1e-3_stn_res18_numPart1_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts1 output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_nohfp_bmp/model_best.pth
HRNet-w48 Num_Parts1 output/mpii/pose_hrnet2/w48_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_nohfp_bmp/model_best.pth
SIM50 Num_Parts4 output/mpii/pose_resnet/res50_256x256_d256x3_adam_lr1e-3_stn_res18_numPart4/model_best.pth
SIM101 Num_Parts1 output/mpii/pose_resnet/res101_256x256_d256x3_adam_lr1e-3_stn_res18_numPart1/model_best.pth

Citation

If you use our code or models in your research, please cite with:

@inproceedings{bin2020adversarial,  
  title={Adversarial semantic data augmentation for human pose estimation},  
  author={Bin, Yanrui and Cao, Xuan and Chen, Xinya and Ge, Yanhao and Tai, Ying and Wang, Chengjie and Li, Jilin and Huang, Feiyue and Gao, Changxin and Sang, Nong},  
  booktitle={European Conference on Computer Vision},  
  pages={606--622},  
  year={2020},  
  organization={Springer}  
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].