Alternatives and detailed information of ASDA

Introduction

This is an official pytorch implementation of Adversarial Semantic Data Augmentation for Human Pose Estimation. This code is based on the official pytorch implementation of HRNet.

Environment

python 3.7
torch==1.0.1.post2
torchvision==0.2.2

EasyDict==1.7
opencv-python==3.4.1.15
shapely==1.6.4
Cython
scipy
pandas
pyyaml
json_tricks
scikit-image
yacs>=0.1.5
tensorboardX==1.6

Quick start

install dependency
Make libs

cd ${POSE_ROOT}/lib
make

Install COCOAPI:

# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python3 setup.py install --user

Note that instructions like # COCOAPI=/path/to/install/cocoapi indicate that you should pick a path where you'd like to have the software cloned and then set an environment variable (COCOAPI in this case) accordingly.

Init output(training model output directory) and log(tensorboard log directory) directory:

mkdir output 
mkdir log

Your directory tree should look like this:

${POSE_ROOT}
├── data
├── experiments
├── lib
├── log
├── models
├── output
├── tools 
├── README.md
└── requirements.txt

Download pretrained models from HRNet model zoo(GoogleDrive or OneDrive)

${POSE_ROOT}
 |-- models
     |-- pytorch
         |-- imagenet
         |   |-- hrnet_w32-36af842e.pth
         |   |-- hrnet_w48-8ef0771d.pth
         |   |-- resnet50-19c8e357.pth
         |   |-- resnet101-5d3b4d8f.pth
         |   |-- resnet152-b121ed2d.pth
         |-- pose_coco
         |   |-- pose_hrnet_w32_256x192.pth
         |   |-- pose_hrnet_w32_384x288.pth
         |   |-- pose_hrnet_w48_256x192.pth
         |   |-- pose_hrnet_w48_384x288.pth
         |   |-- pose_resnet_101_256x192.pth
         |   |-- pose_resnet_101_384x288.pth
         |   |-- pose_resnet_152_256x192.pth
         |   |-- pose_resnet_152_384x288.pth
         |   |-- pose_resnet_50_256x192.pth
         |   |-- pose_resnet_50_384x288.pth
         |-- pose_mpii
             |-- pose_hrnet_w32_256x256.pth
             |-- pose_hrnet_w48_256x256.pth
             |-- pose_resnet_101_256x256.pth
             |-- pose_resnet_152_256x256.pth
             |-- pose_resnet_50_256x256.pth

Data preparation

For MPII data, please download from MPII Human Pose Dataset. The original annotation files are in matlab format. We use the json format from HRNet. You can download them from OneDrive or GoogleDrive. Extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
`-- |-- mpii
    `-- |-- annot
        |   |-- gt_valid.mat
        |   |-- test.json
        |   |-- train.json
        |   |-- trainval.json
        |   `-- valid.json
        `-- images
            |-- 000001163.jpg
            |-- 000003072.jpg

For COCO data, please download from COCO download, 2017 Train/Val is needed for COCO keypoints training and validation. We use the person detection result provided by HRNet to reproduce our multi-person pose estimation results. You could download from OneDrive or GoogleDrive. Download and extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
`-- |-- coco
    `-- |-- annotations
        |   |-- person_keypoints_train2017.json
        |   `-- person_keypoints_val2017.json
        |-- person_detection_results
        |   |-- COCO_val2017_detections_AP_H_56_person.json
        |   |-- COCO_test-dev2017_detections_AP_H_609_person.json
        `-- images
            |-- train2017
            |   |-- 000000000009.jpg
            |   |-- 000000000025.jpg
            |   |-- 000000000030.jpg
            |   |-- ... 
            `-- val2017
                |-- 000000000139.jpg
                |-- 000000000285.jpg
                |-- 000000000632.jpg
                |-- ...

Training and Testing

Training on MPII dataset

python tools/train_stn.py \
    --cfg experiments/mpii/hrnet/stn/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_bmp.yaml

Testing on MPII dataset

python tools/test.py \
--cfg experiments/mpii/hrnet/stn/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_bmp.yaml \
TEST.MODEL_FILE path/to/res_dir/model_best.pth

Multi-scale voting test

python tools/test_multiscale_multistage_voting.py \
--cfg experiments/mpii/hrnet/stn/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_bmp.yaml \
TEST.MODEL_FILE path/to/res_dir/model_best.pth

The parameters 'num_stage_to_fuse' and 'scale_pyramid' can be adjusted in the file to fuse predictions of different scales and stages. To avoid repeated predictions, after running this file, a file named 'test_preds_pyramid_mstage.npy' will be saved under ${POSE_ROOT}, which stores prediction results for num_scale scales and num_stage stages. When multi-scale prediction for different models is required, the file 'test_preds_pyramid_mstage.npy' needs to be deleted in advance.

Training on COCO dataset

python tools/train_stn.py \
    --cfg experiments/coco/hrnet/w32_256x192_adam_lr1e-3_numPart1_bmp_notBothHfpSa.yaml

Testing on COCO dataset

python tools/test.py \
--cfg experiments/coco/hrnet/w32_256x192_adam_lr1e-3_numPart1_bmp_notBothHfpSa.yaml \
TEST.MODEL_FILE path/to/res_dir/model_best.pth

Visualizing augmented image

python tools/visualize_stn.py \
--cfg experiments/mpii/hrnet/stn/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_bmp.yaml

result will be saved in 'path/to/output_dir/stn_vis/'

Important Parameters

Name	Type	Optimal
ASA.NUM_AUG	tuple	(1,)
ASA.PART_ANN_FILE	str	'./lip/parts_bmp_filter_done/part_anns.json'
ASA.PART_ROOT_DIR	str	'./lip/parts_bmp_filter_done/'
ASA.BOTH_HF_SA	bool	False
STN.LR	float	0.001
STN.STN_FIRST	bool	0
STN.NG	int	1
STN.ND	int	1

Model ZOO

MPII dataset

Name	Path
HRNet-w32 Num_Parts1	output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts2	output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart2_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts3	output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart3_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts4	output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart4_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts6	output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart6_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts8	output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart8_nohfp_bmp/model_best.pth

Name	Path
2-stacked hourglass Num_Parts1	output/mpii/pose_hourglass/stack2_256x256_c256_adam_lr1e-3_stn_res18_numPart1_nohfp_bmp/model_best.pth
8-stacked hourglass Num_Parts1	output/mpii/pose_hourglass/stack8_256x256_c256_adam_lr1e-3_stn_res18_numPart1_nohfp_bmp/model_best.pth
HRNet-w32 Num_Parts1	output/mpii/pose_hrnet2/w32_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_nohfp_bmp/model_best.pth
HRNet-w48 Num_Parts1	output/mpii/pose_hrnet2/w48_256x256_adam_lr1e-3_adversalstnlr0001_posefirst_res18_numPart1_nohfp_bmp/model_best.pth
SIM50 Num_Parts4	output/mpii/pose_resnet/res50_256x256_d256x3_adam_lr1e-3_stn_res18_numPart4/model_best.pth
SIM101 Num_Parts1	output/mpii/pose_resnet/res101_256x256_d256x3_adam_lr1e-3_stn_res18_numPart1/model_best.pth

Citation

If you use our code or models in your research, please cite with:

@inproceedings{bin2020adversarial,  
  title={Adversarial semantic data augmentation for human pose estimation},  
  author={Bin, Yanrui and Cao, Xuan and Chen, Xinya and Ge, Yanhao and Tai, Ying and Wang, Chengjie and Li, Jilin and Huang, Feiyue and Gao, Changxin and Sang, Nong},  
  booktitle={European Conference on Computer Vision},  
  pages={606--622},  
  year={2020},  
  organization={Springer}  
}

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Binyr / ASDA

Programming Languages