shrubb / latent-pose-reenactment

Licence: Apache-2.0 license

The authors' implementation of the "Neural Head Reenactment with Latent Pose Descriptors" (CVPR 2020) paper.

Programming Languages

python

139335 projects - #7 most used programming language

shell

77523 projects

Projects that are alternatives of or similar to latent-pose-reenactment

Awesome-Vision-Transformer-Collection

Variants of Vision Transformer and its downstream tasks

Stars: ✭ 124 (-6.06%)

Mutual labels: generative-model, pose-estimation, self-supervised-learning

sc depth pl

Pytorch Lightning Implementation of SC-Depth (V1, V2...) for Unsupervised Monocular Depth Estimation.

Stars: ✭ 86 (-34.85%)

Mutual labels: pose-estimation, self-supervised-learning

deep alignment network pytorch

PyTorch Implementation of the Deep Alignment Network

Stars: ✭ 37 (-71.97%)

Mutual labels: landmark-detection, facial-landmarks

Deep-MVLM

A tool for precisely placing 3D landmarks on 3D facial scans based on the paper "Multi-view Consensus CNN for 3D Facial Landmark Placement"

Stars: ✭ 71 (-46.21%)

Mutual labels: landmark-detection, facial-landmarks

EgoNet

Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

Stars: ✭ 111 (-15.91%)

Mutual labels: pose-estimation, self-supervised-learning

naru

Neural Relation Understanding: neural cardinality estimators for tabular data

Stars: ✭ 76 (-42.42%)

Mutual labels: generative-model, self-supervised-learning

CLMR

Official PyTorch implementation of Contrastive Learning of Musical Representations

Stars: ✭ 216 (+63.64%)

Mutual labels: self-supervised-learning

coursera-gan-specialization

Programming assignments and quizzes from all courses within the GANs specialization offered by deeplearning.ai

Stars: ✭ 277 (+109.85%)

Mutual labels: generative-model

DisCont

Code for the paper "DisCont: Self-Supervised Visual Attribute Disentanglement using Context Vectors".

Stars: ✭ 13 (-90.15%)

Mutual labels: self-supervised-learning

tianchi-fashionai

FashionAI全球挑战赛——服饰关键点定位

Stars: ✭ 21 (-84.09%)

Mutual labels: landmark-detection

pytorch-PyraNet

Pytorch version reinplement code of PyraNet , for paper : Learning Feature Pyramids for Human Pose Estimation

Stars: ✭ 32 (-75.76%)

Mutual labels: pose-estimation

GaitGraph

Official repository for "GaitGraph: Graph Convolutional Network for Skeleton-Based Gait Recognition" (ICIP'21)

Stars: ✭ 68 (-48.48%)

Mutual labels: pose-estimation

lossyless

Generic image compressor for machine learning. Pytorch code for our paper "Lossy compression for lossless prediction".

Stars: ✭ 81 (-38.64%)

Mutual labels: self-supervised-learning

All4Depth

Self-Supervised Depth Estimation on Monocular Sequences

Stars: ✭ 58 (-56.06%)

Mutual labels: pose-estimation

android tflite

GPU Accelerated TensorFlow Lite applications on Android NDK. Higher accuracy face detection, Age and gender estimation, Human pose estimation, Artistic style transfer

Stars: ✭ 105 (-20.45%)

Mutual labels: pose-estimation

Landmark Detection Robot Tracking SLAM-

Simultaneous Localization and Mapping(SLAM) also gives you a way to track the location of a robot in the world in real-time and identify the locations of landmarks such as buildings, trees, rocks, and other world features.

Stars: ✭ 14 (-89.39%)

Mutual labels: landmark-detection

SRN

Code for "SRN: Stacked Regression Network for Real-time 3D Hand Pose Estimation" BMVC 2019

Stars: ✭ 27 (-79.55%)

Mutual labels: pose-estimation

CondGen

Conditional Structure Generation through Graph Variational Generative Adversarial Nets, NeurIPS 2019.

Stars: ✭ 46 (-65.15%)

Mutual labels: generative-model

ailia-models

The collection of pre-trained, state-of-the-art AI models for ailia SDK

Stars: ✭ 1,102 (+734.85%)

Mutual labels: pose-estimation

gltf-avatar-threejs

A glTF-based 3d avatar system

Stars: ✭ 195 (+47.73%)

Mutual labels: avatar

View All Similar Projects ➔

Neural Head Reenactment with Latent Pose Descriptors

Burkov, E., Pasechnik, I., Grigorev, A., & Lempitsky V. (2020, June). Neural Head Reenactment with Latent Pose Descriptors. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

See the project page for an overview.

Prerequisites

For fine-tuning a pre-trained model, you'll need an NVIDIA GPU, preferably with 8+ GB VRAM. To train from scratch, we recommend a total of 40+ GB VRAM.

Set up your environment as described here.

Running the pretrained model

Collect images of the person to reenact.
Run utils/preprocess_dataset.sh to preprocess them. Read inside for instructions.
Download the meta-model checkpoint.
Run the below to fine-tune the meta-model to your person, first setting the top variables. If you want, also launch a TensorBoard at "$OUTPUT_PATH" to view progress, preferably with the --samples_per_plugin "scalars=1000,images=100" option; mainly check the "images" tab to find out at which iteration the identity gap becomes small enough.

# in this example, your images should be "$DATASET_ROOT/images-cropped/$IDENTITY_NAME/*.jpg"
DATASET_ROOT="/where/is/your/data"
IDENTITY_NAME="identity/name"
MAX_BATCH_SIZE=8             # pick the largest possible, start with 8 and decrease until it fits in VRAM
CHECKPOINT_PATH="/where/is/checkpoint.pth"
OUTPUT_PATH="outputs/"       # a directory for outputs, will be created
RUN_NAME="tony_hawk_take_1"  # give your run a name if you want

# Important. See the note below
TARGET_NUM_ITERATIONS=230

# Don't change these
NUM_IMAGES=`ls -1 "$DATASET_ROOT/images-cropped/$IDENTITY_NAME" | wc -l`
BATCH_SIZE=$((NUM_IMAGES<MAX_BATCH_SIZE ? NUM_IMAGES : MAX_BATCH_SIZE))
ITERATIONS_IN_EPOCH=$(( NUM_IMAGES / BATCH_SIZE ))

mkdir -p $OUTPUT_PATH

python3 train.py \
    --config finetuning-base                 \
    --checkpoint_path "$CHECKPOINT_PATH"     \
    --data_root "$DATASET_ROOT"              \
    --train_split_path "$IDENTITY_NAME"      \
    --batch_size $BATCH_SIZE                 \
    --num_epochs $(( (TARGET_NUM_ITERATIONS + ITERATIONS_IN_EPOCH - 1) / ITERATIONS_IN_EPOCH )) \
    --experiments_dir "$OUTPUT_PATH"         \
    --experiment_name "$RUN_NAME"

Note. TARGET_NUM_ITERATIONS is important, make sure to tune it. Pick too low, underfit and get an identity gap; pick too high, overfit and get poor mimics. I suggest that you start with 125 when NUM_IMAGES=1 and increase with more images, say, to 230 when NUM_IMAGES>30. But your concrete case may be different. If you have a lot of disk space, pass a flag to save checkpoints every so often (e.g. --save_frequency 4 will save a checkpoint every 4 * NUM_IMAGES iterations), then drive (see below how) each of them and thus find the iteration where the best tradeoff happens for your avatar.

Take your driving video and crop it with python3 utils/crop_as_in_dataset.py. Run with --help to learn how. Or, equivalently, just reuse utils/preprocess_dataset.sh with COMPUTE_SEGMENTATION=false.
Organize the cropped images from the previous step as "<data_root>/images-cropped/<images_path>/*.jpg".
Use them to drive your fine-tuned model (the checkpoint is at "$OUTPUT_PATH/$RUN_NAME/checkpoints") with python3 drive.py. Run with --help to learn how.

Training (meta-learning) your own model

You'll need a training configuration (aka config) file. Start with "configs/default.yaml" or just edit that. These files specify various training options which you can find in code as argparse parameters. Any of these options can be specified both in the config file and on the command line (e.g. --batch_size=7), and are resolved as follows (any source here overrides all the preceding ones):

argparse defaults — these are specified in the code directly;
those saved in a loaded checkpoint (if starting from a checkpoint);
your --config file;
command line.

The command is

python3 train.py --config=config_name [any extra arguments ...]

Or, with multiple GPUs,

python3 -um torch.distributed.launch --nproc_per_node=<number of GPUs> train.py --config=config_name [any extra arguments ...]

Reference

Consider citing us if you use the code:

@InProceedings{Burkov_2020_CVPR,
author = {Burkov, Egor and Pasechnik, Igor and Grigorev, Artur and Lempitsky, Victor},
title = {Neural Head Reenactment with Latent Pose Descriptors},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

shrubb / latent-pose-reenactment

Programming Languages

Labels

Projects that are alternatives of or similar to latent-pose-reenactment

Neural Head Reenactment with Latent Pose Descriptors

Prerequisites

Running the pretrained model

Training (meta-learning) your own model

Reference