All Projects → RasmusRPaulsen → Deep-MVLM

RasmusRPaulsen / Deep-MVLM

Licence: MIT license
A tool for precisely placing 3D landmarks on 3D facial scans based on the paper "Multi-view Consensus CNN for 3D Facial Landmark Placement"

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Deep-MVLM

latent-pose-reenactment
The authors' implementation of the "Neural Head Reenactment with Latent Pose Descriptors" (CVPR 2020) paper.
Stars: ✭ 132 (+85.92%)
Mutual labels:  landmark-detection, facial-landmarks
deep alignment network pytorch
PyTorch Implementation of the Deep Alignment Network
Stars: ✭ 37 (-47.89%)
Mutual labels:  landmark-detection, facial-landmarks
skillful nowcasting
Implementation of DeepMind's Deep Generative Model of Radar (DGMR) https://arxiv.org/abs/2104.00954
Stars: ✭ 117 (+64.79%)
Mutual labels:  pytorch-implementation
ClusterTransformer
Topic clustering library built on Transformer embeddings and cosine similarity metrics.Compatible with all BERT base transformers from huggingface.
Stars: ✭ 36 (-49.3%)
Mutual labels:  pytorch-implementation
WebGL-Hole-Filling
Master project. Read a 3D model in a format like OBJ, display it per WebGL and provide tools to fill holes in the model.
Stars: ✭ 18 (-74.65%)
Mutual labels:  3d-models
TS3000 TheChatBOT
Its a social networking chat-bot trained on Reddit dataset . It supports open bounded queries developed on the concept of Neural Machine Translation. Beware of its being sarcastic just like its creator 😝 BDW it uses Pytorch framework and Python3.
Stars: ✭ 20 (-71.83%)
Mutual labels:  pytorch-implementation
vasaro
Vasaro let you create 3d printable vases in a snap.
Stars: ✭ 30 (-57.75%)
Mutual labels:  3d-models
keyphrase-generation-rl
Code for the ACL 19 paper "Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards"
Stars: ✭ 95 (+33.8%)
Mutual labels:  pytorch-implementation
DCAN
[AAAI 2020] Code release for "Domain Conditioned Adaptation Network" https://arxiv.org/abs/2005.06717
Stars: ✭ 27 (-61.97%)
Mutual labels:  pytorch-implementation
mcthings
A Python framework for creating 3D scenes in Minecraft and Minetest
Stars: ✭ 44 (-38.03%)
Mutual labels:  3d-models
aws-greengrass-mini-fulfillment
An example of AWS Greengrass used in a miniature fulfillment center.
Stars: ✭ 45 (-36.62%)
Mutual labels:  3d-models
WalkTheWeb
WalkTheWeb 3D Internet - Metaverse - Multiverse - Host your own multiplayer Metaverse of 3D Games, 3D Shopping, and 3D Scenes!
Stars: ✭ 28 (-60.56%)
Mutual labels:  3d-models
efficientnetv2.pytorch
PyTorch implementation of EfficientNetV2 family
Stars: ✭ 366 (+415.49%)
Mutual labels:  pytorch-implementation
nnDetection
nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.
Stars: ✭ 355 (+400%)
Mutual labels:  pytorch-implementation
ResNet-50-CBAM-PyTorch
Implementation of Resnet-50 with and without CBAM in PyTorch v1.8. Implementation tested on Intel Image Classification dataset from https://www.kaggle.com/puneet6060/intel-image-classification.
Stars: ✭ 31 (-56.34%)
Mutual labels:  pytorch-implementation
glTF-js-utils
Helper library for creating glTF 2.0 models with JavaScript.
Stars: ✭ 48 (-32.39%)
Mutual labels:  3d-models
tfvaegan
[ECCV 2020] Official Pytorch implementation for "Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification". SOTA results for ZSL and GZSL
Stars: ✭ 107 (+50.7%)
Mutual labels:  pytorch-implementation
LUVLi
[CVPR 2020] Re-hosting of the LUVLi Face Alignment codebase. Please download the codebase from the original MERL website by agreeing to all terms and conditions. By using this code, you agree to MERL's research-only licensing terms.
Stars: ✭ 24 (-66.2%)
Mutual labels:  facial-landmarks
loc2vec
Pytorch implementation of the Loc2Vec with some modifications for speed
Stars: ✭ 40 (-43.66%)
Mutual labels:  pytorch-implementation
Landmark Detection Robot Tracking SLAM-
Simultaneous Localization and Mapping(SLAM) also gives you a way to track the location of a robot in the world in real-time and identify the locations of landmarks such as buildings, trees, rocks, and other world features.
Stars: ✭ 14 (-80.28%)
Mutual labels:  landmark-detection

Deep learning based 3D landmark placement

A tool for accurately placing 3D landmarks on 3D facial scans based on the paper Multi-view Consensus CNN for 3D Facial Landmark Placement.

Overview

Citing Deep-MVLM

If you use Deep-MVLM in your research, please cite the paper:

@inproceedings{paulsen2018multi,
  title={Multi-view Consensus CNN for 3D Facial Landmark Placement},
  author={Paulsen, Rasmus R and Juhl, Kristine Aavild and Haspang, Thilde Marie and Hansen, Thomas and Ganz, Melanie and Einarsson, Gudmundur},
  booktitle={Asian Conference on Computer Vision},
  pages={706--719},
  year={2018},
  organization={Springer}
}

Updates

  • 24-03-2021 : Hopefully the "cannot instantiate 'WindowsPath'" issue should now be solved. Pre-trained models no longer contain path variables.

Getting Deep-MVLM

Download or clone from github

Requriements

The code has been tested under Windows 10 both with a GPU enabled (Titan X) computer and without a GPU (works but slow). It has been tested with the following dependencies

  • Python 3.7
  • Pytorch 1.2
  • vtk 8.2
  • libnetcdf 4.7.1 (needed by vtk)
  • imageio 2.6
  • matplotlib 3.1.1
  • scipy 1.3.1
  • scikit-image 0.15
  • tensorboard 1.14
  • absl-py 0.8

Getting started

The easiset way to use Deep-MVLM is to use the pre-trained models to place landmarks on your meshes. To place the DTU-3D landmarks on a mesh try:

python predict.py --c configs/DTU3D-RGB.json --n assets/testmeshA.obj

This should create two landmarks files (a .vtk file and a .txt) file in the assets directory and also show a window with a face mesh with landmarks as (its a 3D rendering that can be manipulated with the mouse):

Predicted output

Supported formats and types

The framework can place landmarks on surface without textures, with textures and with vertex coloring. The supported formats are:

  • OBJ textured surfaces (including multi textures), non-textured surfaces
  • WRL textured surfaces (only single texture), non-textured surfaces
  • VTK textured surfaces (only single texture), vertex colored surfaces, non-textured surfaces
  • STL non-textured surfaces
  • PLY non-textured surfaces

Rendering types

The type of 3D rendering used is specified in the image_channels setting in the JSON configuration file. The options are:

  • geometry pure geometry rendering without texture (1 image channel)
  • depth depth rendering (the z-buffer) similar to range scanners like the Kinect (1 image channel)
  • RGB texture rendering (3 image channels)
  • RGB+depth texture plus depth rendering (3+1=4 image channels)
  • geometry+depth geometry plus depth rendering (1+1=2 image channels)

Pre-trained networks

The algorithm comes with pre-trained networks for the landmark sets DTU3D consisting of 73 landmarks that are described in this paper and here and the landmark set from BU-3DFE described further down.

Predict landmarks on a single scan

First determine what landmark set you want to place. Either DTU3D or BU-3DFE. Secondly, choose the rendering type suitable for your scan. Here are some recommendations:

  • surface with RGB texture use RGB+depth or RGB
  • surface with vertex colors use RGB+depth or RGB
  • surface with no texture use geometry+depth, geometry or depth

Now you can choose the JSON config file that fits your need. For example configs/DTU3D-RGB+depth.json. Finally, do the prediction:

python predict.py --c configs/DTU3D-RGB+depth.json --n yourscan

Predict landmarks on a directory with scans

Select a configuration file following the approach above and do the prediction:

python predict.py --c configs/DTU3D-RGB+depth.json --n yourdirectory

where yourdirectory is a directory (or directory tree) containing scans. It will process all obj, wrl, vtk, stl and ply files.

Predict landmarks on a file with scan names

Select a configuration file following the approach above and do the prediction:

python predict.py --c configs/DTU3D-RGB+depth.json --n yourfile.txt

where yourfile.txt is a text file containing names of scans to be processed.

Specifying a pre-transformation

The algorithm expects that the face has a general placement and orientation. Specifically, that the scan is centered around the origin and that the nose is pointing in the z-direction and the up of the head is aligned with the y-axis as seen here:

coord-system

In order to re-align a scan to this system, a section of the JSON configuration file can be modified:

"pre-align": {
	"align_center_of_mass" : true,
	"rot_x": 0,
	"rot_y": 0,
	"rot_z": 180,
	"scale": 1,
	"write_pre_aligned": true
}

Here the scan is first aligned so the center-of-mass of the scan is aligned to the origo. Secondly, it is rotated 180 degrees around the z-axis. The rotation order is z-x-y. This will align this scan:

mri-coord-system

so it is aligned for processing and the result is:

mri-results3

this configuration file can be found as configs/DTU3D-depth-MRI.json

How to use the framework in your own code

Detect 3D landmarks in a 3D facial scan

import argparse
from parse_config import ConfigParser
import deepmvlm
from utils3d import Utils3D

dm = deepmvlm.DeepMVLM(config)
landmarks = dm.predict_one_file(file_name)
dm.write_landmarks_as_vtk_points(landmarks, name_lm_vtk)
dm.write_landmarks_as_text(landmarks, name_lm_txt)
dm.visualise_mesh_and_landmarks(file_name, landmarks)

The full source (including how to read the JSON config files) is predict.py

Examples

The following examples use data from external sources.

Artec3D Eva example

Placing landmarks on an a scan produced using an Artec3D Eva 3D scanner can be done like this:

  • download the example head scan in obj format
  • then:
python predict.py --c configs\DTU3D-RGB_Artec3D.json --n Pasha_guard_head.obj

Artec3D

  • download the example man bust in obj format
  • then:
python predict.py --c configs\DTU3D-depth.json --n man_bust.obj

Artec3D

Using Multiple GPUs

Multi-GPU training and evaluation can be used by setting n_gpu argument of the config file to a number greater than one. If configured to use a smaller number of GPUs than available, n devices will be used by default. To use a specific set of GPUs the command line option --device can be used:

python train.py --device 2,3 --c config.json

The program check if a GPU is present and if it has the required CUDA capabilities (3.5 and up). If not, the CPU is used - will be slow but still works.

How to train and use Deep-MVLM with the BU-3DFE dataset

The Binghamton University 3D Facial Expression Database (BU-3DFE) is a standard database for testing the performance of 3D facial analysis software tools. Here it is described how this database can be used to train and evaluate the performance of Deep-MVLM. The following approach can be adapted to your own dataset.

Start by requesting and downloading the database from the official BU-3DFE site

Secondly, download the 3D landmarks for the raw data from Rasmus R. Paulsens homepage. The landmarks from the original BU-3DFE distribution is fitted to the cropped face data. Unfortunately, the raw and cropped face data are not in alignment. The data fra Rasmus' site has been aligned to the raw data, thus making it possible to train and evaluate on the raw face data. There are 84 landmarks in this set end they are defined here.

A set of example JSON configuration files are provided. Use for example configs/BU_3DFE-RGB_train_test.json and modify it to your needs. Change raw_data_dir, processed_data_dir, data_dir (should be equal to processed_data_dir) to your setup.

Preparing the BU-3DFE data

In order to train the network the data should be prepared. This means that we pre-render a set of views for each input model. On the fly rendering during training is too slow due to the loading of the 3D models. Preparing the data is done by issuing the command:

python preparedata --c configs/BU_3DFE-RGB_train_test.json

This will pre-render the image channels rgb, geometry, depth. If the processed_data_dir is set to for example D:\data\BU-3DFE_processed\, the rendered images will be placed in a folder D:\data\BU-3DFE_processed\images\ and the corresponding 2D landmarks in a folder D:\data\BU-3DFE_processed\2D LM\. The renderings should look like this:

RGB renderinggeometry renderingdepth rendering

The corresponding landmark file is a standard text file with landmark positions corresponding to their placement in the rendered images. This means that this dataset can now be used to train a standard 2D face landmark detector.

The dataset will also be split into a training and a test set. The ids of the scans used for training can be found in the dataset_train.txt file and the test set in the dataset_test.txt file. Both files are found in the processed_data_dir.

Training on the BU-3DFE pre-rendered data

To do the training on the pre-rendered images and landmarks the command

python train --c configs/BU_3DFE-RGB_train_test.json

is used. The result of the training (the model) will be placed in a folder saved\models\MVLMModel_BU_3DFE\DDMMYY_HHMMSS\, where the saved folder can be specified in the JSON configuration file. DDMMYY_HHMMSS is the current date and time. A simple training log can be found in saved\log\MVLMModel_BU_3DFE\DDMMYY_HHMMSS\. After training, it is recommended to rename and copy the best trained model best-model.pth to a suitable location. For example **saved\trained\*.

Tensorboard visualisation

Tensorboard visualisation of the training and validation losses can be enabled in the JSON configuration file. The tensorboard data will be placed in the saved\log\MVLMModel_BU_3DFE\DDMMYY_HHMMSS\ directory.

Resuming training

If training is stopped for some reason, it is possible to resume training by using

python train --c configs/BU_3DFE-RGB_train_test.json --r path-to-model\best-model.pth

where path-to-model is the path to the current best model (for example saved\models\MVLMModel_BU_3DFE\DDMMYY_HHMMSS\).

Evaluating the model trained on the BU-3DFE data

In order to evaluate the performance of the trained model, the following command is used:

python test --c configs/BU_3DFE-RGB_train_test.json --r path-and-file-name-of-model.pth

where path-and-file-name-of-model.pth is the path and filename of the model that should be tested. It should match the configuration in the supplied JSON file. Test results will be placed in a folder named saved\temp\MVLMModel_BU_3DFE\DDMMYY_HHMMSS\. Most interesting is the results.csv that lists the distance error for each landmark for each test mesh.

Team

Rasmus R. Paulsen and Kristine Aavild Juhl

License

Deep-MVLM is released under the MIT license. See the LICENSE file for more details.

Credits

This project is based on the PyTorch template pytorch-template by Victor Huang

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].