All Projects → lmb-freiburg → Hand3d

lmb-freiburg / Hand3d

Licence: gpl-2.0
Network estimating 3D Handpose from single color images

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Hand3d

Tf Pose Estimation
Deep Pose Estimation implemented using Tensorflow with Custom Architectures for fast inference.
Stars: ✭ 3,856 (+458.84%)
Mutual labels:  cnn, pose-estimation
Handpose
A python program to detect and classify hand pose using deep learning techniques
Stars: ✭ 168 (-75.65%)
Mutual labels:  cnn, pose-estimation
Deeplearning
深度学习入门教程, 优秀文章, Deep Learning Tutorial
Stars: ✭ 6,783 (+883.04%)
Mutual labels:  cnn
Openpifpaf
Official implementation of "OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association" in PyTorch.
Stars: ✭ 662 (-4.06%)
Mutual labels:  pose-estimation
Ai Basketball Analysis
🏀🤖🏀 AI web app and API to analyze basketball shots and shooting pose.
Stars: ✭ 582 (-15.65%)
Mutual labels:  pose-estimation
How To Learn Deep Learning
A top-down, practical guide to learn AI, Deep learning and Machine Learning.
Stars: ✭ 544 (-21.16%)
Mutual labels:  cnn
Lighttrack
LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking
Stars: ✭ 590 (-14.49%)
Mutual labels:  pose-estimation
Music recommender
Music recommender using deep learning with Keras and TensorFlow
Stars: ✭ 528 (-23.48%)
Mutual labels:  cnn
Text Classification
Implementation of papers for text classification task on DBpedia
Stars: ✭ 682 (-1.16%)
Mutual labels:  cnn
Multi Class Text Classification Cnn Rnn
Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.
Stars: ✭ 570 (-17.39%)
Mutual labels:  cnn
Mvision
机器人视觉 移动机器人 VS-SLAM ORB-SLAM2 深度学习目标检测 yolov3 行为检测 opencv PCL 机器学习 无人驾驶
Stars: ✭ 6,140 (+789.86%)
Mutual labels:  cnn
Flashtorch
Visualization toolkit for neural networks in PyTorch! Demo -->
Stars: ✭ 561 (-18.7%)
Mutual labels:  cnn
See
Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"
Stars: ✭ 545 (-21.01%)
Mutual labels:  cnn
Cnn For Image Retrieval
🌅The code of post "Image retrieval using MatconvNet and pre-trained imageNet"
Stars: ✭ 597 (-13.48%)
Mutual labels:  cnn
Video Classification
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Stars: ✭ 543 (-21.3%)
Mutual labels:  cnn
Prototypical Networks For Few Shot Learning Pytorch
Implementation of Prototypical Networks for Few Shot Learning (https://arxiv.org/abs/1703.05175) in Pytorch
Stars: ✭ 669 (-3.04%)
Mutual labels:  cnn
Textclassificationbenchmark
A Benchmark of Text Classification in PyTorch
Stars: ✭ 534 (-22.61%)
Mutual labels:  cnn
Alphapose
Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System
Stars: ✭ 5,697 (+725.65%)
Mutual labels:  pose-estimation
Yolov3
Keras implementation of yolo v3 object detection.
Stars: ✭ 585 (-15.22%)
Mutual labels:  cnn
Ultra Fast Lane Detection
Ultra Fast Structure-aware Deep Lane Detection (ECCV 2020)
Stars: ✭ 688 (-0.29%)
Mutual labels:  cnn

ColorHandPose3D network

Teaser

ColorHandPose3D is a Convolutional Neural Network estimating 3D Hand Pose from a single RGB Image. See the project page for the dataset used and additional information.

Usage: Forward pass

The network ships with a minimal example, that performs a forward pass and shows the predictions.

  • Download data and unzip it into the projects root folder (This will create 3 folders: "data", "results" and "weights")
  • run.py - Will run a forward pass of the network on the provided examples

You can compare your results to the content of the folder "results", which shows the predictions we get on our system.

Recommended system

Recommended system (tested):

  • Ubuntu 16.04.2 (xenial)
  • Tensorflow 1.3.0 GPU build with CUDA 8.0.44 and CUDNN 5.1
  • Python 3.5.2

Python packages used by the example provided and their recommended version:

  • tensorflow==1.3.0
  • numpy==1.13.0
  • scipy==0.18.1
  • matplotlib==1.5.3

Preprocessing for training and evaluation

In order to use the training and evaluation scripts you need download and preprocess the datasets.

Rendered Hand Pose Dataset (RHD)

  • Download the dataset accompanying this publication RHD dataset v. 1.1

  • Set the variable 'path_to_db' to where the dataset is located on your machine

  • Optionally modify 'set' variable to training or evaluation

  • Run

      python3.5 create_binary_db.py
    
  • This will create a binary file in ./data/bin according to how 'set' was configured

Stereo Tracking Benchmark Dataset (STB)

  • For eval3d_full.py it is necessary to get the dataset presented in Zhang et al., ‘3d Hand Pose Tracking and Estimation Using Stereo Matching’, 2016

  • After unzipping the dataset run

      cd ./data/stb/
      matlab -nodesktop -nosplash -r "create_db"
    
  • This will create the binary file ./data/stb/stb_evaluation.bin

Network training

We provide scripts to train HandSegNet and PoseNet on the Rendered Hand Pose Dataset (RHD). In case you want to retrain the networks on new data you can adapt the code provided to your needs.

The following steps guide you through training HandSegNet and PoseNet on the Rendered Hand Pose Dataset (RHD).

  • Make sure you followed the steps in the section 'Preprocessing'
  • Start training of HandSegNet with training_handsegnet.py
  • Start training of PoseNet with training_posenet.py
  • Set USE_RETRAINED = True on line 32 in eval2d_gt_cropped.py
  • Run eval2d_gt_cropped.py to evaluate the retrained PoseNet on RHD-e
  • Set USE_RETRAINED = True on line 31 in eval2d.py
  • Run eval2d.py to evaluate the retrained HandSegNet + PoseNet on RHD-e

You should be able to obtain results that roughly match the following numbers we obtain with Tensorflow v1.3:

eval2d_gt_cropped.py yields:

Evaluation results:
Average mean EPE: 7.630 pixels
Average median EPE: 3.939 pixels
Area under curve: 0.771

eval2d.py yields:

Evaluation results:
Average mean EPE: 15.469 pixels
Average median EPE: 4.374 pixels
Area under curve: 0.715

Because training itself isn't a deterministic process results will differ between runs. Note that these results are not listed in the paper.

Evaluation

There are four scripts that evaluate different parts of the architecture:

  1. eval2d_gt_cropped.py: Evaluates PoseNet on 2D keypoint localization using ground truth annoation to create hand cropped images (section 6.1, Table 1 of the paper)
  2. eval2d.py: Evaluates HandSegNet and PoseNet on 2D keypoint localization (section 6.1, Table 1 of the paper)
  3. eval3d.py: Evaluates different approaches on lifting 2D predictions into 3D (section 6.2.1, Table 2 of the paper)
  4. eval3d_full.py: Evaluates our full pipeline on 3D keypoint localization from RGB (section 6.2.1, Table 2 of the paper)

This provides the possibility to reproduce results from the paper that are based on the RHD dataset.

License and Citation

This project is licensed under the terms of the GPL v2 license. By using the software, you are agreeing to the terms of the license agreement.

Please cite us in your publications if it helps your research:

@InProceedings{zb2017hand,
  author    = {Christian Zimmermann and Thomas Brox},
  title     = {Learning to Estimate 3D Hand Pose from Single RGB Images},
  booktitle    = "IEEE International Conference on Computer Vision (ICCV)",
  year      = {2017},
  note         = "https://arxiv.org/abs/1705.01389",
  url          = "https://lmb.informatik.uni-freiburg.de/projects/hand3d/"
}

Known issues

  • There is an issue with the results of section 6.1, Table 1 that reports performance of 2D keypoint localization on full scale images (eval2d.py). PoseNet was trained to predict the "palm center", but the evaluation script compares to the "wrist". This results into an systematic error and therefore the reported results are significantly worse than under a correct evaluation setting. Using the correct setting during evaluation improves results approximately by 2-10% (dependent on the measure).
  • The numbers reported for the "Bottleneck" approach in Table 2 of the paper are not correct. The actual result are approx. 8 % worse.
  • There is a minor issue with the first version of RHD. There was a rounding/casting problem, which led to values of the images to be off by one every now and then compared to the version used in the paper. The difference is visually not noticable and not large, but it prevents from reaching the reported numbers exactly.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].