All Projects → NVlabs → DREAM

NVlabs / DREAM

Licence: other
DREAM: Deep Robot-to-Camera Extrinsics for Articulated Manipulators (ICRA 2020)

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to DREAM

Stag
STag: A Stable Fiducial Marker System
Stars: ✭ 75 (-31.19%)
Mutual labels:  camera-calibration, pose-estimation
Openpose-based-GUI-for-Realtime-Pose-Estimate-and-Action-Recognition
GUI based on the python api of openpose in windows using cuda10 and cudnn7. Support body , hand, face keypoints estimation and data saving. Realtime gesture recognition is realized through two-layer neural network based on the skeleton collected from the gui.
Stars: ✭ 69 (-36.7%)
Mutual labels:  pose-estimation
RGBDAcquisition
A uniform library wrapper for input from V4L2,Freenect,OpenNI,OpenNI2,DepthSense,Intel Realsense,OpenGL simulations and other types of video and depth input..
Stars: ✭ 56 (-48.62%)
Mutual labels:  camera-calibration
kPAM
kPAM: Generalizable Robotic Manipulation
Stars: ✭ 73 (-33.03%)
Mutual labels:  pose-estimation
UniFormer
[ICLR2022] official implementation of UniFormer
Stars: ✭ 574 (+426.61%)
Mutual labels:  pose-estimation
SRN
Code for "SRN: Stacked Regression Network for Real-time 3D Hand Pose Estimation" BMVC 2019
Stars: ✭ 27 (-75.23%)
Mutual labels:  pose-estimation
qml-ar
Seamless Augmented Reality module for QML using UchiyaMarkers
Stars: ✭ 32 (-70.64%)
Mutual labels:  pose-estimation
pytorch-PyraNet
Pytorch version reinplement code of PyraNet , for paper : Learning Feature Pyramids for Human Pose Estimation
Stars: ✭ 32 (-70.64%)
Mutual labels:  pose-estimation
android tflite
GPU Accelerated TensorFlow Lite applications on Android NDK. Higher accuracy face detection, Age and gender estimation, Human pose estimation, Artistic style transfer
Stars: ✭ 105 (-3.67%)
Mutual labels:  pose-estimation
3d-pose-2d-keypoints
3d Human Pose Estimation from 2d Keypoints
Stars: ✭ 56 (-48.62%)
Mutual labels:  pose-estimation
OffsetGuided
Code for "Greedy Offset-Guided Keypoint Grouping for Human Pose Estimation"
Stars: ✭ 31 (-71.56%)
Mutual labels:  pose-estimation
CameraCalibTools
List of Camera Calibration Tools + Patterns
Stars: ✭ 64 (-41.28%)
Mutual labels:  camera-calibration
All4Depth
Self-Supervised Depth Estimation on Monocular Sequences
Stars: ✭ 58 (-46.79%)
Mutual labels:  pose-estimation
flydra
live, low-latency markerless multi-camera 3D animal tracking system
Stars: ✭ 52 (-52.29%)
Mutual labels:  pose-estimation
ailia-models
The collection of pre-trained, state-of-the-art AI models for ailia SDK
Stars: ✭ 1,102 (+911.01%)
Mutual labels:  pose-estimation
3DObjectTracking
Official Code: A Sparse Gaussian Approach to Region-Based 6DoF Object Tracking
Stars: ✭ 375 (+244.04%)
Mutual labels:  pose-estimation
awesome-visual-localization-papers
The relocalization task aims to estimate the 6-DoF pose of a novel (unseen) frame in the coordinate system given by the prior model of the world.
Stars: ✭ 60 (-44.95%)
Mutual labels:  pose-estimation
zhang
Numpy implementation of Z. Zhang's camera calibration algorithm
Stars: ✭ 36 (-66.97%)
Mutual labels:  camera-calibration
latent-pose-reenactment
The authors' implementation of the "Neural Head Reenactment with Latent Pose Descriptors" (CVPR 2020) paper.
Stars: ✭ 132 (+21.1%)
Mutual labels:  pose-estimation
GaitGraph
Official repository for "GaitGraph: Graph Convolutional Network for Skeleton-Based Gait Recognition" (ICIP'21)
Stars: ✭ 68 (-37.61%)
Mutual labels:  pose-estimation

DREAM: Deep Robot-to-Camera Extrinsics for Articulated Manipulators

This is the official implementation of "Camera-to-robot pose estimation from a single image" (ICRA 2020). The DREAM system uses a robot-specific deep neural network to detect keypoints (typically joint locations) in the RGB image of a robot manipulator. Using these keypoint locations along with the robot forward kinematics, the camera pose with respect to the robot is estimated using a perspective-n-point (PnP) algorithm. For more details, please see our paper and video.

DREAM in operation

Installation

We have tested on Ubuntu 16.04 and 18.04 with an NVIDIA GeForce RTX 2080 and Titan X, with both Python 2.7 and Python 3.6. The code may work on other systems.

Install the DREAM package and its dependencies using pip:

pip install . -r requirements.txt

Download the pre-trained models and (optionally) data. In the scripts below, be sure to comment out files you do not want, as they are very large. Alternatively, you can download files manually

cd trained_models; ./DOWNLOAD.sh; cd ..
cd data; ./DOWNLOAD.sh; cd ..

Unit tests are implemented in the pytest framework. Verify your installation by running them: pytest test/

Offline inference

There are three scripts for offline inference:

# Run the network on a single image to display detected 2D keypoints.
python scripts/network_inference.py -i <path/to/network.pth> -m <path/to/image.png> 

# Process a dataset to save both 2D keypoints and 3D poses.
python scripts/network_inference_dataset.py -i <path/to/network.pth> -d <path/to/dataset_dir/> -o <path/to/output_dir/> -b 16 -w 8

# Run the network on an image sequence 
# (either a dataset or a directory of images, e.g., from a video),
# and saves the resulting visualizations as videos.
python scripts/visualize_network_inference.py -i <path/to/network.pth> -d <path/to/dataset_dir/> -o <path/to/output_dir/> -s <start_frame_name> -e <end_frame_name>

Pass -h for help on command line arguments. Datasets are assumed to be in NDDS format.

Example for single-image inference

Single-image inference from one frame of the Panda-3Cam RealSense dataset using the DREAM-vgg-Q network:

python scripts/network_inference.py -i trained_models/panda_dream_vgg_q.pth -m data/real/panda-3cam_realsense/000000.rgb.jpg

You should see the detected keypoints printed to the screen as well as overlaid on the Panda robot. (See note below regarding the Panda keypoint locations.)

Example for dataset inference

Inference on the Panda-3Cam RealSense dataset using the DREAM-vgg-Q network:

python scripts/network_inference_dataset.py -i trained_models/panda_dream_vgg_q.pth -d data/real/panda-3cam_realsense/ -o <path/to/output_results> -b 16 -w 8

The analysis will print to both the screen and file. You should see that the percentage of correct keypoints (PCK) and the area under the curve (AUC) is about 0.720, and the average distance (ADD) AUC is about 0.792. Various visualizations will also be saved to disk.

Example for generating inference visualizations

Generating visualizations for one of the sequences in the Panda-3Cam RealSense dataset using the DREAM-vgg-Q network:

python scripts/visualize_network_inference.py -i trained_models/panda_dream_vgg_q.pth -d data/real/panda-3cam_realsense -o <path/to/output_results> -fps 120.0 -s 004151

This creates videos at 4x normal camera framerate.

Online inference using ROS

A ROS node is provided for real-time camera pose estimation. Some values, such as ROS topic names, may need to be changed for your application. Because of incompatabilities between ROS (before Noetic) and Python 3, the DREAM ROS node is implemented using Python 2.7. For ease of use, we have provided a Docker setup containing all the necessary compoinents to run DREAM with ROS Kinetic.

Example to run the DREAM ROS node (in verbose mode):

python scripts/launch_dream_ros.py -i trained_models/baxter_dream_vgg_q.pth -b torso -v

Training

Below is an example for training a DREAM-vgg-Q model for the Franka Emika Panda robot:

python scripts/train_network.py -i data/synthetic/panda_synth_train_dr/ -t 0.8 -m manip_configs/panda.yaml -ar arch_configs/dream_hourglass_example.yaml -e 25 -lr 0.00015 -b 128 -w 16 -o <path/to/output_dir/>

The models below are defined in the following architecture files:

  • DREAM-vgg-Q: arch_configs/dream_vgg_q.yaml
  • DREAM-vgg-F: arch_configs/deam_vgg_f.yaml
  • DREAM-resnet-H: arch_configs/dream_resnet_h.yaml
  • DREAM-resnet-F: arch_configs/dream_resnet_f.yaml (very large network and unwieldy to train)

Note on Panda keypoints

By default, keypoints are defined at the joint locations as defined by the robot URDF file. In the case of the Panda robot, the URDF file defines the joints at non-intuitive locations. As a result, visualizations of keypoint detections may appear to be wrong when they are in fact correct (see our video). We have since modified the URDF to place the keypoints at the actual joint locations (see Fig. 5a of our paper), but for simplicity we are not releasing the modified URDF at this time.

Note on reproducing results

The experiments in the paper used the image preprocessing type shrink-and-crop, which preserves the same aspect ratio of the input image, but crops the width to send 400 x 400 resolution to the network (which is the resolution used during training). In order to allow for full-frame inference, the models we released have the default image preprocessing type resize, which prevents this cropping. Careful analysis has shown almost no difference in quantitative results, but if you are looking to reproduce our ICRA results exactly, please change the architecture/image_processing value to shrink-and-crop.

The PCK and ADD plots in the paper are generated from oks_plots.py and add_plots.py. The AUC in these figures (and in Table 1) are in the analysis_results.txt file that is produced by scripts/network_inference_dataset.py.

Further information

For an example of how DREAM can be used in practice for vision-based object manipulation, please refer to "Indirect Object-to-Robot Pose Estimation from an External Monocular RGB Camera" by Jonathan Tremblay, Stephen Tyree, Terry Mosier, and Stan Birchfield.

For more information on how DREAM learns robot keypoint detection from sim-to-real transfer using only synthetic domain randomized images, please refer to our sim2realAI blog post.

License

DREAM is licensed under the NVIDIA Source Code License - Non-commercial.

Citation

Please cite our work if you use it for your research. Thank you!

@inproceedings{lee2020icra:dream,
  title={Camera-to-Robot Pose Estimation from a Single Image},
  author={Lee, Timothy E and Tremblay, Jonathan and To, Thang and Cheng, Jia and Mosier, Terry and Kroemer, Oliver and Fox, Dieter and Birchfield, Stan},
  booktitle={International Conference on Robotics and Automation (ICRA)},
  year=2020,
  url={https://arxiv.org/abs/1911.09231}
}

Acknowledgment

Thanks to Jeffrey Smith ([email protected]) for assistance in preparing this release.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].