All Projects → ntu-rris → google-mediapipe

ntu-rris / google-mediapipe

Licence: other
Google MediaPipe Face + Hands + Body + Object

Programming Languages

python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language
CSS
56736 projects
HTML
75241 projects

Projects that are alternatives of or similar to google-mediapipe

ue4-mediapipe-plugin
UE4 MediaPipe plugin
Stars: ✭ 159 (+189.09%)
Mutual labels:  face-tracking, body-tracking, hand-tracking
Facemoji Kit
Face tracker with blend shapes coefficients, 3D head pose and dense mesh in real-time on iOS, Android, Mac, PC and Linux.
Stars: ✭ 158 (+187.27%)
Mutual labels:  realtime, face-tracking
accelerator-core-ios
Syntax sugar of OpenTok iOS SDK with Audio/Video communication including screen sharing
Stars: ✭ 30 (-45.45%)
Mutual labels:  realtime
glip-lib
An OpenGL Image Processing Library (in C++/GLSL).
Stars: ✭ 14 (-74.55%)
Mutual labels:  realtime
transit
Massively real-time city transit streaming application
Stars: ✭ 20 (-63.64%)
Mutual labels:  realtime
django-rest-live
Subscribe to updates from Django REST Framework over Websockets.
Stars: ✭ 48 (-12.73%)
Mutual labels:  realtime
VSDK-Unity
VSDK is an XR software development kit that enables developers to quickly build XR experiences through systems for naturalistic user interactions and support across a wide variety of XR devices and peripherals. VSDK is available for Unity 3D and for Unreal Engine.
Stars: ✭ 26 (-52.73%)
Mutual labels:  hand-tracking
core-api
Streamr Core backend
Stars: ✭ 52 (-5.45%)
Mutual labels:  realtime
bong-bong
Open public chat service built for the web.
Stars: ✭ 17 (-69.09%)
Mutual labels:  realtime
nazar
Electronic component detection, identification and recognition system in realtime from camera image using react-native and tensorflow for classification along with Clarifai API with option to search the component details from web with description shown from Octopart fetched from server
Stars: ✭ 25 (-54.55%)
Mutual labels:  realtime
FlipED
A LMS built specifically for Thailand's Education 4.0 system.
Stars: ✭ 24 (-56.36%)
Mutual labels:  realtime
twilio-taskrouter-realtime-dashboard
Twilio TaskRouter Realtime Dashboard using Sync
Stars: ✭ 51 (-7.27%)
Mutual labels:  realtime
SLProject
SLProject is a platform independent 3D computer graphics scene graph library. Read more on:
Stars: ✭ 47 (-14.55%)
Mutual labels:  realtime
wiz-editor
多人实时富文本 编辑器,可以嵌入各种应用中。支持markdown语法。
Stars: ✭ 208 (+278.18%)
Mutual labels:  realtime
realtime-multiplayer-space-invaders
Realtime Multiplayer Space Invaders Game with Phaser 3 and Ably
Stars: ✭ 42 (-23.64%)
Mutual labels:  realtime
nakama-examples
A mono repo with project examples for the Nakama client libraries.
Stars: ✭ 22 (-60%)
Mutual labels:  realtime
transitime
TheTransitClock real-time transit information system
Stars: ✭ 60 (+9.09%)
Mutual labels:  realtime
gogrs
📈 grs to Go. gogrs is a tool for fetching data from Taiwan Stock Exchange(TWSE) and dockerizing.
Stars: ✭ 58 (+5.45%)
Mutual labels:  realtime
node-v
🔒 Secure ❄️ Synchronized ⚡️ Realtime ☁️ Cloud 🌈 Native JavaScript Variables & Events
Stars: ✭ 27 (-50.91%)
Mutual labels:  realtime
mangband
A free online multi-player realtime roguelike game based on Angband
Stars: ✭ 54 (-1.82%)
Mutual labels:  realtime

Google MediaPipe for Pose Estimation

MediaPipe is a cross-platform framework for building multimodal applied machine learning pipelines including inference models and media processing functions.

The main purpose of this repo is to:

  • Customize output of MediaPipe solutions
  • Customize visualization of 2D & 3D outputs
  • Demo some simple applications on Python (refer to Demo Overview)
  • Demo some simple applications on JavaScript refer to java folder

Pose Estimation with Input Color Image

Attractiveness of Google MediaPipe as compared to other SOTA (e.g. FrankMocap, CMU OpenPose, DeepPoseKit, DeepLabCut, MinimalHand):

  • Fast: Runs at almost realtime rate on CPU and even mobile devices
  • Open-source: Codes are freely available at github (except that details of network models are not released)
  • User-friendly: For python API just pip install mediapipe will work (but C++ API is much more troublesome to build and use)
  • Cross-platform: Works across Android, iOS, desktop, JavaScript and web (Note: this repo only focuses on using Python API for desktop usage)
  • ML Solutions: Apart from face, hand, body and object pose estimations, MediaPipe offers an array of machine learning applications refer to their github for more details

Features

Latest MediaPipe Python API version 0.8.9.1 (Released 14 Dec 2021) features:

Face Detect (2D face detection)

Face Mesh (468/478 3D face landmarks)

Hands (21 3D landmarks and able to support multiple hands, 2 levels of model complexity) (NEW world coordinates)

Body Pose (33 3D landmarks for whole body, 3 levels of model complexity)

Holistic (Face + Hands + Body) (A total of 543/535 landmarks: 468 face + 2 x 21 hands + 33/25 pose)

Objectron (3D object detection and tracking) (4 possible objects: Shoe / Chair / Camera / Cup)

Selfie Segmentation (Segments human for selfie effect/video conferencing)

Note: The above videos are presented at CVPR 2020 Fourth Workshop on Computer Vision for AR/VR, interested reader can refer to the link for other related works.

Installation

The simplest way to run our implementation is to use anaconda.

You can create an anaconda environment called mp with

conda env create -f environment.yaml
conda activate mp

Demo Overview

Single Image Video Input Gesture Recognition Rock Paper Scissor Game
IMAGE ALT TEXT HERE
Measure Hand ROM Measure Wrist and Forearm ROM Face Mask Triangulate Points for 3D Pose
3D Skeleton 3D Object Detection Selfie Segmentation

Usage

0. Single Image

5 different modes are available and sample images are located in data/sample/ folder

python 00_image.py --mode face_detect
python 00_image.py --mode face
python 00_image.py --mode hand
python 00_image.py --mode body
python 00_image.py --mode holistic

Note: The sample images for subject with body marker are adapted from An Asian-centric human movement database capturing activities of daily living and the image of Mona Lisa is adapted from Wiki

1. Video Input

5 different modes are available and video capture can be done online through webcam or offline from your own .mp4 file

python 01_video.py --mode face_detect
python 01_video.py --mode face
python 01_video.py --mode hand
python 01_video.py --mode body
python 01_video.py --mode holistic

Note: It takes around 10 to 30 FPS on CPU, depending on the mode selected. The video demonstrating supported mini-squats is adapted from National Stroke Association

2. Gesture Recognition

2 modes are available: Use evaluation mode to perform recognition of 11 gestures and use train mode to log your own training data

python 02_gesture.py --mode eval
python 02_gesture.py --mode train

Note: A simple but effective K-nearest neighbor (KNN) algorithm is used as the classifier. For the hand gesture recognition demo, since 3D hand joints are available, we can compute flexion joint angles (feature vector) and use it to classify different hand poses. On the other hand, if 3D body joints are not yet reliable, the normalized pairwise distances between predifined lists of joints as described in MediaPipe Pose Classification could also be used as the feature vector for KNN.

3. Rock Paper Scissor Game

Simple game of rock paper scissor requires a pair of hands facing the camera

python 03_game_rps.py

For another game of flappy bird refer to this github

4. Measure Hand Range of Motion

2 modes are available: Use evaluation mode to perform hand ROM recognition and use train mode to log your own training data

python 04_hand_rom.py --mode eval
python 04_hand_rom.py --mode train

5. Measure Wrist and Forearm Range of Motion

3 modes are available and user has to input the side of the hand to be measured

  • 0: Wrist flexion/extension
  • 1: Wrist radial/ulnar deviation
  • 2: Forearm pronation/supination
python 05_wrist_rom.py --mode 0 --side right
python 05_wrist_rom.py --mode 1 --side right
python 05_wrist_rom.py --mode 2 --side right
python 05_wrist_rom.py --mode 0 --side left
python 05_wrist_rom.py --mode 1 --side left
python 05_wrist_rom.py --mode 2 --side left

Note: For measuring forearm pronation/supination, the camera has to be placed at the same level as the hand such that palmar side of the hand is directly facing camera. For measuring wrist ROM, the camera has to be placed such that upper body of the subject is visible, refer to examples of wrist_XXX.png images in data/sample/ folder. The wrist images are adapted from Goni Wrist Flexion, Extension, Radial & Ulnar Deviation

6. Face Mask

Overlay a 3D face mask on the detected face in image plane

python 06_face_mask.py

Note: The face image is adapted from MediaPipe 3D Face Transform

7. Triangulate Points

Estimating 3D body pose from a single 2D image is an ill-posed problem and extremely challenging. One way to reconstruct 3D body pose is to make use of multiview setup and perform triangulation. For offline testing, use CMU Panoptic Dataset, follow the instructions on PanopticStudio Toolbox to download a sample dataset 171204_pose1_sample into data/ folder

python 07_triangulate.py --mode body --use_panoptic_dataset

8. 3D Skeleton

3D pose estimation is available in full-body mode and this demo displays the estimated 3D skeleton of the hand and/or body. 3 different modes are available and video capture can be done online through webcam or offline from your own .mp4 file

python 08_skeleton_3D.py --mode hand
python 08_skeleton_3D.py --mode body
python 08_skeleton_3D.py --mode holistic

9. 3D Object Detection

4 different modes are available and a sample image is located in data/sample/ folder. Currently supports 4 classes: Shoe / Chair / Cup / Camera.

python 09_objectron.py --mode shoe
python 09_objectron.py --mode chair
python 09_objectron.py --mode cup
python 09_objectron.py --mode camera

10. Selfie Segmentation

2 modes are available. The landscape mode has fewer FLOPS than the general model and may run faster. The selfie segmentation works best for selfie effects and video conferencing, where the person is close (< 2m) to the camera.

python 10_segmentation.py --mode general
python 10_segmentation.py --mode landscape

Limitations:

Estimating 3D pose from a single 2D image is an ill-posed problem and extremely challenging, thus the measurement of ROM may not be accurate! Please refer to the respective model cards for more details on other types of limitations such as lighting, motion blur, occlusions, image resolution, etc.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].