All Projects → ofnote → gestop

ofnote / gestop

Licence: Apache-2.0 License
A tool to navigate the desktop with hand gestures. Builds on mediapipe.

Programming Languages

python
139335 projects - #7 most used programming language
C++
36643 projects - #6 most used programming language
Starlark
911 projects

Projects that are alternatives of or similar to gestop

Face-Mask
Real time webcam face detection, protect yourself from COVID19 with a virtual mask
Stars: ✭ 64 (+220%)
Mutual labels:  mediapipe
fastface
Light Face Detection using PyTorch Lightning
Stars: ✭ 71 (+255%)
Mutual labels:  pytorch-lightning
2021-dialogue-summary-competition
[2021 훈민정음 한국어 음성•자연어 인공지능 경진대회] 대화요약 부문 알라꿍달라꿍 팀의 대화요약 학습 및 추론 코드를 공유하기 위한 레포입니다.
Stars: ✭ 86 (+330%)
Mutual labels:  pytorch-lightning
quickvision
An Easy To Use PyTorch Computer Vision Library
Stars: ✭ 49 (+145%)
Mutual labels:  pytorch-lightning
Handgator
✋ Navigating desktop with hand gesture
Stars: ✭ 13 (-35%)
Mutual labels:  hand-gestures
Akihabara
A pure .NET port for Google Mediapipe, inspired of MediaPipeUnity.
Stars: ✭ 21 (+5%)
Mutual labels:  mediapipe
lightning-transformers
Flexible components pairing 🤗 Transformers with Pytorch Lightning
Stars: ✭ 551 (+2655%)
Mutual labels:  pytorch-lightning
pytorch tempest
My repo for training neural nets using pytorch-lightning and hydra
Stars: ✭ 124 (+520%)
Mutual labels:  pytorch-lightning
ue4-mediapipe-plugin
UE4 MediaPipe plugin
Stars: ✭ 159 (+695%)
Mutual labels:  mediapipe
ai-virtual-mouse
Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera. Fingertip location is mapped to RGB images to control the mouse cursor.
Stars: ✭ 59 (+195%)
Mutual labels:  mediapipe
pytorch multi input example
Multi-Input Deep Neural Networks with PyTorch-Lightning - Combine Image and Tabular Data
Stars: ✭ 40 (+100%)
Mutual labels:  pytorch-lightning
embeddings-for-trees
Set of PyTorch modules for developing and evaluating different algorithms for embedding trees.
Stars: ✭ 19 (-5%)
Mutual labels:  pytorch-lightning
uetai
Custom ML tracking experiment and debugging tools.
Stars: ✭ 17 (-15%)
Mutual labels:  pytorch-lightning
bert-squeeze
🛠️ Tools for Transformers compression using PyTorch Lightning ⚡
Stars: ✭ 56 (+180%)
Mutual labels:  pytorch-lightning
DOLG-pytorch
Unofficial PyTorch Implementation of "DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features"
Stars: ✭ 69 (+245%)
Mutual labels:  pytorch-lightning
mediapipe-osc
MediaPipe examples which stream their detections over OSC.
Stars: ✭ 26 (+30%)
Mutual labels:  mediapipe
Neural-HMM
Neural HMMs are all you need (for high-quality attention-free TTS)
Stars: ✭ 69 (+245%)
Mutual labels:  pytorch-lightning
UnityHandTrackingWithMediapipe
Realtime hand tracking and finger tracking in Unity using Mediapipe
Stars: ✭ 129 (+545%)
Mutual labels:  mediapipe
classy
classy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (+205%)
Mutual labels:  pytorch-lightning
lightning-asr
Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.
Stars: ✭ 36 (+80%)
Mutual labels:  pytorch-lightning

Gestop : Customizable Gesture Control of Computer Systems

This is the implementation of the approach described in the paper:

Sriram Krishna and Nishant Sinha. Gestop: Customizable Gesture Control of Computer Systems 8th ACM IKDD CODS and 26th COMAD. 2021. 405-409.

Built on top of mediapipe, this project aims to be a tool to interact with a computer through hand gestures. Out of the box, using this tool, it is possible to:

  1. Use your hand to act as a replacement for the mouse.
  2. Perform hand gestures to control system parameters like screen brightness, volume etc.

In addition, it is possible to extend and customize the functionality of the application in numerous ways:

  1. Remap existing hand gestures to different functions in order to better suit your needs.
  2. Create custom functionality through the use of either python functions or shell scripts.
  3. Collect data and create your own custom gestures to use with existing gestures.

Demo (Click on the image to see the full video)

Demo video link

Static and Dynamic Gestures Dataset link

Installation

Installation using pip inside a virtual environment is highly recommended. To do so:

python -m venv env
source env/bin/activate
pip install gestop

In addition to the Python dependencies, OpenCV and xdotool are also required by Gestop.

Usage

Server

To start the Gestop server:

python -m gestop.receiver

Client

The client, or the keypoint generator, can be setup either through MediaPipe's C++ API, or through its Python API. The Python API is simpler to setup and is recommended.

MediaPipe Python API

python -m gestop.keypoint_gen.hand_tracking
MediaPipe C++ API
  1. Download mediapipe and set it up. MediaPipe >=0.8.0 is NOT supported and should no be used. Make sure the provided hand tracking example is working to verify if all dependencies are installed.
  2. Clone this repo in the top level directory of mediapipe. Install all of Gestop's dependencies.
  3. Run the instructions below to build and then execute the code.

Note: Run build instructions in the mediapipe/ directory, not inside this directory.

GPU (Linux only)
bazel build -c opt --verbose_failures --copt -DMESA_EGL_NO_X11_HEADERS --copt -DEGL_NO_X11 gestop:hand_tracking_gpu

GLOG_logtostderr=1 bazel-bin/gestop/hand_tracking_gpu --calculator_graph_config_file=gestop/gestop/keypoint_gen/hand_tracking_desktop_live.pbtxt
CPU
bazel build -c opt --define MEDIAPIPE_DISABLE_GPU=1 gestop:hand_tracking_cpu

GLOG_logtostderr=1 bazel-bin/gestop/hand_tracking_cpu --calculator_graph_config_file=gestop/keypoint_gen/hand_tracking_desktop_live.pbtxt

Overview

The hand keypoints are detected using google's MediaPipe. These keypoints are then fed into receiver.py . The tool recognizes two kinds of gestures:

  1. Static Gestures : Gestures whose meaning can be inferred from a single image itself.
  2. Dynamic Gestures : Gestures which can only be understood through a sequence of images i.e. a video.

Static gestures, by default, are mapped to all functionality relevant to the mouse, such as left mouse click, scroll etc. Combined with mouse tracking, this allows one to replace the mouse entirely. The mouse is tracked simply by moving the hand, where the tip of the index finger reflects the position of the cursor. The gestures related to the mouse actions are detailed below. To train the neural network to recognize static gestures, a dataset was created manually for the available gestures.

For more complicated gestures involving the movement of the hand, dynamic gestures can be used. By default, it consists of various other actions to interface with the system, such as modifying screen brightness, switching workspaces, taking screenshots etc. The data for these dynamic gestures comes from SHREC2017 dataset. Dynamic gestures are detected by holding down the Ctrl key, which freezes the cursor, performing the gesture, and then releasing the key.

The project consists of a few distinct pieces which are:

  • MediaPipe - Accessed through either the Python API or the C++ API, MediaPipe tracks the hand, generates the keypoints and transmits them.
  • Gesture Receiver - See receiver.py, responsible for handling the stream and utilizing the following modules.
  • Mouse Tracker - See mouse_tracker.py, responsible for moving the cursor using the position of the index finger.
  • Gesture Recognizer - See recognizer.py, takes in the keypoints from the mediapipe executable, and converts them into a high level description of the state of the hand, i.e. a gesture name.
  • Gesture Executor - See executor.py, uses the gesture name from the previous module, and executes an action.

Notes

  • For best performance, perform dynamic gestures with right hand only, as all data from SHREC is right hand only.
  • For dynamic gestures to work properly, you may need to change the keycodes being used in executor.py. Use the given find_keycode.py script to find the keycodes of the keys used to change screen brightness and volumee. Finally, system shortcuts may need to be remapped so that the shortcuts work even with the Ctrl key held down. For example, in addition to the usual default behaviour of <Prnt_Screen> taking a screenshot, you may need to add <Ctrl+Prnt_Screen> as a shortcut as well.

Customizing Gestop

Available Gestures

API Reference

Useful Information

Joints of the hand

HandCommander

Video recorded with VokoScreenNG

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].