All Projects → victordibia → Handtrack.js

victordibia / Handtrack.js

Licence: mit
A library for prototyping realtime hand detection (bounding box), directly in the browser.

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Handtrack.js

Ncrfpp
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Stars: ✭ 1,767 (-30.19%)
Mutual labels:  artificial-intelligence, neural-networks
Avalanche
Avalanche: a End-to-End Library for Continual Learning.
Stars: ✭ 151 (-94.03%)
Mutual labels:  artificial-intelligence, neural-networks
100daysofmlcode
My journey to learn and grow in the domain of Machine Learning and Artificial Intelligence by performing the #100DaysofMLCode Challenge.
Stars: ✭ 146 (-94.23%)
Mutual labels:  artificial-intelligence, neural-networks
Xaynet
Xaynet represents an agnostic Federated Machine Learning framework to build privacy-preserving AI applications.
Stars: ✭ 111 (-95.61%)
Mutual labels:  artificial-intelligence, neural-networks
Deep Learning Notes
My personal notes, presentations, and notebooks on everything Deep Learning.
Stars: ✭ 191 (-92.45%)
Mutual labels:  artificial-intelligence, neural-networks
Machine Learning Flappy Bird
Machine Learning for Flappy Bird using Neural Network and Genetic Algorithm
Stars: ✭ 1,683 (-33.5%)
Mutual labels:  artificial-intelligence, neural-networks
Hands On Machine Learning With Scikit Learn Keras And Tensorflow
Notes & exercise solutions of Part I from the book: "Hands-On ML with Scikit-Learn, Keras & TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems" by Aurelien Geron
Stars: ✭ 151 (-94.03%)
Mutual labels:  artificial-intelligence, neural-networks
Ai Residency List
List of AI Residency & Research programs, Ph.D Fellowships, Research Internships
Stars: ✭ 69 (-97.27%)
Mutual labels:  artificial-intelligence, neural-networks
Fixy
Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çözebilen, eşsiz yaklaşımlar öne süren ve literatürdeki çalışmaların eksiklerini gideren open source bir yazım destekleyicisi/denetleyicisi oluşturmak. Kullanıcıların yazdıkları metinlerdeki yazım yanlışlarını derin öğrenme yaklaşımıyla çözüp aynı zamanda metinlerde anlamsal analizi de gerçekleştirerek bu bağlamda ortaya çıkan yanlışları da fark edip düzeltebilmek.
Stars: ✭ 165 (-93.48%)
Mutual labels:  artificial-intelligence, neural-networks
Iresnet
Improved Residual Networks (https://arxiv.org/pdf/2004.04989.pdf)
Stars: ✭ 163 (-93.56%)
Mutual labels:  artificial-intelligence, neural-networks
Sonnet
TensorFlow-based neural network library
Stars: ✭ 9,129 (+260.69%)
Mutual labels:  artificial-intelligence, neural-networks
Vectorai
Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.
Stars: ✭ 195 (-92.3%)
Mutual labels:  artificial-intelligence, neural-networks
Mit Deep Learning
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
Stars: ✭ 8,912 (+252.11%)
Mutual labels:  artificial-intelligence, neural-networks
Persephone
A tool for automatic phoneme transcription
Stars: ✭ 130 (-94.86%)
Mutual labels:  artificial-intelligence, neural-networks
Get started with deep learning for text with allennlp
Getting started with AllenNLP and PyTorch by training a tweet classifier
Stars: ✭ 69 (-97.27%)
Mutual labels:  artificial-intelligence, neural-networks
Ai Blocks
A powerful and intuitive WYSIWYG interface that allows anyone to create Machine Learning models!
Stars: ✭ 1,818 (-28.17%)
Mutual labels:  artificial-intelligence, neural-networks
Ai Platform
An open-source platform for automating tasks using machine learning models
Stars: ✭ 61 (-97.59%)
Mutual labels:  artificial-intelligence, neural-networks
Graph 2d cnn
Code and data for the paper 'Classifying Graphs as Images with Convolutional Neural Networks' (new title: 'Graph Classification with 2D Convolutional Neural Networks')
Stars: ✭ 67 (-97.35%)
Mutual labels:  artificial-intelligence, neural-networks
Wyrm
Autodifferentiation package in Rust.
Stars: ✭ 164 (-93.52%)
Mutual labels:  artificial-intelligence, neural-networks
Dm control
DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
Stars: ✭ 2,592 (+2.41%)
Mutual labels:  artificial-intelligence, neural-networks

Handtrack.js

npm version npm

View a live demo in your browser here.

Note: Version 0.0.13 is the old version of handtrack.js which tracks only hands and was trained on the egohands dataset. It is slightly more stable than the recent version v 0.1.x which is trained on a new dataset (still in active development) and supports more classes (open, closed, pinch, point, zoom etc.). You might see some issues with the new version (feel free to downgrade to 0.0.13 as needed) and also report the issues you see.

Handtrack.js is a library for prototyping realtime hand detection (bounding box), directly in the browser. It frames handtracking as an object detection problem, and uses a trained convolutional neural network to predict bounding boxes for the location of hands in an image.

Whats New? v0.1.x

Handtrack.js is currently being updated (mostly around optimizations for speed/accuracy and functionality). Here is a list of recent changes:

  • New dataset curation: A new dataset (~2000 images, 6000 labels) has been curated to cover new hand poses (discussed below) and focuses on the viewpoint of a user facing a webcam. Note that the dataset is not released (mostly because it contains personal information on the participants and effort is still underway to extract a subset that is free of PII). In the meantime, the project can still be reproduced using the egohands dataset which is public.

  • New Classes: Following a review of the use cases that developers have created so far with handtrack.js (e.g. game controls, detect face touching to minimize covid spread, air guitar etc), a new set of hand pose labels have been curated:

    • Open: All fingers are extended in an open palm position. This represents an open hand which can be the drop mode of a drag and drop operation.
    • Closed: All fingers are contracted in a ball in a closed fist position. The closed hand is similar to the drag mode for a drag and drop operation.
    • Pinch: The thumb and index finger are are together in an picking gesture. This can also double as a grab or drag mode in a drag and drop operation.
    • Point: Index finger is extended in a pointing gesture.
    • Face: To help disambiguate between the face and hands (a failure point for the previous version of handtrack.js), and to also enable face tracking applications in the same library, a face label has been added.
  • Reduced Model size: Handtrack.js now supports multiple models (e.g. ssd320fpnlite, ssd640fpnlite) with multiple sizes (large, medium and small). The large size is the default fp32 version of the each model while medium and small are fp16 and Int8 quantized versions respectively. In my experiments, the small version yields comparable accuracy but with a much smaller model weight size. For example, ssd320fpnlite sizes (large -> 12MB, medium -> 6MB, small -> 3MB!)

Note that smaller models don't translate to faster inference speed - all three sizes yield about the same FPS.

  • Model Accuracy: Early testing shows the new model to be more accurate for the front facing web cam viewpoint detection. The inclusion of face labels also reduces the earlier face misclassifications
  • Javascript Library: The handtrack.js library has been updated to fix issues with correct input image resolutions, upgrade the underlying tensorflowjs models, provide more customization options (e.g. use of small medium or large base models etc)

The underlying models are trained using the tensorflow object detection api (see here).

FPS Image Size Device Browser Comments
26 450 * 380 Macbook Pro (i7, 2.2GHz, 2018) Chrome Version 72.0.3626 --
14 450 * 380 Macbook Pro (i7, 2.2GHz, mid 2014) Chrome Version 72.0.3626 --

Note: Handtrack.js has not been extensively tested on mobile browsers. There have been some known inconsistencies still being investigated.

Documentation

Handtrack.js is provided as a useful wrapper to allow you prototype hand/gesture based interactions in your web applications. without the need to understand machine learning. It takes in a html image element (img, video, canvas elements, for example) and returns an array of bounding boxes, class names and confidence scores.

Import the Handtrack.js Library

Note that the current version of the handtrack.js library is designed to work in the browser (frontend Javascript) and not Node.js.

Handtrack.js can be imported into your application either via a script tag or via npm. Once imported, handtrack.js provides an asynchronous load() method which returns a promise for a object detection model object.

Import via Script Tag

<!-- Load the handtrackjs model. -->
<script src="https://cdn.jsdelivr.net/npm/handtrackjs@latest/dist/handtrack.min.js"> </script>

<!-- Replace this with your image. Make sure CORS settings allow reading the image! -->
<img id="img" src="hand.jpg"/>  

<!-- Place your code in the script tag below. You can also use an external .js file -->
<script>
  const img = document.getElementById('img'); 
  const model =  await handTrack.load();
  const predictions = await model.detect(img); 
</script>

via NPM

npm install --save handtrackjs
import * as handTrack from 'handtrackjs';

const img = document.getElementById('img'); 
const model =  await handTrack.load();
const predictions = await model.detect(img); 

Handtrack.js also proivdes a set of library helper methods (e.g. to start and stop video playback on a video element) and some model methods (e.g. detect, getFPS etc). Please see the project documentation page for more details on the API and examples.

How does Handtrack.js Work?

  • Trained using a dataset of annotated hand poses (open, closed, pointing, pinch ..) mostly captured from a user webcam viewpoint. This is also augmented with samples from other viewpoints (round table discussion participants, egocentric viewpoints) and varied lighting conditions. An earlier version of handtrack.js was trained with the egohands dataset.
  • Trained model is converted to the Tensorflowjs webmodel format
  • Model is wrapped into an npm package, and can be accessed using jsdelivr, a free open source cdn that lets you include any npm package in your web application. You may notice the model is loaded slowly the first time the page is opened but gets faster on subsequent loads (caching).

When Should I Use Handtrack.js

If you are interested in prototyping gesture based (body as input) interactive experiences, Handtrack.js can be useful. The user does not need to attach any additional sensors or hardware but can immediately take advantage of engagement benefits that result from gesture based and body-as-input interactions.

Some (not all) relevant scenarios are listed below: 

  • When mouse motion can be mapped to hand motion for control purposes.
  • When an overlap of hand and other objects can represent meaningful interaction signals (e.g a touch or selection event for an object).
  • Scenarios where the human hand motion can be a proxy for activity recognition (e.g. automatically tracking movement activity from a video or images of individuals playing chess). Or simply counting how many humans are present in an image or video frame.
  • You want an accessible demonstration that anyone can easily run or tryout with minimal setup.
  • You are interested in usecases that have offline, realtime, privacy requirements.

Limitations

The main limitation currently is that handtrack.js is still a fairly heavy model and there have been some inconsistent results when run on mobile.

Run Demo

Commands below runs the demo example in the demo folder.

npm install
npm run start

The start script launches a simple python3 webserver from the demo folder using http.server. You should be able to view it in your browser at http://localhost:3005/. You can also view the pong game control demo on same link http://localhost:3005/pong.html

Citing this Work (see abstract)

Paper abstract of the paper is here. (a full paper will be added when complete).

If you use this code in your project/paper/research and would like to cite this work, use the below.

Victor Dibia, HandTrack: A Library For Prototyping Real-time Hand TrackingInterfaces using Convolutional Neural Networks, https://github.com/victordibia/handtracking

@article{Dibia2017,
  author = {Victor, Dibia},
  title = {HandTrack: A Library For Prototyping Real-time Hand TrackingInterfaces using Convolutional Neural Networks},
  year = {2017},
  publisher = {GitHub},
  journal = {GitHub repository},
  url = {https://github.com/victordibia/handtracking/tree/master/docs/handtrack.pdf}, 
}

TODO (ideas welcome)

  • Optimization: This thing is still compute heavy (your fans may spin after while). This is mainly because of the neural net operations needed to predict bounding boxes. I am currently exploring CenterNets (an anchor free object detection model) as one way to minimize compute requirements.

  • Tracking id's across frames. Perhaps some nifty methods that assigns ids to each had as they enter a frame and tracks them (e.g based on naive euclidean distance).

  • Add some discrete poses (e.g. instead of just hand, detect open hand, closed, ).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].