All Projects β†’ dictation-toolbox β†’ Dragonfly

dictation-toolbox / Dragonfly

Licence: lgpl-3.0
Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Dragonfly

Cordova Plugin Speechrecognition
🎀 Cordova Plugin for Speech Recognition
Stars: ✭ 174 (-16.75%)
Mutual labels:  speech-recognition
Tensorflow Speech Recognition
πŸŽ™Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Stars: ✭ 2,118 (+913.4%)
Mutual labels:  speech-recognition
Lingvo
Lingvo
Stars: ✭ 2,361 (+1029.67%)
Mutual labels:  speech-recognition
End2end Asr Pytorch
End-to-End Automatic Speech Recognition on PyTorch
Stars: ✭ 175 (-16.27%)
Mutual labels:  speech-recognition
Vosk
VOSK Speech Recognition Toolkit
Stars: ✭ 182 (-12.92%)
Mutual labels:  speech-recognition
Asr Evaluation
Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).
Stars: ✭ 190 (-9.09%)
Mutual labels:  speech-recognition
Naomi
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Stars: ✭ 171 (-18.18%)
Mutual labels:  speech-recognition
Kaldi Active Grammar
Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
Stars: ✭ 196 (-6.22%)
Mutual labels:  speech-recognition
Kaldi Offline Transcriber
Offline transcription system for Estonian using Kaldi
Stars: ✭ 182 (-12.92%)
Mutual labels:  speech-recognition
Automatic Speech Recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
Stars: ✭ 192 (-8.13%)
Mutual labels:  speech-recognition
Deepspeech Server
A testing server for a speech to text service based on mozilla deepspeech
Stars: ✭ 176 (-15.79%)
Mutual labels:  speech-recognition
Deepspeech German
Automatic Speech Recognition (ASR) - German
Stars: ✭ 179 (-14.35%)
Mutual labels:  speech-recognition
Kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.
Stars: ✭ 190 (-9.09%)
Mutual labels:  speech-recognition
Kaldi Onnx
Kaldi model converter to ONNX
Stars: ✭ 174 (-16.75%)
Mutual labels:  speech-recognition
Dictate.js
A small Javascript library for browser-based real-time speech recognition, which uses Recorderjs for audio capture, and a WebSocket connection to the Kaldi GStreamer server for speech recognition.
Stars: ✭ 195 (-6.7%)
Mutual labels:  speech-recognition
Gst Kaldi Nnet2 Online
GStreamer plugin around Kaldi's online neural network decoder
Stars: ✭ 171 (-18.18%)
Mutual labels:  speech-recognition
Voice Overlay Android
πŸ—£ An overlay that gets your user’s voice permission and input as text in a customizable UI
Stars: ✭ 189 (-9.57%)
Mutual labels:  speech-recognition
Edgedict
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
Stars: ✭ 205 (-1.91%)
Mutual labels:  speech-recognition
K6nele
An Android app that offers speech-to-text services and user interfaces to other apps
Stars: ✭ 196 (-6.22%)
Mutual labels:  speech-recognition
Speechtotext Websockets Javascript
SDK & Sample to do speech recognition using websockets in Javascript
Stars: ✭ 191 (-8.61%)
Mutual labels:  speech-recognition

Dragonfly

|Build Status| |Docs Status| |Join Gitter chat| |Join Matrix chat|

.. contents:: Contents

Introduction

Dragonfly is a speech recognition framework for Python that makes it convenient to create custom commands to use with speech recognition software. It was written to make it very easy for Python macros, scripts, and applications to interface with speech recognition engines. Its design allows speech commands and grammar objects to be treated as first-class Python objects.

Dragonfly can be used for general programming by voice. It is flexible enough to allow programming in any language, not just Python. It can also be used for speech-enabling applications, automating computer activities and dictating prose.

Dragonfly contains its own powerful framework for defining and executing actions. It includes actions for text input and key-stroke simulation. This framework is cross-platform, working on Windows, macOS and Linux (X11 only). See the actions sub-package documentation <https://dragonfly2.readthedocs.io/en/latest/actions.html>__ for more information, including code examples.

This project is a fork of the original t4ngo/dragonfly <https://github.com/t4ngo/dragonfly>__ project.

Dragonfly currently supports the following speech recognition engines:

  • Dragon, a product of Nuance. All versions up to 15 (the latest) should be supported. Home, Professional Individual and previous similar editions of Dragon are supported. Other editions may work too
  • Windows Speech Recognition (WSR), included with Microsoft Windows Vista, Windows 7+, and freely available for Windows XP
  • Kaldi (under development)
  • CMU Pocket Sphinx (with caveats)

Documentation and FAQ

Dragonfly's documentation is available online at Read the Docs <http://dragonfly2.readthedocs.org/en/latest/>. The changes in each release are listed in the project's changelog <https://github.com/dictation-toolbox/dragonfly/blob/master/CHANGELOG.rst>. Dragonfly's FAQ is available in the documentation here <https://dragonfly2.readthedocs.io/en/latest/faq.html>__. There are also a number of Dragonfly-related questions on Stackoverflow <http://stackoverflow.com/questions/tagged/python-dragonfly>_, although many of them are related to issues resolved in the latest version of Dragonfly.

CompoundRule Usage example

A very simple example of Dragonfly usage is to create a static voice command with a callback that will be called when the command is spoken. This is done as follows:

.. code-block:: python

from dragonfly import Grammar, CompoundRule

# Voice command rule combining spoken form and recognition processing.
class ExampleRule(CompoundRule):
    spec = "do something computer"                  # Spoken form of command.
    def _process_recognition(self, node, extras):   # Callback when command is spoken.
        print("Voice command spoken.")

# Create a grammar which contains and loads the command rule.
grammar = Grammar("example grammar")                # Create a grammar to contain the command rule.
grammar.add_rule(ExampleRule())                     # Add the command rule to the grammar.
grammar.load()                                      # Load the grammar.

To use this example, save it in a command module in your module loader directory or Natlink user directory, load it and then say do something computer. If the speech recognition engine recognized the command, then Voice command spoken. will be printed in the Natlink messages window. If you're not using Dragon, then it will be printed into the console window.

MappingRule usage example

A more common use of Dragonfly is the MappingRule class, which allows defining multiple voice commands. The following example is a simple grammar to be used when Notepad is the foreground window:

.. code-block:: python

from dragonfly import (Grammar, AppContext, MappingRule, Dictation,
                       Key, Text)

# Voice command rule combining spoken forms and action execution.
class NotepadRule(MappingRule):
    # Define the commands and the actions they execute.
    mapping = {
        "save [file]":            Key("c-s"),
        "save [file] as":         Key("a-f, a/20"),
        "save [file] as <text>":  Key("a-f, a/20") + Text("%(text)s"),
        "find <text>":            Key("c-f/20") + Text("%(text)s\n"),
    }

    # Define the extras list of Dragonfly elements which are available
    # to be used in mapping specs and actions.
    extras = [
        Dictation("text")
    ]


# Create the grammar and the context under which it'll be active.
context = AppContext(executable="notepad")
grammar = Grammar("Notepad example", context=context)

# Add the command rule to the grammar and load it.
grammar.add_rule(NotepadRule())
grammar.load()

To use this example, save it in a command module in your module loader directory or Natlink user directory, load it, open a Notepad window and then say one of mapping commands. For example, saying save or save file will cause the control and S keys to be pressed.

The example aboves don't show any of Dragonfly's exciting features, such as dynamic speech elements. To learn more about these, please take a look at Dragonfly's online docs <http://dragonfly2.readthedocs.org/en/latest/>__.

Installation

Dragonfly is a Python package. It can be installed as dragonfly2 using pip:

.. code:: shell

pip install dragonfly2

The distribution name has been changed to dragonfly2 in order to upload releases to PyPI.org, but everything can still be imported using dragonfly. If you use any grammar modules that include something like :code:pkg_resources.require("dragonfly >= 0.6.5"), you will need to either replace :code:dragonfly with :code:dragonfly2 or remove lines like this altogether.

If you are installing this on Linux, you will also need to install the xdotool <https://www.semicomplete.com/projects/xdotool/>__ and wmctrl <https://www.freedesktop.org/wiki/Software/wmctrl/>__ programs. You may also need to manually set the XDG_SESSION_TYPE environment variable to x11.

Please note that Dragonfly is only fully functional in an X11 session on Linux. Input action classes, application contexts and the Window class will not be functional under Wayland. It is recommended that Wayland users switch to X11.

If you have dragonfly installed under the original dragonfly distribution name, you'll need to remove the old version using:

.. code:: shell

pip uninstall dragonfly

Dragonfly can also be installed by cloning this repository or downloading it from the releases page <https://github.com/dictation-toolbox/dragonfly/releases>__ and running the following (or similar) command in the project's root directory:

.. code:: shell

python setup.py install

If pip fails to install dragonfly2 or any of its required or extra dependencies, then you may need to upgrade pip with the following command:

.. code:: shell

pip install --upgrade pip

SR engine back-ends

Each Dragonfly speech recognition engine back-end and its requirements are documented separately:

  • Natlink and DNS engine <http://dragonfly2.readthedocs.org/en/latest/natlink_engine.html>_
  • SAPI 5 and WSR engine <http://dragonfly2.readthedocs.org/en/latest/sapi5_engine.html>_
  • Kaldi engine <http://dragonfly2.readthedocs.org/en/latest/kaldi_engine.html>_
  • CMU Pocket Sphinx engine <http://dragonfly2.readthedocs.org/en/latest/sphinx_engine.html>_
  • Text-input engine <http://dragonfly2.readthedocs.org/en/latest/text_engine.html>_

Existing command modules

The related resources page of Dragonfly's documentation has a section on command modules <http://dragonfly2.readthedocs.org/en/latest/related_resources.html#command-modules>__ which lists various sources.

.. |Build Status| image:: https://travis-ci.org/dictation-toolbox/dragonfly.svg?branch=master :target: https://travis-ci.org/dictation-toolbox/dragonfly .. |Docs Status| image:: https://readthedocs.org/projects/dragonfly2/badge/?version=latest&style=flat :target: https://dragonfly2.readthedocs.io .. |Join Gitter chat| image:: https://badges.gitter.im/Join%20Chat.svg :target: https://gitter.im/dictation-toolbox/dragonfly .. |Join Matrix chat| image:: https://img.shields.io/matrix/dragonfly2:matrix.org.svg?label=%5Bmatrix%5D :target: https://app.element.io/#/room/#dictation-toolbox_dragonfly:gitter.im

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].