All Projects → mklement0 → speak.awf

mklement0 / speak.awf

Licence: other
An Alfred 3 workflow that uses macOS's TTS (text-to-speech) feature to speak text aloud.

Programming Languages

shell
77523 projects
Makefile
30231 projects

Projects that are alternatives of or similar to speak.awf

voices
macOS CLI for changing the default TTS (text-to-speech) voice and printing information about and speaking text with multiple voices.
Stars: ✭ 53 (+82.76%)
Mutual labels:  text-to-speech, tts, voices
tts dataset maker
A gui to help make a text to speech dataset.
Stars: ✭ 20 (-31.03%)
Mutual labels:  text-to-speech, tts
sam
SAM: Software Automatic Mouth (Ported from https://github.com/vidarh/SAM)
Stars: ✭ 33 (+13.79%)
Mutual labels:  text-to-speech, tts
text-to-speech
⚡️ Capacitor plugin for synthesizing speech from text.
Stars: ✭ 50 (+72.41%)
Mutual labels:  text-to-speech, tts
laravel-text-to-speech
💬 A wrapper for popular TTS services to create a more simple & uniform API. Currently, only AWS Polly is supported.
Stars: ✭ 26 (-10.34%)
Mutual labels:  text-to-speech, tts
TTS tf
WIP Tensorflow implementation of https://github.com/mozilla/TTS
Stars: ✭ 14 (-51.72%)
Mutual labels:  text-to-speech, tts
dctts-pytorch
The pytorch implementation of DC-TTS
Stars: ✭ 73 (+151.72%)
Mutual labels:  text-to-speech, tts
WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Stars: ✭ 55 (+89.66%)
Mutual labels:  text-to-speech, tts
ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (+444.83%)
Mutual labels:  text-to-speech, tts
spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (+79.31%)
Mutual labels:  text-to-speech, tts
bingspeech-api-client
Microsoft Bing Speech API client in node.js
Stars: ✭ 32 (+10.34%)
Mutual labels:  text-to-speech, tts
FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
Stars: ✭ 163 (+462.07%)
Mutual labels:  text-to-speech, tts
STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Stars: ✭ 105 (+262.07%)
Mutual labels:  text-to-speech, tts
Tacotron2-PyTorch
Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.
Stars: ✭ 118 (+306.9%)
Mutual labels:  text-to-speech, tts
SpeakIt Vietnamese TTS
Vietnamese Text-to-Speech on Windows Project (zalo-speech)
Stars: ✭ 81 (+179.31%)
Mutual labels:  text-to-speech, tts
Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Stars: ✭ 149 (+413.79%)
Mutual labels:  text-to-speech, tts
persian-tts
🔊 A simple human-based text-to-speach synthesiser and ReactNative app for Persian language.
Stars: ✭ 18 (-37.93%)
Mutual labels:  text-to-speech, tts
golang-tts
Text-to-Speach golang package based in Amazon Polly service
Stars: ✭ 19 (-34.48%)
Mutual labels:  text-to-speech, tts
Daft-Exprt
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
Stars: ✭ 41 (+41.38%)
Mutual labels:  text-to-speech, tts
LVCNet
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation
Stars: ✭ 67 (+131.03%)
Mutual labels:  text-to-speech, tts

npm version license

Contents

speak.awf — Alfred 3+ TTS (text-to-speech) workflows

An Alfred workflow that uses macOS's TTS (text-to-speech) feature to speak text aloud, especially for multi-lingual use (macOS allows on-demand download of voices in other languages).
Note: Use of workflows in Alfred requires the paid Power Pack add-on - an investment well worth making.

The workflow comes with two distinct feature groups:

  • Speak the active application's text with a specific voice.

    • Useful for multi-lingual setups where you want to have text spoken in one of several languages on demand; for instance, you could have one global keyboard shortcut for speaking text in English, and another for Spanish.
  • Speak specified text with one or more voices, selectable by name(s) or language(s).

    • Useful for interactive experimentation with multiple voices, such as to contrast regional accents.

Note that both feature groups target active voices; i.e., the set of voices selected for active use in System Preferences.
If you know that a voice is installed, yet it doesn't show up in the workflows, make sure it has a check mark in System Preferences > Dictation & Speech > Text to Speech > Systme Voice > Customize....
This is also where you download additional voices.
You can get there more quickly from either the speak or say keywords with no arguments by pressing ⌥↩ (Option+Return).

Caveats:

  • Only voices provided by Apple are supported, not third-party voices such as InfoVox iVox.

  • Additionally, as of macOS 10.15, Siri voices are not supported, due to lack of API support (see this Stack Overflow question).

See also: CLI voices, which this workflow uses behind the scenes.

Speaking the active application's text with a specific voice

This feature group comprises:

  • Keyword speak with an ad-hoc selectable voice; e.g.:
    • speakalex or speak alex or speak @alex speak with voice "Alex".
  • Potentially multiple global keyboard shortcuts to speak with a specific, predefined voice, which is particularly useful for multilingual setups: you can define dedicated keyboard shortcuts to speak with language-specific voices;
    for instance, you could have one keyboard shortcut for speaking English texts, and another for Spanish.

This feature is based on the system feature for speaking the active application's text, wrapping it with the ability to speak with a specific voice on demand.

This implies the following, whether you invoke the feature with keyword speak for ad-hoc voice selection or via a dedicated keyboard shortcut:

  • It acts as a toggle: invoking the keyword or a shortcut again while speech is still in progress stops it.

  • You can, but do not need to select the text to speak - depending on the application, all text may be spoken implicitly (e.g., in TextEdit.app), or, in Safari.app, Reader view (if available) is automatically activated to read only the text of interest.
    Conversely, however, non-native macOS applications may not report even explicitly selected text to the system; in that case, use the global keyboard shortcut associated with the say keyword (see next chapter).

Note:

  • A side effect of speaking with a given voice is that that voice implicitly becomes the new default voice.
    This means that, from that point on, invoking speech without specifying a voice will use that voice.

  • Keystrokes are sent behind the scenes to activate the system feature for speaking the active application's text with the default voice. For the most part, this works fine, but occasionally, especially under heavy system load, this may fail.
    (If you know of a way to invoke this system feature programmatically, do let me know.)

    • If speaking doesn't start, at least the part of switching to the target voice should have succeeded, so you can then try to use the system keyboard shortcut (Option+Esc by default) to trigger speaking.

Speak specified text with one or more voices

This feature group comprises:

  • Keyword say, which speaks given text as part of the Alfred 3 command line, with one or more voices selectable by name(s) or language(s).
  • A global keyboard shortcut that displays say with the explicitly selected text in the active application.
    This is not only useful for speaking an application's text with multiple voices, but also for speaking explicitly selected text in non-native macOS applications, whose selected text the speak keyword wouldn't recognize.

say redisplays itself after speaking so as to facilitate iterative experimentation.

  • say[<voice>] or say @<voice,...> selects one or more voices by name; e.g.
    • sayalex Speak this. speaks "Speak this." with voice "Alex".
    • say @alex Speak this. does the same.
    • say @al,ji Speak this. speaks with voice "Alex", then "Jill" - note how using name prefixes is enough.
  • say #<lang,...> selects one or more languages by their IDs; e.g.:
    • say #enus,enin Speak this. speaks with all US-English (en_US) and Indian English voices (en_IN) - note how case and punctuation do not matter.
    • The list of active voices shown by default shows each voice's language ID in parentheses.
  • The @ or # specifier may be placed either before or after the text to speak.
  • If you don't specify text to speak, the selected voices' demo text is spoken.

Additionally, using modifier keys alongside (Return) offers additional functionality:

  • ⌥↩ (Option+Return)
    • When invoked on a specific voice: makes that voice the new default voice.
    • When invoked on the 1st item (the current default voice): opens System Preferences to the relevant pane to manage voices and TTS features.
  • ⌃↩ (Control+Return)
    • Stops ongoing speech.
  • ⇧↩ (Shift+Return)
    • Clears the current arguments on the Alfred command line.

Feature summary

  • Voice selection, optionally with voice-name and target-language filtering.
  • Ability to speak text in sequence with multiple voices.
  • Rich, dynamic feedback (name of default voice, voice languages, demo text).
  • Selected voices can speak their demo text.
  • Redisplays Alfred with the same query for interactive experimentation.
  • Ability to change the default voice directly from Alfred.
  • Option to open System Preferences to manage voices and TTS options.
  • Option to use a hotkey to speak the selected text in any application.

Installation

Prerequisites

  • macOS (OS X) 10.10 or higher
    • If you only use the say keyword, you can use the workflow on older macOS versions too, provided you install it manually.
  • Alfred 3 with its paid Power Pack add-on.
  • The global TTS keyboard shortcut must be activated:
    • Open System Preferences.
    • In pane Dication & Speech, anchor Text to Speech, ensure that Speak selected text when the key is pressed is checked.
    • No extra customization steps are needed if you leave the default global keyboard shortcut, ⌥⎋ (Option+Esc), in place (recommended).

Installation from the npm registry

Note: Even if you don't use Node.js itself: its package manager, npm, works across platforms and is easy to install; try
curl -L http://git.io/n-install | bash

With Node.js installed, install the package as follows:

[sudo] npm install speak.awf -g

Note:

  • Whether you need sudo depends on how you installed Node.js and whether you've changed permissions later; if you get an EACCES error, try again with sudo.
  • Alfred 3 will prompt you to import the workflow - select a category (optional; "Tools" recommended), and confirm.
  • After importing, proceed with customization below.

Manual installation

  • Click here to download the installer.
  • Open the downloaded file: Alfred 3 will prompt you to import the workflow - select a category (optional; "Tools" recommended), and confirm.
  • After importing, proceed with customization below.

Customization

Caveat: If you reinstall or upgrade this workflow, your custom keyboard shortcuts and customized keywords are retained, but the following aspects of customization must be performed again:

  • You must reassign the specific voices assigned to the 3 predefined hotkey workflows for speaking the active application's text (see below).
    • If you've manually added additional hotkeys, they are, unfortunately, lost, and have to be recreated.
  • If you're using a custom system hotkey for speaking the active application's text (not recommended), you must tell this workflow about it again (see below).

Customizing the speaking-the-active-application's-text-with-a-specific-voice feature

Customization has two to three parts:

  • Decide what predefined voices you want text to be spoken with by global hotkey (keyboard shortcut).
  • Assign a hotkey to each such voice.
  • If you've chosen a custom hotkey for the Speak selected text when the key is pressed system feature (not recommended): see the next chapter.

Unless already there right after having installed the workflow, open Alfred 3's Preferences... dialog via Alfred 3's menu-bar icon and locate workflow Speak Active App's Text.

The workflow comes with three predefined hotkey-based definitions, based on preinstalled voices "Alex", "Vicki", and "Victoria" Adapt them to your needs:

  • Double-click on each Hotkey box:
    • Assign the desired hotkey by clicking in the Hotkey field and pressing the desired key combination.
      • Recommendation: use ⌥1 (Option+1), ⌥2, ... for the voices of interest.
    • Specify the desired voice in the Text input field , using a voice name as displayed in System Preferences > Dication & Speech > Text to Speech

To define additional hotkey-triggered voices:

  • Control-click any existing Hotkey box and select Copy.
  • Control-click again and select Paste - a new, empty Hotkey box will appear.
  • Do the same thing for any existing Run Script box.
  • From the right edge of the new Hotkey box, drag a connection to the new Run Script box.
  • Customize the Hotkey box as described above.

Configuration with a custom system keyboard shortcut

  • Open Alfred's Preferences, locate this workflow (Speak - TTS (Text-To-Speech) Workflows), control-click on it in the list on the left, and select Show in Finder.
  • Open file toggleSpeaking in a text editor and follow the instructions at the top of the file.

Customizing the speaking-given-text-with-one-or-more-voices feature

To assign a hotkey (global keyboard shortcut) to the feature that invokes say with the text currenctly selected in the active application:

  • Double-click on the Hotkey box below the say keyword box.
  • Assign a hotkey (global keyboard shortcut) of choice.
    • Recommendation: use ⌥` (Option+`)

License

Copyright (c) 2015-2017 Michael Klement [email protected] (http://same2u.net), released under the MIT license.

Acknowledgements

This project gratefully depends on the following open-source components, according to the terms of their respective licenses.

npm dependencies below have optional suffixes denoting the type of dependency; the absence of a suffix denotes a required run-time dependency: (D) denotes a development-time-only dependency, (O) an optional dependency, and (P) a peer dependency.

npm dependencies

Changelog

Versioning complies with semantic versioning (semver).

  • v0.4.2 (2017-01-03):

    • [doc] Lack of support for third-party voices noted.
    • [fix] Invoking System Preferences to manage installed voices now works on macOS Sierra.
  • v0.4.1 (2016-10-02):

    • [breaking change] Updated to work with Alfred 3. If you still need Alfred 2 support, download v0.3.5.
  • v0.3.5 (2015-11-08):

    • [doc] README.md link to current installer fixed.
  • v0.3.4 (2015-11-07):

    • [doc] README.md corrections and improvements.
  • v0.3.3 (2015-11-03):

    • [doc] README.md corrections and improvements.
  • v0.3.2 (2015-11-03):

    • [dev] The workflow's source code is now in Alfred 2's "Tools" category (was previously uncategorized), though it turns out that Alfred 2 defaults to "Uncategorised" on import (installation).
  • v0.3.1 (2015-11-03):

    • [enhancement] The say-invoking hotkey now appends a space to the pasted text so as to allow typing @ right away to select a voice or voices of interest.
    • [fix] say now correctly reflects the current default voice even after changing it implicitly via hotkey.
    • [fix] Cache files are now stored in a folder that reflects the actual bundle ID: $HOME/Library/Caches/com.runningwithcrayons.Alfred-2/Workflow Data/net.same2u.speak.awf" - the old folder can safely be removed: $HOME/Library/Caches/com.runningwithcrayons.Alfred-2/Workflow Data/net.same2u.say.awf"
    • [doc] README.md corrections and improvements.
  • v0.3.0 (2015-11-02):

    • [major enhancements] Added keyword say for interactive experimentation with multiple voices, selectable by name(s) or language(s). Consistent use of modifiers keys across keywords speak and say: ⌥↩ to make a specific voice the new default / invoke System Preferences to manage voices, ⌃↩ to stop ongoing speech, ⇧↩ to clear the current argument list.
  • v0.1.6 (2015-11-01):

    • [enhancement] Option+Enter makes a specific voice the new default voice; on the speak-with-default-voice and no-matching-voice-found result items it instead displays System Preferences for managing the installed/active voices.
    • [enhancement] Speak-with-default-voice result item now names the current default voice.
    • [doc] README.md corrections.
  • v0.1.5 (2015-10-30):

    • [doc] README.md update: npm badge and install instructions added.
  • v0.1.4 (2015-10-30):

    • [fix] Removed accidentally-left-behind debug output.
  • v0.1.3 (2015-10-30):

    • First version to be published at the npm registry.
  • v0.1.2 (2015-10-30):

    • [fix] Fix for neglecting to include the updated-by-commit-hook alfredworkflow/version file in the commit.
  • v0.1.1 (2015-10-30):

    • [doc] README.md improvements.
  • v0.1.0 (2015-10-30):

    • Initial release.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].