Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → ZDisket → TensorVox

ZDisket / TensorVox

Licence: MIT license

Desktop application for neural speech synthesis written in C++

Programming Languages

36643 projects - #6 most used programming language

Labels

text-to-speech real-time desktop tts speech-synthesis phoneme voice-synthesis tacotron2 multiband-melgan mb-melgan fastspeech2

Projects that are alternatives of or similar to TensorVox

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Stars: ✭ 2,382 (+1601.43%)

Mutual labels: text-to-speech, tts, speech-synthesis, tacotron2, multiband-melgan, fastspeech2

Comprehensive-Tacotron2

PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.

Stars: ✭ 22 (-84.29%)

Mutual labels: text-to-speech, tts, speech-synthesis, tacotron2

AdaSpeech: Adaptive Text to Speech for Custom Voice

Stars: ✭ 108 (-22.86%)

Mutual labels: text-to-speech, tts, speech-synthesis, fastspeech2

spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Stars: ✭ 52 (-62.86%)

Mutual labels: text-to-speech, tts, speech-synthesis, voice-synthesis

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Stars: ✭ 5,427 (+3776.43%)

Mutual labels: text-to-speech, tts, tacotron2, multiband-melgan

Expressive-FastSpeech2

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

Stars: ✭ 139 (-0.71%)

Mutual labels: text-to-speech, tts, speech-synthesis

Windows "say"

Stars: ✭ 36 (-74.29%)

Mutual labels: text-to-speech, tts, speech-synthesis

Official implementation of Meta-StyleSpeech and StyleSpeech

Stars: ✭ 161 (+15%)

Mutual labels: text-to-speech, tts, speech-synthesis

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

Stars: ✭ 245 (+75%)

Mutual labels: text-to-speech, tts, speech-synthesis

An opensource text-to-speech (TTS) voice building tool

Stars: ✭ 362 (+158.57%)

Mutual labels: text-to-speech, tts, speech-synthesis

Spokestack Python

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.

Stars: ✭ 103 (-26.43%)

Mutual labels: text-to-speech, tts, speech-synthesis

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Stars: ✭ 1,604 (+1045.71%)

Mutual labels: text-to-speech, tts, speech-synthesis

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Stars: ✭ 31 (-77.86%)

Mutual labels: text-to-speech, tts, speech-synthesis

HTS-style full-context labels for JSUT v1.1

Stars: ✭ 28 (-80%)

Mutual labels: text-to-speech, tts, speech-synthesis

Cs224n Gpu That Talks

Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)

Stars: ✭ 52 (-62.86%)

Mutual labels: text-to-speech, tts, speech-synthesis

Parallelwavegan

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch

Stars: ✭ 682 (+387.14%)

Mutual labels: text-to-speech, tts, speech-synthesis

WaveRNN Vocoder + TTS

Stars: ✭ 1,636 (+1068.57%)

Mutual labels: text-to-speech, tts, speech-synthesis

Text to Speech with PyTorch (English and Mongolian)

Stars: ✭ 122 (-12.86%)

Mutual labels: text-to-speech, tts, speech-synthesis

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.

Stars: ✭ 111 (-20.71%)

Mutual labels: text-to-speech, tts, speech-synthesis

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+110.71%)

Mutual labels: text-to-speech, tts, speech-synthesis

View All Similar Projects ➔

TensorVox

TensorVox is an application designed to enable user-friendly and lightweight neural speech synthesis in the desktop, aimed at increasing accessibility to such technology.

Powered mainly by TensorFlowTTS and also by Coqui-TTS and VITS, it is written in pure C++/Qt, using the Tensorflow C API for interacting with Tensorflow models (first two), and LibTorch for PyTorch ones. This way, we can perform inference without having to install gigabytes worth of Python libraries, just a few DLLs.

Try it out

Detailed guide in Google Docs

Grab a copy from the releases, extract the .zip and check the Google Drive folder for models and installation instructions

If you're interested in using your own model, first you need to train then export it.

Supported architectures

TensorVox supports models from three repos:

TensorFlowTTS: FastSpeech2, Tacotron2, both char and phoneme based and Multi-Band MelGAN. Here's a Colab notebook demonstrating how to export the LJSpeech pretrained, char-based Tacotron2 model:
Coqui-TTS: Tacotron2 (phoneme-based IPA) and Multi-Band MelGAN, after converting from PyTorch to Tensorflow. Here's a notebook showing how to export the LJSpeech DDC model:
jaywalnut310/VITS: VITS, which is a fully E2E model. (Stressed IPA as phonemes) Export notebook:

Those two examples should provide you with enough guidance to understand what is needed. If you're looking to train a model specifically for this purpose then I recommend TensorFlowTTS, as it is the one with the best support, and VITS, as it's the closest thing to perfect As for languages, out-of-the-box support is provided for English (Coqui and TFTTS, VITS), German and Spanish (only TensorFlowTTS); that is, you won't have to do anything. You can add languages without modifying code, as long as the phoneme set are IPA (stressed or nonstressed), ARPA, or GlobalPhone, (open an issue and I'll explain it to you)

Build instructions

Currently, only Windows 10 x64 (although I've heard reports of it running on 8.1) is supported.

Requirements:

Qt Creator
MSVC 2017 (v141) compiler

Primed build (with all provided libraries):

Download precompiled binary dependencies and includes
Unzip it so that the deps folder is in the same place as the .pro and main source files.
Open the project with Qt Creator, add your compiler and compile

Note that to try your shiny new executable you'll need to download a release of program as described above and replace the executable in that release with your new one, so you have all the DLLs in place.

TODO: Add instructions for compile from scratch.

Externals (and thanks)

LibTorch: https://pytorch.org/cppdocs/installing.html
Tensorflow C API: https://www.tensorflow.org/install/lang_c
CppFlow (TF C API -> C++ wrapper): https://github.com/serizba/cppflow
AudioFile (for WAV export): https://github.com/adamstark/AudioFile
Frameless Dark Style Window: https://github.com/Jorgen-VikingGod/Qt-Frameless-Window-DarkStyle
JSON for modern C++: https://github.com/nlohmann/json
r8brain-free-src (Resampling): https://github.com/avaneev/r8brain-free-src
rnnoise (CMake version, denoising output): https://github.com/almogh52/rnnoise-cmake
Logitech LED Illumination SDK (Mouse RGB integration): https://www.logitechg.com/en-us/innovation/developer-lab.html
QCustomPlot : https://www.qcustomplot.com/index.php/introduction
libnumbertext : https://github.com/Numbertext/libnumbertext

Contact

You can open an issue here or join the Discord server and discuss/ask anything there

For media/licensing/any other formal stuff inquiries, send to this email: [email protected]

Note about licensing

This program itself is MIT licensed, but for the models you use, their license terms apply. For example, if you're in Vietnam and using TensorFlowTTS models, you'll have to check here for some details

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 140

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗