All Projects → thuhcsi → Crystal

thuhcsi / Crystal

Licence: apache-2.0
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.

Projects that are alternatives of or similar to Crystal

Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (-4.63%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Lightspeech
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Stars: ✭ 31 (-71.3%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Parakeet
PAddle PARAllel text-to-speech toolKIT (supporting WaveFlow, WaveNet, Transformer TTS and Tacotron2)
Stars: ✭ 279 (+158.33%)
Mutual labels:  speech-synthesis, text-to-speech, tts
esp32-flite
Speech synthesis running on ESP32 based on Flite engine.
Stars: ✭ 28 (-74.07%)
Mutual labels:  text-to-speech, tts, speech-synthesis
Cs224n Gpu That Talks
Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)
Stars: ✭ 52 (-51.85%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Comprehensive-Tacotron2
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
Stars: ✭ 22 (-79.63%)
Mutual labels:  text-to-speech, tts, speech-synthesis
Parallelwavegan
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch
Stars: ✭ 682 (+531.48%)
Mutual labels:  speech-synthesis, text-to-speech, tts
talkie
Text-to-speech browser extension button. Select text on any web page, and have the computer read it out loud for you by simply clicking the Talkie button.
Stars: ✭ 43 (-60.19%)
Mutual labels:  text-to-speech, tts, speech-synthesis
Multilingual text to speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
Stars: ✭ 324 (+200%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Hifi Gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Stars: ✭ 325 (+200.93%)
Mutual labels:  speech-synthesis, text-to-speech, tts
editts
Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
Stars: ✭ 74 (-31.48%)
Mutual labels:  text-to-speech, tts, speech-synthesis
Wsay
Windows "say"
Stars: ✭ 36 (-66.67%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (-32.41%)
Mutual labels:  text-to-speech, tts, speech-synthesis
Jsut Lab
HTS-style full-context labels for JSUT v1.1
Stars: ✭ 28 (-74.07%)
Mutual labels:  speech-synthesis, text-to-speech, tts
spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (-51.85%)
Mutual labels:  text-to-speech, tts, speech-synthesis
Glow Tts
A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Stars: ✭ 284 (+162.96%)
Mutual labels:  speech-synthesis, text-to-speech, tts
LVCNet
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation
Stars: ✭ 67 (-37.96%)
Mutual labels:  text-to-speech, tts, speech-synthesis
ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (+46.3%)
Mutual labels:  text-to-speech, tts, speech-synthesis
Cognitive Speech Tts
Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
Stars: ✭ 312 (+188.89%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Voice Builder
An opensource text-to-speech (TTS) voice building tool
Stars: ✭ 362 (+235.19%)
Mutual labels:  speech-synthesis, text-to-speech, tts

Crystal Text-to-Speech (TTS) Engine

C++ implementation of Crystal Text-to-Speech (TTS) engine.

The Crystal TTS engine provides an implementation of a unified framework for multilingual TTS synthesis engine – Crystal. The unified framework defines the common TTS modules for different languages and/or dialects. The interfaces between consecutive modules conform to Speech Synthesis Markup Language (SSML) specification for standardization, in-teroperability, multilinguality, and extensibility.

Architecture

Reference

For the motivation and design of the framework, you can refer to the the following paper. Please also use this paper for reference to this project:

Native Support of SSML

The framework uses Speech Synthesis Markup Language (SSML) specification as interface between different modules. Hence, the framework provides native support of SSML tags.

Meanwhile, the framework provides cst::xml::CSSMLTraversal (xml/ssml_traversal) to convert the SSML document into internal data structure for convenient processing. This means you actually donot need to take care of the complex parsing procedures of SSML document when implementing your own algorithms. What you need to do is just to implement your algorithms by overriding the functions with internal data structures for the modules in cst::tts::base::*.

Support of Dynamic Module Loading & Cross-platform

The framework provides the support of dynamic module loading on different platforms.

You can implement different algorithms for each module and compile as a new dynamic library (.dll on Windows, .so on Linux platform). The backbone of the framework cst::tts::base::CTextParser (ttsbase/tts.text/tts_textparser) and cst::tts::base::CSynthesizer (ttsbase/tts.synth/tts_synthesizer) will automatically load the modules specified by an XML based configuration file. In this way, the framework provides the flexibility in switching between different TTS engines or algorithms.

For example, the above left figure shows Concatenative Putonghua TTS engine running by specifying the "cmn.xml" as configuration input; while the above right figure shows HMM-based Chinese TTS engine running by specifying the "zh.xml" as configuration input.

Support of Multilingual TTS Engine

You can implement different TTS engines for different languages by overriding the TTSBase moduels in cst::tts::base::*. The following figure depicts the multilingual support of the architecture.

About the Project

Copyright (c) Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems. All rights reserved.

http://mjrc.sz.tsinghua.edu.cn

Tsinghua-CUHK Joint Research Center has the rights to create, modify, copy, compile, remove, rename, explain and deliver the source codes.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].