End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.

Stars: ✭ 50 (+100%)

Mutual labels: speech-synthesis

TinyCog

Small Robot, Toy Robot platform

Stars: ✭ 29 (+16%)

Mutual labels: speech-synthesis

speechrec

a simple speech recognition app using the Web Speech API Interfaces

Stars: ✭ 18 (-28%)

Mutual labels: speech-synthesis

open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Stars: ✭ 841 (+3264%)

Mutual labels: speech-synthesis

klatt-syn

Klatt formant synthesizer

Stars: ✭ 18 (-28%)

Mutual labels: speech-synthesis

ppg-vc

PPG-Based Voice Conversion

Stars: ✭ 154 (+516%)

Mutual labels: speech-synthesis

Daft-Exprt

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Stars: ✭ 41 (+64%)

Mutual labels: speech-synthesis

View All Similar Projects ➔

ExtensibleTTS-PyTorch

An extensible speech synthesis system, build with PyTorch and the original code is from r9y9's https://github.com/r9y9/nnmnkwii_gallery . You will find it easy to train acoustic model by employing popular models such as tacotron's encoder, deepvoice's encoder, transformer's encoder and any other you created.

Quick Start

Dependencies

python 3.6
CUDA 9.0
pytorch
nnmnkwii
pyworld
pysptk
scipy
numpy
pickle

Prepare Dataset

Note: the repo requires wav files with aligned HTS-style full-context lablel files.

Download a dataset

cmu_slt_arctic

Unpack the dataset into ~/ExtensibleTTS-PyTorch/datasets

After unpacking, your tree should look like this for cmu_slt_arctic:

ExtensibleTTS-PyTorch   
  |- datasets    
      |- slt_arctic_full_data
          |- label_phone_align
          |- label_state_align
          |- wav
          |- file_id_list_full.scp
          |- questions-radio_dnn_416.hed

Training

Preprocess the data to extract linguistic/duration/acoustic feature

python preprocess.py --label state_align

Use --label phone_align

Count min/max/mean/var/scale value of the data for input/output feature normalization

python norm_params.py

Train a model

python train_dnn.py --train_model duration

Use --train_model acoustic for training a acoustic model

Label to speech waveform from a duration/acoustic checkpoint

python synthesis.py --label state_align --duration_checkpint * --acoustic_checkpint *

Restore from a checkpoint

python train.py --restore_step *

WIP

combined with MTTS, the Mandarin frontend
batch inference for synthesis speedup
scheduled sampling
model pruning

Reference

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

huiw39 / ExtensibleTTS-PyTorch

Programming Languages

Labels

Projects that are alternatives of or similar to ExtensibleTTS-PyTorch

ExtensibleTTS-PyTorch

Quick Start

Dependencies

Prepare Dataset

Training

WIP

Reference