All Projects → Yangyangii → DeepConvolutionalTTS-pytorch

Yangyangii / DeepConvolutionalTTS-pytorch

Licence: other
Deep Convolutional TTS pytorch implementation

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to DeepConvolutionalTTS-pytorch

FCH-TTS
A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型,适用于英语、普通话/中文、日语、韩语、俄语和藏语(当前已测试)。
Stars: ✭ 154 (+492.31%)
Mutual labels:  tts, dctts
totalvoice-php
Client em PHP para API da Totalvoice
Stars: ✭ 30 (+15.38%)
Mutual labels:  tts
ForgetMeNot
A flashcard app for Android.
Stars: ✭ 234 (+800%)
Mutual labels:  tts
openctp
CTP开放平台提供A股、港股、美股、期货、期权等全品种接入通道,通过提供中泰证券XTP、华鑫证券奇点、东方证券OST、东方财富证券EMT、盈透证券TWS等各通道的CTPAPI接口,CTP程序可以无缝对接各股票柜台。平台也提供了一套基于TTS交易系统的模拟环境,同样提供了CTPAPI兼容接口,可以替代Simnow,为CTP量化交易开发者提供7x24可用的模拟环境。
Stars: ✭ 389 (+1396.15%)
Mutual labels:  tts
tts
Table Top Simulator Mod for Star Wars: Legion
Stars: ✭ 32 (+23.08%)
Mutual labels:  tts
text-to-speech
⚡️ Capacitor plugin for synthesizing speech from text.
Stars: ✭ 50 (+92.31%)
Mutual labels:  tts
TTS tf
WIP Tensorflow implementation of https://github.com/mozilla/TTS
Stars: ✭ 14 (-46.15%)
Mutual labels:  tts
py-espeak-ng
Some simple wrappers around eSpeak NG intended to make using this excellent TTS for waveform and IPA generation as convenient as possible.
Stars: ✭ 27 (+3.85%)
Mutual labels:  tts
LVCNet
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation
Stars: ✭ 67 (+157.69%)
Mutual labels:  tts
Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Stars: ✭ 149 (+473.08%)
Mutual labels:  tts
EbookReader
The EbookReader Android App. Support file format like epub, pdf, txt, html, mobi, azw, azw3, html, doc, docx,cbz, cbr. Support tts.
Stars: ✭ 37 (+42.31%)
Mutual labels:  tts
tts dataset maker
A gui to help make a text to speech dataset.
Stars: ✭ 20 (-23.08%)
Mutual labels:  tts
Text-to-Speech-Landscape
No description or website provided.
Stars: ✭ 31 (+19.23%)
Mutual labels:  tts
soundoftext
API & Web Client for soundoftext.com
Stars: ✭ 41 (+57.69%)
Mutual labels:  tts
ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (+507.69%)
Mutual labels:  tts
Tacotron2-PyTorch
Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.
Stars: ✭ 118 (+353.85%)
Mutual labels:  tts
spoken-word
Spoken Word
Stars: ✭ 46 (+76.92%)
Mutual labels:  tts
dctts-pytorch
The pytorch implementation of DC-TTS
Stars: ✭ 73 (+180.77%)
Mutual labels:  tts
talkie
Text-to-speech browser extension button. Select text on any web page, and have the computer read it out loud for you by simply clicking the Talkie button.
Stars: ✭ 43 (+65.38%)
Mutual labels:  tts
bingspeech-api-client
Microsoft Bing Speech API client in node.js
Stars: ✭ 32 (+23.08%)
Mutual labels:  tts

DCTTS (Deep Convolutional TTS) - pytorch implementation

Paper: Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention

Prerequisite

  • python 3.6
  • pytorch 1.0
  • librosa, scipy, tqdm, tensorboardX

Dataset

Usage

  1. Download the above dataset and modify the path in config.py. And then run the below command. 1st arg: signal prepro, 2nd arg: metadata (train/test split)

    python prepro.py 1 1
    
  2. DCTTS has two models. Firstly, you should train the model Text2Mel. I think that 20k step is enough (for only an hour). But you should train the model more and more with decaying guided attention loss.

    python train.py 1 <gpu_id>
    
  3. Secondly, train the SSRN. The outputs of SSRN are many high resolution data. So training SSRN is slower than training Text2Mel

    python train.py 2 <gpu_id>
    
  4. After training, you can synthesize some speech from text.

    python synthesize.py <gpu_id>
    

Attention

  • In speech synthesis, the attention module is important. If the model is normally trained, then you can see the monotonic attention like the follow figures.

Notes

  • To do: previous attention for inference.
  • To do: Alleviate the overfitting.
  • In the paper, they did not refer normalization. So I used weight normalization like DeepVoice3.
  • Some hyperparameters are different.
  • If you want to improve the performance, you should use all of the data. For some various experiments, I seperated the training set and the validation set.

Other Codes

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].