All Projects → candlewill → AiVoice

candlewill / AiVoice

Licence: other
Deep CNN networks for Speech Synthesis

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to AiVoice

deep-learning-german-tts
Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.
Stars: ✭ 268 (+495.56%)
Mutual labels:  tts
totalvoice-node
Client em NodeJS para API da Totalvoice
Stars: ✭ 54 (+20%)
Mutual labels:  tts
Tacotron2-PyTorch
Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.
Stars: ✭ 118 (+162.22%)
Mutual labels:  tts
voices
macOS CLI for changing the default TTS (text-to-speech) voice and printing information about and speaking text with multiple voices.
Stars: ✭ 53 (+17.78%)
Mutual labels:  tts
ttskit
text to speech toolkit. 好用的中文语音合成工具箱,包含语音编码器、语音合成器、声码器和可视化模块。
Stars: ✭ 336 (+646.67%)
Mutual labels:  tts
FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
Stars: ✭ 163 (+262.22%)
Mutual labels:  tts
klaam
Arabic speech recognition, classification and text-to-speech.
Stars: ✭ 151 (+235.56%)
Mutual labels:  tts
tts
Table Top Simulator Mod for Star Wars: Legion
Stars: ✭ 32 (-28.89%)
Mutual labels:  tts
SpeakIt Vietnamese TTS
Vietnamese Text-to-Speech on Windows Project (zalo-speech)
Stars: ✭ 81 (+80%)
Mutual labels:  tts
TTS tf
WIP Tensorflow implementation of https://github.com/mozilla/TTS
Stars: ✭ 14 (-68.89%)
Mutual labels:  tts
golang-tts
Text-to-Speach golang package based in Amazon Polly service
Stars: ✭ 19 (-57.78%)
Mutual labels:  tts
WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Stars: ✭ 55 (+22.22%)
Mutual labels:  tts
laravel-text-to-speech
💬 A wrapper for popular TTS services to create a more simple & uniform API. Currently, only AWS Polly is supported.
Stars: ✭ 26 (-42.22%)
Mutual labels:  tts
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+1768.89%)
Mutual labels:  tts
ForgetMeNot
A flashcard app for Android.
Stars: ✭ 234 (+420%)
Mutual labels:  tts
AdaSpeech
AdaSpeech: Adaptive Text to Speech for Custom Voice
Stars: ✭ 108 (+140%)
Mutual labels:  tts
STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Stars: ✭ 105 (+133.33%)
Mutual labels:  tts
tts dataset maker
A gui to help make a text to speech dataset.
Stars: ✭ 20 (-55.56%)
Mutual labels:  tts
soundoftext
API & Web Client for soundoftext.com
Stars: ✭ 41 (-8.89%)
Mutual labels:  tts
One-Shot-Voice-Cloning
☺️ One Shot Voice Cloning base on Unet-TTS
Stars: ✭ 118 (+162.22%)
Mutual labels:  tts

Deep Voice 3

This is a tensorflow implementation of DEEP VOICE 3: 2000-SPEAKER NEURAL TEXT-TO-SPEECH. For now, we are just focusing on single speaker synthesis.

Requirement

  • Tensorflow >= 1.2
  • Python >= 3.0

Dataset

The LJ Speech Dataset

Pre-process

Download and unzip the LJ Speech Dataset. Run:

python prepro.py

Note: Make sure that we have unzipped the dataset into the same foler of prepro.py.

After this, we would get three new folders:

├── dones          [New]
├── mags           [New]
├── mels           [New]
├── metadata.csv
├── README
└── wavs

Training

Training data is loaded from ./LJSpeech-1.0/metadata.csv, ./LJSpeech-1.0/mels, ./LJSpeech-1.0/dones, ./LJSpeech-1.0/mags as default. If we want to change the loading path, we could change the config in class Hyperparams.

To train the model, we use this command:

python train.py

Pre-trained Model

Currently, we can not get good result. However, we still provide our pre-trained model in case someone is interested in it.

Pre-trained Model.

Its attention figure is as follows:

Image of attention

All the attention figures generated at training are included in the pre-trained model zipped file.

File Description

  • hyperparams.py: hyper parameters
  • prepro.py: creates inputs and targets, i.e., mel spectrogram, magnitude, and dones.
  • data_load.py
  • utils.py: several custom operational functions.
  • modules.py: building blocks for the networks.
  • networks.py: encoder, decoder, and converter
  • train.py: train
  • synthesize.py: inference
  • test_sents.txt: some test sentences in the paper.

Reference

Most of the code is borrowed from Kyubyong/deepvoice3.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].