rishikksh20 / AdaSpeech

Licence: Apache-2.0 license

AdaSpeech: Adaptive Text to Speech for Custom Voice

Programming Languages

Jupyter Notebook

11667 projects

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to AdaSpeech

Zero-Shot-TTS

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Stars: ✭ 33 (-69.44%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis, transformer

Tensorflowtts

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Stars: ✭ 2,382 (+2105.56%)

Mutual labels: text-to-speech, tts, speech-synthesis, fastspeech, fastspeech2

TensorVox

Desktop application for neural speech synthesis written in C++

Stars: ✭ 140 (+29.63%)

Mutual labels: text-to-speech, tts, speech-synthesis, fastspeech2

Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Stars: ✭ 73 (-32.41%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

StyleSpeech

Official implementation of Meta-StyleSpeech and StyleSpeech

Stars: ✭ 161 (+49.07%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

Parallel-Tacotron2

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Stars: ✭ 149 (+37.96%)

Mutual labels: text-to-speech, tts, speech-synthesis, fastspeech

ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Stars: ✭ 158 (+46.3%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

editts

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

Stars: ✭ 74 (-31.48%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

Voice Builder

An opensource text-to-speech (TTS) voice building tool

Stars: ✭ 362 (+235.19%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+173.15%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

Wavegrad

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

Stars: ✭ 245 (+126.85%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

Wsay

Windows "say"

Stars: ✭ 36 (-66.67%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

FastSpeech2

PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech

Stars: ✭ 163 (+50.93%)

Mutual labels: text-to-speech, tts, fastspeech, fastspeech2

spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Stars: ✭ 52 (-51.85%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

Cognitive Speech Tts

Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.

Stars: ✭ 312 (+188.89%)

Mutual labels: text-to-speech, tts, speech-synthesis, transformer

Lightspeech

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Stars: ✭ 31 (-71.3%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

Durian

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.

Stars: ✭ 111 (+2.78%)

Mutual labels: text-to-speech, speech, tts, speech-synthesis

Marytts

MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java

Stars: ✭ 1,699 (+1473.15%)

Mutual labels: text-to-speech, tts, speech-synthesis

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Stars: ✭ 1,604 (+1385.19%)

Mutual labels: text-to-speech, tts, speech-synthesis

Pytorch Dc Tts

Text to Speech with PyTorch (English and Mongolian)

Stars: ✭ 122 (+12.96%)

Mutual labels: text-to-speech, tts, speech-synthesis

View All Similar Projects ➔

AdaSpeech: Adaptive Text to Speech for Custom Voice [WIP]

Unofficial Pytorch implementation of AdaSpeech.

Note:

I am not considering multi-speaker use case, Iam much more focus only on single speaker.
I will use only Utterance level encoder and Phoneme level encoder not condition layer norm (which is the soul of AdaSpeech paper), it definelty restrict the adaptive nature of AdaSpeech but my focus is to improve FastSpeech 2 acoustic generalization rather than adaptation.

Citations

@misc{chen2021adaspeech,
      title={AdaSpeech: Adaptive Text to Speech for Custom Voice}, 
      author={Mingjian Chen and Xu Tan and Bohan Li and Yanqing Liu and Tao Qin and Sheng Zhao and Tie-Yan Liu},
      year={2021},
      eprint={2103.00993},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

Requirements :

All code written in Python 3.6.2 .

Install Pytorch

Before installing pytorch please check your Cuda version by running following command : nvcc --version

pip install torch torchvision

In this repo I have used Pytorch 1.6.0 for torch.bucketize feature which is not present in previous versions of PyTorch.

Installing other requirements :

pip install -r requirements.txt

To use Tensorboard install tensorboard version 1.14.0 seperatly with supported tensorflow (1.14.0)

For Preprocessing :

filelists folder contains MFA (Motreal Force aligner) processed LJSpeech dataset files so you don't need to align text with audio (for extract duration) for LJSpeech dataset. For other dataset follow instruction here. For other pre-processing run following command :

python nvidia_preprocessing.py -d path_of_wavs

For finding the min and max of F0 and Energy

python compute_statistics.py

Update the following in hparams.py by min and max of F0 and Energy

p_min = Min F0/pitch
p_max = Max F0
e_min = Min energy
e_max = Max energy

For training

 python train_fastspeech.py --outdir etc -c configs/default.yaml -n "name"

Note

For more complete and end to end Voice cloning or Text to Speech (TTS) toolbox please visit Deepsync Technologies.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

rishikksh20 / AdaSpeech

Programming Languages

Labels

Projects that are alternatives of or similar to AdaSpeech

AdaSpeech: Adaptive Text to Speech for Custom Voice [WIP]

Note:

Citations

Requirements :

For Preprocessing :

For training

Note