Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → ga642381 → FastSpeech2

ga642381 / FastSpeech2

Licence: other

Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech ✊

Programming Languages

139335 projects - #7 most used programming language

Labels

text-to-speech pytorch tts waveglow melgan multi-speaker-tts fastspeech2

Projects that are alternatives of or similar to FastSpeech2

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Stars: ✭ 2,382 (+3621.88%)

Mutual labels: text-to-speech, tts, melgan, multi-speaker-tts, fastspeech2

AdaSpeech: Adaptive Text to Speech for Custom Voice

Stars: ✭ 108 (+68.75%)

Mutual labels: text-to-speech, tts, fastspeech2

PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech

Stars: ✭ 163 (+154.69%)

Mutual labels: text-to-speech, tts, fastspeech2

Desktop application for neural speech synthesis written in C++

Stars: ✭ 140 (+118.75%)

Mutual labels: text-to-speech, tts, fastspeech2

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Stars: ✭ 5,427 (+8379.69%)

Mutual labels: text-to-speech, tts, melgan

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Stars: ✭ 1,942 (+2934.38%)

Mutual labels: text-to-speech, tts

Google TTS (Text-To-Speech) for node.js

Stars: ✭ 180 (+181.25%)

Mutual labels: text-to-speech, tts

Tacotron Pytorch

Pytorch implementation of Tacotron

Stars: ✭ 189 (+195.31%)

Mutual labels: text-to-speech, tts

Brasil TTS é um conjunto de sintetizadores de voz, em português do Brasil, que lê telas para portadores de deficiência visual. Transforma texto em áudio, permitindo que pessoas cegas ou com baixa visão tenham acesso ao conteúdo exibido na tela. Embora o principal público-alvo de sistemas de conversão texto-fala – como o Brasil TTS – seja formado…

Stars: ✭ 34 (-46.87%)

Mutual labels: text-to-speech, tts

Android MARY TTS - an open-source, offline HMM-Based text-to-speech synthesis system based on MaryTTS

Stars: ✭ 134 (+109.38%)

Mutual labels: text-to-speech, tts

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

Stars: ✭ 245 (+282.81%)

Mutual labels: text-to-speech, tts

soundpad-text-to-speech

Text-To-Speech for Soundpad

Stars: ✭ 29 (-54.69%)

Mutual labels: text-to-speech, tts

MouseTooltipTranslator

chrome extension - When mouse hover on text, it shows translated tooltip using google translate

Stars: ✭ 93 (+45.31%)

Mutual labels: text-to-speech, tts

Amazon Polly Sample

Sample application for Amazon Polly. Allows to convert any blog into an audio podcast.

Stars: ✭ 139 (+117.19%)

Mutual labels: text-to-speech, tts

Official implementation of Meta-StyleSpeech and StyleSpeech

Stars: ✭ 161 (+151.56%)

Mutual labels: text-to-speech, tts

Expressive-FastSpeech2

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

Stars: ✭ 139 (+117.19%)

Mutual labels: text-to-speech, tts

The retro text-to-speech bot for Discord

Stars: ✭ 24 (-62.5%)

Mutual labels: text-to-speech, tts

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Stars: ✭ 1,604 (+2406.25%)

Mutual labels: text-to-speech, tts

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Stars: ✭ 33 (-48.44%)

Mutual labels: text-to-speech, tts

MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java

Stars: ✭ 1,699 (+2554.69%)

Mutual labels: text-to-speech, tts

View All Similar Projects ➔

Multi-speaker FastSpeech 2 - PyTorch Implementation ⚡

This is a PyTorch implementation of Microsoft's FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.
Now supporting about 900 speakers in 🔥 LibriTTS for multi-speaker text-to-speech.

Datasets 🐘

This project supports 2 muti-speaker datasets:

🔥 Single-Speaker

LJSpeech

🔥 Multi-Speaker

LibriTTS
VCTK

Config

Configurations are in:

config/dataset.yaml
config/hparams.py

Please modify the dataest and mfa_path in hparams.

In this repo, we're using MFA v1. Migrating to MFA v2 is a TODO item.

Steps

preprocess.py
train.py
synthesize.py

1. Preprocess

File Structures:

[DATASET] / wavs / speaker / wav_files [DATASET] / txts / speaker / txt_files

wav_dir : the folder containing speaker dirs ( [DATASET] / wavs )
txt_dir : the folder containing speaker dirs ( [DATASET] / txts )
save_dir : the output directory (e.g. "./processed" )
--prepare_mfa : create mfa_data
--mfa : create textgrid files
--create_dataset : generate mel, phone, f0 ....., metadata.json

Example commands:

LJSpeech:

#run the script for organizing LJSpeech first
python ./script/organizeLJ.py

python preprocess.py /storage/tts2021/LJSpeech-organized/wavs /storage/tts2021/LJSpeech-organized/txts ./processed/LJSpeech --prepare_mfa --mfa --create_dataset

LibriTTS:

python preprocess.py /storage/tts2021//LibriTTS/train-clean-360 /storage/tts2021//LibriTTS/train-clean-360 ./processed/LibriTTS --prepare_mfa --mfa --create_dataset

VCTK:

python preprocess.py /storage/tts2021/VCTK-Corpus/wav48/ /storage/tts2021/VCTK-Corpus/txt ./processed/VCTK --prepare_mfa --mfa --create_dataset

metadata.json includes:

spker table
traning data
validation data

2. Train

data_dir : the preprocessed data directory
--comment: some comments

Example commands:

LJSpeech:

python train.py ./processed/LJSpeech --comment "Hello LJSpeech"

LibriTTS:

python train.py ./processed/LibriTTS --comment "Hello LibriTTS"

VCTK:

python train.py ./processed/VCTK --comment "Hello VCTK"

3. Synthesize

--ckpt_path: the checkpoint path
--output_dir: the directory to put the synthesized audios

Example commands:

python synthesize.py --ckpt_path ./records/LJSpeech_2021-11-22-22:42/ckpt/checkpoint_125000.pth.tar --output_dir ./output

References 📔

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 64

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗