NTT123 / vietTTS

Licence: MIT license

Vietnamese Text to Speech library

Programming Languages

python

139335 projects - #7 most used programming language

Jupyter Notebook

11667 projects

shell

77523 projects

Projects that are alternatives of or similar to vietTTS

Comprehensive-Tacotron2

PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.

Stars: ✭ 22 (-71.79%)

Mutual labels: text-to-speech, tacotron, hifi-gan

Tts

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Stars: ✭ 5,427 (+6857.69%)

Mutual labels: text-to-speech, vocoder, tacotron

FFTNet

FFTNet: a Real-Time Speaker-Dependent Neural Vocoder

Stars: ✭ 63 (-19.23%)

Mutual labels: text-to-speech, vocoder

LVCNet

LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

Stars: ✭ 67 (-14.1%)

Mutual labels: text-to-speech, vocoder

Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Stars: ✭ 73 (-6.41%)

Mutual labels: text-to-speech, vocoder

FastSpeech2

PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech

Stars: ✭ 163 (+108.97%)

Mutual labels: text-to-speech, tts-engines

TTS tf

WIP Tensorflow implementation of https://github.com/mozilla/TTS

Stars: ✭ 14 (-82.05%)

Mutual labels: text-to-speech, tacotron

Wavernn

WaveRNN Vocoder + TTS

Stars: ✭ 1,636 (+1997.44%)

Mutual labels: text-to-speech, tacotron

Tacotron Pytorch

A Pytorch Implementation of Tacotron: End-to-end Text-to-speech Deep-Learning Model

Stars: ✭ 104 (+33.33%)

Mutual labels: text-to-speech, tacotron

Tacotron2

A PyTorch implementation of Tacotron2, an end-to-end text-to-speech(TTS) system described in "Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions".

Stars: ✭ 43 (-44.87%)

Mutual labels: text-to-speech, tacotron

SpeakIt Vietnamese TTS

Vietnamese Text-to-Speech on Windows Project (zalo-speech)

Stars: ✭ 81 (+3.85%)

Mutual labels: text-to-speech, vietnamese

Tacotron 2

DeepMind's Tacotron-2 Tensorflow implementation

Stars: ✭ 1,968 (+2423.08%)

Mutual labels: text-to-speech, tacotron

melgan

MelGAN implementation with Multi-Band and Full Band supports...

Stars: ✭ 54 (-30.77%)

Mutual labels: text-to-speech, vocoder

Tacotron2-PyTorch

Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.

Stars: ✭ 118 (+51.28%)

Mutual labels: text-to-speech, tacotron

Tts

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Stars: ✭ 305 (+291.03%)

Mutual labels: text-to-speech, tacotron

Tensorflowtts

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Stars: ✭ 2,382 (+2953.85%)

Mutual labels: text-to-speech, vocoder

Tacotron Pytorch

Pytorch implementation of Tacotron

Stars: ✭ 189 (+142.31%)

Mutual labels: text-to-speech, tacotron

react-native-spokestack

Spokestack: give your React Native app a voice interface!

Stars: ✭ 53 (-32.05%)

Mutual labels: text-to-speech

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Stars: ✭ 1,604 (+1956.41%)

Mutual labels: text-to-speech

hawking

The retro text-to-speech bot for Discord

Stars: ✭ 24 (-69.23%)

Mutual labels: text-to-speech

View All Similar Projects ➔

A Vietnamese TTS

Duration model + Acoustic model + HiFiGAN vocoder for vietnamese text-to-speech application.

Online demo at https://huggingface.co/spaces/ntt123/vietTTS.

A synthesized audio clip: clip.wav. A colab notebook: notebook.

🔔Checkout the experimental multi-speaker branch (git checkout multi-speaker) for multi-speaker support.🔔

Install

git clone https://github.com/NTT123/vietTTS.git
cd vietTTS 
pip3 install -e .

Quick start using pretrained models

bash ./scripts/quick_start.sh

Download InfoRe dataset

python ./scripts/download_aligned_infore_dataset.py

Note: this is a denoised and aligned version of the original dataset which is donated by the InfoRe Technology company (see here). You can download the original dataset (InfoRe Technology 1) at here.

See notebooks/denoise_infore_dataset.ipynb for instructions on how to denoise the dataset. We use the Montreal Forced Aligner (MFA) to align transcript and speech (textgrid files). See notebooks/align_text_audio_infore_mfa.ipynb for instructions on how to create textgrid files.

Train duration model

python -m vietTTS.nat.duration_trainer

Train acoustic model

python -m vietTTS.nat.acoustic_trainer

Train HiFiGAN vocoder

We use the original implementation from HiFiGAN authors at https://github.com/jik876/hifi-gan. Use the config file at assets/hifigan/config.json to train your model.

git clone https://github.com/jik876/hifi-gan.git

# create dataset in hifi-gan format
ln -sf `pwd`/train_data hifi-gan/data
cd hifi-gan/data
ls -1 *.TextGrid | sed -e 's/\.TextGrid$//' > files.txt
cd ..
head -n 100 data/files.txt > val_files.txt
tail -n +101 data/files.txt > train_files.txt
rm data/files.txt

# training
python train.py \
  --config ../assets/hifigan/config.json \
  --input_wavs_dir=data \
  --input_training_file=train_files.txt \
  --input_validation_file=val_files.txt

Finetune on Ground-Truth Aligned melspectrograms:

cd /path/to/vietTTS # go to vietTTS directory
python -m vietTTS.nat.zero_silence_segments -o train_data # zero all [sil, sp, spn] segments
python -m vietTTS.nat.gta -o /path/to/hifi-gan/ft_dataset  # create gta melspectrograms at hifi-gan/ft_dataset directory

# turn on finetune
cd /path/to/hifi-gan
python train.py \
  --fine_tuning True \
  --config ../assets/hifigan/config.json \
  --input_wavs_dir=data \
  --input_training_file=train_files.txt \
  --input_validation_file=val_files.txt

Then, use the following command to convert pytorch model to haiku format:

cd ..
python -m vietTTS.hifigan.convert_torch_model_to_haiku \
  --config-file=assets/hifigan/config.json \
  --checkpoint-file=hifi-gan/cp_hifigan/g_[latest_checkpoint]

Synthesize speech

python -m vietTTS.synthesizer \
  --lexicon-file=train_data/lexicon.txt \
  --text="hôm qua em tới trường" \
  --output=clip.wav

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

NTT123 / vietTTS

Programming Languages

Labels

Projects that are alternatives of or similar to vietTTS

A Vietnamese TTS

Install

Quick start using pretrained models

Download InfoRe dataset

Train duration model

Train acoustic model

Train HiFiGAN vocoder

Synthesize speech