All Projects → yistLin → universal-vocoder

yistLin / universal-vocoder

Licence: other
A PyTorch implementation of the universal neural vocoder

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to universal-vocoder

melgan
MelGAN implementation with Multi-Band and Full Band supports...
Stars: ✭ 54 (+17.39%)
Mutual labels:  vocoder
QPPWG
Quasi-Periodic Parallel WaveGAN Pytorch implementation
Stars: ✭ 41 (-10.87%)
Mutual labels:  neural-vocoder
WorldInApple
Swift wrapper for vocoder World(https://github.com/mmorise/World)
Stars: ✭ 18 (-60.87%)
Mutual labels:  vocoder
magphase
MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.
Stars: ✭ 76 (+65.22%)
Mutual labels:  vocoder
vietTTS
Vietnamese Text to Speech library
Stars: ✭ 78 (+69.57%)
Mutual labels:  vocoder
FFTNet
FFTNet: a Real-Time Speaker-Dependent Neural Vocoder
Stars: ✭ 63 (+36.96%)
Mutual labels:  vocoder
pytorch FFTNet
A pytorch implementation of FFTNet.
Stars: ✭ 35 (-23.91%)
Mutual labels:  vocoder
deepvac
PyTorch Project Specification.
Stars: ✭ 507 (+1002.17%)
Mutual labels:  torchscript
GlottDNN
GlottDNN vocoder and tools for training DNN excitation models
Stars: ✭ 30 (-34.78%)
Mutual labels:  vocoder
wavenet-like-vocoder
Basic wavenet and fftnet vocoder model.
Stars: ✭ 20 (-56.52%)
Mutual labels:  vocoder
Tensorflowtts
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Stars: ✭ 2,382 (+5078.26%)
Mutual labels:  vocoder
Tts
🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Stars: ✭ 5,427 (+11697.83%)
Mutual labels:  vocoder
Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (+58.7%)
Mutual labels:  vocoder
codec2 talkie
Turn your Android phone into Codec2 Walkie-Talkie (Bluetooth/USB/TCPIP KISS modem client for DV digital voice communication)
Stars: ✭ 65 (+41.3%)
Mutual labels:  vocoder
LVCNet
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation
Stars: ✭ 67 (+45.65%)
Mutual labels:  vocoder
Wavenet vocoder
WaveNet vocoder
Stars: ✭ 1,926 (+4086.96%)
Mutual labels:  neural-vocoder
Wavernn
WaveRNN Vocoder + TTS
Stars: ✭ 1,636 (+3456.52%)
Mutual labels:  neural-vocoder
INTERSPEECH19 TUTORIAL
Interspeech 2019 tutorial materials
Stars: ✭ 46 (+0%)
Mutual labels:  neural-vocoder

Universal Vocoder

This is a restructured and rewritten version of bshall/UniversalVocoding. The main difference here is that the model is turned into a TorchScript module during training and can be loaded for inferencing anywhere without Python dependencies.

Generate waveforms using pretrained models

Since the pretrained models were turned to TorchScript, you can load a trained model anywhere. Also you can generate multiple waveforms parallelly, e.g.

import torch

vocoder = torch.jit.load("vocoder.pt")

mels = [
    torch.randn(100, 80),
    torch.randn(200, 80),
    torch.randn(300, 80),
] # (length, mel_dim)

with torch.no_grad():
    wavs = vocoder.generate(mels)

Emperically, if you're using the default architecture, you can generate 30 samples at the same time on an GTX 1080 Ti.

Train from scratch

Multiple directories containing audio files can be processed at the same time, e.g.

python preprocess.py \
    VCTK-Corpus \
    LibriTTS/train-clean-100 \
    preprocessed # the output directory of preprocessed data

And train the model with the preprocessed data, e.g.

python train.py preprocessed

With the default settings, it would take around 12 hr to train to 100K steps on an RTX 2080 Ti.

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].