rishikksh20 / melgan

Licence: BSD-3-Clause license

MelGAN implementation with Multi-Band and Full Band supports...

Programming Languages

Jupyter Notebook

11667 projects

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to melgan

Tts

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Stars: ✭ 5,427 (+9950%)

Mutual labels: text-to-speech, speech, vocoder, melgan

Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Stars: ✭ 73 (+35.19%)

Mutual labels: text-to-speech, speech, speech-synthesis, vocoder

Tensorflowtts

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Stars: ✭ 2,382 (+4311.11%)

Mutual labels: text-to-speech, speech-synthesis, vocoder, melgan

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+446.3%)

Mutual labels: text-to-speech, speech, speech-synthesis

AdaSpeech

AdaSpeech: Adaptive Text to Speech for Custom Voice

Stars: ✭ 108 (+100%)

Mutual labels: text-to-speech, speech, speech-synthesis

editts

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

Stars: ✭ 74 (+37.04%)

Mutual labels: text-to-speech, speech, speech-synthesis

Voice Builder

An opensource text-to-speech (TTS) voice building tool

Stars: ✭ 362 (+570.37%)

Mutual labels: text-to-speech, speech, speech-synthesis

Lightspeech

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Stars: ✭ 31 (-42.59%)

Mutual labels: text-to-speech, speech, speech-synthesis

StyleSpeech

Official implementation of Meta-StyleSpeech and StyleSpeech

Stars: ✭ 161 (+198.15%)

Mutual labels: text-to-speech, speech, speech-synthesis

Zero-Shot-TTS

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Stars: ✭ 33 (-38.89%)

Mutual labels: text-to-speech, speech, speech-synthesis

Durian

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.

Stars: ✭ 111 (+105.56%)

Mutual labels: text-to-speech, speech, speech-synthesis

Diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Stars: ✭ 139 (+157.41%)

Mutual labels: text-to-speech, speech, speech-synthesis

spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Stars: ✭ 52 (-3.7%)

Mutual labels: text-to-speech, speech, speech-synthesis

ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Stars: ✭ 158 (+192.59%)

Mutual labels: text-to-speech, speech, speech-synthesis

LVCNet

LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

Stars: ✭ 67 (+24.07%)

Mutual labels: text-to-speech, speech-synthesis, vocoder

Wsay

Windows "say"

Stars: ✭ 36 (-33.33%)

Mutual labels: text-to-speech, speech, speech-synthesis

Wavegrad

A fast, high-quality neural vocoder.

Stars: ✭ 138 (+155.56%)

Mutual labels: text-to-speech, speech, speech-synthesis

Wavegrad

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

Stars: ✭ 245 (+353.7%)

Mutual labels: text-to-speech, speech, speech-synthesis

GlottDNN

GlottDNN vocoder and tools for training DNN excitation models

Stars: ✭ 30 (-44.44%)

Mutual labels: speech-synthesis, vocoder

react-native-spokestack

Spokestack: give your React Native app a voice interface!

Stars: ✭ 53 (-1.85%)

Mutual labels: text-to-speech, speech-synthesis

View All Similar Projects ➔

Multi-band MelGAN and Full band MelGAN

Unofficial PyTorch implementation of Multi-Band MelGAN paper. This implementation uses Seungwon Park's MelGAN repo as a base and PQMF filters implementation from this repo.
MelGAN :
Multi-band MelGAN:

Prerequisites

Tested on Python 3.6

pip install -r requirements.txt

Prepare Dataset

Download dataset for training. This can be any wav files with sample rate 22050Hz. (e.g. LJSpeech was used in paper)
preprocess: python preprocess.py -c config/default.yaml -d [data's root path]
Edit configuration yaml file

Train & Tensorboard

python trainer.py -c [config yaml file] -n [name of the run]
- cp config/default.yaml config/config.yaml and then edit config.yaml
- Write down the root path of train/validation files to 2nd/3rd line.
- Each path should contain pairs of *.wav with corresponding (preprocessed) *.mel file.
- The data loader parses list of files within the path recursively.
- For Multi-Band training use config/mb_melgan config file in -c
tensorboard --logdir logs/

Pretrained model

Check out here.

Inference

python inference.py -p [checkpoint path] -i [input mel path]

Results

References

License

BSD 3-Clause License.

utils/stft.py by Prem Seetharaman (BSD 3-Clause License)
datasets/mel2samp.py from https://github.com/NVIDIA/waveglow (BSD 3-Clause License)
utils/hparams.py from https://github.com/HarryVolek/PyTorch_Speaker_Verification (No License specified)

Useful resources

How to Train a GAN? Tips and tricks to make GANs work by Soumith Chintala
Official MelGAN implementation by original authors
Reproduction of MelGAN - NeurIPS 2019 Reproducibility Challenge (Ablation Track) by Yifei Zhao, Yichao Yang, and Yang Gao
- "replacing the average pooling layer with max pooling layer and replacing reflection padding with replication padding improves the performance significantly, while combining them produces worse results"

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

rishikksh20 / melgan

Programming Languages

Labels

Projects that are alternatives of or similar to melgan

Multi-band MelGAN and Full band MelGAN

Prerequisites

Prepare Dataset

Train & Tensorboard

Pretrained model

Inference

Results

References

License

Useful resources