All Projects → SungFeng-Huang → Meta-TTS

SungFeng-Huang / Meta-TTS

Licence: other
Official repository of https://arxiv.org/abs/2111.04040v1

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Meta-TTS

Transferlearning
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
Stars: ✭ 8,481 (+12191.3%)
Mutual labels:  meta-learning, few-shot-learning
awesome-few-shot-meta-learning
awesome few shot / meta learning papers
Stars: ✭ 44 (-36.23%)
Mutual labels:  meta-learning, few-shot-learning
Meta Learning Papers
Meta Learning / Learning to Learn / One Shot Learning / Few Shot Learning
Stars: ✭ 2,420 (+3407.25%)
Mutual labels:  meta-learning, few-shot-learning
LibFewShot
LibFewShot: A Comprehensive Library for Few-shot Learning.
Stars: ✭ 629 (+811.59%)
Mutual labels:  meta-learning, few-shot-learning
sib meta learn
Code of Empirical Bayes Transductive Meta-Learning with Synthetic Gradients
Stars: ✭ 56 (-18.84%)
Mutual labels:  meta-learning, few-shot-learning
CDFSL-ATA
[IJCAI 2021] Cross-Domain Few-Shot Classification via Adversarial Task Augmentation
Stars: ✭ 21 (-69.57%)
Mutual labels:  meta-learning, few-shot-learning
Awesome-Few-shot
Awesome Few-shot learning
Stars: ✭ 50 (-27.54%)
Mutual labels:  meta-learning, few-shot-learning
MeTAL
Official PyTorch implementation of "Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning" (ICCV2021 Oral)
Stars: ✭ 24 (-65.22%)
Mutual labels:  meta-learning, few-shot-learning
FUSION
PyTorch code for NeurIPSW 2020 paper (4th Workshop on Meta-Learning) "Few-Shot Unsupervised Continual Learning through Meta-Examples"
Stars: ✭ 18 (-73.91%)
Mutual labels:  meta-learning, few-shot-learning
simple-cnaps
Source codes for "Improved Few-Shot Visual Classification" (CVPR 2020), "Enhancing Few-Shot Image Classification with Unlabelled Examples" (WACV 2022), and "Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain, Active and Continual Few-Shot Learning" (Neural Networks 2022 - in submission)
Stars: ✭ 88 (+27.54%)
Mutual labels:  meta-learning, few-shot-learning
Learning-To-Compare-For-Text
Learning To Compare For Text , Few shot learning in text classification
Stars: ✭ 38 (-44.93%)
Mutual labels:  meta-learning, few-shot-learning
LearningToCompare-Tensorflow
Tensorflow implementation for paper: Learning to Compare: Relation Network for Few-Shot Learning.
Stars: ✭ 17 (-75.36%)
Mutual labels:  meta-learning, few-shot-learning
StyleSpeech
Official implementation of Meta-StyleSpeech and StyleSpeech
Stars: ✭ 161 (+133.33%)
Mutual labels:  speech-synthesis, meta-learning
FSL-Mate
FSL-Mate: A collection of resources for few-shot learning (FSL).
Stars: ✭ 1,346 (+1850.72%)
Mutual labels:  meta-learning, few-shot-learning
Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Stars: ✭ 149 (+115.94%)
Mutual labels:  speech-synthesis
ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (+128.99%)
Mutual labels:  speech-synthesis
mimic2
Text to Speech engine based on the Tacotron architecture, initially implemented by Keith Ito.
Stars: ✭ 537 (+678.26%)
Mutual labels:  speech-synthesis
mliis
Code for meta-learning initializations for image segmentation
Stars: ✭ 21 (-69.57%)
Mutual labels:  meta-learning
talkie
Text-to-speech browser extension button. Select text on any web page, and have the computer read it out loud for you by simply clicking the Talkie button.
Stars: ✭ 43 (-37.68%)
Mutual labels:  speech-synthesis
LVCNet
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation
Stars: ✭ 67 (-2.9%)
Mutual labels:  speech-synthesis

Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech

This repository is the official implementation of "Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech".

multi-task learning meta learning

Meta-TTS

image

Requirements

This is how I build my environment, which is not exactly needed to be the same:

  • Sign up for Comet.ml, find out your workspace and API key via www.comet.ml/api/my/settings and fill them in config/comet.py. Comet logger is used throughout train/val/test stages.
    • Check my training logs here.
  • [Optional] Install pyenv for Python version control, change to Python 3.8.6.
# After download and install pyenv:
pyenv install 3.8.6
pyenv local 3.8.6
  • [Optional] Install pyenv-virtualenv as a plugin of pyenv for clean virtual environment.
# After install pyenv-virtualenv
pyenv virtualenv meta-tts
pyenv activate meta-tts
  • Install requirements:
pip install -r requirements.txt

Proprocessing

First, download LibriTTS and VCTK, then change the paths in config/LibriTTS/preprocess.yaml and config/VCTK/preprocess.yaml, then run

python3 prepare_align.py config/LibriTTS/preprocess.yaml
python3 prepare_align.py config/VCTK/preprocess.yaml

for some preparations.

Alignments of LibriTTS is provided here, and the alignments of VCTK is provided here. You have to unzip the files into preprocessed_data/LibriTTS/TextGrid/ and preprocessed_data/VCTK/TextGrid/.

Then run the preprocessing script:

python3 preprocess.py config/LibriTTS/preprocess.yaml

# Copy stats from LibriTTS to VCTK to keep pitch/energy normalization the same shift and bias.
cp preprocessed_data/LibriTTS/stats.json preprocessed_data/VCTK/

python3 preprocess.py config/VCTK/preprocess.yaml

Training

To train the models in the paper, run this command:

python3 main.py -s train \
                -p config/preprocess/<corpus>.yaml \
                -m config/model/base.yaml \
                -t config/train/base.yaml config/train/<corpus>.yaml \
                -a config/algorithm/<algorithm>.yaml

To reproduce, please use 8 V100 GPUs for meta models, and 1 V100 GPU for baseline models, or else you might need to tune gradient accumulation step (grad_acc_step) setting in config/train/base.yaml to get the correct meta batch size. Note that each GPU has its own random seed, so even the meta batch size is the same, different number of GPUs is equivalent to different random seed.

After training, you can find your checkpoints under output/ckpt/<corpus>/<project_name>/<experiment_key>/checkpoints/, where the project name is set in config/comet.py.

To inference the models, run:

python3 main.py -s test \
                -p config/preprocess/<corpus>.yaml \
                -m config/model/base.yaml \
                -t config/train/base.yaml config/train/<corpus>.yaml \
                -a config/algorithm/<algorithm>.yaml \
                -e <experiment_key> -c <checkpoint_file_name>

and the results would be under output/result/<corpus>/<experiment_key>/<algorithm>/.

Evaluation

Note: The evaluation code is not well-refactored yet.

cd evaluation/ and check README.md

Pre-trained Models

Note: The checkpoints are with older version, might not capatiable with the current code. We would fix the problem in the future.

Since our codes are using Comet logger, you might need to create a dummy experiment by running:

from comet_ml import Experiment
experiment = Experiment()

then put the checkpoint files under output/ckpt/LibriTTS/<project_name>/<experiment_key>/checkpoints/.

You can download pretrained models here.

Results

Corpus LibriTTS VCTK
Speaker Similarity
Speaker Verification

Synthesized Speech Detection

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].