All Projects → Kyubyong → Deepvoice3

Kyubyong / Deepvoice3

Tensorflow Implementation of Deep Voice 3

Programming Languages

python
139335 projects - #7 most used programming language

Deep Voice 3

Work In Progress

To check the current status, see this.

This is a tensorflow implementation of DEEP VOICE 3: 2000-SPEAKER NEURAL TEXT-TO-SPEECH. For now I'm focusing on single speaker synthesis.

Data

I'm trying with Nick Offerman's audiobook files for fun and The LJ Speech Dataset which in public domain.

File Description

  • hyperparams.py: hyper parameters
  • prepro.py: creates inputs and targets, i.e., mel spectrogram, magnitude, and dones.
  • data_load.py
  • utils.py: several custom operational functions.
  • modules.py: building blocks for the networks.
  • networks.py: encoder, decoder, and converter
  • train.py: train
  • synthesize.py: inference
  • test_sents.txt: some test sentences in the paper.

Papers that referenced this repo

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].