All Projects → CSTR-Edinburgh → Merlin

CSTR-Edinburgh / Merlin

Licence: other
This is now the official location of the Merlin project.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Merlin

Wsay
Windows "say"
Stars: ✭ 36 (-96.92%)
Mutual labels:  speech-synthesis, text-to-speech
Cs224n Gpu That Talks
Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)
Stars: ✭ 52 (-95.55%)
Mutual labels:  speech-synthesis, text-to-speech
Multilingual text to speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
Stars: ✭ 324 (-72.26%)
Mutual labels:  speech-synthesis, text-to-speech
Nnmnkwii
Library to build speech synthesis systems designed for easy and fast prototyping.
Stars: ✭ 308 (-73.63%)
Mutual labels:  speech-synthesis, text-to-speech
Espeak Ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Stars: ✭ 799 (-31.59%)
Mutual labels:  speech-synthesis, text-to-speech
Cognitive Speech Tts
Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
Stars: ✭ 312 (-73.29%)
Mutual labels:  speech-synthesis, text-to-speech
Voice Builder
An opensource text-to-speech (TTS) voice building tool
Stars: ✭ 362 (-69.01%)
Mutual labels:  speech-synthesis, text-to-speech
esp32-flite
Speech synthesis running on ESP32 based on Flite engine.
Stars: ✭ 28 (-97.6%)
Mutual labels:  text-to-speech, speech-synthesis
Rhvoice
a free and open source speech synthesizer for Russian and other languages
Stars: ✭ 750 (-35.79%)
Mutual labels:  speech-synthesis, text-to-speech
Parallelwavegan
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch
Stars: ✭ 682 (-41.61%)
Mutual labels:  speech-synthesis, text-to-speech
Glow Tts
A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Stars: ✭ 284 (-75.68%)
Mutual labels:  speech-synthesis, text-to-speech
Tacotron2
A PyTorch implementation of Tacotron2, an end-to-end text-to-speech(TTS) system described in "Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions".
Stars: ✭ 43 (-96.32%)
Mutual labels:  speech-synthesis, text-to-speech
Parakeet
PAddle PARAllel text-to-speech toolKIT (supporting WaveFlow, WaveNet, Transformer TTS and Tacotron2)
Stars: ✭ 279 (-76.11%)
Mutual labels:  speech-synthesis, text-to-speech
Hifi Gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Stars: ✭ 325 (-72.17%)
Mutual labels:  speech-synthesis, text-to-speech
Comprehensive-Tacotron2
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
Stars: ✭ 22 (-98.12%)
Mutual labels:  text-to-speech, speech-synthesis
Espeak
eSpeak NG is an open source speech synthesizer that supports 101 languages and accents.
Stars: ✭ 339 (-70.98%)
Mutual labels:  speech-synthesis, text-to-speech
leon
🧠 Leon is your open-source personal assistant.
Stars: ✭ 8,560 (+632.88%)
Mutual labels:  text-to-speech, speech-synthesis
editts
Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
Stars: ✭ 74 (-93.66%)
Mutual labels:  text-to-speech, speech-synthesis
Parrot
RNN-based generative models for speech.
Stars: ✭ 601 (-48.54%)
Mutual labels:  speech-synthesis, theano
Jsut Lab
HTS-style full-context labels for JSUT v1.1
Stars: ✭ 28 (-97.6%)
Mutual labels:  speech-synthesis, text-to-speech

Build Status

Merlin: The Neural Network (NN) based Speech Synthesis System

This repository contains the Neural Network (NN) based Speech Synthesis System
developed at the Centre for Speech Technology Research (CSTR), University of Edinburgh.

Merlin is a toolkit for building Deep Neural Network models for statistical parametric speech synthesis. It must be used in combination with a front-end text processor (e.g., Festival) and a vocoder (e.g., STRAIGHT or WORLD).

The system is written in Python and relies on the Theano numerical computation library.

Merlin comes with recipes (in the spirit of the Kaldi automatic speech recognition toolkit) to show you how to build state-of-the art systems.

Merlin is free software, distributed under an Apache License Version 2.0, allowing unrestricted commercial and non-commercial use alike.

Read the documentation at cstr-edinburgh.github.io/merlin.

Merlin is compatible with: Python 2.7-3.6.

Installation

Merlin uses the following dependencies:

  • numpy, scipy
  • matplotlib
  • bandmat
  • theano
  • tensorflow (optional, required if you use tensorflow models)
  • sklearn, keras, h5py (optional, required if you use keras models)

To install Merlin, cd merlin and run the below steps:

  • Install some basic tools in Merlin
bash tools/compile_tools.sh
  • Install python dependencies
pip install -r requirements.txt

For detailed instructions, to build the toolkit: see INSTALL and CSTR blog post.
These instructions are valid for UNIX systems including various flavors of Linux;

Getting started with Merlin

To run the example system builds, see egs/README.txt

As a first demo, please follow the scripts in egs/slt_arctic

Now, you can also follow Josh Meyer's blog post for detailed instructions
on how to install Merlin and build SLT demo voice.

For a more in-depth tutorial about building voices with Merlin, you can check out:

Synthetic speech samples

Listen to synthetic speech samples from our SLT arctic voice.

Development pattern for contributors

  1. Create a personal fork of the main Merlin repository in GitHub.
  2. Make your changes in a named branch different from master, e.g. you create a branch my-new-feature.
  3. Generate a pull request through the Web interface of GitHub.

Contact Us

Post your questions, suggestions, and discussions to GitHub Issues.

Citation

If you publish work based on Merlin, please cite:

Zhizheng Wu, Oliver Watts, Simon King, "Merlin: An Open Source Neural Network Speech Synthesis System" in Proc. 9th ISCA Speech Synthesis Workshop (SSW9), September 2016, Sunnyvale, CA, USA.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].