Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → vlomme → Multi Tacotron Voice Cloning

vlomme / Multi Tacotron Voice Cloning

Licence: other

Phoneme multilingual(Russian-English) voice cloning based on

Programming Languages

python

139335 projects - #7 most used programming language

Labels

deep-learning pytorch tensorflow tts russian tacotron

Projects that are alternatives of or similar to Multi Tacotron Voice Cloning

FCH-TTS

A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型，适用于英语、普通话/中文、日语、韩语、俄语和藏语（当前已测试）。

Stars: ✭ 154 (-19.79%)

Mutual labels: tts, russian, tacotron

TTS tf

WIP Tensorflow implementation of https://github.com/mozilla/TTS

Stars: ✭ 14 (-92.71%)

Mutual labels: tts, tacotron

tacotron2

Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow

Stars: ✭ 102 (-46.87%)

Mutual labels: tts, tacotron

Gst Tacotron

A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

Stars: ✭ 175 (-8.85%)

Mutual labels: tacotron, tts

Mimic Recording Studio

Mimic Recording Studio is a Docker-based application you can install to record voice samples, which can then be trained into a TTS voice with Mimic2

Stars: ✭ 202 (+5.21%)

Mutual labels: tacotron, tts

Tacotron

Audio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.

Stars: ✭ 493 (+156.77%)

Mutual labels: tacotron, tts

Text-to-Speech-Landscape

No description or website provided.

Stars: ✭ 31 (-83.85%)

Mutual labels: tts, tacotron

Tacotron

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

Stars: ✭ 2,581 (+1244.27%)

Mutual labels: tacotron, tts

Tts

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Stars: ✭ 305 (+58.85%)

Mutual labels: tacotron, tts

Tts

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Stars: ✭ 5,427 (+2726.56%)

Mutual labels: tacotron, tts

Tacotron2-PyTorch

Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.

Stars: ✭ 118 (-38.54%)

Mutual labels: tts, tacotron

Wavernn

WaveRNN Vocoder + TTS

Stars: ✭ 1,636 (+752.08%)

Mutual labels: tacotron, tts

Comprehensive-Tacotron2

PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.

Stars: ✭ 22 (-88.54%)

Mutual labels: tts, tacotron

Tacotron Wavernn

TTS (Tacotron + WaveRNN)

Stars: ✭ 40 (-79.17%)

Mutual labels: tacotron, tts

Tacotron Pytorch

Pytorch implementation of Tacotron

Stars: ✭ 189 (-1.56%)

Mutual labels: tacotron, tts

Automatic Youtube Reddit Text To Speech Video Generator And Uploader

A series of 3 programs that will automatically receive scripts from Reddit, allow the user to edit them, then be sent off to a video generator where they will be uploaded to YouTube automatically.

Stars: ✭ 152 (-20.83%)

Mutual labels: tts

Mrcp Plugin With Freeswitch

使用FreeSWITCH接受用户手机呼叫，通过UniMRCP Server集成讯飞开放平台（xfyun）插件将用户语音进行语音识别（ASR），并根据自定义业务逻辑调用语音合成（TTS），构建简单的端到端语音呼叫中心。

Stars: ✭ 168 (-12.5%)

Mutual labels: tts

Aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Stars: ✭ 1,942 (+911.46%)

Mutual labels: tts

Awesome Speech Recognition Speech Synthesis Papers

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

Stars: ✭ 2,085 (+985.94%)

Mutual labels: tts

Google Tts

Google TTS (Text-To-Speech) for node.js

Stars: ✭ 180 (-6.25%)

Mutual labels: tts

View All Similar Projects ➔

Multi-Tacotron Voice Cloning

This repository is a phonemic multilingual (Russian-English) implementation based on Real-Time-Voice-Cloning. it is a four-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model. If you only need the English version, please use the original implementation.

Этот репозиторий является многоязычной(русско-английской) фонемной реализацией, основанной на Real-Time-Voice-Cloning. Она состоит из четырёх нейронных сетей, которые позволяют создавать числовое представление голоса из нескольких секунд звука и использовать его для создания модели преобразования текста в речь

Example

Quick start

Use the colab online demo

Requirements

You will need the following whether you plan to use the toolbox only or to retrain the models.

≥Python 3.6.

PyTorch (>=1.0.1).

Run pip install -r requirements.txt to install the necessary packages.

A GPU is mandatory, but you don't necessarily need a high tier GPU if you only want to use the toolbox.

Pretrained models

Download the latest here.

Datasets

Name	Language	Link	Comments	My link	Comments
Phoneme dictionary	En, Ru	En,Ru	Phoneme dictionary	link	Совместил русский и английский фонемный словарь
LibriSpeech	En	link	300 speakers, 360h clean speech
VoxCeleb	En	link	7000 speakers, many hours bad speech
M-AILABS	Ru	link	3 speakers, 46h clean speech
open_tts, open_stt	Ru	open_tts, open_stt	many speakers, many hours bad speech	link	Почистил 4 часа речи одного спикера. Поправил анотацию, разбил на отрезки до 7 секунд
Voxforge+audiobook	Ru	link	Many speaker, 25h various quality	link	Выбрал хорошие файлы. Разбил на отрезки. Добавил аудиокниг из интернета. Получилось 200 спикеров по паре минут на каждого
RUSLAN	Ru	link	One speaker, 40h good speech	link	Перекодировал в 16кГц
Mozilla	Ru	link	50 speaker, 30h good speech	link	Перекодировал в 16кГц, Раскидал разных пользователей по папкам
Russian Single	Ru	link	One speaker, 9h good speech	link	Перекодировал в 16кГц

Toolbox

You can then try the toolbox:

python demo_toolbox.py -d <datasets_root>
or
python demo_toolbox.py

Wiki

Pretrained models

Тренировка (и для других языков)

Training (and for other languages)

Contribution

for any questions, please email me

Papers implemented

URL	Designation	Title	Implementation source
1806.04558	SV2TTS	Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis	CorentinJ
1802.08435	WaveRNN (vocoder)	Efficient Neural Audio Synthesis	fatchord/WaveRNN
1712.05884	Tacotron 2 (synthesizer)	Natural TTS Synthesis by Conditioning Wavenet on Mel Spectrogram Predictions	Rayhane-mamah/Tacotron-2
1710.10467	GE2E (encoder)	Generalized End-To-End Loss for Speaker Verification	CorentinJ

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 192

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (10) 🔗