Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

Stars: ✭ 2,085 (+1039.34%)

Mutual labels: tts

Melnet

Implementation of "MelNet: A Generative Model for Audio in the Frequency Domain"

Stars: ✭ 161 (-12.02%)

Mutual labels: tts

Speechrecognizerbutton

UIButton subclass with push to talk recording, speech recognition and Siri-style waveform view.

Stars: ✭ 144 (-21.31%)

Mutual labels: speech-to-text

Tensorflowtts

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Stars: ✭ 2,382 (+1201.64%)

Mutual labels: tts

Speech To Text Russian

Проект для распознавания речи на русском языке на основе pykaldi.

Stars: ✭ 151 (-17.49%)

Mutual labels: speech-to-text

Hey Jetson

Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.

Stars: ✭ 161 (-12.02%)

Mutual labels: speech-to-text

Zzz Retired openstt

RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:

Stars: ✭ 146 (-20.22%)

Mutual labels: speech-to-text

Gst Tacotron

A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

Stars: ✭ 175 (-4.37%)

Mutual labels: tts

Tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Stars: ✭ 1,756 (+859.56%)

Mutual labels: tts

Google Tts

Google TTS (Text-To-Speech) for node.js

Stars: ✭ 180 (-1.64%)

Mutual labels: tts

Naomi

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

Stars: ✭ 171 (-6.56%)

Mutual labels: speech-to-text

Jiwer

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Stars: ✭ 158 (-13.66%)

Mutual labels: speech-to-text

Proctoring Ai

Creating a software for automatic monitoring in online proctoring

Stars: ✭ 155 (-15.3%)

Mutual labels: speech-to-text

View All Similar Projects ➔

Making a TTS model with 1 minute of speech samples within 10 minutes

Seeing my implementaions of Tacotron and DCTTS, many people have asked me "How large speech dataset is needed for neural TTS?" or "Can you make a TTS model with X hour(s)/minute(s) of training data?" I'm fully aware of the importance of those questions. When you plan a service using TTS, it is not always likely to get lots of speech samples. I would like to give an answer. I really do. But unfortunately I have no answer. The only thing I know is that I could train a model successfully with five hours of speech samples I extracted from Kate Winslet's audiobook. I haven't tried less data than that. I could try it, but I actually I have a better idea. Since I have a decent model trained with the LJ Speech Dataset for several days, why don't I use it? After all, we all have different voices, but the way we speak English is not totally different.

In the above two repos, I trained TTS models using all the speech samples of my two favorite celebrities, Nick Offerman and Kate Winslet, from scratch. This time, I use only one minute of the speech samples. The following are the synthesized samples after 10 minutes of fine-tuning training. Do you think they sound like them?

Check Nick Samples
Check Kate Samples

Additionally, I collected 10 speech samples of Modern Family celebrities from YouTube, and generated their voice, training on those sample.

Check here to see the model details, source code and the pretrained model which served as a seed.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 183

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗