open-speech-corpora💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+56.61%)
WavegradA fast, high-quality neural vocoder.
Stars: ✭ 138 (-74.3%)
KhronosThe open source intelligent personal assistant
Stars: ✭ 25 (-95.34%)
Tacotron2-PyTorchYet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.
Stars: ✭ 118 (-78.03%)
Voice BuilderAn opensource text-to-speech (TTS) voice building tool
Stars: ✭ 362 (-32.59%)
CotatronOfficial code for Cotatron @ INTERSPEECH 2020
Stars: ✭ 137 (-74.49%)
web-speech-cognitive-servicesPolyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Stars: ✭ 35 (-93.48%)
Legacy straightA vocoder framework which had been widely used in research community since 1999.
Stars: ✭ 130 (-75.79%)
AdaSpeechAdaSpeech: Adaptive Text to Speech for Custom Voice
Stars: ✭ 108 (-79.89%)
Pytorch Dc TtsText to Speech with PyTorch (English and Mongolian)
Stars: ✭ 122 (-77.28%)
vitsVITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Stars: ✭ 1,604 (+198.7%)
DurianImplementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (-79.33%)
ExtensibleTTS-PyTorchAn extensible speech synthesis system, build with PyTorch and the original code is from r9y9's https://github.com/r9y9/nnmnkwii_gallery
Stars: ✭ 25 (-95.34%)
Catch-A-WaveformOfficial pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)
Stars: ✭ 117 (-78.21%)
EspeakeSpeak NG is an open source speech synthesizer that supports 101 languages and accents.
Stars: ✭ 339 (-36.87%)
tacotron2Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow
Stars: ✭ 102 (-81.01%)
Openseq2seqToolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Stars: ✭ 1,378 (+156.61%)
Cross vcCross-lingual Voice Conversion
Stars: ✭ 91 (-83.05%)
TinyCogSmall Robot, Toy Robot platform
Stars: ✭ 29 (-94.6%)
MerlinThis is now the official location of the Merlin project.
Stars: ✭ 1,168 (+117.5%)
wiki2ssmlWiki2SSML provides the WikiVoice markup language used for fine-tuning synthesised voice.
Stars: ✭ 31 (-94.23%)
FCH-TTSA fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型,适用于英语、普通话/中文、日语、韩语、俄语和藏语(当前已测试)。
Stars: ✭ 154 (-71.32%)
GlottDNNGlottDNN vocoder and tools for training DNN excitation models
Stars: ✭ 30 (-94.41%)
Cross-Speaker-Emotion-TransferPyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Stars: ✭ 107 (-80.07%)
LingvoLingvo
Stars: ✭ 2,361 (+339.66%)
Hifi GanHiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Stars: ✭ 325 (-39.48%)
Artyom.jsA voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
Stars: ✭ 1,011 (+88.27%)
LightspeechLightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Stars: ✭ 31 (-94.23%)
samSoftware Automatic Mouth - Tiny Speech Synthesizer
Stars: ✭ 316 (-41.15%)
PororoPORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Stars: ✭ 812 (+51.21%)
ml-with-audioHF's ML for Audio study group
Stars: ✭ 104 (-80.63%)
WorldA high-quality speech analysis, manipulation and synthesis system
Stars: ✭ 769 (+43.2%)
voderAn emulation of the Voder Speech Synthesizer.
Stars: ✭ 19 (-96.46%)
ParallelwaveganUnofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch
Stars: ✭ 682 (+27%)
WaveGrad2PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Stars: ✭ 55 (-89.76%)
ParrotRNN-based generative models for speech.
Stars: ✭ 601 (+11.92%)
Melgan NeuripsGAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis
Stars: ✭ 592 (+10.24%)
few-shot-transformer-ttsByte-based multilingual transformer TTS for low-resource/few-shot language adaptation.
Stars: ✭ 60 (-88.83%)
Athenaan open-source implementation of sequence-to-sequence based speech processing engine
Stars: ✭ 542 (+0.93%)
Java Speech ApiThe J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: ✭ 490 (-8.75%)
Sinsy-NG(discontinued) 🎵The Formant-Based All Language Singing Voice Syntheis System: Sinsy-NG
Stars: ✭ 15 (-97.21%)
GanttsPyTorch implementation of GAN-based text-to-speech synthesis and voice conversion (VC)
Stars: ✭ 460 (-14.34%)
Cognitive Speech TtsMicrosoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
Stars: ✭ 312 (-41.9%)
EspnetEnd-to-End Speech Processing Toolkit
Stars: ✭ 4,533 (+744.13%)
MediumVCAny-to-any voice conversion using synthetic specific-speaker speeches as intermedium features
Stars: ✭ 46 (-91.43%)
Libfaceidlibfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.
Stars: ✭ 354 (-34.08%)
UniversalvocodingA PyTorch implementation of "Robust Universal Neural Vocoding"
Stars: ✭ 197 (-63.31%)
Multilingual text to speechAn implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
Stars: ✭ 324 (-39.66%)
melganMelGAN implementation with Multi-Band and Full Band supports...
Stars: ✭ 54 (-89.94%)
Zero-Shot-TTSUnofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Stars: ✭ 33 (-93.85%)
NaomiThe Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Stars: ✭ 171 (-68.16%)
Cyclegan Vc2Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
Stars: ✭ 158 (-70.58%)
Neural-HMMNeural HMMs are all you need (for high-quality attention-free TTS)
Stars: ✭ 69 (-87.15%)
Voice2MeshCVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Stars: ✭ 67 (-87.52%)
ppg-vcPPG-Based Voice Conversion
Stars: ✭ 154 (-71.32%)
Speech-BackbonesThis is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Stars: ✭ 205 (-61.82%)