TimitThe DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.
Stars: ✭ 202 (+210.77%)
VadVoice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Stars: ✭ 622 (+856.92%)
ZhrtvcChinese real time voice cloning (VC) and Chinese text to speech (TTS). 好用的中文语音克隆兼中文语音合成系统,包含语音编码器、语音合成器、声码器和可视化模块。
Stars: ✭ 771 (+1086.15%)
tacotron2Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow
Stars: ✭ 102 (+56.92%)
torch-asgAuto Segmentation Criterion (ASG) implemented in pytorch
Stars: ✭ 42 (-35.38%)
MtransMulti-source Translation
Stars: ✭ 711 (+993.85%)
Nodejs SpeechNode.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
Stars: ✭ 545 (+738.46%)
EkhoChinese text-to-speech engine
Stars: ✭ 690 (+961.54%)
Esp8266samSpeech synthesis for ESP8266 using S.A.M. port
Stars: ✭ 199 (+206.15%)
Sonus💬 /so.nus/ STT (speech to text) for Node with offline hotword detection
Stars: ✭ 532 (+718.46%)
Transformertts🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
Stars: ✭ 617 (+849.23%)
ukrainian-ttsUkrainian TTS (text-to-speech) using Coqui TTS
Stars: ✭ 74 (+13.85%)
SER-datasetsA collection of datasets for the purpose of emotion recognition/detection in speech.
Stars: ✭ 74 (+13.85%)
magphaseMagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.
Stars: ✭ 76 (+16.92%)
AllosaurusAllosaurus is a pretrained universal phone recognizer for more than 2000 languages
Stars: ✭ 135 (+107.69%)
lectures-allCentral repository for all lectures on deep learning at UPC ETSETB TelecomBCN.
Stars: ✭ 46 (-29.23%)
Real Time Voice CloningClone a voice in 5 seconds to generate arbitrary speech in real-time
Stars: ✭ 32,095 (+49276.92%)
MelganMelGAN vocoder (compatible with NVIDIA/tacotron2)
Stars: ✭ 444 (+583.08%)
Transformer TtsA Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"
Stars: ✭ 418 (+543.08%)
Avpian open source voice command macro software
Stars: ✭ 130 (+100%)
NormitTranslations with speech synthesis in your terminal as a node package
Stars: ✭ 219 (+236.92%)
wav2vec2-liveA live speech recognition using Facebooks wav2vec 2.0 model.
Stars: ✭ 205 (+215.38%)
Cyclegan Vc2Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
Stars: ✭ 158 (+143.08%)
HTKThe Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.
Stars: ✭ 23 (-64.62%)
Tacotron 2DeepMind's Tacotron-2 Tensorflow implementation
Stars: ✭ 1,968 (+2927.69%)
Asr audio data linksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 128 (+96.92%)
ZerospeechVQ-VAE for Acoustic Unit Discovery and Voice Conversion
Stars: ✭ 137 (+110.77%)
opensnipsOpen source projects related to Snips https://snips.ai/.
Stars: ✭ 50 (-23.08%)
CotatronOfficial code for Cotatron @ INTERSPEECH 2020
Stars: ✭ 137 (+110.77%)
ttsflowtensorflow speech synthesis c++ inference for voicenet
Stars: ✭ 17 (-73.85%)
Legacy straightA vocoder framework which had been widely used in research community since 1999.
Stars: ✭ 130 (+100%)
nlp-classA Natural Language Processing course taught by Professor Ghassemi
Stars: ✭ 95 (+46.15%)
Tacotron PytorchA Pytorch Implementation of Tacotron: End-to-end Text-to-speech Deep-Learning Model
Stars: ✭ 104 (+60%)
Kaldikaldi-asr/kaldi is the official location of the Kaldi project.
Stars: ✭ 11,151 (+17055.38%)
WaveflowA PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio"
Stars: ✭ 95 (+46.15%)
XzvoiceFree and open source text-to-speech software
Stars: ✭ 355 (+446.15%)
MerlinThis is now the official location of the Merlin project.
Stars: ✭ 1,168 (+1696.92%)
UniSpeechUniSpeech - Large Scale Self-Supervised Learning for Speech
Stars: ✭ 224 (+244.62%)
Tacotron2A PyTorch implementation of Tacotron2, an end-to-end text-to-speech(TTS) system described in "Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions".
Stars: ✭ 43 (-33.85%)
Code Switching PapersA curated list of research papers and resources on code-switching
Stars: ✭ 122 (+87.69%)
PororoPORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Stars: ✭ 812 (+1149.23%)
gtranscribeSoftware for interview transcription
Stars: ✭ 12 (-81.54%)
oddvoicesAn indie singing synthesizer
Stars: ✭ 4 (-93.85%)
linear16Converts an audio file to LINEAR16 Google-speech compatible file.
Stars: ✭ 14 (-78.46%)
KeenASR-Android-PoCA proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Stars: ✭ 21 (-67.69%)
brasilttsBrasil TTS é um conjunto de sintetizadores de voz, em português do Brasil, que lê telas para portadores de deficiência visual. Transforma texto em áudio, permitindo que pessoas cegas ou com baixa visão tenham acesso ao conteúdo exibido na tela. Embora o principal público-alvo de sistemas de conversão texto-fala – como o Brasil TTS – seja formado…
Stars: ✭ 34 (-47.69%)
SignDetectThis application is developed to help speechless people interact with others with ease. It detects voice and converts the input speech into a sign language based video.
Stars: ✭ 21 (-67.69%)
Xr3player🎧 🎼 Advanced JavaFX Media Player
Stars: ✭ 472 (+626.15%)
Facemoji😆 A voice chatbot that can imitate your expression. OpenCV+Dlib+Live2D+Moments Recorder+Turing Robot+Iflytek IAT+Iflytek TTS
Stars: ✭ 320 (+392.31%)