TASNETTime-domain Audio Separation Network (IN PYTORCH)
Stars: ✭ 18 (-76.62%)
Lhotse Stars: ✭ 236 (+206.49%)
Vq Vae SpeechPyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]
Stars: ✭ 187 (+142.86%)
Tts Papers🐸 collection of TTS papers
Stars: ✭ 160 (+107.79%)
LingvoLingvo
Stars: ✭ 2,361 (+2966.23%)
WavegradImplementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Stars: ✭ 245 (+218.18%)
IMS-ToucanText-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
Stars: ✭ 295 (+283.12%)
VoluteRaspberry Pi + Nodejs = Speech Robot
Stars: ✭ 224 (+190.91%)
WavegradA fast, high-quality neural vocoder.
Stars: ✭ 138 (+79.22%)
Avpian open source voice command macro software
Stars: ✭ 130 (+68.83%)
TimitThe DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.
Stars: ✭ 202 (+162.34%)
lectures-allCentral repository for all lectures on deep learning at UPC ETSETB TelecomBCN.
Stars: ✭ 46 (-40.26%)
TF-Speech-Recognition-Challenge-SolutionSource code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.
Stars: ✭ 58 (-24.68%)
KerasdeepspeechA Keras CTC implementation of Baidu's DeepSpeech for model experimentation
Stars: ✭ 245 (+218.18%)
Pytorch Kaldipytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+2623.38%)
pytorch-pcenPyTorch reimplementation of per-channel energy normalization for audio.
Stars: ✭ 80 (+3.9%)
SetkTools for Speech Enhancement integrated with Kaldi
Stars: ✭ 227 (+194.81%)
AllosaurusAllosaurus is a pretrained universal phone recognizer for more than 2000 languages
Stars: ✭ 135 (+75.32%)
VQMIVCOfficial implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!
Stars: ✭ 278 (+261.04%)
Asr audio data linksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 128 (+66.23%)
Kaldikaldi-asr/kaldi is the official location of the Kaldi project.
Stars: ✭ 11,151 (+14381.82%)
EdgedictWorking online speech recognition based on RNN Transducer. ( Trained model release available in release )
Stars: ✭ 205 (+166.23%)
browser-apis🦄 Cool & Fun Browser Web APIs 🥳
Stars: ✭ 21 (-72.73%)
Esp8266samSpeech synthesis for ESP8266 using S.A.M. port
Stars: ✭ 199 (+158.44%)
anycontrolVoice control for your websites and applications
Stars: ✭ 53 (-31.17%)
Voice GenderGender recognition by voice and speech analysis
Stars: ✭ 248 (+222.08%)
Depression DetectPredicting depression from acoustic features of speech using a Convolutional Neural Network.
Stars: ✭ 187 (+142.86%)
ventib📈 Ventib records your voice, transcribes it in realtime, and performs speech pattern analysis to give you objective statistics about how you speak.
Stars: ✭ 43 (-44.16%)
Speechbrain.github.ioThe SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Stars: ✭ 242 (+214.29%)
Multimodal-Gesture-Recognition-with-LSTMs-and-CTCAn end-to-end system that performs temporal recognition of gesture sequences using speech and skeletal input. The model combines three networks with a CTC output layer that recognises gestures from continuous stream.
Stars: ✭ 25 (-67.53%)
Chatbot Watson AndroidAn Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
Stars: ✭ 169 (+119.48%)
Tacotron pytorchPyTorch implementation of Tacotron speech synthesis model.
Stars: ✭ 242 (+214.29%)
Tacotron asrSpeech Recognition Using Tacotron
Stars: ✭ 165 (+114.29%)
Aeneasaeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+2422.08%)
Gcc NmfReal-time GCC-NMF Blind Speech Separation and Enhancement
Stars: ✭ 231 (+200%)
TacotronA TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Stars: ✭ 1,756 (+2180.52%)
DiffwaveDiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Stars: ✭ 139 (+80.52%)
Source separationDeep learning based speech source separation using Pytorch
Stars: ✭ 226 (+193.51%)
wav2vec2-liveA live speech recognition using Facebooks wav2vec 2.0 model.
Stars: ✭ 205 (+166.23%)
VocA physical model of the human vocal tract using literate programming, based on Pink Trombone.
Stars: ✭ 129 (+67.53%)
Speech DenoiserA speech denoise lv2 plugin based on RNNoise library
Stars: ✭ 220 (+185.71%)
idear🎙️ Handsfree Audio Development Interface
Stars: ✭ 84 (+9.09%)
Tts CubeEnd-2-end speech synthesis with recurrent neural networks
Stars: ✭ 213 (+176.62%)
capeContinuous Augmented Positional Embeddings (CAPE) implementation for PyTorch
Stars: ✭ 29 (-62.34%)
ASR-Audio-Data-LinksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 179 (+132.47%)
txt2speechConvert text to speech using Google Translate API
Stars: ✭ 38 (-50.65%)