web-speech-demoLearn how to build a simple text-to-speech voice app for the web using the Web Speech API.
Stars: ✭ 19 (-52.5%)
idear🎙️ Handsfree Audio Development Interface
Stars: ✭ 84 (+110%)
CVCCVC: Contrastive Learning for Non-parallel Voice Conversion (INTERSPEECH 2021, in PyTorch)
Stars: ✭ 45 (+12.5%)
browser-apis🦄 Cool & Fun Browser Web APIs 🥳
Stars: ✭ 21 (-47.5%)
Voice GenderGender recognition by voice and speech analysis
Stars: ✭ 248 (+520%)
Speech Feature ExtractionFeature extraction of speech signal is the initial stage of any speech recognition system.
Stars: ✭ 78 (+95%)
DurianImplementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (+177.5%)
Tacotron pytorchPyTorch implementation of Tacotron speech synthesis model.
Stars: ✭ 242 (+505%)
kaldi ag trainingDocker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
Stars: ✭ 14 (-65%)
Gcc NmfReal-time GCC-NMF Blind Speech Separation and Enhancement
Stars: ✭ 231 (+477.5%)
MelNet-SpeechGenerationImplementation of MelNet in PyTorch to generate high-fidelity audio samples
Stars: ✭ 19 (-52.5%)
Source separationDeep learning based speech source separation using Pytorch
Stars: ✭ 226 (+465%)
StyleSpeechOfficial implementation of Meta-StyleSpeech and StyleSpeech
Stars: ✭ 161 (+302.5%)
Speech DenoiserA speech denoise lv2 plugin based on RNNoise library
Stars: ✭ 220 (+450%)
linear16Converts an audio file to LINEAR16 Google-speech compatible file.
Stars: ✭ 14 (-65%)
Tts CubeEnd-2-end speech synthesis with recurrent neural networks
Stars: ✭ 213 (+432.5%)
audio noise clusteringhttps://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.
Stars: ✭ 24 (-40%)
EdgedictWorking online speech recognition based on RNN Transducer. ( Trained model release available in release )
Stars: ✭ 205 (+412.5%)
speech to texthow to use the Google Cloud Speech API to transcribe audio/video files.
Stars: ✭ 35 (-12.5%)
Esp8266samSpeech synthesis for ESP8266 using S.A.M. port
Stars: ✭ 199 (+397.5%)
simple-obs-sttSpeech-to-text and keyboard input captions for OBS.
Stars: ✭ 89 (+122.5%)
DeepSegmentorSequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)
Stars: ✭ 17 (-57.5%)
Depression DetectPredicting depression from acoustic features of speech using a Convolutional Neural Network.
Stars: ✭ 187 (+367.5%)
KeenASR-Android-PoCA proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Stars: ✭ 21 (-47.5%)
ttslearnttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (+295%)
opensource-voice-toolsA repo listing known open source voice tools, ordered by where they sit in the voice stack
Stars: ✭ 21 (-47.5%)
Chatbot Watson AndroidAn Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
Stars: ✭ 169 (+322.5%)
datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+34575%)
Tacotron asrSpeech Recognition Using Tacotron
Stars: ✭ 165 (+312.5%)
FAST-RIRThis is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Stars: ✭ 90 (+125%)
Aeneasaeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+4755%)
TASNETTime-domain Audio Separation Network (IN PYTORCH)
Stars: ✭ 18 (-55%)
TacotronA TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Stars: ✭ 1,756 (+4290%)
rnn benchmarksRNN benchmarks of pytorch, tensorflow and theano
Stars: ✭ 85 (+112.5%)
DiffwaveDiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Stars: ✭ 139 (+247.5%)
deepspeech.mxnetA MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (+105%)
NBSSThe official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".
Stars: ✭ 77 (+92.5%)
VocA physical model of the human vocal tract using literate programming, based on Pink Trombone.
Stars: ✭ 129 (+222.5%)
HTKThe Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.
Stars: ✭ 23 (-42.5%)
AESRC2020a deep accent recognition network
Stars: ✭ 35 (-12.5%)
txt2speechConvert text to speech using Google Translate API
Stars: ✭ 38 (-5%)
kaldi helpers🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-67.5%)
TtsText-to-Speech for Arduino
Stars: ✭ 118 (+195%)
ventib📈 Ventib records your voice, transcribes it in realtime, and performs speech pattern analysis to give you objective statistics about how you speak.
Stars: ✭ 43 (+7.5%)
anycontrolVoice control for your websites and applications
Stars: ✭ 53 (+32.5%)
wav2vec2-liveA live speech recognition using Facebooks wav2vec 2.0 model.
Stars: ✭ 205 (+412.5%)
jackpairp2p speech encrypting device with analog audio interface suitable for GSM phones
Stars: ✭ 26 (-35%)
fadeA Simulation Framework for Auditory Discrimination Experiments
Stars: ✭ 12 (-70%)
KARENKAREN: Unifying Hatespeech Detection and Benchmarking
Stars: ✭ 18 (-55%)
UniSpeechUniSpeech - Large Scale Self-Supervised Learning for Speech
Stars: ✭ 224 (+460%)