Top 184 speech open source projects

Wavegrad
Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Speechbrain.github.io
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Tacotron pytorch
PyTorch implementation of Tacotron speech synthesis model.
Gcc Nmf
Real-time GCC-NMF Blind Speech Separation and Enhancement
Setk
Tools for Speech Enhancement integrated with Kaldi
Source separation
Deep learning based speech source separation using Pytorch
Volute
Raspberry Pi + Nodejs = Speech Robot
Speech Denoiser
A speech denoise lv2 plugin based on RNNoise library
Speech Enhancement
Deep learning for audio denoising
Tts Cube
End-2-end speech synthesis with recurrent neural networks
Neural Voice Cloning With Few Samples
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
Edgedict
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
Timit
The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.
✭ 202
speech
Esp8266sam
Speech synthesis for ESP8266 using S.A.M. port
Depression Detect
Predicting depression from acoustic features of speech using a Convolutional Neural Network.
Vq Vae Speech
PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]
React Native Dialogflow
A React-Native Bridge for the Google Dialogflow (API.AI) SDK
End2end Asr Pytorch
End-to-End Automatic Speech Recognition on PyTorch
Deep speaker Speaker recognition system
Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
Chatbot Watson Android
An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
Pytorch Kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Tts Papers
🐸 collection of TTS papers
Aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
Avpi
an open source voice command macro software
Voc
A physical model of the human vocal tract using literate programming, based on Pink Trombone.
Asr audio data links
A list of publically available audio data that anyone can download for ASR or other speech activities
Reconstructing faces from voices
An example of the paper "reconstructing faces from voices"
Kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
Code Switching Papers
A curated list of research papers and resources on code-switching
Speech And Text Unity Ios Android
Speed to text in Unity iOS use Native Speech Recognition
✭ 117
speech
Holobot
HoloBot is a reusable 3D interface that allows HoloLens & VR users to interact with any bot using Mixed Reality & Speech.
Durian
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Python Speech recognition
A simple example for use speech recognition baidu api with python.
Audiomate
Python library for handling audio datasets.
Wikipron
Massively multilingual pronunciation mining
Gtts
Python library and CLI tool to interface with Google Translate's text-to-speech API
Wavenet Enhancement
Speech Enhancement using Bayesian WaveNet
Audio
Data manipulation and transformation for audio signal processing, powered by PyTorch
Julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
Tts
Tools to convert text to speech 📚💬
Deepspeech
A PaddlePaddle implementation of ASR.
Openasr
A pytorch based end2end speech recognition system.
1-60 of 184 speech projects