All Categories → Machine Learning → speech

Top 184 speech open source projects

Gender recognition by voice and speech analysis

✭ 248

r machine-learning data-science neural-network artificial-intelligence ai speech voice logistic-regression signal

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

✭ 245

jupyter-notebook speech text-to-speech tts speech-synthesis

Speechbrain.github.io

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

✭ 242

html deep-learning neural-network neural-networks deeplearning speech-recognition speech speech-to-text speech-processing

Kerasdeepspeech

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation

✭ 245

python deep-learning machine-learning keras neural-network neural-networks deeplearning speech speech-to-text coreml baidu asr ctc

Tacotron pytorch

PyTorch implementation of Tacotron speech synthesis model.

✭ 242

python jupyter-notebook pytorch speech speech-synthesis tacotron

Lhotse

✭ 236

python deep-learning machine-learning pytorch audio ai data speech kaldi

Gcc Nmf

Real-time GCC-NMF Blind Speech Separation and Enhancement

✭ 231

python machine-learning real-time speech gcc low-latency speech-processing ipython-notebook

Setk

Tools for Speech Enhancement integrated with Kaldi

✭ 227

python speech kaldi

Source separation

Deep learning based speech source separation using Pytorch

✭ 226

jupyter-notebook deep-learning pytorch audio speech

Volute

Raspberry Pi + Nodejs = Speech Robot

✭ 224

javascript raspberry-pi speech

Speech Denoiser

A speech denoise lv2 plugin based on RNNoise library

✭ 220

c audio rnn speech

Speech Enhancement

Deep learning for audio denoising

✭ 207

python deep-learning cnn speech unet

Tts Cube

End-2-end speech synthesis with recurrent neural networks

✭ 213

python neural-network lstm speech text-to-speech synthesis character neural

Neural Voice Cloning With Few Samples

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

✭ 211

python speech speech-synthesis speech-processing

Edgedict

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

✭ 205

python speech-recognition speech speech-to-text asr

Timit

The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.

✭ 202

speech

Esp8266sam

Speech synthesis for ESP8266 using S.A.M. port

✭ 199

c esp8266 speech synthesis sam

Speechtotext Websockets Javascript

SDK & Sample to do speech recognition using websockets in Javascript

✭ 191

javascript typescript js ts sdk websocket browser websockets microsoft speech-recognition speech recognition cognitive-services

Emotion Classification From Audio Files

Understanding emotions from audio files using neural networks and multiple datasets.

✭ 189

python python3 deep-learning machine-learning tensorflow keras audio deep-neural-networks speech audio-processing datascience emotion songs

Depression Detect

Predicting depression from acoustic features of speech using a Convolutional Neural Network.

✭ 187

python convolutional-neural-networks cnn speech

Vq Vae Speech

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

✭ 187

python pytorch speech speech-processing wavenet

React Native Dialogflow

A React-Native Bridge for the Google Dialogflow (API.AI) SDK

✭ 182

javascript react-native google speech voice speech-processing text-recognition

Siricontrol System

Control anything with Siri voice commands.

✭ 180

python ios framework raspberry-pi iot internet-of-things speech internet voice-commands siri voice-control

End2end Asr Pytorch

End-to-End Automatic Speech Recognition on PyTorch

✭ 175

python pytorch speech-recognition transformer speech asr end-to-end

Deep speaker Speaker recognition system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)

✭ 174

python keras speech

Chatbot Watson Android

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

✭ 169

java android chatbot dialog android-studio speech entity conversation intent workspace watson

Pytorch Kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Tacotron asr

Speech Recognition Using Tacotron

✭ 165

python speech-recognition speech speech-to-text tacotron

Tts Papers

🐸 collection of TTS papers

✭ 160

deep-learning speech papers tts

Aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

✭ 1,942

python C++c HTML linux cli macos windows nlp audio ffmpeg text speech text-to-speech tts alignment srt dtw festival espeak smil espeak-ng forced-alignment

Wavenet vocoder

WaveNet vocoder

✭ 1,926

python shell pytorch speech speech-synthesis speech-processing wavenet wavenet-vocoder neural-vocoder

Tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

✭ 1,756

python tensorflow speech tts speech-synthesis-model

Wavegrad

A fast, high-quality neural vocoder.

✭ 138

python machine-learning pytorch neural-network paper speech text-to-speech pretrained-models speech-synthesis

Diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

✭ 139

python machine-learning pytorch neural-network paper speech text-to-speech pretrained-models speech-synthesis

Allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

✭ 135

python pytorch speech-recognition speech

Voice activity detection

Voice Activity Detection based on Deep Learning & TensorFlow

✭ 132

python deep-learning machine-learning tensorflow artificial-intelligence deep-neural-networks deeplearning time-series speech-recognition resnet speech

Avpi

an open source voice command macro software

✭ 130

macro windows speech voice recognition

Voc

A physical model of the human vocal tract using literate programming, based on Pink Trombone.

✭ 129

tex music model speech dsp synthesis

Asr audio data links

A list of publically available audio data that anyone can download for ASR or other speech activities

✭ 128

data speech-recognition speech speech-to-text asr

Reconstructing faces from voices

An example of the paper "reconstructing faces from voices"

✭ 127

python gan speech

Kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

✭ 11,151

shell C++python perl c TeX cuda speech-recognition speech speech-to-text kaldi speaker-verification speaker-id

Pytorch Asr

ASR with PyTorch

✭ 124

python pytorch speech-recognition resnet speech decoder asr densenet kaldi ctc capsule-network

Code Switching Papers

A curated list of research papers and resources on code-switching

✭ 122

language nlp research speech papers

Tts

Text-to-Speech for Arduino

✭ 118

c arduino esp8266 esp32 speech text-to-speech tts teensy

Speech And Text Unity Ios Android

Speed to text in Unity iOS use Native Speech Recognition

✭ 117

speech

Tfg Voice Conversion

Deep Learning-based Voice Conversion system

✭ 115

python deep-learning tensorflow keras deep-neural-networks numpy speech gplv3 speech-processing

Holobot

HoloBot is a reusable 3D interface that allows HoloLens & VR users to interact with any bot using Mixed Reality & Speech.

✭ 114

unity bot speech-recognition speech hololens microsoft-bot-framework

Durian

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.

✭ 111

python speech text-to-speech tts speech-synthesis

Python Speech recognition

A simple example for use speech recognition baidu api with python.

✭ 106

python speech-recognition speech scipy

Delta

DELTA is a deep learning based natural language and speech processing platform.

Audiomate

Python library for handling audio datasets.

✭ 99

python audio music speech-recognition speech noise

Wikipron

Massively multilingual pronunciation mining

✭ 99

python language nlp speech linguistics python-api

Gtts

Python library and CLI tool to interface with Google Translate's text-to-speech API

✭ 1,303

python cli pypi speech text-to-speech tts

Wavenet Enhancement

Speech Enhancement using Bayesian WaveNet

✭ 86

python speech wavenet

Audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

✭ 1,262

python audio mp3 speech io wav

Julius

Open-Source Large Vocabulary Continuous Speech Recognition Engine

✭ 1,258

c speech-recognition speech audio-processing recognition

Tts

Tools to convert text to speech 📚💬

✭ 84

javascript amazon speech tts

Deepspeech

A PaddlePaddle implementation of ASR.

✭ 1,219

python speech-recognition speech speech-to-text

Openasr

A pytorch based end2end speech recognition system.

✭ 69

python speech-recognition transformer speech speech-to-text asr

1-60 of 184 speech projects

›