soxanWav2Vec for speech recognition, classification, and audio classification
A chronology of deep learningTracing back and exposing in chronological order the main ideas in the field of deep learning, to help everyone better understand the current intense research in AI.
deep avsrA PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
TinyCogSmall Robot, Toy Robot platform
cobraOn-device voice activity detection (VAD) powered by deep learning.
speechlessSpeech-to-text based on wav2letter built for transfer learning
Unity live captionUse Google Speech-to-Text API to do real-time live stream caption on Unity! Best when combined with your virtual character!
Speech-BackbonesThis is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
open-speech-corpora💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
syn-speech-samplesAn application that demostrate the usage of Syn.Speech library for Speech Recognition
wenetProduction First and Production Ready End-to-End Speech Recognition Toolkit
VoiceDictation迅飞 语音听写 WebAPI - 把语音(≤60秒)转换成对应的文字信息,让机器能够“听懂”人类语言,相当于给机器安装上“耳朵”,使其具备“能听”的功能。
scriptySpeech to text bot for Discord using Mozilla's DeepSpeech
Transformer-TransducerPyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)
rustfstRust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
QuantumSpeech-QCNNIEEE ICASSP 21 - Quantum Convolution Neural Networks for Speech Processing and Automatic Speech Recognition
VoiceBridgeVoiceBridge - an AI-TOOLKIT Open Source C++ Speech Recognition Toolkit
kaldi ag trainingDocker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
PCPMPresenting Collection of Pretrained Models. Links to pretrained models in NLP and voice.
InimesedAn Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.
DeepSpeech-APIThe code enables users to use Mozilla's Deep Speech model over the Web Browser.
AmazonSpeechTranslatorEnd-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
apiSpeechly public API definitions and generated code
2018-dlslUPC Deep Learning for Speech and Language 2018
rnnt decoder cudaAn efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.
salutejsSmartApp Framework для создания навыков семейства Виртуальных Ассистентов "Салют" на языке JavaScript
speechreca simple speech recognition app using the Web Speech API Interfaces
Android-TTS-STTOne line solution for Android Text to speech(TTS) & Speech to Text(STT) translation problem
telltimeiOS application to tell the time in the British way 🇬🇧⏰
KhronosThe open source intelligent personal assistant
ctc-asrEnd-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.
kospeechOpen-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
titanium-speechUse the iOS 10 SFSpeechRecognizer API in JavaScript with Appcelerator Hyperloop.
KodiSharpUse Kodi python APIs in C#, and write rich addons using the .NET framework/Mono
praiseDo stuff with your voice in the browser.
KeenASR-Android-PoCA proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
React.aiIt recognize your speech and trained AI Bot will respond(i.e Customer Service, Personal Assistant) using Machine Learning API (DialogFlow, apiai), Speech Recognition, GraphQL, Next.js, React, redux