speech-to-textPython helper for Google and IBM Watson speech-to-text cloud APIs.
Unity live captionUse Google Speech-to-Text API to do real-time live stream caption on Unity! Best when combined with your virtual character!
open-speech-corpora💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
vspeech📢 Complete V bindings for Mozilla's DeepSpeech TensorFlow based Speech-to-Text library. 📜
scriptySpeech to text bot for Discord using Mozilla's DeepSpeech
kaldi ag trainingDocker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
PCPMPresenting Collection of Pretrained Models. Links to pretrained models in NLP and voice.
InimesedAn Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.
DeepSpeech-APIThe code enables users to use Mozilla's Deep Speech model over the Web Browser.
AmazonSpeechTranslatorEnd-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
rnnt decoder cudaAn efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.
speechreca simple speech recognition app using the Web Speech API Interfaces
benchmarksttOpen Source AI Benchmarking toolkit for benchmarking speech to text services
KeenASR-Android-PoCA proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
React.aiIt recognize your speech and trained AI Bot will respond(i.e Customer Service, Personal Assistant) using Machine Learning API (DialogFlow, apiai), Speech Recognition, GraphQL, Next.js, React, redux
octopusOn-device speech-to-index engine powered by deep learning.
ASR-Audio-Data-LinksA list of publically available audio data that anyone can download for ASR or other speech activities
wav2vec2-liveA live speech recognition using Facebooks wav2vec 2.0 model.
leopardOn-device speech-to-text engine powered by deep learning
anycontrolVoice control for your websites and applications
megsA merged version of multiple open-source German speech datasets.