deepspeech.mxnetA MXNet implementation of Baidu's DeepSpeech architecture
Stars: β 82 (-7.87%)
open-speech-corporaπ A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: β 841 (+844.94%)
opensource-voice-toolsA repo listing known open source voice tools, ordered by where they sit in the voice stack
Stars: β 21 (-76.4%)
LingvoLingvo
Stars: β 2,361 (+2552.81%)
sova-asrSOVA ASR (Automatic Speech Recognition)
Stars: β 123 (+38.2%)
OpenasrA pytorch based end2end speech recognition system.
Stars: β 69 (-22.47%)
EdgedictWorking online speech recognition based on RNN Transducer. ( Trained model release available in release )
Stars: β 205 (+130.34%)
Dc ttsA TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Stars: β 1,017 (+1042.7%)
scriptySpeech to text bot for Discord using Mozilla's DeepSpeech
Stars: β 14 (-84.27%)
Awesome KaldiThis is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Stars: β 393 (+341.57%)
anycontrolVoice control for your websites and applications
Stars: β 53 (-40.45%)
DiscordspeechbotA speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.
Stars: β 35 (-60.67%)
KeenASR-Android-PoCA proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Stars: β 21 (-76.4%)
DeepspeechA PaddlePaddle implementation of ASR.
Stars: β 1,219 (+1269.66%)
ASR-Audio-Data-LinksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: β 179 (+101.12%)
Java Speech ApiThe J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: β 490 (+450.56%)
Asr audio data linksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: β 128 (+43.82%)
Tacotron asrSpeech Recognition Using Tacotron
Stars: β 165 (+85.39%)
Kaldikaldi-asr/kaldi is the official location of the Kaldi project.
Stars: β 11,151 (+12429.21%)
Speechbrain.github.ioThe SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Stars: β 242 (+171.91%)
demo vietasrVietnamese Speech Recognition
Stars: β 22 (-75.28%)
kaldi ag trainingDocker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
Stars: β 14 (-84.27%)
Spokestack PythonSpokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: β 103 (+15.73%)
speech to texthow to use the Google Cloud Speech API to transcribe audio/video files.
Stars: β 35 (-60.67%)
spokestack-androidExtensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: β 52 (-41.57%)
Syn SpeechSyn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
Stars: β 57 (-35.96%)
wav2vec2-liveA live speech recognition using Facebooks wav2vec 2.0 model.
Stars: β 205 (+130.34%)
Tensorflow Speech RecognitionπSpeech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Stars: β 2,118 (+2279.78%)
Annyang㪠Speech recognition for your site
Stars: β 6,216 (+6884.27%)
Sonus㪠/so.nus/ STT (speech to text) for Node with offline hotword detection
Stars: β 532 (+497.75%)
leopardOn-device speech-to-text engine powered by deep learning
Stars: β 354 (+297.75%)
WatbotAn Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Stars: β 64 (-28.09%)
SoloudFree, easy, portable audio engine for games
Stars: β 1,048 (+1077.53%)
TtsTools to convert text to speech ππ¬
Stars: β 84 (-5.62%)
AudiomatePython library for handling audio datasets.
Stars: β 99 (+11.24%)
GttsPython library and CLI tool to interface with Google Translate's text-to-speech API
Stars: β 1,303 (+1364.04%)
React.aiIt recognize your speech and trained AI Bot will respond(i.e Customer Service, Personal Assistant) using Machine Learning API (DialogFlow, apiai), Speech Recognition, GraphQL, Next.js, React, redux
Stars: β 38 (-57.3%)
JuliusOpen-Source Large Vocabulary Continuous Speech Recognition Engine
Stars: β 1,258 (+1313.48%)
DeltaDELTA is a deep learning based natural language and speech processing platform.
Stars: β 1,479 (+1561.8%)
DurianImplementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: β 111 (+24.72%)
TtsText-to-Speech for Arduino
Stars: β 118 (+32.58%)
AllosaurusAllosaurus is a pretrained universal phone recognizer for more than 2000 languages
Stars: β 135 (+51.69%)
Aeneasaeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: β 1,942 (+2082.02%)
Tts PapersπΈ collection of TTS papers
Stars: β 160 (+79.78%)
octopusOn-device speech-to-index engine powered by deep learning.
Stars: β 30 (-66.29%)
HolobotHoloBot is a reusable 3D interface that allows HoloLens & VR users to interact with any bot using Mixed Reality & Speech.
Stars: β 114 (+28.09%)
TacotronA TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Stars: β 1,756 (+1873.03%)
Pytorch Kaldipytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: β 2,097 (+2256.18%)
web-voice-processorA library for real-time voice processing in web browsers
Stars: β 69 (-22.47%)
KerasdeepspeechA Keras CTC implementation of Baidu's DeepSpeech for model experimentation
Stars: β 245 (+175.28%)
XION-ChaseCamThis is a free-to-use HTML/javascript based overlay for roleplay streamers. Basically it mimics the overlay of the AXON bodycam, but since most folks play in 3rd person, it's a ChaseCam. I've included a logo, and the html file. The html file has the css, html, and javascript all in one file for ease of editing. Goto line 81 of the html file to cβ¦
Stars: β 27 (-69.66%)