DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Stars: ✭ 18,680 (+40508.7%)

Mutual labels: speech-to-text, deepspeech

leopard

On-device speech-to-text engine powered by deep learning

Stars: ✭ 354 (+669.57%)

Mutual labels: speech-to-text, transcription

kaldi helpers

🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.

Stars: ✭ 13 (-71.74%)

Mutual labels: speech-to-text, transcription

speech-to-text

Python helper for Google and IBM Watson speech-to-text cloud APIs.

Stars: ✭ 14 (-69.57%)

Mutual labels: speech-to-text, transcription

leon

🧠 Leon is your open-source personal assistant.

Stars: ✭ 8,560 (+18508.7%)

Mutual labels: speech-to-text, deepspeech

speechmatics-python

Python library and CLI for Speechmatics

Stars: ✭ 24 (-47.83%)

Mutual labels: speech-to-text, transcription

simple-obs-stt

Speech-to-text and keyboard input captions for OBS.

Stars: ✭ 89 (+93.48%)

Mutual labels: speech-to-text

KeenASR-Android-PoC

A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html

Stars: ✭ 21 (-54.35%)

Mutual labels: speech-to-text

React.ai

It recognize your speech and trained AI Bot will respond(i.e Customer Service, Personal Assistant) using Machine Learning API (DialogFlow, apiai), Speech Recognition, GraphQL, Next.js, React, redux

Stars: ✭ 38 (-17.39%)

Mutual labels: speech-to-text

web-voice-processor

A library for real-time voice processing in web browsers

Stars: ✭ 69 (+50%)

Mutual labels: speech-to-text

revai-java-sdk

Rev.ai Java SDK

Stars: ✭ 16 (-65.22%)

Mutual labels: speech-to-text

scripty

Speech to text bot for Discord using Mozilla's DeepSpeech

Stars: ✭ 14 (-69.57%)

Mutual labels: speech-to-text

react-native-spokestack

Spokestack: give your React Native app a voice interface!

Stars: ✭ 53 (+15.22%)

Mutual labels: speech-to-text

ASR-Audio-Data-Links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 179 (+289.13%)

Mutual labels: speech-to-text

glaemscribe

Glaemscribe, the tolkienian languages/writings transcription engine.

Stars: ✭ 29 (-36.96%)

Mutual labels: transcription

wav2vec2-live

A live speech recognition using Facebooks wav2vec 2.0 model.

Stars: ✭ 205 (+345.65%)

Mutual labels: speech-to-text

revai-python-sdk

Rev AI Python SDK

Stars: ✭ 35 (-23.91%)

Mutual labels: speech-to-text

aws-transcribe-demo

A simple AWS demo utilises Amazon Transcribe to convert audio to text and analyse.

Stars: ✭ 39 (-15.22%)

Mutual labels: aws-transcribe

spokestack-ios

Spokestack: give your iOS app a voice interface!

Stars: ✭ 27 (-41.3%)

Mutual labels: speech-to-text

PCPM

Presenting Collection of Pretrained Models. Links to pretrained models in NLP and voice.

Stars: ✭ 21 (-54.35%)

Mutual labels: speech-to-text

vave

🌊 A crazy simple library for reading/writing WAV files in V. Zero dependencies, 100% cross-platform.

Stars: ✭ 35 (-23.91%)

Mutual labels: deepspeech

benchmarkstt

Open Source AI Benchmarking toolkit for benchmarking speech to text services

Stars: ✭ 43 (-6.52%)

Mutual labels: speech-to-text

open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Stars: ✭ 841 (+1728.26%)

Mutual labels: speech-to-text

revai-node-sdk

Node.js SDK for the Rev AI API

Stars: ✭ 21 (-54.35%)

Mutual labels: speech-to-text

octopus

On-device speech-to-index engine powered by deep learning.

Stars: ✭ 30 (-34.78%)

Mutual labels: speech-to-text

Chinese-automatic-speech-recognition

Chinese speech recognition

Stars: ✭ 147 (+219.57%)

Mutual labels: speech-to-text

Generate-Live-Transcription

This extension helps to get a real-time transcription of audio playing in the browser using Deep Speech.

Stars: ✭ 16 (-65.22%)

Mutual labels: deepspeech

parlatype

GNOME audio player for transcription

Stars: ✭ 151 (+228.26%)

Mutual labels: transcription

aws-content-analysis

This project is a fully automated video search engine which uses AWS AI services for computer vision and speech recognition to catalog video archives.

Stars: ✭ 67 (+45.65%)

Mutual labels: aws-transcribe

asr24

24-hour Automatic Speech Recognition

Stars: ✭ 27 (-41.3%)

Mutual labels: transcription

deep avsr

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

Stars: ✭ 104 (+126.09%)

Mutual labels: speech-to-text

Speech recognition with tensorflow

Implementation of a seq2seq model for Speech Recognition using the latest version of TensorFlow. Architecture similar to Listen, Attend and Spell.

Stars: ✭ 253 (+450%)

Mutual labels: speech-to-text

kaldi ag training

Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.

Stars: ✭ 14 (-69.57%)

Mutual labels: speech-to-text

anycontrol

Voice control for your websites and applications

Stars: ✭ 53 (+15.22%)

Mutual labels: speech-to-text

digital-paper-edit-client

Work in progress - BBC News Labs digital paper edit project - React Client

Stars: ✭ 36 (-21.74%)

Mutual labels: speech-to-text

megs

A merged version of multiple open-source German speech datasets.

Stars: ✭ 21 (-54.35%)

Mutual labels: speech-to-text

Inimesed

An Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.

Stars: ✭ 65 (+41.3%)

Mutual labels: speech-to-text

Speechbrain.github.io

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

Stars: ✭ 242 (+426.09%)

Mutual labels: speech-to-text

Kerasdeepspeech

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation

Stars: ✭ 245 (+432.61%)

Mutual labels: speech-to-text

AmazonSpeechTranslator

End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.

Stars: ✭ 50 (+8.7%)

Mutual labels: speech-to-text

Stt

🐸STT - a deep learning toolkit for Speech-to-Text, battle-tested in research and production

Stars: ✭ 197 (+328.26%)

Mutual labels: speech-to-text

Go Astibob

Golang framework to build an AI that can understand and speak back to you, and everything else you want

Stars: ✭ 222 (+382.61%)

Mutual labels: speech-to-text

dataflow-contact-center-speech-analysis

Speech Analysis Framework, a collection of components and code from Google Cloud that you can use to transcribe audio files to create analytics.

Stars: ✭ 46 (+0%)

Mutual labels: speech-to-text

Rnn ctc

Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.

Stars: ✭ 220 (+378.26%)

Mutual labels: speech-to-text

rnnt decoder cuda

An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.

Stars: ✭ 60 (+30.43%)

Mutual labels: speech-to-text

Edgedict

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Stars: ✭ 205 (+345.65%)

Mutual labels: speech-to-text

Kaldi Active Grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

Stars: ✭ 196 (+326.09%)

Mutual labels: speech-to-text

speechrec

a simple speech recognition app using the Web Speech API Interfaces

Stars: ✭ 18 (-60.87%)

Mutual labels: speech-to-text

K6nele

An Android app that offers speech-to-text services and user interfaces to other apps

Stars: ✭ 196 (+326.09%)

Mutual labels: speech-to-text

Dictate.js

A small Javascript library for browser-based real-time speech recognition, which uses Recorderjs for audio capture, and a WebSocket connection to the Kaldi GStreamer server for speech recognition.

Stars: ✭ 195 (+323.91%)

Mutual labels: speech-to-text

web-speech-cognitive-services

Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.

Stars: ✭ 35 (-23.91%)

Mutual labels: speech-to-text

Lingvo

Stars: ✭ 2,361 (+5032.61%)

Mutual labels: speech-to-text

Expressive tacotron

Tensorflow Implementation of Expressive Tacotron

Stars: ✭ 192 (+317.39%)

Mutual labels: speech-to-text

1-60 of 169 similar projects

›