pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Stars: ✭ 2,097 (+1098.29%)

Mutual labels: speech-recognition, speech, asr

opensource-voice-tools

A repo listing known open source voice tools, ordered by where they sit in the voice stack

Stars: ✭ 21 (-88%)

Mutual labels: speech, speech-recognition, asr

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Stars: ✭ 2,384 (+1262.29%)

Mutual labels: transformer, speech-recognition, asr

Espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Stars: ✭ 808 (+361.71%)

Mutual labels: speech-recognition, asr, end-to-end

Speech Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

Stars: ✭ 565 (+222.86%)

Mutual labels: asr, end-to-end, transformer

Edgedict

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Stars: ✭ 205 (+17.14%)

Mutual labels: speech-recognition, speech, asr

Asr audio data links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 128 (-26.86%)

Mutual labels: speech-recognition, speech, asr

Syn Speech

Syn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework

Stars: ✭ 57 (-67.43%)

Mutual labels: speech-recognition, speech, asr

Wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Stars: ✭ 617 (+252.57%)

Mutual labels: speech-recognition, asr, transformer

ASR-Audio-Data-Links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 179 (+2.29%)

Mutual labels: speech, speech-recognition, asr

Transformer-Transducer

PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)

Stars: ✭ 61 (-65.14%)

Mutual labels: end-to-end, transformer, speech-recognition

Lingvo

Stars: ✭ 2,361 (+1249.14%)

Mutual labels: speech-recognition, speech, asr

Speech Transformer Tf2.0

transformer for ASR-systerm (via tensorflow2.0)

Stars: ✭ 90 (-48.57%)

Mutual labels: speech-recognition, end-to-end, transformer

wav2vec2-live

A live speech recognition using Facebooks wav2vec 2.0 model.

Stars: ✭ 205 (+17.14%)

Mutual labels: speech, speech-recognition, asr

Delta

DELTA is a deep learning based natural language and speech processing platform.

Stars: ✭ 1,479 (+745.14%)

Mutual labels: speech-recognition, speech, asr

E2e Asr

PyTorch Implementations for End-to-End Automatic Speech Recognition

Stars: ✭ 106 (-39.43%)

Mutual labels: speech-recognition, asr, end-to-end

Rnn Transducer

MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks

Stars: ✭ 114 (-34.86%)

Mutual labels: speech-recognition, asr, end-to-end

sova-asr

SOVA ASR (Automatic Speech Recognition)

Stars: ✭ 123 (-29.71%)

Mutual labels: speech, speech-recognition, asr

End-to-End-Mandarin-ASR

End-to-end speech recognition on AISHELL dataset.

Stars: ✭ 20 (-88.57%)

Mutual labels: end-to-end, speech-recognition, asr

spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Stars: ✭ 52 (-70.29%)

Mutual labels: speech, speech-recognition, asr

Pytorch Asr

ASR with PyTorch

Stars: ✭ 124 (-29.14%)

Mutual labels: speech-recognition, speech, asr

Zamia Speech

Open tools and data for cloudless automatic speech recognition

Stars: ✭ 374 (+113.71%)

Mutual labels: speech-recognition, asr

Espnet

End-to-End Speech Processing Toolkit

Stars: ✭ 4,533 (+2490.29%)

Mutual labels: speech-recognition, end-to-end

Listen Attend Spell

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

Stars: ✭ 147 (-16%)

Mutual labels: asr, end-to-end

Nmtpytorch

Sequence-to-Sequence Framework in PyTorch

Stars: ✭ 392 (+124%)

Mutual labels: speech-recognition, asr

Cheetah

On-device streaming speech-to-text engine powered by deep learning

Stars: ✭ 383 (+118.86%)

Mutual labels: speech-recognition, asr

Awesome Kaldi

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

Stars: ✭ 393 (+124.57%)

Mutual labels: speech-recognition, speech

Java Speech Api

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Stars: ✭ 490 (+180%)

Mutual labels: speech-recognition, speech

Specaugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Stars: ✭ 408 (+133.14%)

Mutual labels: speech-recognition, speech

Speech Denoising Wavenet

A neural network for end-to-end speech denoising

Stars: ✭ 516 (+194.86%)

Mutual labels: speech, end-to-end

Pocketsphinx Python

Python interface to CMU Sphinxbase and Pocketsphinx libraries

Stars: ✭ 298 (+70.29%)

Mutual labels: speech-recognition, speech

Silero Models

Silero Models: pre-trained STT models and benchmarks made embarrassingly simple

Stars: ✭ 522 (+198.29%)

Mutual labels: speech-recognition, asr

Sonus

💬 /so.nus/ STT (speech to text) for Node with offline hotword detection

Stars: ✭ 532 (+204%)

Mutual labels: speech-recognition, speech

Libreasr

💬 An On-Premises, Streaming Speech Recognition System

Stars: ✭ 633 (+261.71%)

Mutual labels: speech-recognition, asr

Vad

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

Stars: ✭ 622 (+255.43%)

Mutual labels: speech-recognition, speech

Py Kaldi Asr

Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.

Stars: ✭ 156 (-10.86%)

Mutual labels: speech-recognition, asr

Speech Emotion Analyzer

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Stars: ✭ 633 (+261.71%)

Mutual labels: speech-recognition, speech

Wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

Stars: ✭ 5,907 (+3275.43%)

Mutual labels: speech-recognition, end-to-end

Eesen

The official repository of the Eesen project

Stars: ✭ 738 (+321.71%)

Mutual labels: speech-recognition, asr

Sincnet

SincNet is a neural architecture for efficiently processing raw audio samples.

Stars: ✭ 764 (+336.57%)

Mutual labels: speech-recognition, asr

Vosk Server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries

Stars: ✭ 277 (+58.29%)

Mutual labels: speech-recognition, asr

Annyang

💬 Speech recognition for your site

Stars: ✭ 6,216 (+3452%)

Mutual labels: speech-recognition, speech

Discordspeechbot

A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.

Stars: ✭ 35 (-80%)

Mutual labels: speech-recognition, speech

Keras Sincnet

Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)

Stars: ✭ 47 (-73.14%)

Mutual labels: speech-recognition, asr

Allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Stars: ✭ 135 (-22.86%)

Mutual labels: speech-recognition, speech

Asr benchmark

Program to benchmark various speech recognition APIs