The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Stars: ✭ 490 (-35.19%)

Mutual labels: speech-recognition, speech

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Stars: ✭ 2,384 (+215.34%)

Mutual labels: speech-recognition, asr

Keras Sincnet

Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)

Stars: ✭ 47 (-93.78%)

Mutual labels: speech-recognition, asr

syn-speech-samples

An application that demostrate the usage of Syn.Speech library for Speech Recognition

Stars: ✭ 24 (-96.83%)

Mutual labels: speech-recognition, asr

torchain

WIP: pytorch FFI wrapper for Kaldi chain loss (a.k.a. Lattice Free MMI)

Stars: ✭ 20 (-97.35%)

Mutual labels: kaldi, asr

Kaldi Onnx

Kaldi model converter to ONNX

Stars: ✭ 174 (-76.98%)

Mutual labels: speech-recognition, kaldi

Speechtotext Websockets Javascript

SDK & Sample to do speech recognition using websockets in Javascript

Stars: ✭ 191 (-74.74%)

Mutual labels: speech-recognition, speech

spokestack-ios

Spokestack: give your iOS app a voice interface!

Stars: ✭ 27 (-96.43%)

Mutual labels: speech-recognition, asr

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Stars: ✭ 13,870 (+1734.66%)

Mutual labels: numpy, speech

deepspeech.mxnet

A MXNet implementation of Baidu's DeepSpeech architecture

Stars: ✭ 82 (-89.15%)

Mutual labels: speech, speech-recognition

speech-transformer

Transformer implementation speciaized in speech recognition tasks using Pytorch.

Stars: ✭ 40 (-94.71%)

Mutual labels: speech, asr

Kaldi Active Grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

Stars: ✭ 196 (-74.07%)

Mutual labels: speech-recognition, kaldi

TF-Speech-Recognition-Challenge-Solution

Source code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.

Stars: ✭ 58 (-92.33%)

Mutual labels: speech, speech-recognition

megs

A merged version of multiple open-source German speech datasets.

Stars: ✭ 21 (-97.22%)

Mutual labels: speech-recognition, asr

speech to text

how to use the Google Cloud Speech API to transcribe audio/video files.

Stars: ✭ 35 (-95.37%)

Mutual labels: speech, speech-recognition

lightning-asr

Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.

Stars: ✭ 36 (-95.24%)

Mutual labels: speech-recognition, asr

kaldi-alligner

scripts to align a given wave to its transcription using trained models by Kaldi

Stars: ✭ 24 (-96.83%)

Mutual labels: kaldi, asr

Discordspeechbot

A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.

Stars: ✭ 35 (-95.37%)

Mutual labels: speech-recognition, speech

Specaugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Stars: ✭ 408 (-46.03%)

Mutual labels: speech-recognition, speech

ctc-asr

End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.

Stars: ✭ 112 (-85.19%)

Mutual labels: speech-recognition, asr

Speech Feature Extraction

Feature extraction of speech signal is the initial stage of any speech recognition system.

Stars: ✭ 78 (-89.68%)

Mutual labels: speech, feature-extraction

speech-recognition

SDKs and docs for Skit's speech to text service

Stars: ✭ 20 (-97.35%)

Mutual labels: speech-recognition, asr

speech-to-text

mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras

Stars: ✭ 61 (-91.93%)

Mutual labels: speech-recognition, kaldi

mongolian-nlp

Useful resources for Mongolian NLP

Stars: ✭ 119 (-84.26%)

Mutual labels: speech-recognition, language-model

kaldi helpers

🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.

Stars: ✭ 13 (-98.28%)

Mutual labels: speech, kaldi

srvk-eesen-offline-transcriber

Top level code to transcribe English audio/video files into text/subtitles

Stars: ✭ 22 (-97.09%)

Mutual labels: speech-recognition, kaldi

End-to-End-Mandarin-ASR

End-to-end speech recognition on AISHELL dataset.

Stars: ✭ 20 (-97.35%)

Mutual labels: speech-recognition, asr

Docker Kaldi Gstreamer Server

Dockerfile for kaldi-gstreamer-server.

Stars: ✭ 266 (-64.81%)

Mutual labels: asr, kaldi

vosk-asterisk

Speech Recognition in Asterisk with Vosk Server

Stars: ✭ 52 (-93.12%)

Mutual labels: speech-recognition, asr

demo vietasr

Vietnamese Speech Recognition

Stars: ✭ 22 (-97.09%)

Mutual labels: speech-recognition, asr

Speech Emotion Analyzer

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Stars: ✭ 633 (-16.27%)

Mutual labels: speech-recognition, speech

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Stars: ✭ 224 (-70.37%)

Mutual labels: speech, speech-recognition

Athena

an open-source implementation of sequence-to-sequence based speech processing engine

Stars: ✭ 542 (-28.31%)

Mutual labels: speech-recognition, asr

kosr

Korean speech recognition based on transformer (트랜스포머 기반 한국어 음성 인식)

Stars: ✭ 25 (-96.69%)

Mutual labels: speech-recognition, asr

Ctcwordbeamsearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model for TensorFlow.

Stars: ✭ 398 (-47.35%)

Mutual labels: speech-recognition, language-model

Pocketsphinx Python

Python interface to CMU Sphinxbase and Pocketsphinx libraries

Stars: ✭ 298 (-60.58%)

Mutual labels: speech-recognition, speech

Espnet

End-to-End Speech Processing Toolkit

Stars: ✭ 4,533 (+499.6%)

Mutual labels: speech-recognition, kaldi

Libreasr

💬 An On-Premises, Streaming Speech Recognition System

Stars: ✭ 633 (-16.27%)

Mutual labels: speech-recognition, asr

Speech Aligner

speech-aligner，是一个从“人声语音”及其“语言文本”，产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription

Stars: ✭ 259 (-65.74%)

Mutual labels: speech, kaldi

UnityASR

Automatic Speech Recognition in Unity.

Stars: ✭ 14 (-98.15%)

Mutual labels: speech-recognition, asr

Asr theory

语音识别理论，论文和PPT

Stars: ✭ 344 (-54.5%)

Mutual labels: asr, kaldi

Tensorflow end2end speech recognition

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)