All Categories → Machine Learning → speech-recognition

Top 326 speech-recognition open source projects

Wav2Vec for speech recognition, classification, and audio classification

✭ 113

Jupyter Notebook python speech-recognition automatic-speech-recognition emotion-recognition speech-emotion-recognition speech-classification

Tensorflow-Keyword-Spotting

Keyword spotting using various architecture like convolutional vggnet , 1D convolutional network and CTC.

✭ 27

python tensorflow speech-recognition

A chronology of deep learning

Tracing back and exposing in chronological order the main ideas in the field of deep learning, to help everyone better understand the current intense research in AI.

✭ 47

natural-language-processing computer-vision deep-learning optimization history speech-recognition

Deep-learning-And-Paper

【仅作为交流学习使用】机器智能--相关书目及经典论文包括AutoML、情感分类、语音识别、声纹识别、语音合成实验代码等

✭ 62

Jupyter Notebook python ai deep-learning book speech-recognition speech-to-text papers speaker-verification automl sentiment-classification meachinelearning

revai-node-sdk

Node.js SDK for the Rev AI API

✭ 21

typescript javascript Dockerfile nodejs sdk realtime captions speech-recognition speech-to-text rev revai

deepspeech.mxnet

A MXNet implementation of Baidu's DeepSpeech architecture

✭ 82

python shell mxnet arch speech speech-recognition baidu speech-to-text stt warp-ctc deepspeech

deep avsr

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

✭ 104

python speech-recognition automatic-speech-recognition speech-to-text audio-visual-speech-recognition lip-reading visual-speech-recognition

TinyCog

Small Robot, Toy Robot platform

✭ 29

C++scheme CMake shell audio rpi robotics toy embedded-systems speech-synthesis robot-framework vision speech-recognition motor-controller sense rpi3-computer

spokestack-ios

Spokestack: give your iOS app a voice interface!

✭ 27

swift ios text-to-speech tensorflow speech-synthesis voice-recognition speech-recognition vad speech-to-text hacktoberfest speech-processing asr voice-assistant natural-language-understanding speech-api voice-activity-detection voice-synthesis wakeword wakeword-activation

cobra

On-device voice activity detection (VAD) powered by deep learning.

✭ 76

javascript python typescript rust c swift android ios web deep-learning voice-recognition speech-recognition vad voice-activity-detection

react-client

An React client library for Speechly API

✭ 71

typescript javascript react natural-language-processing voice speech-recognition

srvk-eesen-offline-transcriber

Top level code to transcribe English audio/video files into text/subtitles

✭ 22

shell python Makefile speech-recognition kaldi eesen

kaldi-long-audio-alignment

Long audio alignment using Kaldi

✭ 21

shell python speech-recognition automatic-speech-recognition speech-to-text kaldi transcription asr speechrecognition split-audio longaudio-alignment audio-segments speech-transcription

speechless

Speech-to-text based on wav2letter built for transfer learning

✭ 92

python tensorflow keras speech-recognition

Unity live caption

Use Google Speech-to-Text API to do real-time live stream caption on Unity! Best when combined with your virtual character!

✭ 26

python C#stream unity speech-recognition google-api speech-to-text vtuber youtuber automatic-caption live-caption

mongolian-nlp

Useful resources for Mongolian NLP

✭ 119

Jupyter Notebook nlp natural-language-processing text-to-speech deep-learning pytorch speech-recognition language-model mongolian

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

✭ 205

Jupyter Notebook python speech-synthesis speech-recognition speech-processing

open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

✭ 841

text-to-speech tts speech-synthesis voice-recognition speech-recognition speech-to-text stt speech-processing voice-activity-detection speech-separation speech-emotion-recognition voice-cloning

syn-speech-samples

An application that demostrate the usage of Syn.Speech library for Speech Recognition

✭ 24

C#speech-recognition asr speech-recognizer

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

✭ 2,384

C++python shell perl CMake java pytorch transformer speech-recognition automatic-speech-recognition production-ready asr conformer e2e-models

pytorch audio

audio processing module for pytorch:stft, istft

✭ 33

python pytorch speech-recognition pytorch-audio

favorite-research-papers

Listing my favorite research papers 📝 from different fields as I read them.

✭ 12

machine-learning deep-learning neural-network artificial-intelligence generative-adversarial-network style-transfer speech-recognition generative-model image-classification transfer-learning research-paper

Chinese-automatic-speech-recognition

Chinese speech recognition

✭ 147

Jupyter Notebook python machine-learning deep-learning signal-processing speech-recognition chinese-nlp speech-to-text chinese-speech-recognition chinese-speech-to-text

VoiceDictation

迅飞语音听写 WebAPI - 把语音(≤60秒)转换成对应的文字信息，让机器能够“听懂”人类语言，相当于给机器安装上“耳朵”，使其具备“能听”的功能。

✭ 36

javascript CSS HTML voice speech-recognition webapi websocket-api voice-dictation

scripty

Speech to text bot for Discord using Mozilla's DeepSpeech

✭ 14

rust discord discord-bot speech-recognition speech-to-text stt

Transformer-Transducer

PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)

✭ 61

python end-to-end transformer speech-recognition sequence-to-sequence rnnt transformer-transducer

timit-preprocessor

Extract mfcc vectors and phones from TIMIT dataset

✭ 14

shell python deep-learning phone speech-recognition data-preprocessing mfcc timit-dataset timit

speech-recognition-transfer-learning

Speech command recognition DenseNet transfer learning from UrbanSound8k in keras tensorflow

✭ 18

python Jupyter Notebook tensorflow keras kaggle speech-recognition densenet transfer-learning dilatednet speech-commands

rustfst

Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.

✭ 104

rust python C++automata graph tokenizer composition speech-recognition transducers kaldi transducer asr rust-crate fst openfst shortest-path finite-state-transducers kaldi-asr wfst finite-state-acceptors fsts

QuantumSpeech-QCNN

IEEE ICASSP 21 - Quantum Convolution Neural Networks for Speech Processing and Automatic Speech Recognition

✭ 71

Jupyter Notebook python speech-recognition speech-processing quantum-machine-learning colab-notebook tensorflow2 pennylane ctc-model

VoiceBridge

VoiceBridge - an AI-TOOLKIT Open Source C++ Speech Recognition Toolkit

✭ 17

C++c Makefile shell Cuda fortran dll examples speech-recognition voicebridge ai-toolkit language-model-generation pronunciation-lexicon-generation

kaldi ag training

Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.

✭ 14

shell python training custom personal speech speech-recognition speech-to-text kaldi fine-tuning kaldi-asr

PCPM

Presenting Collection of Pretrained Models. Links to pretrained models in NLP and voice.

✭ 21

sentiment speech-recognition speech-to-text pretrained-models language-model asr pretrained

Inimesed

An Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.

✭ 65

java c HTML Makefile CSS android speech-recognition speech-to-text pocketsphinx android-ndk estonian

ml-with-audio

HF's ML for Audio study group

✭ 104

Jupyter Notebook speech-synthesis speech-recognition huggingface

DeepSpeech-API

The code enables users to use Mozilla's Deep Speech model over the Web Browser.

✭ 31

typescript python HTML javascript CSS speech-recognition speech-to-text mozilla-deepspeech

End-to-End-Mandarin-ASR

End-to-end speech recognition on AISHELL dataset.

✭ 20

python end-to-end pytorch speech-recognition asr mandarin chinese-speech-recognition specaugment aishell

AmazonSpeechTranslator

End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.

✭ 50

swift ruby text-to-speech interpreter translation youtube-video speech-synthesis voice-recognition speech-recognition speech-to-text amazon-polly amazon-cognito mobile-development speech-recognizer translation-api aws-sdk-ios aws-mobilehub amazon-translate

api

Speechly public API definitions and generated code

✭ 15

swift python api natural-language-processing protobuf voice grpc speech-recognition

2018-dlsl

UPC Deep Learning for Speech and Language 2018

✭ 18

translation deep-learning speech-recognition automatic-speech-recognition neural-machine-translation teaching-materials speaker-identification

rnnt decoder cuda

An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.

✭ 60

Cuda C++python Makefile cuda speech-recognition beam-search speech-to-text transducer handwriting-recognition prefix-search rnnt

salutejs

SmartApp Framework для создания навыков семейства Виртуальных Ассистентов "Салют" на языке JavaScript

✭ 35

typescript javascript nodejs artificial-intelligence speech-recognition voice-assistant virtual-assistant

speechrec

a simple speech recognition app using the Web Speech API Interfaces

✭ 18

javascript CSS HTML speech-synthesis speech-recognition speech-to-text speech-processing speech-api

Android-TTS-STT

One line solution for Android Text to speech(TTS) & Speech to Text(STT) translation problem

✭ 77

kotlin speech-recognition tts-android texttospeech-android speechtotext android-speech

houndify-sdk-go

The official Houndify SDK for Go

✭ 23

go sdk voice-recognition speech-recognition houndify voice-search

telltime

iOS application to tell the time in the British way 🇬🇧⏰

✭ 49

swift redux text-to-speech time clock speech-recognition swiftui composable-architecture the-composable-architecture

hf-experiments

Experiments with Hugging Face 🔬 🤗

Khronos

The open source intelligent personal assistant

✭ 25

c CMake cmake speech-synthesis speech-recognition libsndfile khronos portaudio

web-speech-cognitive-services

Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.

✭ 35

javascript HTML shell text-to-speech azure speech-synthesis speech-recognition speech-to-text cognitive-services

ctc-asr

End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.

✭ 112

python shell machine-learning mit neural-network tensorflow speech-recognition asr ctc

kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

✭ 456

python shell end-to-end pytorch transformer speech-recognition las seq2seq jasper asr conformer attention-is-all-you-need korean-speech e2e-asr las-models ksponspeech

speechmatics-python

Python library and CLI for Speechmatics

✭ 24

python Makefile cli speech-recognition speech-to-text transcription

speech-recognition-evaluation

Evaluate results from ASR/Speech-to-Text quickly

✭ 25

javascript diff statistics evaluation comparison speech-recognition accuracy words speech-to-text stt difference asr mismatches wer word-error-rate transcriptions punctuations insertions

titanium-speech

Use the iOS 10 SFSpeechRecognizer API in JavaScript with Appcelerator Hyperloop.

✭ 21

javascript python ios native speech-recognition hyperloop appcelerator-hyperloop

simple-obs-stt

Speech-to-text and keyboard input captions for OBS.

✭ 89

typescript HTML rust CSS javascript twitch angular azure webrtc speech captions tts subtitles speech-recognition speech-to-text obs stt text-animation tauri akita stt-plugins

KodiSharp

Use Kodi python APIs in C#, and write rich addons using the .NET framework/Mono

✭ 22

C#C++python CMake cross-platform dotnet mono kodi interop speech-recognition kodi-plugin kodi-module kodi-addon kodi-python-apis

awesome-keyword-spotting

This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).

✭ 150

speech-recognition hotword-detection keyword-spotting speech-processing awesome-lists

praise

Do stuff with your voice in the browser.

✭ 13

typescript javascript experimental web-speech-api speech-recognition

KeenASR-Android-PoC

A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html

✭ 21

java offline voice-commands speech voice-recognition speech-recognition voice-chat speech-to-text voice-control voice-assistant speech-to-text-android on-device

React.ai

It recognize your speech and trained AI Bot will respond(i.e Customer Service, Personal Assistant) using Machine Learning API (DialogFlow, apiai), Speech Recognition, GraphQL, Next.js, React, redux

241-300 of 326 speech-recognition projects

first

‹

›